With Python there are several methods which can be used to check if a file exists, in a certain directory. When checking if a file exists, often it is performed right before accessing (reading and/or writing) a file. Below we will go through each method of checking if a file exists (and whether it is accessible), and discuss some of the potential issues with each one.

1. os.path.isfile(path)

This function returns true if the given path is an existing, regular file. It follows symbolic links, therefore it’s possible for os.path.islink(path) to be true while os.path.isfile(path) is also true. This is a handy function to check if a file exists, because it’s a simple one liner. Unfortunately the function only checks whether the specified path is a file, but does not guarantee that the user has access to it. It also only tells you that the file existed at the point in time you called the function. It is possible (although highly unlikely), that between the time you called this function, and when you access the file, it has been deleted or moved/renamed.

For example, it may fail in the following scenario:

>>> os.path.isfile('foo.txt')
True
>> f = open('foo.txt', 'r')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 13] Permission denied: 'foo.txt'

2. os.access(path, mode)

This function tests if the current user (with the real uid/gid) has access (read and/or write privileges) to a given path. To test if the file is readable os.R_OK can be used, and os.W_OK can be used to determine if the file is writable. For example, as follows.

>>> # Check for read access to foo.txt
>>> os.access('foo.txt', os.R_OK)
True # This means the file exists AND you can read it.
>>> 
>>> # Check for write access to foo.txt
>>> os.access('foo.txt', os.W_OK)
False # You cannot write to the file. It may or may not exist.

If you are planning on accessing a file, using this function is somewhat safer (although not completely recommend) because it also checks if you can access (reading or writing) the file. However if you plan on accessing the file, it is possible (although unlikely), that in between the time you check it is accessible and the time you access it, it has been deleted or moved/renamed. This is known as a race condition, and should be avoided. The following is an example of how it can happen.

>>> # The file 'foo.txt' currently exists and is readable.
>>> if os.access('foo.txt', os.R_OK):
>>> # After executing os.access() and before open(),
>>> # another program deletes the file.
>>> f = open('foo.txt', 'r')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 2] No such file or directory: 'foo.txt'

3. Attempting to access (open) the file.

In order to absolutely guarantee that the file not only exists, but is accessible at the current time, the easiest method is actually attempting to open the file.

try:
    f = open('foo.txt')
    f.close()
except IOError as e:
    print('Uh oh!')

This can be transformed into an easy to use function, as follows.

def file_accessible(filepath, mode):
    ''' Check if a file exists and is accessible. '''
    try:
        f = open(filepath, mode)
        f.close()
    except IOError as e:
        return False
 
    return True

For example, you can use it as follows:

>>> # Say the file 'foo.txt' exists and is readable,
>>> # whereas the file 'bar.txt' doesn't exist.
>>> foo_accessible = file_accessible('foo.txt', 'r')
True
>>>
>>> bar_accessible = file_accessible('bar.txt', 'r')
False

So… which is best?

Whichever method you decide to use depends on why you need to check if a file exists, whether speed matters, and often how many files you are trying to open at any given time. In many cases os.path.isfile should suffice just fine. However keep in mind that when using any of the methods, each has its own list of benefits and potential issues.

Leave a Reply

Your email address will not be published. Required fields are marked *