Manipulating files is an essential aspect of scripting in Python, and luckily for us, the process isn’t complicated. The built-in open
function is the preferred method for reading files of any type, and probably all you’ll ever need to use. Let’s first demonstrate how to use this method on a simple text file.
For clarity, let’s first write our text file string in a standard text editor (MS Notepad in this example). When opened in the editor it will look like this (note the empty trailing line):
To open our file with Python, we first have to know the path to the file. In this example the file path will be relative to your current working directory. So we won’t need to type the full path into the interpreter.
>>> tf = 'textfile.txt'
- Python Programming – File Functions
- Python Programming – File Processing
- Python Programming – Operations on Files
Open a File in Python
Using this variable as the first argument of the open
method, we’ll have our file saved as an object.
>>> f = open(tf)
Read a File with Python
When we reference our file-object f
, Python tells us the status (open or closed), the name, and the mode, as well as some info we don’t need (about the memory it’s using on our machine).
We already knew the name, and we haven’t closed it so we know it’s open, but the mode deserves special attention. Our file f
is in mode r for read. Specifically, this means we can only read data from the file, not edit or write new data to the file (it’s also in t
mode for text
, though it doesn’t say this explicitly —it’s the default mode, as is r). Let’s read our text from the file with the read
method:
>>> f.read() 'First line of our text.\nSecond line of our text.\n3rd line, one line is trailing.\n'
This doesn’t exactly look like what we typed into the notepad, but it’s how Python reads the raw text data. To get the text as we typed it (without the \n
newline characters, we can print it):
>>> print(_) First line of our text. Second line of our text. 3rd line, one line is trailing.
Note how we used the _
character in the Python IDLE to reference the most recent output instead of using the read
method again. Here’s what happens if we try to use read
instead:
>>> f.read() ''
This happens because read returned the full contents of the file, and the invisible position marker (how Python keeps track of your position in the file) is at the end of the file; there’s nothing left to read.
Partial Reading of Files in Python
Note: You can use an integer argument with read
if you don’t want the full contents of the file; Python will then read however many bytes you specify as an integer argument for read
.
To get back to the start of the file (or anywhere else in the file), use the seek(int)
method on f
. By going back to the start you can read the contents from the beginning again with read
:
>>> f.seek(0) # We only read a small chunk of the file, 10 bytes print(f.read(10)) First line
Also, to tell where the current position of the file is, use the tell
method on f
like so:
>>> f.tell() 10L
If you don’t know the size of your file or how much of it you want, you might not find that useful.
Reading Files Line by Line in Python
What is useful, however, is reading the contents of the file line-by-line. One way we can do this with the readline
or readlines
methods—the first reads one line at a time, the second returns a list of every line in the file; both have an optional integer argument to indicate how much of the file (how many bytes) to read:
# Make sure we're at the start of the file >>> f.seek(0) >>> f.readlines() ['First line of our text.\n', 'Second line of our text.\n', '3rd line, one line is trailing.\n'] >>> f.readline() 'First line of our text.\n' >>> f.readline(20) 'Second line of our t' # Note if int is too large it just reads to the end of the line >>> f.readline(20) 'ext.\n'
Another option for reading a file line-by-line is treating it as a sequence and looping through it, like so:
>>> f.seek(0) >>> for line in f: >>> print(line) First line of our text. Second line of our text. 3rd line, one line is trailing.
Reading a Specific Line of a File with Python
We can use the readlines
method to access a specific line in the file.
Say if we have a file named test.txt
, containing the following text:
one two three four
We can acces the second line as follows:
>>> test_file = open('test.txt', 'r') >>> test_lines = test_file.readlines() >>> test_file.close() >>> # Print second line >>> print(test_lines[1]) two
Note that it prints two lines, because the line had a line ending, and the print
adds another line ending. You can use the strip method, such as print(test_lines[1].strip())
.
Python File Writing Modes
That covers the basic reading methods for files. Before looking at writing methods, we’ll briefly examine the other modes of file-objects returned with open
.
We already know mode r, but there are also the w and a modes (which stand for write and append, respectively). In addition to these there are the options + and b. The + option added to a mode makes the file open for updating, in other words to read from it or write to it.
With this option it might seem like there’s no difference between an r+ mode and w+ mode, but there’s a very important difference between these two: in w mode, the file is automatically truncated, meaning its entire contents are erased — so even in w+ mode the file will be completely overwritten as soon as it’s opened, so be careful. Alternatively, you can truncate the open file yourself with the truncate
method.
If you want to write to the end of the file, just use append mode (with + if you also want to read from it).
The b option indicates to open the file as a binary file (instead of the text mode default). Use this whenever you have data in the file that is not regular text (e.g. when opening an image file).
Now let’s look at writing to our file. We’ll use a+ mode so we don’t erase what we have. First let’s close our file f
and open a new one f2
:
# It's important to close the file to free memory >>> f.close() >>> f2 = open(tf, 'a+')
We can see that our f
file is now closed, meaning it isn’t taking up much memory, and we can’t perform any methods on it.
Note: If you don’t want to have to call close
explicitly on the file, you can use a with
statement to open the file. The with
statement will close the file automatically:
# f remains open only within the 'with' >>> with open(tf) as f: >>> print(f.read()) First line of our text. Second line of our text. 3rd line, one line is trailing. # This constant tells us if the file is closed >>> f.closed True
With f2, let’s write to the end of the file. We’re already in append mode so we can just call write
:
f2.write('Our 4th line, with write()\n')
Writing Multiple Lines to a File in Python
With this we’ve written to our file, and we can also write multiple lines with the writelines
, which will write a sequence (e.g., a list) of strings to the file as lines:
f2.writelines(['And a fifth', 'And also a sixth.']) f2.close()
Note: The name writelines
is a misnomer, as it does not write newline characters to the end of each string in the sequence automatically, as we’ll see.
Ok, now we’ve written our text and we’ve closed f2
so the changes we’ve made should be seen in the file when we open it in our text editor:
We can see the writelines
method didn’t separate our fifth and sixth lines for us, so keep that in mind.
Now that you have a good starting point, get scripting and discover what you can do when reading and writing files in Python — and don’t forget to utilize all of the extensive formatting methods Python has for strings!