To open a text file within your code, use Python's built in open() function. Within the open() function, type a string containing the path of the location of your text file (in this case, it looks like open('C:/Users/mybringback/Desktop/pg16328.txt'), your location will of course look different). After opening your text file, you can tell Python what to do with it by defining it is a variable. For example, typing booktxt = book.readlines() will define "booktxt" as your text file and allow it to be recalled within your program on a readable line by line basis.
After defining this variable, simply typing booktxt will display your entire text file start to finish within your IDLE window. While this feature can be useful in other circumstances, oftentimes your text will be too long and unwieldy to be recalled in this manner, as is the case with the large text file used in this tutorial. In this case, you can also recall select lines from your text by placing the specific line number within brackets next to your variable. For example, booktxt will recall the first line of the text from your file, which in the text we are using is formatting information.
To recall the length in words of your text file, type len(booktxt). For more information on the len() function, see Tutorial 11.
If you are a Mac user, omit the drive letter "C:" from your file path. Mac's directory tree simply starts from "/", which is called the root.
Even when following Ed's exact steps and using the exact Beowulf file, some of you (mostly on Windows) will get this surprise error message:
>>> book = open('C:/Users/narae/Desktop/pg16328.txt')
>>> booktxt = book.readlines()
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
booktxt = book.readlines()
File "C:\Program Files (x86)\Python35-32\lib\encodings\cp1252.py", line 23, in decode
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 36593: character
maps to <undefined>>>>
That means the text file's encoding method is different from your system's default encoding. Ed didn't get this error because he is operating in Python 2.7, which handles text encoding differently. In Python 3, more often than not we need to specify the encoding system during file opening, using the encoding="xxx" switch:
>>> book = open('C:/Users/narae/Desktop/pg16328.txt', encoding="utf-8")
>>> booktxt = book.readlines()
'\ufeffThe Project Gutenberg EBook of Beowulf \n'>>>
Here, the file is encoded in UTF-8 (8-bit Unicode, as opposed to UTF-16 or UTF-32), so encoding="utf-8" was specified.
It was not done in the tutorial, but a file object, once opened and processed, must be closed. In the tutorial, a good time to close would have been after book.readlines() was executed. It can be done by calling book.close().
There are more details to learn (and battle with) in dealing with files on your local drive. See this advanced topic page: "File Path and CWD".
Practice using the "mary-short.txt" file, linked on the left under "Code and Text Examples". First download and save it on your computer, and then read it in in the IDLE shell. Everyone's system is different, so you might need to refer to these two additional tutorials: "File Path and CWD" and "File Reading and Writing Methods".
Here is me reading in the file, which is saved on my desktop (Windows 7).
>>> f = open('C:/Users/narae/Desktop/mary-short.txt')
>>> lines = f.readlines()
'Mary had a little lamb,\n'>>> lines
'His fleece was white as snow,\n'>>> print(lines)
And everywhere that Mary went,
['Mary had a little lamb,\n', 'His fleece was white as snow,\n', 'And everywhere that Mary
went,\n', 'The lamb was sure to go.\n']>>> len(lines)
Anne Dawson has many sample scripts for File I/O. Search for "open". Note that she uses the "escaped backslash" style (see this page) of Windows file path reference.