Python Data Persistence 简明教程
Python Data Persistence - File API
Python 使用内置 input() 和 print() 函数来执行标准输入/输出操作。input() 函数从标准输入流设备(即键盘)读取字节。
Python uses built-in input() and print() functions to perform standard input/output operations. The input() function reads bytes from a standard input stream device, i.e. keyboard.
另一方面, print() 函数将数据发送到标准输出流设备(即显示器)。Python 程序通过 sys 模块中定义的标准流对象 stdin 和 stdout 与这些 IO 设备进行交互。
The print() function on the other hand, sends the data towards standard output stream device i.e. the display monitor. Python program interacts with these IO devices through standard stream objects stdin and stdout defined in sys module.
input() 函数实际上是 sys.stdin 对象的 readline() 方法的包装函数。接收所有来自输入流中的击键操作,直至按下「回车」键。
The input() function is actually a wrapper around readline() method of sys.stdin object. All keystrokes from the input stream are received till ‘Enter’ key is pressed.
>>> import sys
>>> x=sys.stdin.readline()
Welcome to TutorialsPoint
>>> x
'Welcome to TutorialsPoint\n'
请注意, readline() 函数留下一个临时的「\n」字符。这里还有一个 read() 函数,它将从标准输入流中读取数据,直到该过程被 Ctrl+D 字符终止。
Note that, readline() function leave a trailing ‘\n’ character. There is also a read() method which reads data from standard input stream till it is terminated by Ctrl+D character.
>>> x=sys.stdin.read()
Hello
Welcome to TutorialsPoint
>>> x
'Hello\nWelcome to TutorialsPoint\n'
类似地, print() 是一个编写 stdout 对象的 write() 方法的简便函数。
Similarly, print() is a convenience function emulating write() method of stdout object.
>>> x='Welcome to TutorialsPoint\n'
>>> sys.stdout.write(x)
Welcome to TutorialsPoint
26
如同 stdin 和 stdout 预定义的流对象,Python 程序可以从磁盘文件或网络套接字读取数据并向其发送数据。它们也是流。任何具有 read() 方法的对象都是输入流。具有 write() 方法的任何对象都是输出流。通过获取对流对象的引用,借助内置的 open() 函数,可以与流建立通信。
Just as stdin and stdout predefined stream objects, a Python program can read data from and send data to a disk file or a network socket. They are also streams. Any object that has read() method is an input stream. Any object that has write() method is an output stream. The communication with the stream is established by obtaining reference to the stream object with built-in open() function.
open() function
这个内置函数使用以下参数:
This built-in function uses following arguments −
f=open(name, mode, buffering)
name 参数是磁盘文件或字节字符串的名称,mode 是可选项,指定要执行的操作类型(读取、写入、追加等)的单字符字符串,buffering 参数为 0、1 或 -1,表示缓冲为关闭、开启或系统默认。
The name parameter, is name of disk file or byte string, mode is optional one-character string to specify the type of operation to be performed (read, write, append etc.) and buffering parameter is either 0, 1 or -1 indicating buffering is off, on or system default.
文件打开模式中根据下表进行枚举。默认模式为‘r’
File opening mode is enumerated as per table below. Default mode is ‘r’
Sr.No |
Parameters & Description |
1 |
R Open for reading (default) |
2 |
W Open for writing, truncating the file first |
3 |
X Create a new file and open it for writing |
4 |
A Open for writing, appending to the end of the file if it exists |
5 |
B Binary mode |
6 |
T Text mode (default) |
7 |
+ Open a disk file for updating (reading and writing) |
要将数据保存到文件,必须使用“w”模式打开它。
In order to save data to file it must be opened with ‘w’ mode.
f=open('test.txt','w')
此文件对象充当输出流,并有权访问 write() 方法。write() 方法将字符串发送到该对象,并存储在它的底层文件中。
This file object acts as an output stream, and has access to write() method. The write() method sends a string to this object, and is stored in the file underlying it.
string="Hello TutorialsPoint\n"
f.write(string)
关闭流非常重要,以确保缓冲区中剩余的任何数据都完全传输到它。
It is important to close the stream, to ensure that any data remaining in buffer is completely transferred to it.
file.close()
尝试使用任何测试编辑器(如记事本)打开“test.txt”,以确认文件创建成功。
Try and open ‘test.txt’ using any test editor (such as notepad) to confirm successful creation of file.
要以编程方式读取“test.txt”的内容,必须以“r”模式打开它。
To read contents of ‘test.txt’ programmatically, it must be opened in ‘r’ mode.
f=open('test.txt','r')
此对象表现为输入流。Python 可使用 read() 方法从流中获取数据。
This object behaves as an input stream. Python can fetch data from the stream using read() method.
string=f.read()
print (string)
文件内容在 Python 控制台中显示。文件对象还支持 readline() 方法,该方法能够读取字符串直到遇到 EOF 字符。
Contents of the file are displayed on Python console. The File object also supports readline() method which is able to read string till it encounters EOF character.
然而,如果以“w”模式打开相同的文件在其中存储附加文本,则前面的内容将被删除。每当以写权限打开文件时,将视其为一个新文件。要向现有文件添加数据,可使用“a”作为追加模式。
However, if same file is opened in ‘w’ mode to store additional text in it, earlier contents are erased. Whenever, a file is opened with write permission, it is treated as if it is a new file. To add data to an existing file, use ‘a’ for append mode.
f=open('test.txt','a')
f.write('Python Tutorials\n')
该文件现在具有前置字符串和新增加的字符串。该文件对象还支持 ` writelines() ` 方法,用于将列表对象中的每个字符串写入到该文件中。
The file now, has earlier as well as newly added string. The file object also supports writelines() method to write each string in a list object to the file.
f=open('test.txt','a')
lines=['Java Tutorials\n', 'DBMS tutorials\n', 'Mobile development tutorials\n']
f.writelines(lines)
f.close()
Example
` readlines() ` 方法返回字符串列表,其中每个字符串表示该文件中的一个行。也可以逐行读取该文件,直到达到文件结尾。
The readlines() method returns a list of strings, each representing a line in the file. It is also possible to read the file line by line until end of file is reached.
f=open('test.txt','r')
while True:
line=f.readline()
if line=='' : break
print (line, end='')
f.close()
Binary mode
默认情况下,在文件对象上执行的读/写操作针对文本字符串数据执行。如果我们想处理其他不同类型(例如媒体(mp3)、可执行文件(exe)、图片(jpg)等)的文件,则需要在读/写模式中添加“b”前缀。
By default, read/write operation on a file object are performed on text string data. If we want to handle files of different other types such as media (mp3), executables (exe), pictures (jpg) etc., we need to add ‘b’ prefix to read/write mode.
下面的语句将把一个字符串转换为字节并写入到一个文件中。
Following statement will convert a string to bytes and write in a file.
f=open('test.bin', 'wb')
data=b"Hello World"
f.write(data)
f.close()
还可以使用 encode() 函数将文本字符串转换为字节。
Conversion of text string to bytes is also possible using encode() function.
data="Hello World".encode('utf-8')
我们需要使用 ` ‘rb’ ` 模式才能读取二进制文件。read() 方法的返回值在打印之前首先解码。
We need to use ‘rb’ mode to read binary file. Returned value of read() method is first decoded before printing.
f=open('test.bin', 'rb')
data=f.read()
print (data.decode(encoding='utf-8'))
为了在二进制文件中写入整数数据,应该通过 ` to_bytes() ` 方法将整数对象转换为字节。
In order to write integer data in a binary file, the integer object should be converted to bytes by to_bytes() method.
n=25
n.to_bytes(8,'big')
f=open('test.bin', 'wb')
data=n.to_bytes(8,'big')
f.write(data)
为了从二进制文件回读,需通过 from_bytes() 函数将 read() 函数的输出转换为整数。
To read back from a binary file, convert output of read() function to integer by from_bytes() function.
f=open('test.bin', 'rb')
data=f.read()
n=int.from_bytes(data, 'big')
print (n)
对于浮点数据,我们需要使用 Python 的标准库中的 struct 模块。
For floating point data, we need to use struct module from Python’s standard library.
import struct
x=23.50
data=struct.pack('f',x)
f=open('test.bin', 'wb')
f.write(data)
从 read() 函数解包字符串,以便从二进制文件检索浮点数数据。
Unpacking the string from read() function, to retrieve the float data from binary file.
f=open('test.bin', 'rb')
data=f.read()
x=struct.unpack('f', data)
print (x)
Simultaneous read/write
当某一文件打开进行写入(使用“w”或“a”)时,无法从该文件中读取,反之亦然。执行此操作将引发 UnSupportedOperation 错误。我们需要在执行其他操作之前关闭该文件。
When a file is opened for writing (with ‘w’ or ‘a’), it is not possible, to read from it and vice versa. Doing so throws UnSupportedOperation error. We need to close the file before doing other operation.
为了同时执行这两个操作,我们必须在模式参数中添加 ‘’ 字符。因此,“w”或“r+”模式允许在不关闭文件的情况下使用 write() 和 read() 方法。File 对象还支持 seek() 函数,用于将流倒回到任何所需的字节位置。
In order to perform both operations simultaneously, we have to add ‘’ character in the mode parameter. Hence, ‘w’ or ‘r+’ mode enables using write() as well as read() methods without closing a file. The File object also supports seek() function to rewind the stream to any desired byte position.
f=open('test.txt','w+')
f.write('Hello world')
f.seek(0,0)
data=f.read()
print (data)
f.close()
下表总结了可用于类文件对象的全部方法。
Following table summarizes all the methods available to a file like object.
Sr.No |
Method & Description |
1 |
close() Closes the file. A closed file cannot be read or written any more. |
2 |
flush() Flush the internal buffer. |
3 |
fileno() Returns the integer file descriptor. |
4 |
next() Returns the next line from the file each time it is being called. Use next() iterator in Python 3. |
5 |
read([size]) Reads at most size bytes from the file (less if the read hits EOF before obtaining size bytes). |
6 |
readline([size]) Reads one entire line from the file. A trailing newline character is kept in the string. |
7 |
readlines([sizehint]) Reads until EOF using readline() and returns a list containing the lines. |
8 |
seek(offset[, whence]) Sets the file’s current position. 0-begin 1-current 2-end. |
9 |
seek(offset[, whence]) Sets the file’s current position. 0-begin 1-current 2-end. |
10 |
tell() Returns the file’s current position |
11 |
truncate([size]) Truncates the file’s size. |
12 |
write(str) Writes a string to the file. There is no return value. |