Python Data Persistence 简明教程

Python Data Persistence - CSV Module

CSV stands for comma separated values 。此文件格式是一种常用的数据格式,用于将数据导出/导入到/从数据库中的电子表格和数据表。csv 模块是 PEP 305 的结果,被纳入了 Python 标准库中。它提供类和方法,根据 PEP 305 的建议对 CSV 文件执行读/写操作。

CSV stands for comma separated values. This file format is a commonly used data format while exporting/importing data to/from spreadsheets and data tables in databases. The csv module was incorporated in Python’s standard library as a result of PEP 305. It presents classes and methods to perform read/write operations on CSV file as per recommendations of PEP 305.

CSV 是 Microsoft Excel 电子表格软件首选的导出数据格式。但是,csv 模块还可以处理其他方言表示的数据。

CSV is a preferred export data format by Microsoft’s Excel spreadsheet software. However, csv module can handle data represented by other dialects also.

CSV API 接口由以下 writer 和 reader 类组成:

The CSV API interface consists of following writer and reader classes −

writer()

csv 模块中的此函数返回一个 writer 对象,该对象将数据转换为分隔字符串,并将其存储在文件对象中。该函数需要具有写权限的文件对象作为参数。在文件中写入的每行都发出一个换行符。为了防止行之间有额外的空格,将换行符参数设置为 ''。

This function in csv module returns a writer object that converts data into a delimited string and stores in a file object. The function needs a file object with write permission as a parameter. Every row written in the file issues a newline character. To prevent additional space between lines, newline parameter is set to ''.

writer 类具有以下方法:

The writer class has following methods −

writerow()

此方法写入一个可迭代对象(列表、元组或字符串)中的项目,并用逗号分隔它们。

This method writes items in an iterable (list, tuple or string), separating them by comma character.

writerows()

此方法采用一个可迭代列表作为参数,并将每个项目写为文件中以逗号分隔的行项目。

This method takes a list of iterables, as parameter and writes each item as a comma separated line of items in the file.

Example

以下示例演示了 writer() 函数的用处。首先以“w”模式打开一个文件。此文件用于获取 writer 对象。然后使用 writerow() 方法将列表中的每个元组写至文件。

Following example shows use of writer() function. First a file is opened in ‘w’ mode. This file is used to obtain writer object. Each tuple in list of tuples is then written to file using writerow() method.

import csv
   persons=[('Lata',22,45),('Anil',21,56),('John',20,60)]
   csvfile=open('persons.csv','w', newline='')
   obj=csv.writer(csvfile)
   for person in persons:
      obj.writerow(person)
csvfile.close()

Output

这将在当前目录中创建“persons.csv”文件。它将显示以下数据。

This will create ‘persons.csv’ file in current directory. It will show following data.

Lata,22,45
Anil,21,56
John,20,60

我们可以使用 writerows() 方法,而不用遍历列表逐个编写每行。

Instead of iterating over the list to write each row individually, we can use writerows() method.

csvfile=open('persons.csv','w', newline='')
persons=[('Lata',22,45),('Anil',21,56),('John',20,60)]
   obj=csv.writer(csvfile)
   obj.writerows(persons)
   obj.close()

reader()

此函数返回一个读取对象,它返回 csv file 中行的迭代器。使用常规定位循环,以下示例中的所有文件行都将显示为 −

This function returns a reader object which returns an iterator of lines in the csv file. Using the regular for loop, all lines in the file are displayed in following example −

Example

csvfile=open('persons.csv','r', newline='')
   obj=csv.reader(csvfile)
   for row in obj:
      print (row)

Output

['Lata', '22', '45']
['Anil', '21', '56']
['John', '20', '60']

reading 对象是一个迭代器。因此,它支持 next() 函数,该函数也可以用于显示 csv 文件中的所有行,而不用显示 for loop

The reader object is an iterator. Hence, it supports next() function which can also be used to display all lines in csv file instead of a for loop.

csvfile=open('persons.csv','r', newline='')
   obj=csv.reader(csvfile)
   while True:
   try:
      row=next(obj)
      print (row)
   except StopIteration:
      break

如前所述,csv 模块使用 Excel 作为其默认方言。csv 模块还定义了一个方言类。方言是用于实现 CSV 协议的一组标准。可用的方言列表可以通过 list_dialects() 函数获取。

As mentioned earlier, csv module uses Excel as its default dialect. The csv module also defines a dialect class. Dialect is set of standards used to implement CSV protocol. The list of dialects available can be obtained by list_dialects() function.

>>> csv.list_dialects()
['excel', 'excel-tab', 'unix']

除了可迭代对象外,csv 模块可以将字典对象导出到 CSV 文件,并读取它以填充 Python 字典对象。为此,此模块定义了以下类 −

In addition to iterables, csv module can export a dictionary object to CSV file and read it to populate Python dictionary object. For this purpose, this module defines following classes −

DictWriter()

此函数返回一个 DictWriter 对象。它与 writer 对象类似,但行映射到字典对象。该函数需要一个具有写入权限的文件对象和一个将字典中使用的键用作 fieldnames 参数的列表。这用于将第一个行写至文件作为标题。

This function returns a DictWriter object. It is similar to writer object, but the rows are mapped to dictionary object. The function needs a file object with write permission and a list of keys used in dictionary as fieldnames parameter. This is used to write first line in the file as header.

writeheader()

此方法将字典中的键列表写为第一个行中的以逗号分隔的行。

This method writes list of keys in dictionary as a comma separated line as first line in the file.

在以下示例中,定义了一个字典项目列表。列表中的每个项目都是一个字典。使用 writrows() 方法,它们以逗号分隔的方式写至文件。

In following example, a list of dictionary items is defined. Each item in the list is a dictionary. Using writrows() method, they are written to file in comma separated manner.

persons=[
   {'name':'Lata', 'age':22, 'marks':45},
   {'name':'Anil', 'age':21, 'marks':56},
   {'name':'John', 'age':20, 'marks':60}
]
csvfile=open('persons.csv','w', newline='')
fields=list(persons[0].keys())
obj=csv.DictWriter(csvfile, fieldnames=fields)
obj.writeheader()
obj.writerows(persons)
csvfile.close()

persons.csv 文件显示以下内容 −

The persons.csv file shows following contents −

name,age,marks
Lata,22,45
Anil,21,56
John,20,60

DictReader()

此函数从基础 CSV 文件返回 DictReader 对象。与 reader 对象一样,它也是一个迭代器,使用它来检索文件内容。

This function returns a DictReader object from the underlying CSV file. As, in case of, reader object, this one is also an iterator, using which contents of the file are retrieved.

csvfile=open('persons.csv','r', newline='')
obj=csv.DictReader(csvfile)

该类提供 fieldnames 属性,返回用作文件标题的字典键。

The class provides fieldnames attribute, returning the dictionary keys used as header of file.

print (obj.fieldnames)
['name', 'age', 'marks']

使用 DictReader 对象上的循环来获取各个字典对象。

Use loop over the DictReader object to fetch individual dictionary objects.

for row in obj:
   print (row)

这就产生了以下输出 −

This results in following output −

OrderedDict([('name', 'Lata'), ('age', '22'), ('marks', '45')])
OrderedDict([('name', 'Anil'), ('age', '21'), ('marks', '56')])
OrderedDict([('name', 'John'), ('age', '20'), ('marks', '60')])

要将 OrderedDict 对象转换为普通字典,我们必须先从 collections 模块导入 OrderedDict。

To convert OrderedDict object to normal dictionary, we have to first import OrderedDict from collections module.

from collections import OrderedDict
   r=OrderedDict([('name', 'Lata'), ('age', '22'), ('marks', '45')])
   dict(r)
{'name': 'Lata', 'age': '22', 'marks': '45'}