Python 简明教程

Python - Serialization

Serialization in Python

序列化是指将对象转换为可轻松存储、传输或稍后重建的格式的过程。在 Python 中,这意味着将对象或字典等复杂数据结构转换为字节流。

Serialization refers to the process of converting an object into a format that can be easily stored, transmitted, or reconstructed later. In Python, this involves converting complex data structures, such as objects or dictionaries, into a byte stream.

Why Do We Use Serialization?

序列化允许将数据轻松保存到磁盘或通过网络传输,稍后将其重建为其原始形式。对于诸如保存游戏状态、存储用户首选项或在不同系统之间交换数据等任务,这非常重要。

Serialization allows data to be easily saved to disk or transmitted over a network, and later reconstructed back into its original form. It is important for tasks like saving game states, storing user preferences, or exchanging data between different systems.

Serialization Libraries in Python

Python 提供了几个用于序列化的库,每个都有自己的优点。以下是 Python 中一些常用的序列化库的详细信息 −

Python offers several libraries for serialization, each with its own advantages. Here is a detailed overview of some commonly used serialization libraries in Python −

  1. Pickle − This is Python’s built-in module for serializing and deserializing Python objects. It is simple to use but specific to Python and may have security implications if used with untrusted data.

  2. JSON − JSON (JavaScript Object Notation) is a lightweight data interchange format that is human-readable and easy to parse. It is ideal for web APIs and cross-platform communication.

  3. YAML − YAML: YAML (YAML Ain’t Markup Language) is a human-readable data serialization standard that is also easy for both humans and machines to read and write. It supports complex data structures and is often used in configuration files.

Serialization Using Pickle Module

Python 中的 pickle 模块用于序列化和反序列化对象。序列化也称为 pickling ,包括将 Python 对象转换为字节流,然后可以将其存储在文件中或通过网络传输。

The pickle module in Python is used for serializing and deserializing objects. Serialization, also known as pickling, involves converting a Python object into a byte stream, which can then be stored in a file or transmitted over a network.

反序列化或 unpickling 是相反的过程,将字节流转换回 Python 对象。

Deserialization, or unpickling, is the reverse process, converting the byte stream back into a Python object.

Serializing an Object

我们可以使用 dump() 函数序列化对象并将其写入文件。必须以二进制写入模式 ('wb') 打开文件。

We can serialize an object using the dump() function and write it to a file. The file must be opened in binary write mode ('wb').

Example

在以下示例中,对字典进行序列化并写入名为 "data.pkl" 的文件 −

In the following example, a dictionary is serialized and written to a file named "data.pkl" −

import pickle

data = {'name': 'Alice', 'age': 30, 'city': 'New York'}

# Open a file in binary write mode
with open('data.pkl', 'wb') as file:
   # Serialize the data and write it to the file
   pickle.dump(data, file)
   print ("File created!!")

当执行上述代码时,字典对象的字节表示将存储在 data.pkl 文件中。

When above code is executed, the dictionary object’s byte representation will be stored in data.pkl file.

Deserializing an Object

要反序列化或取消腌制对象,可以使用 load() 函数。必须以二进制读取模式 ('rb') 打开文件,如下所示 −

To deserialize or unpickle the object, you can use the load() function. The file must be opened in binary read mode ('rb') as shown below −

import pickle

# Open the file in binary read mode
with open('data.pkl', 'rb') as file:
   # Deserialize the data
   data = pickle.load(file)
print(data)

这将从 "data.pkl" 中读取字节流,并将其转换回原始字典,如下所示 −

This will read the byte stream from "data.pkl" and convert it back into the original dictionary as shown below −

{'name': 'Alice', 'age': 30, 'city': 'New York'}

Pickle Protocols

协议是用于构建和解构到/从二进制数据的 Python 对象的约定。

Protocols are the conventions used in constructing and deconstructing Python objects to/from binary data.

pickle 模块支持不同的序列化协议,更高的协议通常提供更多功能和更好的性能。目前,pickle 模块定义了以下 6 种不同的协议:

The pickle module supports different serialization protocols, with higher protocols generally offering more features and better performance. Currently pickle module defines 6 different protocols as listed below −

可以通过将协议作为参数传递给 pickle.dump() 函数来指定协议。

You can specify the protocol by passing it as an argument to pickle.dump() function.

要了解 Python 安装的最高和默认协议版本,请使用 pickle 模块中定义的以下常量:

To know the highest and default protocol version of your Python installation, use the following constants defined in the pickle module −

>>> import pickle
>>> pickle.HIGHEST_PROTOCOL
5
>>> pickle.DEFAULT_PROTOCOL
4

Pickler and Unpickler Classes

Python 中的 pickle 模块还定义了 PicklerUnpickler 类,用于更详细地控制序列化和反序列化过程。“Pickler”类将 pickle 数据写入文件,而“Unpickler”类从文件中读取二进制数据并重建原始 Python 对象。

The pickle module in Python also defines Pickler and Unpickler classes for more detailed control over the serialization and deserialization processes. The "Pickler" class writes pickle data to a file, while the "Unpickler" class reads binary data from a file and reconstructs the original Python object.

Using the Pickler Class

要使用 Pickler 类序列化 Python 对象,可以按照以下步骤进行:

To serialize a Python object using the Pickler class, you can follow these steps −

from pickle import Pickler

# Open a file in binary write mode
with open("data.txt", "wb") as f:
   # Create a dictionary
   dct = {'name': 'Ravi', 'age': 23, 'Gender': 'M', 'marks': 75}
   # Create a Pickler object and write the dictionary to the file
   Pickler(f).dump(dct)
   print ("Success!!")

执行以上代码后,字典对象的字节表示将存储在“data.txt”文件中。

After executing the above code, the dictionary object’s byte representation will be stored in "data.txt" file.

Using the Unpickler Class

要使用 Unpickler 类从二进制文件中反序列化数据,可以执行以下操作:

To deserialize the data from a binary file using the Unpickler class, you can do the following −

from pickle import Unpickler

# Open the file in binary read mode
with open("data.txt", "rb") as f:
   # Create an Unpickler object and load the dictionary from the file
   dct = Unpickler(f).load()
   # Print the dictionary
   print(dct)

我们得到了如下输出 −

We get the output as follows −

{'name': 'Ravi', 'age': 23, 'Gender': 'M', 'marks': 75}

Pickling Custom Class Objects

pickle 模块还可以序列化和反序列化自定义类。类定义在腌制和反腌制的过程中都必须可用。

The pickle module can also serialize and deserialize custom classes. The class definition must be available at both the time of pickling and unpickling.

Example

在此示例中,一个“Person”类的实例被序列化然后反序列化,维持对象的 state:

In this example, an instance of the "Person" class is serialized and then deserialized, maintaining the state of the object −

import pickle
class Person:
   def __init__(self, name, age, city):
      self.name = name
      self.age = age
      self.city = city

# Create an instance of the Person class
person = Person('Alice', 30, 'New York')

# Serialize the person object
with open('person.pkl', 'wb') as file:
   pickle.dump(person, file)

# Deserialize the person object
with open('person.pkl', 'rb') as file:
   person = pickle.load(file)

print(person.name, person.age, person.city)

执行上面的代码后,我们得到以下输出: -

After executing the above code, we get the following output −

Alice 30 New York

Using JSON for Serialization

JSON(JavaScript 对象表示法)是一种流行的数据交换格式。它具有可读性、易写性且与语言无关,使其非常适合序列化。

JSON (JavaScript Object Notation) is a popular format for data interchange. It is human-readable, easy to write, and language-independent, making it ideal for serialization.

Python 通过 json 模块提供对 JSON 的内置支持,该模块允许您将数据序列化和反序列化为 JSON 格式。

Python provides built-in support for JSON through the json module, which allows you to serialize and deserialize data to and from JSON format.

Serialization

序列化是将 Python 对象转换成一个 JSON 字符串或将其写入一个文件的过程。

Serialization is the process of converting a Python object into a JSON string or writing it to a file.

Example: Serialize Data to a JSON String

Example: Serialize Data to a JSON String

在下例中,我们使用 json.dumps() 函数将 Python 字典转换成 JSON 字符串:

In the example below, we use the json.dumps() function to convert a Python dictionary to a JSON string −

import json

# Create a dictionary
data = {"name": "Alice", "age": 25, "city": "San Francisco"}

# Serialize the dictionary to a JSON string
json_string = json.dumps(data)
print(json_string)

以下是上面代码的输出: -

Following is the output of the above code −

{"name": "Alice", "age": 25, "city": "San Francisco"}

Example: Serialize Data and Write to a File

Example: Serialize Data and Write to a File

在此,我们使用 json.dump() 函数将序列化的 JSON 数据直接写入文件:

In here, we use the json.dump() function to write the serialized JSON data directly to a file −

import json

# Create a dictionary
data = {"name": "Alice", "age": 25, "city": "San Francisco"}

# Serialize the dictionary and write it to a file
with open("data.json", "w") as f:
   json.dump(data, f)
   print ("Success!!")

Deserialization

反序列化是将 JSON 字符串转换回 Python 对象或从文件中读取它的过程。

Deserialization is the process of converting a JSON string back into a Python object or reading it from a file.

Example: Deserialize a JSON String

Example: Deserialize a JSON String

在以下示例中,我们使用 json.loads() 函数将 JSON 字符串转换回 Python 字典:

In the following example, we use the json.loads() function to convert a JSON string back into a Python dictionary −

import json

# JSON string
json_string = '{"name": "Alice", "age": 25, "city": "San Francisco"}'

# Deserialize the JSON string into a Python dictionary
loaded_data = json.loads(json_string)
print(loaded_data)

它将生成如下输出:

It will produce the following output −

{'name': 'Alice', 'age': 25, 'city': 'San Francisco'}

Example: Deserialize Data from a File

Example: Deserialize Data from a File

在此,我们使用 json.load() 函数从文件读取 JSON 数据并将其转换成 Python 字典−

Here, we use the json.load() function to read JSON data from a file and convert it to a Python dictionary−

import json

# Open the file and load the JSON data into a Python dictionary
with open("data.json", "r") as f:
   loaded_data = json.load(f)
   print(loaded_data)

获得的输出如下 −

The output obtained is as follows −

{'name': 'Alice', 'age': 25, 'city': 'San Francisco'}

Using YAML for Serialization

YAML(YAML Ain’t Markup Language)是一种人类可读的数据序列化标准,通常用于配置文件和数据交换。

YAML (YAML Ain’t Markup Language) is a human-readable data serialization standard that is commonly used for configuration files and data interchange.

Python 通过 pyyaml 包支持 YAML 序列化和反序列化,需要先按如下所示安装 −

Python supports YAML serialization and deserialization through the pyyaml package, which needs to be installed first as shown below −

pip install pyyaml

Example: Serialize Data and Write to a YAML File

在下面的示例中,yaml.dump() 函数将 Python 字典数据转换成 YAML 字符串,并将其写入文件 "data.yaml"。

In the below example, yaml.dump() function converts the Python dictionary data into a YAML string and writes it to the file "data.yaml".

"default_flow_style" 参数确保 YAML 输出更具人类可读性,并采用展开格式 −

The "default_flow_style" parameter ensures that the YAML output is more human-readable with expanded formatting −

import yaml

# Create a Python dictionary
data = {"name": "Emily", "age": 35, "city": "Seattle"}

# Serialize the dictionary and write it to a YAML file
with open("data.yaml", "w") as f:
   yaml.dump(data, f, default_flow_style=False)
   print("Success!!")

Example: Deserialize Data from a YAML File

在此,yaml.safe_load() 函数用于从 "data.yaml" 中安全加载 YAML 数据,并将其转换成 Python 字典 (loaded_data) −

Here, yaml.safe_load() function is used to safely load the YAML data from "data.yaml" and convert it into a Python dictionary (loaded_data) −

import yaml

# Deserialize data from a YAML file
with open("data.yaml", "r") as f:
   loaded_data = yaml.safe_load(f)
   print(loaded_data)

下面显示了产生的输出:

The output produced is as shown below −

{'age': 35, 'city': 'Seattle', 'name': 'Emily'}