Python 简明教程

Python - Further Extensions

使用任何编译语言(例如 C、C++ 或 Java)编写的任何代码都可以集成或导入另一个 Python 脚本中。此代码被视为“扩展”。

Any code that you write using any compiled language like C, C++, or Java can be integrated or imported into another Python script. This code is considered as an "extension."

Python 扩展模块仅仅是标准 C 库。在 Unix 计算机上,这些库通常以 .so (针对共享对象)结尾。在 Windows 计算机上,您通常会看到 .dll (针对动态链接库)。

A Python extension module is nothing more than a normal C library. On Unix machines, these libraries usually end in .so (for shared object). On Windows machines, you typically see .dll (for dynamically linked library).

Pre-Requisites for Writing Extensions

若要开始编写扩展,您将需要 Python 头文件。

To start writing your extension, you are going to need the Python header files.

  1. On Unix machines, this usually requires installing a developer-specific package.

  2. Windows users get these headers as part of the package when they use the binary Python installer.

此外,假定您对 C 或 C++ 有良好的了解,以便使用 C 编程编写任何 Python 扩展。

Additionally, it is assumed that you have a good knowledge of C or C++ to write any Python Extension using C programming.

First look at a Python Extension

初次查看 Python 扩展模块,您需要将代码分成四个部分 −

For your first look at a Python extension module, you need to group your code into four parts −

  1. The header file Python.h.

  2. The C functions you want to expose as the interface from your module..

  3. A table mapping the names of your functions as Python developers see them as C functions inside the extension module..

  4. An initialization function.

The Header File Python.h

您需要在 C 源文件中包含 Python.h 头文件,以便您访问用于将模块连接到解释程序的内部 Python API。

You need to include Python.h header file in your C source file, which gives you the access to the internal Python API used to hook your module into the interpreter.

请确保在您可能需要的任何其他头之前包含 Python.h。您需要使用要从 Python 调用的函数继续包含。

Make sure to include Python.h before any other headers you might need. You need to follow the includes with the functions you want to call from Python.

The C Functions

函数的 C 实现签名始终采用以下三种形式之一 −

The signatures of the C implementation of your functions always takes one of the following three forms −

static PyObject *MyFunction(PyObject *self, PyObject *args);
static PyObject *MyFunctionWithKeywords(PyObject *self,
   PyObject *args,
   PyObject *kw);
static PyObject *MyFunctionWithNoArgs(PyObject *self);

前面的每个声明都返回一个 Python 对象。在 Python 中不存在类似于 C 中的空函数。如果您不希望函数返回值,则返回 Python 的 None 值的 C 等效值。Python 标头定义了一个宏 Py_RETURN_NONE,它为我们执行此操作。

Each one of the preceding declarations returns a Python object. There is no such thing as a void function in Python as there is in C. If you do not want your functions to return a value, return the C equivalent of Python’s None value. The Python headers define a macro, Py_RETURN_NONE, that does this for us.

您可以随意指定 C 函数名称,因为它们永远不会在扩展模块外部看到。它们被定义为静态函数。

The names of your C functions can be whatever you like as they are never seen outside of the extension module. They are defined as static function.

您通常会通过将 Python 模块和函数名称组合在一起来命名 C 函数,如下所示 −

Your C functions usually are named by combining the Python module and function names together, as shown here −

static PyObject *module_func(PyObject *self, PyObject *args) {
   /* Do your stuff here. */
   Py_RETURN_NONE;
}

这是模块内一个名为 func 的 Python 函数。您将在源代码中通常紧随其后的模块方法表中放置指向 C 函数的指针。

This is a Python function called func inside the module module. You will be putting pointers to your C functions into the method table for the module that usually comes next in your source code.

The Method Mapping Table

此方法表是一个 PyMethodDef 结构的简单数组。该结构类似于如下示例所示 −

This method table is a simple array of PyMethodDef structures. That structure looks something like this −

struct PyMethodDef {
   char *ml_name;
   PyCFunction ml_meth;
   int ml_flags;
   char *ml_doc;
};

以下是此结构中成员的描述 −

Here is the description of the members of this structure −

  1. ml_name − This is the name of the function as the Python interpreter presents when it is used in Python programs.

  2. ml_meth − This is the address of a function that has any one of the signatures, described in the previous section.

  3. ml_flags − This tells the interpreter which of the three signatures ml_meth is using. This flag usually has a value of METH_VARARGS. This flag can be bitwise OR’ed with METH_KEYWORDS if you want to allow keyword arguments into your function. This can also have a value of METH_NOARGS that indicates you do not want to accept any arguments.

  4. mml_doc − This is the docstring for the function, which could be NULL if you do not feel like writing one.

此表需要以包含相应成员的 NULL 和 0 值的哨兵终止。

This table needs to be terminated with a sentinel that consists of NULL and 0 values for the appropriate members.

Example

对于上述定义的函数,我们有以下方法映射表 −

For the above-defined function, we have the following method mapping table −

static PyMethodDef module_methods[] = {
   { "func", (PyCFunction)module_func, METH_NOARGS, NULL },
   { NULL, NULL, 0, NULL }
};

The Initialization Function

扩展模块的最后一部分是初始函数。当模块被载入时,Python 解释器将会调用此函数。该函数必须以 initModule 命名,其中 Module 为模块名称。

The last part of your extension module is the initialization function. This function is called by the Python interpreter when the module is loaded. It is required that the function be named initModule, where Module is the name of the module.

初始化函数需要从你将要构建的库导出。Python 头文件定义 PyMODINIT_FUNC,以便在我们要编译的特定环境中为其包含适当的咒语。你要做的就是用它来定义函数。

The initialization function needs to be exported from the library you will be building. The Python headers define PyMODINIT_FUNC to include the appropriate incantations for that to happen for the particular environment in which we are compiling. All you have to do is use it when defining the function.

你的 C 初始化函数通常具有以下总体结构 −

Your C initialization function generally has the following overall structure −

PyMODINIT_FUNC initModule() {
   Py_InitModule3(func, module_methods, "docstring...");
}

以下是 Py_InitModule3 函数的描述 −

Here is the description of Py_InitModule3 function −

  1. func − This is the function to be exported.

  2. module_methods − This is the mapping table name defined above.

  3. docstring − This is the comment you want to give in your extension.

把这一切放在一起,它看起来像以下内容 −

Putting all this together, it looks like the following −

#include <Python.h>
static PyObject *module_func(PyObject *self, PyObject *args) {
   /* Do your stuff here. */
   Py_RETURN_NONE;
}
static PyMethodDef module_methods[] = {
   { "func", (PyCFunction)module_func, METH_NOARGS, NULL },
   { NULL, NULL, 0, NULL }
};
PyMODINIT_FUNC initModule() {
   Py_InitModule3(func, module_methods, "docstring...");
}

Example

一个利用以上所有概念的简单示例 −

A simple example that makes use of all the above concepts −

#include <Python.h>
static PyObject* helloworld(PyObject* self)
{
   return Py_BuildValue("s", "Hello, Python extensions!!");
}
static char helloworld_docs[] =
   "helloworld( ): Any message you want to put here!!\n";
static PyMethodDef helloworld_funcs[] = {
   {"helloworld", (PyCFunction)helloworld,
   METH_NOARGS, helloworld_docs},
   {NULL}
};
void inithelloworld(void)
{
   Py_InitModule3("helloworld", helloworld_funcs,
      "Extension module example!");
}

在这里,Py_BuildValue 函数用于构建一个 Python 值。将上述代码保存在 hello.c 文件中。我们将看到如何编译和安装此模块以从 Python 脚本调用它。

Here the Py_BuildValue function is used to build a Python value. Save above code in hello.c file. We would see how to compile and install this module to be called from Python script.

Building and Installing Extensions

distutils 包使得以一种标准的方式分发 Python 模块(纯 Python 模块和扩展模块)变得非常容易。模块以源代码形式分发,通常通过称为 setup.pyas 的设置脚本来构建和安装。

The distutils package makes it very easy to distribute Python modules, both pure Python and extension modules, in a standard way. Modules are distributed in the source form, built and installed via a setup script usually called setup.pyas.

对于上述模块,你需要准备以下 setup.py 脚本 −

For the above module, you need to prepare the following setup.py script −

from distutils.core import setup, Extension
setup(name='helloworld', version='1.0', \
   ext_modules=[Extension('helloworld', ['hello.c'])])

现在,使用以下命令,该命令将使用正确的编译器和链接器命令和标志执行所有必需的编译和链接步骤,并将生成的动态库复制到相应目录中 −

Now, use the following command, which would perform all needed compilation and linking steps, with the right compiler and linker commands and flags, and copies the resulting dynamic library into an appropriate directory −

$ python setup.py install

在基于 Unix 的系统上,你很可能需要以 root 用户身份运行此命令,以便拥有向 site-packages 目录写入的权限。这在 Windows 上通常不是问题。

On Unix-based systems, you will most likely need to run this command as root in order to have permissions to write to the site-packages directory. This usually is not a problem on Windows.

Importing Extensions

安装扩展程序后,你将能够在 Python 脚本中导入并调用该扩展程序,如下所示 −

Once you install your extensions, you would be able to import and call that extension in your Python script as follows −

import helloworld
print helloworld.helloworld()

这将产生以下 output

This would produce the following output

Hello, Python extensions!!

Passing Function Parameters

由于你很可能想要定义接受参数的函数,因此你可以对 C 函数使用其他签名之一。例如,以下函数接受一些参数,可以这样定义 −

As you will most likely want to define functions that accept arguments, you can use one of the other signatures for your C functions. For example, the following function, that accepts some number of parameters, would be defined like this −

static PyObject *module_func(PyObject *self, PyObject *args) {
   /* Parse args and do something interesting here. */
   Py_RETURN_NONE;
}

包含新函数条目的方法表将如下所示 −

The method table containing an entry for the new function would look like this −

static PyMethodDef module_methods[] = {
   { "func", (PyCFunction)module_func, METH_NOARGS, NULL },
   { "func", module_func, METH_VARARGS, NULL },
   { NULL, NULL, 0, NULL }
};

你可以使用 API PyArg_ParseTuple 函数从一个 PyObject 指针中提取参数,该指针传递到你的 C 函数中。

You can use the API PyArg_ParseTuple function to extract the arguments from the one PyObject pointer passed into your C function.

PyArg_ParseTuple 的第一个参数是 args 参数。这是你将要解析的对象。第二个参数是一个格式字符串,描述你期望出现的参数。每个参数在格式字符串中由一个或多个字符表示,如下所示。

The first argument to PyArg_ParseTuple is the args argument. This is the object you will be parsing. The second argument is a format string describing the arguments as you expect them to appear. Each argument is represented by one or more characters in the format string as follows.

static PyObject *module_func(PyObject *self, PyObject *args) {
   int i;
   double d;
   char *s;
   if (!PyArg_ParseTuple(args, "ids", &i, &d, &s)) {
      return NULL;
   }

   /* Do something interesting here. */
   Py_RETURN_NONE;
}

编译模块的新版本并导入它使你能够使用任意数量的任何类型的参数调用新函数 −

Compiling the new version of your module and importing it enables you to invoke the new function with any number of arguments of any type −

module.func(1, s="three", d=2.0)
module.func(i=1, d=2.0, s="three")
module.func(s="three", d=2.0, i=1)

你可能会想出更多变体。

You can probably come up with even more variations.

The PyArg_ParseTuple Function

re 是 PyArg_ParseTuple 函数的标准签名 −

re is the standard signature for the PyArg_ParseTuple function −

int PyArg_ParseTuple(PyObject* tuple,char* format,...)

此函数在出错时返回 0,在成功时返回不等于 0 的值。Tuple 是 C 函数的第二个参数 PyObject*。此处 format 是一个 C 字符串,它描述了必需参数和可选参数。

This function returns 0 for errors, and a value not equal to 0 for success. Tuple is the PyObject* that was the C function’s second argument. Here format is a C string that describes mandatory and optional arguments.

以下是 PyArg_ParseTuple 函数的格式代码列表 −

Here is a list of format codes for the PyArg_ParseTuple function −

Returning Values

Py_BuildValue 采用一个格式字符串,类似于 PyArg_ParseTuple。你不必输入正在构建的值的地址,而是输入实际值。以下是一个显示如何实现 add 函数的示例。

Py_BuildValue takes in a format string much like PyArg_ParseTuple does. Instead of passing in the addresses of the values you are building, you pass in the actual values. Here is an example showing how to implement an add function.

static PyObject *foo_add(PyObject *self, PyObject *args) {
   int a;
   int b;
   if (!PyArg_ParseTuple(args, "ii", &a, &b)) {
      return NULL;
   }
   return Py_BuildValue("i", a + b);
}

如果在 Python 中实现,它将如下所示 −

This is what it would look like if implemented in Python −

def add(a, b):
   return (a + b)

你可以如下从你的函数中返回两个值。这将使用 Python 中的列表来捕获。

You can return two values from your function as follows. This would be captured using a list in Python.

static PyObject *foo_add_subtract(PyObject *self, PyObject *args) {
   int a;
   int b;
   if (!PyArg_ParseTuple(args, "ii", &a, &b)) {
      return NULL;
   }
   return Py_BuildValue("ii", a + b, a - b);
}

如果在 Python 中实现,它将如下所示 −

This is what it would look like if implemented in Python −

def add_subtract(a, b):
   return (a + b, a - b)

The Py_BuildValue Function

以下是 Py_BuildValue 函数的标准签名 −

Here is the standard signature for Py_BuildValue function −

PyObject* Py_BuildValue(char* format,...)

此处 format 是一段 C 字符串,它描述要构建的 Python 对象。Py_BuildValue 的以下参数是 C 值,结果将基于 C 值构建。PyObject* result 是一个新的引用。

Here format is a C string that describes the Python object to build. The following arguments of Py_BuildValue are C values from which the result is built. ThePyObject* result is a new reference.

下表列出了常用的代码字符串,其中零个或多个联接成一个字符串格式。

The following table lists the commonly used code strings, of which zero or more are joined into a string format.

代码 {…​} 从偶数个 C 值构建字典,依次是键和值。例如,Py_BuildValue("{issi}",23,"zig","zag",42) 返回一个与 Python 中的 {23:'zig','zag':42} 类似的字典

Code {…​} builds dictionaries from an even number of C values, alternately keys and values. For example, Py_BuildValue("{issi}",23,"zig","zag",42) returns a dictionary like Python’s {23:'zig','zag':42}