Requests 简明教程

Requests - Quick Guide

Requests - Overview

Requests 是一个 HTTP 库,它提供了轻松处理 Web 应用程序中 http 请求/响应的功能。此库使用 python 开发。

Requests is a HTTP library that provides easy functionality to deal with http request/response in your web application. The library is developed in python.

Python Requests 的官方网站,可从 https://2.python-requests.org/en/master/ 获得,将 Requests 定义如下 −

The official website of Python Requests which is available at https://2.python-requests.org/en/master/ defines Requests as follows −

Requests 是 Python 中一个优雅简单的 HTTP 库,为人类而设计。

Requests is an elegant and simple HTTP library for Python, built for human beings.

Features of Requests

Requests 的功能在下面讨论 −

The features of Requests are discussed below −

Request

python 请求库提供了易于使用的可用来处理 Http 请求的方法。传递参数并处理 GET、POST、PUT、DELETE 等请求类型非常简单。

The python requests library has easy to use methods available to handle Http request. Passing of parameters and handling the request type like GET, POST, PUT, DELETE, etc. is very easy.

Response

您可以获取所需的格式的响应,支持的格式有文本格式、二进制响应、json 响应和原始响应。

You can get the response in the format you need and the supported ones are text format, binary response, json response, and raw response.

Headers

此库允许您根据您的要求读取、更新或发送新的标头。

The library allows you to read, update or send new headers as per your requirements.

Timeouts

使用 python requests 库,可以轻松地将超时添加到您请求的 URL。您碰巧使用的是第三方 URL 并等待响应。

Timeouts can be easily added to the URL you are requesting using python requests library. It so happens that you are using a third-party URL and waiting for a response.

在 URL 上给出一个超时总是一个好习惯,因为我们可能希望 URL 在该超时内通过响应或错误做出响应。不这样做可能会导致无限期等待该请求。

It is always a good practice to give a timeout on the URL as we might want the URL to respond within that timeout with a response or an error that is coming because of timeout. Not doing so can cause either to wait on that request indefinitely.

Error handling

requests 模块提供了对错误处理的支持,其中包括连接错误、超时错误、TooManyRedirects、Response.raise_for_status 错误等。

The requests module gives support for error handling and some of which are Connection Error, Timeout errors, TooManyRedirects, Response.raise_for_status errors, etc.

Cookies

该库允许您对请求的 URL 进行读取、写入和更新。

The library allows you to read, write and update for the requested URL.

Sessions

为了维护数据,您需要在请求之间进行会话。因此,如果反复调用同一个主机,您可以重用 TCP 连接,这反过来将提高性能。

To maintain the data, you require between requests you need sessions. So, if the same host is called again and again you can re-use the TCP connection which in turn will improve the performance.

SSL certificates

SSL 证书是一项随安全 url 而来的安全功能。当您使用 Requests 时,它还会验证给定 https URL 的 SSL 证书。Requests 库中默认启用了 SSL 验证,如果证书不存在,将抛出错误。

SSL certificate is a security feature that comes with secure urls. When you use Requests, it also verifies SSL certificates for the https URL given. SSL Verification is enabled by default in the requests library and will throw an error if the certificate is not present.

Authentication

HTTP 身份验证是在服务器端要求某些身份验证信息,例如用户名、密码,当客户端请求一个 URL 时。这是对客户端和服务器之间交换的请求和响应的附加安全措施。

HTTP authentication is on the server-side asking for some authentication information like username, password when the client requests a URL. This is an additional security for the request and the response being exchanged between the client and the server.

Advantages of using Python Requests Library

以下是使用 Python Requests 库的优点 −

Following are the advantages of using Python Requests Library −

  1. Easy to use and fetch the data from the URL given.

  2. Requests library can be used to scrape the data from the website.

  3. Using requests, you can get, post, delete, update the data for the URL given.

  4. The handling of cookies and session is very easy.

  5. The security is also taken care of the help of authentication module support.

Requests - Environment Setup

在本章中,我们将着手 Requests 的安装。要开始使用 Requests 模块,我们首先需要安装 Python。因此,我们将处理以下内容:

In this chapter, we will work on the installation of Requests. To start working with the Requests module, we need to install Python first. So we are going to work on following−

  1. Install Python

  2. Install Requests

Installing Python

转到 Python 官方网站: https://www.python.org/downloads/ ,如下所示,然后单击适用于 Windows、Linux/Unix 和 Mac OS 的最新可用版本。根据您现有的 64 位或 32 位操作系统下载 Python。

Go to the Python official site: https://www.python.org/downloads/ as shown below and click on the latest version available for Windows, Linux/Unix, and Mac OS. Download Python as per your 64 or 32 bit OS available with you.

python download

下载完成后,单击 .exe 文件并按照步骤在您的系统上安装 Python。

Once you have downloaded, click on the .exe file and follow the steps to install python on your system.

python for windows

Python 包管理器(即 pip)也将默认随上述安装一起安装。要使它在您的系统上全局工作,请直接将 Python 的位置添加到 PATH 变量。在安装开始时显示相同内容,请记住勾选“添加到 PATH”复选框。如果您忘记选中它,请按照以下给定的步骤添加到 PATH。

The python package manager, i.e., pip will also get installed by default with the above installation. To make it work globally on your system, directly add the location of python to the PATH variable. The same is shown at the start of the installation to remember to check the checkbox which says ADD to PATH. In case you forget to check it, please follow the below-given steps to add to PATH.

To add to PATH follow the steps−

右键单击您的计算机图标,然后单击属性>高级系统设置。

Right-click on your Computer icon and click on properties > Advanced System Settings.

它会显示如下屏幕:

It will display the screen as shown below −

system properties

单击“环境变量”,如下图所示,它会显示如下屏幕:

Click on Environment Variables as shown above. It will display the screen as shown below −

environment variables

选择路径并单击编辑按钮,在末尾添加您的 Python 的位置路径。现在,让我们检查 python 版本。

Select Path and click on Edit button, add the location path of your python at the end. Now, let us check the python version.

Checking the python version

E:\prequests>python --version
Python 3.7.3

Install Requests

既然我们已经安装了 Python,我们将安装 Requests。

Now that we have python installed, we are going to install Requests.

安装 Python 后,也会安装 Python 包管理器(即 pip)。以下是检查 pip 版本的命令。

Once python is installed, python package manager i.e. pip will also get installed. Following is the command to check pip version.

E:\prequests>pip --version
pip 19.1.1 from c:\users\xxxxx\appdata\local\programs\python\python37\lib\site-p
ackages\pip (python 3.7)

我们已安装了 pip 并且版本为 19.1.1。现在,将使用 pip 安装 Requests 模块。

We have pip installed and the version is 19.1.1. Now, will use pip to install Requests module.

下面给出了命令−

The command is given below−

pip install requests
E:\prequests>pip install requests
Requirement already satisfied: requests in c:\users\xxxx\appdata\local\programs
\python\python37\lib\site-packages (2.22.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\kamat\appdata\loca
l\programs\python\python37\lib\site-packages (from requests) (2019.3.9)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\use
rs\xxxxx\appdata\local\programs\python\python37\lib\site-packages (from requests
) (1.25.3)
Requirement already satisfied: idna<2.9,>=2.5 in c:\users\xxxxxxx\appdata\local\pr
ograms\python\python37\lib\site-packages (from requests) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\xxxxx\appdata\l
ocal\programs\python\python37\lib\site-packages (from requests) (3.0.4)

我们已经安装了该模块,因此在命令提示符中,它显示 Requirement already satisfied(需求已满足);如果没有安装,它将已为安装下载所需的包。

We already have the module installed, so in the command prompt it says Requirement already satisfied; if not installed it would have downloaded the required packages for installation.

要查看已安装的 requests 模块的详细信息,您可以使用以下命令−

To check the details of the requests module installed, you can use the following command−

pip show requests
E:\prequests>pip show requests
Name: requests
Version: 2.22.0
Summary: Python HTTP for Humans.
Home-page: http://python-requests.org
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Location: c:\users\xxxxx\appdata\local\programs\python\python37\lib\site-package
S
Requires: certifi, idna, urllib3, chardet
Required-by:

Requests 模块的版本为 2.22.0。

The version of Requests module is 2.22.0.

Requests - How Http Requests Work?

Python 的 Requests 是一个 HTTP 库,它将帮助我们在客户端和服务器之间交换数据。想象一下,您有一个表单的 UI,您需要在其中输入用户详细信息,因此一旦您输入了它,您就必须提交数据,而这只不过是从客户端到服务器的保存数据的 HTTP POST 或 PUT 请求。

Python’s Requests is a HTTP library that will help us exchange data between the client and the server. Consider you have a UI with a form, wherein you need to enter the user details, so once you enter it, you have to submit the data which is nothing but a Http POST or PUT request from the client to server to save the data.

当您想要数据时,您需要从服务器获取它,这又是一个 Http GET 请求。客户端请求数据时数据在客户端和服务器之间的交换,以及服务器用所需数据进行响应,这种客户端和服务器之间的关系非常重要。

When you want the data, you need to fetch it from the server, which is again a Http GET request. The exchange of data between the client when it requests the data and the server responding with the required data, this relationship between the client and the server is very important.

请求已发送到给定的 URL,它可以是安全或非安全 URL。

The request is made to the URL given and it could be a secure or non-secure URL.

对 URL 的请求可以使用 GET、POST、PUT、DELETE。使用最广泛的是 GET 方法,主要用于您想要从服务器获取数据时。

The request to the URL can be done using GET, POST, PUT, DELETE. The most commonly used is the GET method, mainly used when you want to fetch data from the server.

您还可以将数据作为查询字符串发送至 URL,例如−

You can also send data to the URL as a query string for example−

因此,在这里,我们向 URL 传递 id = 9 和 username = Delphine。所有值都在问号 (?) 之后以键/值对的形式发送,并且使用 & 符号将多个参数传递给 URL。

So here, we are passing id = 9 and username = Delphine to the URL. All the values are sent in key/value pair after the question mark(?) and multiple params are passed to the URL separated by &.

使用请求库,使用字符串字典按如下方式调用 URL。

Using the request library, the URL is called as follows using a string dictionary.

其中,数据以字符串字典的形式发送至 URL。如果您希望传递 id=9 and username=Delphine,则可以执行以下操作−

Wherein the data to the URL is sent as a dictionary of strings. If you want to pass id=9 and username=Delphine, you can do as follows−

payload = {'id': '9', 'username': 'Delphine'}

请求库按如下方式调用−

The requests library is called as follows−

res = requests.get('https://jsonplaceholder.typicode.com/users', params=payload')

Using POST, we can do as follows−

res = requests.post('https://jsonplaceholder.typicode.com/users', data = {'id':'9', 'username':'Delphine'})

Using PUT

res = requests.put('https://jsonplaceholder.typicode.com/users', data = {'id':'9', 'username':'Delphine'})

Using DELETE

res = requests.delete('https://jsonplaceholder.typicode.com/users')

HTTP 请求的响应可以为文本编码形式、二进制编码、json 格式或原始响应。请求和响应的详细信息将在下一章详细说明。

The response from the Http request can be in text encoded form, binary encoded, json format or raw response. The details of the request and response are explained in detail in the next chapters.

Requests - Working with Requests

在本章中,我们将了解如何使用 requests 模块。我们接下来将了解以下内容−

In this chapter, we will understand how to work with the requests module. We will look into the following−

  1. Making HTTP Requests.

  2. Passing Parameters to HTTP Requests.

Making HTTP Requests

要进行 HTTP 请求,我们首先需要导入 request 模块,如下所示−

To make a Http request, we need to first import the request module as shown below−

import requests

现在让我们了解如何使用 requests 模块发起对 URL 的调用。

Let us now see, how to make a call to URL using the requests module.

让我们在代码中使用 URL− link: https://jsonplaceholder.typicode.com/users 来测试 Requests 模块。

Let us use the URL− link: https://jsonplaceholder.typicode.com/users in the code, to test Requests Module.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.status_code)

url− https://jsonplaceholder.typicode.com/users 是使用 requests.get() 方法调用的。URL 的响应对象存储在 getdata 变量中。当我们打印该变量时,它给出了 200 响应代码,这意味着我们已成功获得响应。

The url− https://jsonplaceholder.typicode.com/users is called using requests.get() method. The response object of the URL is stored in the getdata variable. When we print the variable, it gives the 200 response code, which means that we have got the response successfully.

Output

E:\prequests>python makeRequest.py
<Response [200]>

要从响应中获取内容,我们可以使用 getdata.content 执行以下操作−

To get the content from the response, we can do so using getdata.content as shown below−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)

getdata.content 将打印响应中所有可用数据。

getdata.content, will print all the data available in the response.

Output

E:\prequests>python makeRequest.py
b'[\n {\n  "id": 1,\n  "name": "Leanne Graham",\n  "username": "Bret",\n
"email": "Sincere@april.biz",\n  "address": {\n  "street": "Kulas Light
",\n  "suite": "Apt. 556",\n  "city": "Gwenborough",\n  "zipcode": "
92998-3874",\n  "geo": {\n "lat": "-37.3159",\n  "lng": "81.149
6"\n }\n },\n  "phone": "1-770-736-8031 x56442",\n  "website": "hild
egard.org",\n  "company": {\n "name": "Romaguera-Crona",\n  "catchPhr
ase": "Multi-layered client-server neural-net",\n  "bs": "harness real-time
e-markets"\n }\n }

Passing Parameters to HTTP Requests

仅请求 URL 是不够的,我们还需要将参数传递给 URL。

Just requesting the URL is not sufficient, we also need to pass parameters to the URL.

params 通常以键/值对形式传递,例如−

The params are mostly passed as key/value pair, for example−

 https://jsonplaceholder.typicode.com/users?id=9&username=Delphine

因此,我们有 id=9 和 username=Delphine。现在,将了解如何将此类数据传递给 requests Http 模块。

So, we have id=9 and username=Delphine. Now, will see how to pass such data to requests Http module.

Example

import requests
payload = {'id': 9, 'username': 'Delphine'}
getdata = requests.get('https://jsonplaceholder.typicode.com/users', params=payload)
print(getdata.content)

这些详细信息存储在键/值对中的对象有效负载中,并传递到 get() 方法内部的 params。

The details are stored in the object payload in the key/value pair and passed to params, inside get() method.

Output

E:\prequests>python makeRequest.py
b'[\n {\n "id": 9,\n "name": "Glenna Reichert",\n "username": "Delphin
e",\n "email": "Chaim_McDermott@dana.io",\n "address": {\n "street":
"Dayna Park",\n "suite": "Suite 449",\n "city": "Bartholomebury",\n
"zipcode": "76495-3109",\n "geo": {\n "lat": "24.6463",\n
"lng": "-168.8889"\n }\n },\n "phone": "(775)976-6794 x41206",\n "
website": "conrad.com",\n "company": {\n "name": "Yost and Sons",\n
"catchPhrase": "Switchable contextually-based project",\n "bs": "aggregate
real-time technologies"\n }\n }\n]'

我们现在正在响应中获取 id=9 和 username=Delphine 详细信息。

We are now getting the details of the id=9 and username=Delphine details in the response.

如果您想查看在传递参数后 URL 的外观,则可以使用响应对象来获取 URL。

If you want to see, how the URL looks after passing the parameters, making use of the response object to the URL.

Example

import requests
payload = {'id': 9, 'username': 'Delphine'}
getdata = requests.get('https://jsonplaceholder.typicode.com/users', params=payload)
print(getdata.url)

Output

E:\prequests>python makeRequest.py
https://jsonplaceholder.typicode.com/users?id=9&username=Delphine

Handling Response for HTTP Requests

在本章中,我们将更详细地了解从请求模块接收到的响应。我们将讨论以下详细信息:

In this chapter, we will get into more details of the response received from the requests module. We will discuss the following details−

  1. Getting Response

  2. JSON Response

  3. RAW Response

  4. Binary Response

Getting Response

我们将使用 request.get() 方法向 URL 发出请求。

We will make a request to the URL using request.get() method.

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users');

getdata 具有响应对象。它包含响应的所有详细信息。我们可以使用两种方式((.text) 和 (.content))获取响应。使用 response.text 将以文本格式返回数据,如下所示:

The getdata has the response object. It has all the details of the response. We can get a response in 2 ways using (.text) and (.content). Using response.text will give you the data back in text format as shown below−

Example

E:\prequests>python makeRequest.py
[
 {
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered client-server neural-net",
   "bs": "harness real-time e-markets"
  }
},

您会看到响应与在浏览器中为该 URL 查看源代码时出现的相同,如下所示:

You will see the response is the same, as how it would have appeared in the browser when you do view source for the URL as shown below−

typicode

您还可以尝试 .html URL,并使用 response.text 查看内容,它将与浏览器中 .html URL 的查看源内容相同。

You can also try out .html URL and see the content using response.text, it will be the same as the view source content for the .html URL in the browser.

现在,让我们尝试针对同一个 URL 使用 response.content, 并查看输出。

Now, let us try response.content for the same URL and see the output.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)

Output

E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n },\n {\n "id": 2,\n "name": "Ervin Howell",\n "us
ername": "Antonette",\n "email": "Shanna@melissa.tv",\n "address": {\n
"street": "Victor Plains",\n "suite": "Suite 879",\n "city": "Wisoky
burgh",\n "zipcode": "90566-7771",\n "geo": {\n "lat": "-43.950
9",\n "lng": "-34.4618"\n }\n },\n "phone": "010-692-6593 x091
25",\n "website": "anastasia.net",\n "company": {\n "name": "Deckow-C
rist",\n "catchPhrase": "Proactive didactic contingency",\n "bs": "syn
ergize scalable supply-chains"\n }\n },\n {\n "id": 3,\n "name": "Cle
mentine Bauch",\n "username": "Samantha",\n "email":
"Nathan@yesenia.net",
\n "address": {\n "street": "Douglas Extension",\n "suite": "Suite
847",\n "city": "McKenziehaven",\n "zipcode": "59590-4157",\n "ge
o": {\n "lat": "-68.6102",\n "lng": "-47.0653"\n }\n },\n

响应以字节为单位提供。您将在响应开头得到一个字符 b 。使用 requests 模块,您可以获得使用的编码,也可以在需要时更改编码。例如,要获得编码,可以使用 response.encoding。

The response is given in bytes. You will get a letter b at the start of the response. With the requests module, you can get the encoding used and also change the encoding if required. For example, to get the encoding you can use response.encoding.

print(getdata.encoding)

Output

utf-8

您可以按如下方式更改编码:您可以使用您选择的编码。

You can change the encoding as follows− You can use the encoding of your choice.

getdata.encoding = 'ISO-8859-1'

JSON Response

You can also get the response for the Http request in json format by using response.json() method as follows−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.json())

Output

E:\prequests>python makeRequest.py
[{'id': 1, 'name': 'Leanne Graham', 'username': 'Bret', 'email': 'Sincere@april.
biz', 'address': {'street': 'Kulas Light', 'suite': 'Apt. 556', 'city': 'Gwenbor
ough', 'zipcode': '92998-3874', 'geo': {'lat': '-37.3159', 'lng': '81.1496'}},
'
phone': '1-770-736-8031 x56442', 'website': 'hildegard.org', 'company': {'name':
'Romaguera-Crona', 'catchPhrase': 'Multi-layered client-server neural-net', 'bs
': 'harness real-time e-markets'}}]

RAW Response

In case you need the raw response for the Http URL you can make use of response.raw, also add stream=True inside the get method as shown below−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', stream=True)
print(getdata.raw)

Output

E:\prequests>python makeRequest.py
<urllib3.response.HTTPResponse object at 0x000000A8833D7B70>

要从原始数据中读取更多内容,您可以按如下操作:

To read for more content from the raw data you can do so as follows−

print(getdata.raw.read(50))

Output

E:\prequests>python makeRequest.py
b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\x95\x98[o\xe38\x12\x85\xdf\xe7W\x10y\
xda\x01F\x82.\xd4m\x9f\xdc\x9dd\xba\xb7\x93\xf4\x06q\xef4\x06\x83A@K\x15\x89m'

Binary Response

要获得二进制响应,我们可以利用 response.content。

To get a binary response we can make use of response.content.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)

Output

E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n },\n {\n "id": 2,\n "name": "Ervin Howell",\n "us
ername": "Antonette",\n "email": "Shanna@melissa.tv",\n "address": {\n
"street": "Victor Plains",\n "suite": "Suite 879",\n "city": "Wisoky
burgh",\n "zipcode": "90566-7771",\n "geo": {\n "lat": "-43.950
9",\n "lng": "-34.4618"\n }\n },\n "phone": "010-692-6593 x091
25",\n "website": "anastasia.net",\n "company": {\n "name": "Deckow-C
rist",\n "catchPhrase": "Proactive didactic contingency",\n "bs": "syn
ergize scalable supply-chains"\n }\n },\n {\n "id": 3,\n "name": "Cle
mentine Bauch",\n "username": "Samantha",\n "email": "Nathan@yesenia.net",
\n "address": {\n "street": "Douglas Extension",\n "suite": "Suite
847",\n "city": "McKenziehaven",\n "zipcode": "59590-4157",\n "ge
o": {\n "lat": "-68.6102",\n "lng": "-47.0653"\n }\n },\n

响应以字节为单位提供。您将在响应开头得到一个字符 b 。二进制响应主要用于非文本请求。

The response is given in bytes. You will get a letter b at the start of the response. The binary response is mostly used for non-text requests.

Requests - HTTP Requests Headers

在上一章中,我们已经了解了如何发出请求并获取响应。本章将更深入地探讨 URL 的标头部分。因此,我们将研究以下内容:

In the previous chapter, we have seen how to make the request and get the response. This chapter will explore a little more on the header section of the URL. So, we are going to look into the following−

  1. Understanding Request Headers

  2. Custom Headers

  3. Response Headers

Understanding Request Headers

在浏览器中击中任意 URL,检查它并在开发者工具网络选项卡中查看。

Hit any URL in the browser, inspect it and check in developer tool network tab.

你将得到响应头部、请求头部、载荷等。

You will get response headers, request headers, payload, etc.

例如,考虑以下 URL−

For example, consider the following URL−

view source

你可以按以下方式获得头部详细信息−

You can get the header details as follows−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', stream=True)
print(getdata.headers)

Output

E:\prequests>python makeRequest.py
{'Date': 'Sat, 30 Nov 2019 05:15:00 GMT', 'Content-Type': 'application/json; cha
rset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Set-Co
okie': '__cfduid=d2b84ccf43c40e18b95122b0b49f5cf091575090900; expires=Mon, 30-De
c-19 05:15:00 GMT; path=/; domain=.typicode.com; HttpOnly', 'X-Powered-By': 'Exp
ress', 'Vary': 'Origin, Accept-Encoding', 'Access-Control-Allow-Credentials': 't
rue', 'Cache-Control': 'max-age=14400', 'Pragma': 'no-cache', 'Expires': '-1', '
X-Content-Type-Options': 'nosniff', 'Etag': 'W/"160d-1eMSsxeJRfnVLRBmYJSbCiJZ1qQ
"', 'Content-Encoding': 'gzip', 'Via': '1.1 vegur', 'CF-Cache-Status': 'HIT', 'A
ge': '2271', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudf
lare.com/cdn-cgi/beacon/expect-ct"', 'Server': 'cloudflare', 'CF-RAY': '53da574f
f99fc331-SIN'}

要读取任何 HTTP 头部,你可以按以下方式执行−

To read any http header you can do so as follows−

getdata.headers["Content-Encoding"] // gzip

Custom Headers

还可以向被调用的 URL 发送头部,如下所示。

You can also send headers to the URL being called as shown below.

Example

import requests
headers = {'x-user': 'test123'}
getdata = requests.get('https://jsonplaceholder.typicode.com/users', headers=headers)

传递的头部必须是字符串、字节串或 Unicode 格式。请求的行为不会根据通过的自定义头部而改变。

The headers passed has to be string, bytestring, or Unicode format. The behavior of the request will not change as per the custom headers passed.

Response Headers

当你从浏览器开发者工具中的网络选项卡中查看 URL 时,响应头部看起来如下−

The response headers look like below when you check the URL in the browser developer tool, network tab−

view sourcecode

要从请求模块中获取头部详细信息,请使用。Response.headers 如下所示−

To get the details of the headers from the requests module use. Response.headers are as shown below−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.headers)

Output

E:\prequests>python makeRequest.py
{'Date': 'Sat, 30 Nov 2019 06:08:10 GMT', 'Content-Type': 'application/json; cha
rset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Set-Co
okie': '__cfduid=de1158f1a5116f3754c2c353055694e0d1575094090; expires=Mon, 30-De
c-19 06:08:10 GMT; path=/; domain=.typicode.com; HttpOnly', 'X-Powered-By': 'Exp
ress', 'Vary': 'Origin, Accept-Encoding', 'Access-Control-Allow-Credentials': 't
rue', 'Cache-Control': 'max-age=14400', 'Pragma': 'no-cache', 'Expires': '-1', '
X-Content-Type-Options': 'nosniff', 'Etag': 'W/"160d-1eMSsxeJRfnVLRBmYJSbCiJZ1qQ
"', 'Content-Encoding': 'gzip', 'Via': '1.1 vegur', 'CF-Cache-Status': 'HIT', 'A
ge': '5461', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudf
lare.com/cdn-cgi/beacon/expect-ct"', 'Server': 'cloudflare', 'CF-RAY': '53daa52f
3b7ec395-SIN'}

你可以按以下方式获取任何你想要的特定头部−

You can get any specific header you want as follows−

print(getdata.headers["Expect-CT"])

Output

max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/exp
ect-ct

You can also get the header details by using the get() method.

You can also get the header details by using the get() method.

print(getdata.headers.get("Expect-CT"))

Output

max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/exp
ect-ct

Requests - Handling GET Requests

本章将更集中于 GET 请求,GET 请求是最常见且最常用的。请求模块中的 GET 工作非常简单。下面是一个使用 URL 和 GET 方法工作的简单示例。

This chapter will concentrate more on the GET requests, which is the most common and used very often. The working of GET in the requests module is very easy. Here is a simple example about working with the URL using the GET method.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)
getdata.content, will print all the data available in the response.

Output

E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n }

你还可以使用 param 属性向 get 方法传递参数,如下所示−

You can also pass parameters to the get method using the param attribute as shown below−

import requests
payload = {'id': 9, 'username': 'Delphine'}
getdata = requests.get('https://jsonplaceholder.typicode.com/users',
params=payload)
print(getdata.content)

这些详细信息存储在键/值对中的对象有效负载中,并传递到 get() 方法内部的 params。

The details are stored in the object payload in the key/value pair and passed to params, inside get() method.

Output

E:\prequests>python makeRequest.py
b'[\n {\n "id": 9,\n "name": "Glenna Reichert",\n "username": "Delphin
e",\n "email": "Chaim_McDermott@dana.io",\n "address": {\n "street":
"Dayna Park",\n "suite": "Suite 449",\n "city": "Bartholomebury",\n
"zipcode": "76495-3109",\n "geo": {\n "lat": "24.6463",\n
"lng": "-168.8889"\n }\n },\n "phone": "(775)976-6794 x41206",\n "
website": "conrad.com",\n "company": {\n "name": "Yost and Sons",\n
"catchPhrase": "Switchable contextually-based project",\n "bs": "aggregate
real-time technologies"\n }\n }\n]'

Handling POST, PUT, PATCH and DELETE Requests

在本章中,我们将介绍如何使用请求库来使用 POST 方法以及如何将参数传递到 URL 中。

In this chapter, we will understand how to use the POST method using requests library and also pass parameters to the URL.

Using POST

对于 PUT 请求,Requests 库具有 requests.post() 方法,其示例如下所示:

For PUT request, the Requests library has requests.post() method, the example of it is shown below:

import requests

myurl = 'https://postman-echo.com/post'
myparams = {'name': 'ABC', 'email':'xyz@gmail.com'}
res = requests.post(myurl, data=myparams)
print(res.text)

Output

E:\prequests>python makeRequest.py
{"args":{},"data":"","files":{},"form":{"name":"ABC","email":"xyz@gmail.com"},"headers":{"x-forwarded-proto":"https","host":"postman-echo.com","content-length":"30","accept":"*/*","accept-encoding":"gzip,deflate","content-type":"application/x-www-form-urlencoded","user-agent":"python-requests/2.22.0","x-forwarded-port":"443"},"json":{"name":"ABC","email":"xyz@gmail.com"},"url":"https://postman-echo.com/post"}

在前述示例中,你可以将表单数据作为键值对传递到 requests.post() 内部的 data 参数。我们还将了解如何在请求模块中使用 PUT、PATCH 和 DELETE。

In the example shown above, you can pass the form data as key-value pair to the data param inside requests.post(). We will also see how to work with PUT, PATCH and DELETE in requests module.

Using PUT

对于 PUT 请求,Requests 库具有 requests.put() 方法,其示例如下所示。

For PUT request, the Requests library has requests.put() method, the example of it is shown below.

import requests
myurl = 'https://postman-echo.com/put'
myparams = {'name': 'ABC', 'email':'xyz@gmail.com'}
res = requests.put(myurl, data=myparams)
print(res.text)

Output

E:\prequests>python makeRequest.py
{"args":{},"data":"","files":{},"form":{"name":"ABC","email":"xyz@gmail.com"},"h
eaders":{"x-forwarded-proto":"https","host":"postman-echo.com","content-length":
"30","accept":"*/*","accept-encoding":"gzip, deflate","content-type":"applicatio
n/x-www-form-urlencoded","user-agent":"python-requests/2.22.0","x-forwarded-port
":"443"},"json":{"name":"ABC","email":"xyz@gmail.com"},"url":"https://postman-ec
ho.com/put"}

Using PATCH

对于 PATCH 请求,Requests 库具有 requests.patch() 方法,其示例如下所示。

For the PATCH request, the Requests library has requests.patch() method, the example of it is shown below.

import requests
myurl = https://postman-echo.com/patch'
res = requests.patch(myurl, data="testing patch")
print(res.text)

Output

E:\prequests>python makeRequest.py
{"args":{},"data":{},"files":{},"form":{},"headers":{"x-forwarded-proto":"https"
,"host":"postman-echo.com","content-length":"13","accept":"*/*","accept-encoding
":"gzip, deflate","user-agent":"python-requests/2.22.0","x-forwarded-port":"443"
},"json":null,"url":"https://postman-echo.com/patch"}

Using DELETE

DELETE 请求中,Requests 库有 requests.delete() 方法,以下所示为示例。

For the DELETE request, the Requests library has requests.delete() method, the example of it is shown below.

import requests
myurl = 'https://postman-echo.com/delete'
res = requests.delete(myurl, data="testing delete")
print(res.text)

Output

E:\prequests>python makeRequest.py
{"args":{},"data":{},"files":{},"form":{},"headers":{"x-forwarded-proto":"https"
,"host":"postman-echo.com","content-length":"14","accept":"*/*","accept-encoding
":"gzip, deflate","user-agent":"python-requests/2.22.0","x-forwarded-port":"443"
},"json":null,"url":"https://postman-echo.com/delete"}

Requests - File Upload

In this chapter, we will upload a file using request and read the contents of the file uploaded. We can do it using the ` files ` param as shown in the example below.

In this chapter, we will upload a file using request and read the contents of the file uploaded. We can do it using the files param as shown in the example below.

We will use the ` http://httpbin.org/ `post to upload the file.

We will use the http://httpbin.org/post to upload the file.

Example

import requests
myurl = 'https://httpbin.org/post'
files = {'file': open('test.txt', 'rb')}
getdata = requests.post(myurl, files=files)
print(getdata.text)

Test.txt

File upload test using Requests

Example

var total = [0, 1, 2, 3].reduceRight(function(a, b){ return a + b; });
console.log("total is : " + total );

Output

E:\prequests>python makeRequest.py
{
  "args": {},
  "data": "",
  "files": {
   "file": "File upload test using Requests"
  },
  "form": {},
  "headers": {
   "Accept": "*/*",
   "Accept-Encoding": "gzip, deflate",
   "Content-Length": "175",
   "Content-Type": "multipart/form-data;
boundary=28aee3a9d15a3571fb80d4d2a94bf
d33",
   "Host": "httpbin.org",
   "User-Agent": "python-requests/2.22.0"
  },
  "json": null,
  "origin": "117.223.63.135, 117.223.63.135",
  "url": "https://httpbin.org/post"
}

It is also possible to send the contents of the file as shown below−

Example

import requests
myurl = 'https://httpbin.org/post'
files = {'file': ('test1.txt', 'Welcome to TutorialsPoint')}
getdata = requests.post(myurl, files=files)
print(getdata.text)

Output

E:\prequests>python makeRequest.py
{
  "args": {},
  "data": "",
  "files": {
   "file": "Welcome to TutorialsPoint"
},
"form": {},
"headers": {
   "Accept": "*/*",
   "Accept-Encoding": "gzip, deflate",
   "Content-Length": "170",
   "Content-Type": "multipart/form-data; boundary=f2837238286fe40e32080aa7e172b
e4f",
  "Host": "httpbin.org",
  "User-Agent": "python-requests/2.22.0"
},
  "json": null,
  "origin": "117.223.63.135, 117.223.63.135",
  "url": "https://httpbin.org/post"
}

Requests - Working with Cookies

This chapter will discuss how to deal with cookies. You can get the cookies as well as send your cookies while calling the URL using the requests library.

The url, ` https://jsonplaceholder.typicode.com/users ` when hits in the browser we can get the details of the cookies as shown below−

The url, https://jsonplaceholder.typicode.com/users when hits in the browser we can get the details of the cookies as shown below−

typicode sourcecode

You can read the cookies as shown below−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.cookies["__cfduid"])

Output

E:\prequests>python makeRequest.py
d1733467caa1e3431fb7f768fa79ed3741575094848

You can also send cookies when we make a request.

Example

import requests
cookies = dict(test='test123')
getdata = requests.get('https://httpbin.org/cookies',cookies=cookies)
print(getdata.text)

Output

E:\prequests>python makeRequest.py
{
   "cookies": {
   "test": "test123"
}
}

Requests - Working with Errors

This chapter will discuss how to deal with errors coming down when working with the Http request library. It is always a good practice to have errors managed for all possible cases.

Error Exception

The requests module gives the following types of error exception−

` ConnectionError − This will be raised, if there is any connection error. For example, the network failed, DNS error so the Request library will raise `ConnectionError exception.

ConnectionError− This will be raised, if there is any connection error. For example, the network failed, DNS error so the Request library will raise ConnectionError exception.

` Response.raise_for_status() − Based on status code i.e. 401, 404 it will raise `HTTPError for the url requested.

Response.raise_for_status()− Based on status code i.e. 401, 404 it will raise HTTPError for the url requested.

` HTTPError `− This error will be raised for an invalid response coming down for the request made.

HTTPError− This error will be raised for an invalid response coming down for the request made.

` Timeout `− Errors raised for a timeout for the URL requested.

Timeout− Errors raised for a timeout for the URL requested.

` TooManyRedirects − If the limit is crossed for maximum redirections than it will raise `TooManyRedirects error.

TooManyRedirects− If the limit is crossed for maximum redirections than it will raise TooManyRedirects error.

Example

Here is an example of errors shown for timeout−

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=0.001)
print(getdata.text)

Output

raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='jsonplaceholder.ty
picode.com', port=443): Max retries exceeded with url: /users (Caused by Connect
TimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at
0x000000B02AD
E76A0>, 'Connection to jsonplaceholder.typicode.com timed out. (connect timeout= 0.001)'))

Requests - Handling Timeouts

Timeouts can be easily added to the URL you are requesting. It so happens that, you are using a third-party URL and waiting for a response. It is always a good practice to give a timeout on the URL, as we might want the URL to respond within a timespan with a response or an error. Not doing so, can cause to wait on that request indefinitely.

We can give timeout to the URL by using the timeout param and value is passed in seconds as shown in the example below−

We can give timeout to the URL by using the timeout param and value is passed in seconds as shown in the example below−

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=0.001)
print(getdata.text)

Output

raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='jsonplaceholder.ty
picode.com', port=443): Max retries exceeded with url: /users (Caused by Connect
TimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at 0x000000B02AD
E76A0>, 'Connection to jsonplaceholder.typicode.com timed out. (connect timeout=
0.001)'))

The timeout given is as follows−

getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=0.001)

执行抛出连接超时错误,如图中所示。给出的超时是 0.001,这使得请求无法获取响应并抛出错误。现在,我们将增加超时并进行检查。

The execution throws connection timeout error as shown in the output. The timeout given is 0.001, which is not possible for the request to get back the response and throws an error. Now, we will increase the timeout and check.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=1.000)
print(getdata.text)

Output

E:\prequests>python makeRequest.py
[
 {
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered client-server neural-net",
   "bs": "harness real-time e-markets"
 }

使用 1 秒的超时,我们可以获得请求的 URL 的响应。

With a timeout of 1 second, we can get the response for the URL requested.

Requests - Handling Redirection

本章将了解 Request 库如何处理 URL 重定向的情况。

This chapter will take a look at how the Request library handles the url redirection case.

Example

import requests
getdata = requests.get('http://google.com/')
print(getdata.status_code)
print(getdata.history)

url http://google.com 将使用状态代码 301(永久移动)重定向到 https://www.google.com/ 。该重定向将保存在历史记录中。

The url− http://google.com will be redirected using status code 301(Moved Permanently) to https://www.google.com/. The redirection will be saved in the history.

Output

执行上述代码后,我们将获得以下结果:

When the above code is executed, we get the following result−

E:\prequests>python makeRequest.py
200
[<Response [301]>]

您可以使用 allow_redirects=False 停止 URL 的重定向。可以在所使用的 GET、POST、OPTIONS、PUT、DELETE、PATCH 方法上进行。

You can stop redirection of a URL using allow_redirects=False. It can be done on GET, POST, OPTIONS, PUT, DELETE, PATCH methods used.

Example

以下是一个示例。

Here is an example on the same.

import requests
getdata = requests.get('http://google.com/', allow_redirects=False)
print(getdata.status_code)
print(getdata.history)
print(getdata.text)

现在,如果您检查输出,则不允许重定向,并将获得 301 状态代码。

Now if you check the output, the redirection will not be allowed and will get a status code of 301.

Output

E:\prequests>python makeRequest.py
301
[]
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

Requests - Handling History

您可以通过使用 response.history 获得给定 URL 的历史记录。如果给定的 URL 有任何重定向,则会将重定向存储在历史记录中。

You can get the history of a given URL by using response.history. If the given URL has any redirects, the same will be stored in history.

For history

import requests
getdata = requests.get('http://google.com/')
print(getdata.status_code)
print(getdata.history)

Output

E:\prequests>python makeRequest.py
200
[<Response [301]>]

response.history 属性将包括基于请求完成的响应对象的详细信息。显示的值将从最旧的排序到最新的。 response.history 属性将跟踪对请求的 URL 所做的所有重定向。

The response.history property will have the details of the response objects that were done based on the request. The values present will be sorted from the oldest to the newest ones. The response.history property tracks all the redirection done on the URL requested.

Requests - Handling Sessions

若要在请求之间维护数据,则需要会话。因此,如果反复调用同一主机,您可以重用 TCP 连接,这反过来将提高性能。现在,让我们看看如何使用会话在发出的请求之间维护 Cookie。

To maintain the data between requests you need sessions. So, if the same host is called again and again, you can reuse the TCP connection which in turn will improve the performance. Let us now see, how to maintain cookies across requests made using sessions.

Adding cookies using session

import requests
req = requests.Session()
cookies = dict(test='test123')
getdata = req.get('https://httpbin.org/cookies',cookies=cookies)
print(getdata.text)

Output

E:\prequests>python makeRequest.py
{
   "cookies": {
   "test": "test123"
}
}

使用会话,您可以保留请求之间的 Cookie 数据。还可以使用会话传递标头数据,如下所示:

Using session, you can preserve the cookies data across requests. It is also possible to pass headers data using the session as shown below−

Example

import requests
req = requests.Session()
req.headers.update({'x-user1': 'ABC'})
headers = {'x-user2': 'XYZ'}
getdata = req.get('https://httpbin.org/headers', headers=headers)
print(getdata.headers)

Requests - SSL Certification

SSL证书是安全网址附带的一项安全功能。当您使用 Requests 库时,它也会验证给定 https 网址的 SSL 证书。SSL 验证在 requests 模块中默认启用,如果证书不存在,则会引发错误。

SSL certificate is a security feature that comes with secure urls. When you use Requests library, it also verifies SSL certificates for the https URL given. SSL verification is enabled by default in the requests module and will throw an error if the certificate is not present.

Working with secure URL

以下是使用安全网址的示例−

Following is the example of working with secure URL−

import requests
getdata = requests.get(https://jsonplaceholder.typicode.com/users)
print(getdata.text)

Output

E:\prequests>python makeRequest.py
[
   {
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered client-server neural-net",
   "bs": "harness real-time e-markets"
   }
  }
]

我们很容易从上述 https 网址获取响应,这是因为 request 模块可以验证 SSL 证书。

We are easily getting a response from the above https URL, and it is because the request module can verify the SSL certificate.

您可以通过在示例中所示的方式简单添加 verify=False 来禁用 SSL 验证。

You can disable the SSL verification by simply adding verify=False as shown in the example below.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', verify=False)
print(getdata.text)

您将获得输出,但它还会给出一条警告消息,即 SSL 证书尚未得到验证,建议添加证书验证。

You will get the output, but it will also give a warning message that, the SSL certificate is not verified and adding certificate verification is advised.

Output

E:\prequests>python makeRequest.py
connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being
made. Adding certificate verification is strongly advised. See: https://urllib3
   .readthedocs.io/en/latest/advanced-usage.htm  l#ssl-warnings
 InsecureRequestWarning)
[
 {
  "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered   client-server neural-net",
   "bs": "harness real-time e-markets"
  }
 }
]

您还可以通过在自己这边托管 SSL 证书并使用 verify 参数给出路径来验证 SSL 证书,如下所示。

You can also verify SSL certificate by hosting it at your end, and giving the path using verify param as shown below.

Example

import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', verify='C:\Users\AppData\Local\certificate.txt')
print(getdata.text)

Output

E:\prequests>python makeRequest.py
[
  {
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered   client-server neural-net",
   "bs": "harness real-time e-markets"
   }
  }
]

Requests - Authentication

本章将讨论 Requests 模块中可用的认证类型。

This chapter will discuss the types of authentication available in the Requests module.

我们将讨论以下内容−

We are going to discuss the following−

  1. Working of Authentication in HTTP Requests

  2. Basic Authentication

  3. Digest Authentication

  4. OAuth2 Authentication

Working of Authentication in HTTP Requests

HTTP 认证是在服务器端进行的,当客户端请求一个网址时,要求提供一些认证信息,例如用户名和密码。这是对客户端和服务器之间交换的请求和响应的额外安全保护。

HTTP authentication is on the server-side asking for some authentication information like username, password when the client requests a URL. This is additional security for the request and the response being exchanged between the client and the server.

从客户端角度而言,这些额外的认证信息(即用户名和密码)可以发送在标头中,稍后将在服务器端进行验证。只有在认证有效的情况下,才会从服务器端传递响应。

From the client-side these additional authentication information i.e. username and password can be sent in the headers, which later on the server side will be validated. The response will be delivered from the server-side only when the authentication is valid.

Requests 库已在 requests.auth 中使用了最常用的认证,它们是基本认证(HTTPBasicAuth)和摘要认证(HTTPDigestAuth)。

Requests library has most commonly used authentication in requests.auth, which are Basic Authentication (HTTPBasicAuth) and Digest Authentication (HTTPDigestAuth).

Basic Authentication

这是提供服务器认证的最简单形式。为了使用基本认证,我们将使用 requests 库提供的 HTTPBasicAuth 类。

This is the simplest form of providing authentication to the server. To work with basic authentication, we are going to use HTTPBasicAuth class available with requests library.

Example

这是一个如何使用它的工作示例。

Here is a working example of how to use it.

import requests
from requests.auth import HTTPBasicAuth
response_data = requests.get('httpbin.org/basic-auth/admin/admin123', auth=HTTPDigestAuth('admin', 'admin123'))
print(response_data.text)

我们调用网址 https://httpbin.org/basic-auth/admin/admin123 ,其中用户是 admin,密码是 admin123。

We are calling the url, https://httpbin.org/basic-auth/admin/admin123 with user as admin and password as admin123.

因此,如果没有认证(即用户名和密码),这个网址将无法工作。一旦您使用 auth 参数提供认证,则只有服务器才会返回响应。

So, this URL will not work without authentication, i.e. user and password. Once you give the authentication using the auth param, then only the server will give back the response.

Output

E:\prequests>python makeRequest.py
{
   "authenticated": true,
   "user": "admin"
}

Digest Authentication

这是 requests 中可用的另一种认证形式。我们将使用 requests 中的 HTTPDigestAuth 类。

This is another form of authentication available with requests. We are going to make use of HTTPDigestAuth class from requests.

Example

import requests
from requests.auth import HTTPDigestAuth
response_data = requests.get('https://httpbin.org/digest-auth/auth/admin/admin123>, auth=HTTPDigestAuth('admin', 'admin123'))
print(response_data.text)

Output

E:\prequests>python makeRequest.py
{
   "authenticated": true,
   "user": "admin"
}

OAuth2 Authentication

要使用 OAuth2 认证,我们需要 “requests_oauth2” 库。要安装 “requests_oauth2”,请执行以下操作−

To use OAuth2 Authentication, we need “requests_oauth2” library. To install “requests_oauth2” do the following−

pip install requests_oauth2

在终端中安装时显示的内容如下图所示−

The display in your terminal while installing will be something as shown below−

E:\prequests>pip install requests_oauth2
Collecting requests_oauth2
Downloading https://files.pythonhosted.org/packages/52/dc/01c3c75e6e7341a2c7a9
71d111d7105df230ddb74b5d4e10a3dabb61750c/requests-oauth2-0.3.0.tar.gz
Requirement already satisfied: requests in c:\users\xyz\appdata\local\programs
\python\python37\lib\site-packages (from requests_oauth2) (2.22.0)
Requirement already satisfied: six in c:\users\xyz\appdata\local\programs\pyth
on\python37\lib\site-packages (from requests_oauth2) (1.12.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\use
rs\xyz\appdata\local\programs\python\python37\lib\site-packages (from requests
->requests_oauth2) (1.25.3)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\xyz\appdata\loca
l\programs\python\python37\lib\site-packages (from requests->requests_oauth2) (2
019.3.9)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\xyz\appdata\l
ocal\programs\python\python37\lib\site-packages (from requests->requests_oauth2)
(3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in c:\users\xyz\appdata\local\pr
ograms\python\python37\lib\site-packages (from requests->requests_oauth2) (2.8)
Building wheels for collected packages: requests-oauth2
Building wheel for requests-oauth2 (setup.py) ... done
Stored in directory: C:\Users\xyz\AppData\Local\pip\Cache\wheels\90\ef\b4\43
3743cbbc488463491da7df510d41c4e5aa28213caeedd586
Successfully built requests-oauth2

我们已经完成了“requests-oauth2”的安装。要使用 Google、Twitter 的 API,我们需要其同意,这将使用 OAuth2 身份验证完成。

We are done installing “requests-oauth2”. To use the API’s of Google, Twitter we need its consent and the same is done using OAuth2 authentication.

对于 OAuth2 身份验证,我们需要客户端 ID 和密钥。有关如何获取它们的详细信息,可以在链接中找到: https://developers.google.com/identity/protocols/OAuth2

For OAuth2 authentication we will need Client ID and a Secret Key. The details of how to get it, is mentioned on link: https://developers.google.com/identity/protocols/OAuth2.

稍后,登录 Google API 控制台,该控制台可在 https://console.developers.google.com/ 获得,并获取客户端 ID 和密钥。

Later on, login to Google API Console which is available at https://console.developers.google.com/and get the client id and secret key.

Example

下面是使用“requests-oauth2”的示例。

Here is an example of how to use "requests-oauth2".

import requests
from requests_oauth2.services import GoogleClient
google_auth = GoogleClient(
   client_id="xxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com",
redirect_uri="http://localhost/auth/success.html",
)
a = google_auth.authorize_url(
   scope=["profile", "email"],
   response_type="code",
)
res = requests.get(a)
print(res.url)

我们无法重定向到给定的 URL,因为它需要登录 Gmail 帐户,但在这里,您将从示例中看到,google_auth 可用,并给出了授权 URL。

We will not be able to redirect to the URL given, as it needs to login to the Gmail account, but here, you will see from the example, that google_auth works and the authorized URL is given.

Output

E:\prequests>python oauthRequest.py
https://accounts.google.com/o/oauth2/auth?redirect_uri=
http%3A%2F%2Flocalhost%2Fauth%2Fsuccess.html&
client_id=xxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com&
scope=profile+email&response_type=code

Requests - Event Hooks

我们可以使用事件挂钩将事件添加到请求的 URL。在下面的示例中,我们将添加一个回调函数,该函数将在响应可用时被调用。

We can add events to the URL requested using event hooks. In the example below, we are going to add a callback function that will get called when the response is available.

Example

要添加回调,我们需要像下面示例中所示使用 hooks 参数−

To add the callback, we need to make use of hooks param as shown in the example below−

mport requests
def printData(r, *args, **kwargs):
   print(r.url)
   print(r.text)
getdata = requests.get('https://jsonplaceholder.typicode.com/users',
hooks={'response': printData})

Output

E:\prequests>python makeRequest.py
https://jsonplaceholder.typicode.com/users
[
{
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
    "lat": "-37.3159",
    "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered client-server neural-net",
   "bs": "harness real-time e-markets"
   }
  }
]

您还可以像下面所示调用多个回调函数−

You can also call multiple callback functions as shown below−

Example

import requests
def printRequestedUrl(r, *args, **kwargs):
   print(r.url)
def printData(r, *args, **kwargs):
   print(r.text)
getdata = requests.get('https://jsonplaceholder.typicode.com/users', hooks={'response': [printRequestedUrl, printData]})

Output

E:\prequests>python makeRequest.py
https://jsonplaceholder.typicode.com/users
[
  {
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
 }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered client-server neural-net",
    "bs": "harness real-time e-markets"
   }
  }
]

您还可以将钩子添加到下面所示的创建的会话−

You can also add the hook to the Session created as shown below−

Example

import requests
def printData(r, *args, **kwargs):
print(r.text)
s = requests.Session()
s.hooks['response'].append(printData)
s.get('https://jsonplaceholder.typicode.com/users')

Output

E:\prequests>python makeRequest.py
[
 {
   "id": 1,
   "name": "Leanne Graham",
   "username": "Bret",
   "email": "Sincere@april.biz",
   "address": {
   "street": "Kulas Light",
   "suite": "Apt. 556",
   "city": "Gwenborough",
   "zipcode": "92998-3874",
   "geo": {
   "lat": "-37.3159",
   "lng": "81.1496"
   }
  },
   "phone": "1-770-736-8031 x56442",
   "website": "hildegard.org",
   "company": {
   "name": "Romaguera-Crona",
   "catchPhrase": "Multi-layered client-server neural-net",
   "bs": "harness real-time e-markets"
   }
  }
]

Requests - Proxy

到目前为止,我们已经看到客户端直接连接并与服务器通信。使用代理,交互如下进行−

So far, we have seen clients directly connecting and talking to the server. Using proxy, the interaction happens as follows−

  1. The client sends a request to the proxy.

  2. The proxy sends the request to the server.

  3. The server sends back the response to the proxy.

  4. The proxy will send a response back to the client.

使用 Http-proxy 是分配给管理客户端和服务器之间数据交换的附加安全性。requests 库还提供了处理代理的设置,方法是使用代理参数,如下所示−

Using Http-proxy is additional security assigned to manage the data exchange between client and server. The requests libraries also have provision to handle proxy, by using the proxies param as shown below−

Example

import requests
proxies = {
'http': 'http://localhost:8080'
}
res = requests.get('http://httpbin.org/', proxies=proxies)
print(res.status_code)

该请求将路由到( http://localhost:8080 )URL。

The request will route to ('http://localhost:8080 URL.

Output

200

Requests - Web Scraping using Requests

我们已经看到如何使用 Python requests 库从给定的 URL 获取数据。我们将尝试使用以下代码从 Tutorialspoint 站点(可在此处获取: https://www.tutorialspoint.com/tutorialslibrary.htm )提取数据:

We have already seen how we can get data from a given URL using python requests library. We will try to scrap the data from the site of Tutorialspoint which is available at https://www.tutorialspoint.com/tutorialslibrary.htm using the following−

  1. Requests Library

  2. Beautiful soup library from python

我们已经安装了 Requests 库,现在让我们安装 BeautifulSoup 包。如果您想进一步了解 BeautifulSoup 的一些功能,这是 beautiful soup 的官方网站,该网站可在此处获取: https://www.crummy.com/software/BeautifulSoup/bs4/doc/

We have already installed the Requests library, let us now install Beautiful soup package. Here is the official website for beautiful soup available at https://www.crummy.com/software/BeautifulSoup/bs4/doc/ in case you want to explore some more functionalities of beautiful soup.

Installing Beautifulsoup

我们将在下面看到如何安装 Beautiful Soup:

We shall see how to install Beautiful Soup below−

E:\prequests>pip install beautifulsoup4
Collecting beautifulsoup4
Downloading https://files.pythonhosted.org/packages/3b/c8/a55eb6ea11cd7e5ac4ba
cdf92bac4693b90d3ba79268be16527555e186f0/beautifulsoup4-4.8.1-py3-none-any.whl (
101kB)
|████████████████████████████████| 102kB 22kB/s
Collecting soupsieve>=1.2 (from beautifulsoup4)
Downloading https://files.pythonhosted.org/packages/81/94/03c0f04471fc245d08d0
a99f7946ac228ca98da4fa75796c507f61e688c2/soupsieve-1.9.5-py2.py3-none-any.whl
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.8.1 soupsieve-1.9.5

我们现在已经安装了 Python requests 库和 beautiful soup。

We now have python requests library and beautiful soup installed.

现在让我们编写代码,这将从给定的 URL 中提取数据。

Let us now write the code, that will scrap the data from the URL given.

Web scraping

import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.tutorialspoint.com/tutorialslibrary.htm')
print("The status code is ", res.status_code)
print("\n")
soup_data = BeautifulSoup(res.text, 'html.parser')
print(soup_data.title)
print("\n")
print(soup_data.find_all('h4'))

使用 requests 库,我们可以从给定的 URL 中获取内容,beautiful soup 库有助于分析它并以我们想要的方式获取详细信息。

Using requests library, we can fetch the content from the URL given and beautiful soup library helps to parse it and fetch the details the way we want.

您可以使用 beautiful soup 库使用 Html 标记、类、id、css 选择器以及更多方式提取数据。以下是我们获得的输出,其中我们打印了页面的标题以及页面上的所有 h4 标记。

You can use a beautiful soup library to fetch data using Html tag, class, id, css selector and many more ways. Following is the output we get wherein we have printed the title of the page and also all the h4 tags on the page.

Output

E:\prequests>python makeRequest.py
The status code is 200
<title>Free Online Tutorials and Courses</title>
[<h4>Academic</h4>, <h4>Computer Science</h4>, <h4>Digital Marketing</h4>, <h4>M
onuments</h4>,<h4>Machine Learning</h4>, <h4>Mathematics</h4>, <h4>Mobile Devel
opment</h4>,<h4>SAP</h4>, <h4>Software Quality</h4>, <h4>Big Data & Analyti
cs</h4>, <h4>Databases</h4>, <h4>Engineering Tutorials</h4>, <h4>Mainframe Devel
opment</h4>, <h4>Microsoft Technologies</h4>, <h4>Java Technologies</h4>,<h4>XM
L Technologies</h4>, <h4>Python Technologies</h4>, <h4>Sports</h4>, <h4>Computer
Programming</h4>,<h4>DevOps</h4>, <h4>Latest Technologies</h4>, <h4>Telecom</h4>, <h4>Exams Syllabus</h4>, <h4>UPSC IAS Exams</h4>, <h4>Web Development</h4>,
<h4>Scripts</h4>, <h4>Management</h4>,<h4>Soft Skills</h4>, <h4>Selected Readin
g</h4>, <h4>Misc</h4>]