Requests 简明教程
Requests - Overview
Requests 是一个 HTTP 库,它提供了轻松处理 Web 应用程序中 http 请求/响应的功能。此库使用 python 开发。
Python Requests 的官方网站,可从 https://2.python-requests.org/en/master/ 获得,将 Requests 定义如下 −
Requests 是 Python 中一个优雅简单的 HTTP 库,为人类而设计。
Features of Requests
Requests 的功能在下面讨论 −
Timeouts
使用 python requests 库,可以轻松地将超时添加到您请求的 URL。您碰巧使用的是第三方 URL 并等待响应。
在 URL 上给出一个超时总是一个好习惯,因为我们可能希望 URL 在该超时内通过响应或错误做出响应。不这样做可能会导致无限期等待该请求。
Requests - Environment Setup
在本章中,我们将着手 Requests 的安装。要开始使用 Requests 模块,我们首先需要安装 Python。因此,我们将处理以下内容:
-
Install Python
-
Install Requests
Installing Python
转到 Python 官方网站: https://www.python.org/downloads/ ,如下所示,然后单击适用于 Windows、Linux/Unix 和 Mac OS 的最新可用版本。根据您现有的 64 位或 32 位操作系统下载 Python。
下载完成后,单击 .exe 文件并按照步骤在您的系统上安装 Python。
Python 包管理器(即 pip)也将默认随上述安装一起安装。要使它在您的系统上全局工作,请直接将 Python 的位置添加到 PATH 变量。在安装开始时显示相同内容,请记住勾选“添加到 PATH”复选框。如果您忘记选中它,请按照以下给定的步骤添加到 PATH。
To add to PATH follow the steps−
右键单击您的计算机图标,然后单击属性>高级系统设置。
它会显示如下屏幕:
单击“环境变量”,如下图所示,它会显示如下屏幕:
选择路径并单击编辑按钮,在末尾添加您的 Python 的位置路径。现在,让我们检查 python 版本。
Install Requests
既然我们已经安装了 Python,我们将安装 Requests。
安装 Python 后,也会安装 Python 包管理器(即 pip)。以下是检查 pip 版本的命令。
E:\prequests>pip --version
pip 19.1.1 from c:\users\xxxxx\appdata\local\programs\python\python37\lib\site-p
ackages\pip (python 3.7)
我们已安装了 pip 并且版本为 19.1.1。现在,将使用 pip 安装 Requests 模块。
下面给出了命令−
pip install requests
E:\prequests>pip install requests
Requirement already satisfied: requests in c:\users\xxxx\appdata\local\programs
\python\python37\lib\site-packages (2.22.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\kamat\appdata\loca
l\programs\python\python37\lib\site-packages (from requests) (2019.3.9)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\use
rs\xxxxx\appdata\local\programs\python\python37\lib\site-packages (from requests
) (1.25.3)
Requirement already satisfied: idna<2.9,>=2.5 in c:\users\xxxxxxx\appdata\local\pr
ograms\python\python37\lib\site-packages (from requests) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\xxxxx\appdata\l
ocal\programs\python\python37\lib\site-packages (from requests) (3.0.4)
我们已经安装了该模块,因此在命令提示符中,它显示 Requirement already satisfied(需求已满足);如果没有安装,它将已为安装下载所需的包。
要查看已安装的 requests 模块的详细信息,您可以使用以下命令−
pip show requests
E:\prequests>pip show requests
Name: requests
Version: 2.22.0
Summary: Python HTTP for Humans.
Home-page: http://python-requests.org
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Location: c:\users\xxxxx\appdata\local\programs\python\python37\lib\site-package
S
Requires: certifi, idna, urllib3, chardet
Required-by:
Requests 模块的版本为 2.22.0。
Requests - How Http Requests Work?
Python 的 Requests 是一个 HTTP 库,它将帮助我们在客户端和服务器之间交换数据。想象一下,您有一个表单的 UI,您需要在其中输入用户详细信息,因此一旦您输入了它,您就必须提交数据,而这只不过是从客户端到服务器的保存数据的 HTTP POST 或 PUT 请求。
当您想要数据时,您需要从服务器获取它,这又是一个 Http GET 请求。客户端请求数据时数据在客户端和服务器之间的交换,以及服务器用所需数据进行响应,这种客户端和服务器之间的关系非常重要。
请求已发送到给定的 URL,它可以是安全或非安全 URL。
对 URL 的请求可以使用 GET、POST、PUT、DELETE。使用最广泛的是 GET 方法,主要用于您想要从服务器获取数据时。
您还可以将数据作为查询字符串发送至 URL,例如−
因此,在这里,我们向 URL 传递 id = 9 和 username = Delphine。所有值都在问号 (?) 之后以键/值对的形式发送,并且使用 & 符号将多个参数传递给 URL。
使用请求库,使用字符串字典按如下方式调用 URL。
其中,数据以字符串字典的形式发送至 URL。如果您希望传递 id=9 and username=Delphine,则可以执行以下操作−
payload = {'id': '9', 'username': 'Delphine'}
请求库按如下方式调用−
res = requests.get('https://jsonplaceholder.typicode.com/users', params=payload')
Using POST, we can do as follows−
res = requests.post('https://jsonplaceholder.typicode.com/users', data = {'id':'9', 'username':'Delphine'})
Requests - Working with Requests
在本章中,我们将了解如何使用 requests 模块。我们接下来将了解以下内容−
-
Making HTTP Requests.
-
将参数传递给 HTTP 请求。
Making HTTP Requests
要进行 HTTP 请求,我们首先需要导入 request 模块,如下所示−
import requests
现在让我们了解如何使用 requests 模块发起对 URL 的调用。
让我们在代码中使用 URL− link: https://jsonplaceholder.typicode.com/users 来测试 Requests 模块。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.status_code)
url− https://jsonplaceholder.typicode.com/users 是使用 requests.get() 方法调用的。URL 的响应对象存储在 getdata 变量中。当我们打印该变量时,它给出了 200 响应代码,这意味着我们已成功获得响应。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)
getdata.content 将打印响应中所有可用数据。
Output
E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n }
Passing Parameters to HTTP Requests
仅请求 URL 是不够的,我们还需要将参数传递给 URL。
params 通常以键/值对形式传递,例如−
https://jsonplaceholder.typicode.com/users?id=9&username=Delphine
因此,我们有 id=9 和 username=Delphine。现在,将了解如何将此类数据传递给 requests Http 模块。
Example
import requests
payload = {'id': 9, 'username': 'Delphine'}
getdata = requests.get('https://jsonplaceholder.typicode.com/users', params=payload)
print(getdata.content)
这些详细信息存储在键/值对中的对象有效负载中,并传递到 get() 方法内部的 params。
Output
E:\prequests>python makeRequest.py
b'[\n {\n "id": 9,\n "name": "Glenna Reichert",\n "username": "Delphin
e",\n "email": "Chaim_McDermott@dana.io",\n "address": {\n "street":
"Dayna Park",\n "suite": "Suite 449",\n "city": "Bartholomebury",\n
"zipcode": "76495-3109",\n "geo": {\n "lat": "24.6463",\n
"lng": "-168.8889"\n }\n },\n "phone": "(775)976-6794 x41206",\n "
website": "conrad.com",\n "company": {\n "name": "Yost and Sons",\n
"catchPhrase": "Switchable contextually-based project",\n "bs": "aggregate
real-time technologies"\n }\n }\n]'
我们现在正在响应中获取 id=9 和 username=Delphine 详细信息。
如果您想查看在传递参数后 URL 的外观,则可以使用响应对象来获取 URL。
Handling Response for HTTP Requests
在本章中,我们将更详细地了解从请求模块接收到的响应。我们将讨论以下详细信息:
-
Getting Response
-
JSON Response
-
RAW Response
-
Binary Response
Getting Response
我们将使用 request.get() 方法向 URL 发出请求。
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users');
getdata 具有响应对象。它包含响应的所有详细信息。我们可以使用两种方式((.text) 和 (.content))获取响应。使用 response.text 将以文本格式返回数据,如下所示:
Example
E:\prequests>python makeRequest.py
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
},
您会看到响应与在浏览器中为该 URL 查看源代码时出现的相同,如下所示:
您还可以尝试 .html URL,并使用 response.text 查看内容,它将与浏览器中 .html URL 的查看源内容相同。
现在,让我们尝试针对同一个 URL 使用 response.content, 并查看输出。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)
Output
E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n },\n {\n "id": 2,\n "name": "Ervin Howell",\n "us
ername": "Antonette",\n "email": "Shanna@melissa.tv",\n "address": {\n
"street": "Victor Plains",\n "suite": "Suite 879",\n "city": "Wisoky
burgh",\n "zipcode": "90566-7771",\n "geo": {\n "lat": "-43.950
9",\n "lng": "-34.4618"\n }\n },\n "phone": "010-692-6593 x091
25",\n "website": "anastasia.net",\n "company": {\n "name": "Deckow-C
rist",\n "catchPhrase": "Proactive didactic contingency",\n "bs": "syn
ergize scalable supply-chains"\n }\n },\n {\n "id": 3,\n "name": "Cle
mentine Bauch",\n "username": "Samantha",\n "email":
"Nathan@yesenia.net",
\n "address": {\n "street": "Douglas Extension",\n "suite": "Suite
847",\n "city": "McKenziehaven",\n "zipcode": "59590-4157",\n "ge
o": {\n "lat": "-68.6102",\n "lng": "-47.0653"\n }\n },\n
响应以字节为单位提供。您将在响应开头得到一个字符 b 。使用 requests 模块,您可以获得使用的编码,也可以在需要时更改编码。例如,要获得编码,可以使用 response.encoding。
print(getdata.encoding)
You can also get the response for the Http request in json format by using response.json() method as follows−
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.json())
Output
E:\prequests>python makeRequest.py
[{'id': 1, 'name': 'Leanne Graham', 'username': 'Bret', 'email': 'Sincere@april.
biz', 'address': {'street': 'Kulas Light', 'suite': 'Apt. 556', 'city': 'Gwenbor
ough', 'zipcode': '92998-3874', 'geo': {'lat': '-37.3159', 'lng': '81.1496'}},
'
phone': '1-770-736-8031 x56442', 'website': 'hildegard.org', 'company': {'name':
'Romaguera-Crona', 'catchPhrase': 'Multi-layered client-server neural-net', 'bs
': 'harness real-time e-markets'}}]
In case you need the raw response for the Http URL you can make use of response.raw, also add stream=True inside the get method as shown below−
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', stream=True)
print(getdata.raw)
Binary Response
要获得二进制响应,我们可以利用 response.content。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)
Output
E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n },\n {\n "id": 2,\n "name": "Ervin Howell",\n "us
ername": "Antonette",\n "email": "Shanna@melissa.tv",\n "address": {\n
"street": "Victor Plains",\n "suite": "Suite 879",\n "city": "Wisoky
burgh",\n "zipcode": "90566-7771",\n "geo": {\n "lat": "-43.950
9",\n "lng": "-34.4618"\n }\n },\n "phone": "010-692-6593 x091
25",\n "website": "anastasia.net",\n "company": {\n "name": "Deckow-C
rist",\n "catchPhrase": "Proactive didactic contingency",\n "bs": "syn
ergize scalable supply-chains"\n }\n },\n {\n "id": 3,\n "name": "Cle
mentine Bauch",\n "username": "Samantha",\n "email": "Nathan@yesenia.net",
\n "address": {\n "street": "Douglas Extension",\n "suite": "Suite
847",\n "city": "McKenziehaven",\n "zipcode": "59590-4157",\n "ge
o": {\n "lat": "-68.6102",\n "lng": "-47.0653"\n }\n },\n
响应以字节为单位提供。您将在响应开头得到一个字符 b 。二进制响应主要用于非文本请求。
Requests - HTTP Requests Headers
在上一章中,我们已经了解了如何发出请求并获取响应。本章将更深入地探讨 URL 的标头部分。因此,我们将研究以下内容:
-
Understanding Request Headers
-
Custom Headers
-
Response Headers
Understanding Request Headers
在浏览器中击中任意 URL,检查它并在开发者工具网络选项卡中查看。
你将得到响应头部、请求头部、载荷等。
例如,考虑以下 URL−
你可以按以下方式获得头部详细信息−
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', stream=True)
print(getdata.headers)
Output
E:\prequests>python makeRequest.py
{'Date': 'Sat, 30 Nov 2019 05:15:00 GMT', 'Content-Type': 'application/json; cha
rset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Set-Co
okie': '__cfduid=d2b84ccf43c40e18b95122b0b49f5cf091575090900; expires=Mon, 30-De
c-19 05:15:00 GMT; path=/; domain=.typicode.com; HttpOnly', 'X-Powered-By': 'Exp
ress', 'Vary': 'Origin, Accept-Encoding', 'Access-Control-Allow-Credentials': 't
rue', 'Cache-Control': 'max-age=14400', 'Pragma': 'no-cache', 'Expires': '-1', '
X-Content-Type-Options': 'nosniff', 'Etag': 'W/"160d-1eMSsxeJRfnVLRBmYJSbCiJZ1qQ
"', 'Content-Encoding': 'gzip', 'Via': '1.1 vegur', 'CF-Cache-Status': 'HIT', 'A
ge': '2271', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudf
lare.com/cdn-cgi/beacon/expect-ct"', 'Server': 'cloudflare', 'CF-RAY': '53da574f
f99fc331-SIN'}
要读取任何 HTTP 头部,你可以按以下方式执行−
getdata.headers["Content-Encoding"] // gzip
Response Headers
当你从浏览器开发者工具中的网络选项卡中查看 URL 时,响应头部看起来如下−
要从请求模块中获取头部详细信息,请使用。Response.headers 如下所示−
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.headers)
Output
E:\prequests>python makeRequest.py
{'Date': 'Sat, 30 Nov 2019 06:08:10 GMT', 'Content-Type': 'application/json; cha
rset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Set-Co
okie': '__cfduid=de1158f1a5116f3754c2c353055694e0d1575094090; expires=Mon, 30-De
c-19 06:08:10 GMT; path=/; domain=.typicode.com; HttpOnly', 'X-Powered-By': 'Exp
ress', 'Vary': 'Origin, Accept-Encoding', 'Access-Control-Allow-Credentials': 't
rue', 'Cache-Control': 'max-age=14400', 'Pragma': 'no-cache', 'Expires': '-1', '
X-Content-Type-Options': 'nosniff', 'Etag': 'W/"160d-1eMSsxeJRfnVLRBmYJSbCiJZ1qQ
"', 'Content-Encoding': 'gzip', 'Via': '1.1 vegur', 'CF-Cache-Status': 'HIT', 'A
ge': '5461', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudf
lare.com/cdn-cgi/beacon/expect-ct"', 'Server': 'cloudflare', 'CF-RAY': '53daa52f
3b7ec395-SIN'}
你可以按以下方式获取任何你想要的特定头部−
print(getdata.headers["Expect-CT"])
Requests - Handling GET Requests
本章将更集中于 GET 请求,GET 请求是最常见且最常用的。请求模块中的 GET 工作非常简单。下面是一个使用 URL 和 GET 方法工作的简单示例。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.content)
getdata.content, will print all the data available in the response.
Output
E:\prequests>python makeRequest.py
b'[\n {\n "id": 1,\n "name": "Leanne Graham",\n "username": "Bret",\n
"email": "Sincere@april.biz",\n "address": {\n "street": "Kulas Light
",\n "suite": "Apt. 556",\n "city": "Gwenborough",\n "zipcode": "
92998-3874",\n "geo": {\n "lat": "-37.3159",\n "lng": "81.149
6"\n }\n },\n "phone": "1-770-736-8031 x56442",\n "website": "hild
egard.org",\n "company": {\n "name": "Romaguera-Crona",\n "catchPhr
ase": "Multi-layered client-server neural-net",\n "bs": "harness real-time
e-markets"\n }\n }
你还可以使用 param 属性向 get 方法传递参数,如下所示−
import requests
payload = {'id': 9, 'username': 'Delphine'}
getdata = requests.get('https://jsonplaceholder.typicode.com/users',
params=payload)
print(getdata.content)
这些详细信息存储在键/值对中的对象有效负载中,并传递到 get() 方法内部的 params。
Output
E:\prequests>python makeRequest.py
b'[\n {\n "id": 9,\n "name": "Glenna Reichert",\n "username": "Delphin
e",\n "email": "Chaim_McDermott@dana.io",\n "address": {\n "street":
"Dayna Park",\n "suite": "Suite 449",\n "city": "Bartholomebury",\n
"zipcode": "76495-3109",\n "geo": {\n "lat": "24.6463",\n
"lng": "-168.8889"\n }\n },\n "phone": "(775)976-6794 x41206",\n "
website": "conrad.com",\n "company": {\n "name": "Yost and Sons",\n
"catchPhrase": "Switchable contextually-based project",\n "bs": "aggregate
real-time technologies"\n }\n }\n]'
Handling POST, PUT, PATCH and DELETE Requests
在本章中,我们将介绍如何使用请求库来使用 POST 方法以及如何将参数传递到 URL 中。
Using POST
对于 PUT 请求,Requests 库具有 requests.post() 方法,其示例如下所示:
import requests
myurl = 'https://postman-echo.com/post'
myparams = {'name': 'ABC', 'email':'xyz@gmail.com'}
res = requests.post(myurl, data=myparams)
print(res.text)
Output
E:\prequests>python makeRequest.py
{"args":{},"data":"","files":{},"form":{"name":"ABC","email":"xyz@gmail.com"},"headers":{"x-forwarded-proto":"https","host":"postman-echo.com","content-length":"30","accept":"*/*","accept-encoding":"gzip,deflate","content-type":"application/x-www-form-urlencoded","user-agent":"python-requests/2.22.0","x-forwarded-port":"443"},"json":{"name":"ABC","email":"xyz@gmail.com"},"url":"https://postman-echo.com/post"}
在前述示例中,你可以将表单数据作为键值对传递到 requests.post() 内部的 data 参数。我们还将了解如何在请求模块中使用 PUT、PATCH 和 DELETE。
Using PUT
对于 PUT 请求,Requests 库具有 requests.put() 方法,其示例如下所示。
import requests
myurl = 'https://postman-echo.com/put'
myparams = {'name': 'ABC', 'email':'xyz@gmail.com'}
res = requests.put(myurl, data=myparams)
print(res.text)
Output
E:\prequests>python makeRequest.py
{"args":{},"data":"","files":{},"form":{"name":"ABC","email":"xyz@gmail.com"},"h
eaders":{"x-forwarded-proto":"https","host":"postman-echo.com","content-length":
"30","accept":"*/*","accept-encoding":"gzip, deflate","content-type":"applicatio
n/x-www-form-urlencoded","user-agent":"python-requests/2.22.0","x-forwarded-port
":"443"},"json":{"name":"ABC","email":"xyz@gmail.com"},"url":"https://postman-ec
ho.com/put"}
Using PATCH
对于 PATCH 请求,Requests 库具有 requests.patch() 方法,其示例如下所示。
import requests
myurl = https://postman-echo.com/patch'
res = requests.patch(myurl, data="testing patch")
print(res.text)
Output
E:\prequests>python makeRequest.py
{"args":{},"data":{},"files":{},"form":{},"headers":{"x-forwarded-proto":"https"
,"host":"postman-echo.com","content-length":"13","accept":"*/*","accept-encoding
":"gzip, deflate","user-agent":"python-requests/2.22.0","x-forwarded-port":"443"
},"json":null,"url":"https://postman-echo.com/patch"}
Using DELETE
DELETE 请求中,Requests 库有 requests.delete()
方法,以下所示为示例。
import requests
myurl = 'https://postman-echo.com/delete'
res = requests.delete(myurl, data="testing delete")
print(res.text)
Output
E:\prequests>python makeRequest.py
{"args":{},"data":{},"files":{},"form":{},"headers":{"x-forwarded-proto":"https"
,"host":"postman-echo.com","content-length":"14","accept":"*/*","accept-encoding
":"gzip, deflate","user-agent":"python-requests/2.22.0","x-forwarded-port":"443"
},"json":null,"url":"https://postman-echo.com/delete"}
Requests - File Upload
In this chapter, we will upload a file using request and read the contents of the file uploaded. We can do it using the ` files ` param as shown in the example below.
We will use the ` http://httpbin.org/ `post to upload the file.
Example
import requests
myurl = 'https://httpbin.org/post'
files = {'file': open('test.txt', 'rb')}
getdata = requests.post(myurl, files=files)
print(getdata.text)
Example
var total = [0, 1, 2, 3].reduceRight(function(a, b){ return a + b; });
console.log("total is : " + total );
Output
E:\prequests>python makeRequest.py
{
"args": {},
"data": "",
"files": {
"file": "File upload test using Requests"
},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "175",
"Content-Type": "multipart/form-data;
boundary=28aee3a9d15a3571fb80d4d2a94bf
d33",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.22.0"
},
"json": null,
"origin": "117.223.63.135, 117.223.63.135",
"url": "https://httpbin.org/post"
}
It is also possible to send the contents of the file as shown below−
Example
import requests
myurl = 'https://httpbin.org/post'
files = {'file': ('test1.txt', 'Welcome to TutorialsPoint')}
getdata = requests.post(myurl, files=files)
print(getdata.text)
Output
E:\prequests>python makeRequest.py
{
"args": {},
"data": "",
"files": {
"file": "Welcome to TutorialsPoint"
},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "170",
"Content-Type": "multipart/form-data; boundary=f2837238286fe40e32080aa7e172b
e4f",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.22.0"
},
"json": null,
"origin": "117.223.63.135, 117.223.63.135",
"url": "https://httpbin.org/post"
}
Requests - Working with Cookies
This chapter will discuss how to deal with cookies. You can get the cookies as well as send your cookies while calling the URL using the requests library.
The url, ` https://jsonplaceholder.typicode.com/users ` when hits in the browser we can get the details of the cookies as shown below−
You can read the cookies as shown below−
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users')
print(getdata.cookies["__cfduid"])
Output
E:\prequests>python makeRequest.py
d1733467caa1e3431fb7f768fa79ed3741575094848
You can also send cookies when we make a request.
Requests - Working with Errors
This chapter will discuss how to deal with errors coming down when working with the Http request library. It is always a good practice to have errors managed for all possible cases.
Error Exception
The requests module gives the following types of error exception−
` ConnectionError − This will be raised, if there is any connection error. For example, the network failed, DNS error so the Request library will raise `ConnectionError
exception.
` Response.raise_for_status() − Based on status code i.e. 401, 404 it will raise `HTTPError
for the url requested.
` HTTPError `− This error will be raised for an invalid response coming down for the request made.
` Timeout `− Errors raised for a timeout for the URL requested.
` TooManyRedirects − If the limit is crossed for maximum redirections than it will raise `TooManyRedirects
error.
Example
Here is an example of errors shown for timeout−
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=0.001)
print(getdata.text)
Output
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='jsonplaceholder.ty
picode.com', port=443): Max retries exceeded with url: /users (Caused by Connect
TimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at
0x000000B02AD
E76A0>, 'Connection to jsonplaceholder.typicode.com timed out. (connect timeout= 0.001)'))
Requests - Handling Timeouts
Timeouts can be easily added to the URL you are requesting. It so happens that, you are using a third-party URL and waiting for a response. It is always a good practice to give a timeout on the URL, as we might want the URL to respond within a timespan with a response or an error. Not doing so, can cause to wait on that request indefinitely.
We can give timeout to the URL by using the timeout
param and value is passed in seconds as shown in the example below−
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=0.001)
print(getdata.text)
Output
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='jsonplaceholder.ty
picode.com', port=443): Max retries exceeded with url: /users (Caused by Connect
TimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at 0x000000B02AD
E76A0>, 'Connection to jsonplaceholder.typicode.com timed out. (connect timeout=
0.001)'))
The timeout given is as follows−
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=0.001)
执行抛出连接超时错误,如图中所示。给出的超时是 0.001,这使得请求无法获取响应并抛出错误。现在,我们将增加超时并进行检查。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users',timeout=1.000)
print(getdata.text)
Output
E:\prequests>python makeRequest.py
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
使用 1 秒的超时,我们可以获得请求的 URL 的响应。
Requests - Handling Redirection
本章将了解 Request 库如何处理 URL 重定向的情况。
Example
import requests
getdata = requests.get('http://google.com/')
print(getdata.status_code)
print(getdata.history)
url http://google.com 将使用状态代码 301(永久移动)重定向到 https://www.google.com/ 。该重定向将保存在历史记录中。
Output
执行上述代码后,我们将获得以下结果:
E:\prequests>python makeRequest.py
200
[<Response [301]>]
您可以使用 allow_redirects=False 停止 URL 的重定向。可以在所使用的 GET、POST、OPTIONS、PUT、DELETE、PATCH 方法上进行。
Requests - Handling History
Requests - Handling Sessions
若要在请求之间维护数据,则需要会话。因此,如果反复调用同一主机,您可以重用 TCP 连接,这反过来将提高性能。现在,让我们看看如何使用会话在发出的请求之间维护 Cookie。
Adding cookies using session
import requests
req = requests.Session()
cookies = dict(test='test123')
getdata = req.get('https://httpbin.org/cookies',cookies=cookies)
print(getdata.text)
Requests - SSL Certification
SSL证书是安全网址附带的一项安全功能。当您使用 Requests 库时,它也会验证给定 https 网址的 SSL 证书。SSL 验证在 requests 模块中默认启用,如果证书不存在,则会引发错误。
Working with secure URL
以下是使用安全网址的示例−
import requests
getdata = requests.get(https://jsonplaceholder.typicode.com/users)
print(getdata.text)
Output
E:\prequests>python makeRequest.py
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
}
]
我们很容易从上述 https 网址获取响应,这是因为 request 模块可以验证 SSL 证书。
您可以通过在示例中所示的方式简单添加 verify=False 来禁用 SSL 验证。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', verify=False)
print(getdata.text)
您将获得输出,但它还会给出一条警告消息,即 SSL 证书尚未得到验证,建议添加证书验证。
Output
E:\prequests>python makeRequest.py
connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being
made. Adding certificate verification is strongly advised. See: https://urllib3
.readthedocs.io/en/latest/advanced-usage.htm l#ssl-warnings
InsecureRequestWarning)
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
}
]
您还可以通过在自己这边托管 SSL 证书并使用 verify 参数给出路径来验证 SSL 证书,如下所示。
Example
import requests
getdata = requests.get('https://jsonplaceholder.typicode.com/users', verify='C:\Users\AppData\Local\certificate.txt')
print(getdata.text)
Output
E:\prequests>python makeRequest.py
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
}
]
Requests - Authentication
本章将讨论 Requests 模块中可用的认证类型。
我们将讨论以下内容−
-
使用 HTTP 请求进行认证
-
Basic Authentication
-
Digest Authentication
-
OAuth2 Authentication
Working of Authentication in HTTP Requests
HTTP 认证是在服务器端进行的,当客户端请求一个网址时,要求提供一些认证信息,例如用户名和密码。这是对客户端和服务器之间交换的请求和响应的额外安全保护。
从客户端角度而言,这些额外的认证信息(即用户名和密码)可以发送在标头中,稍后将在服务器端进行验证。只有在认证有效的情况下,才会从服务器端传递响应。
Requests 库已在 requests.auth 中使用了最常用的认证,它们是基本认证(HTTPBasicAuth)和摘要认证(HTTPDigestAuth)。
Basic Authentication
这是提供服务器认证的最简单形式。为了使用基本认证,我们将使用 requests 库提供的 HTTPBasicAuth 类。
Example
这是一个如何使用它的工作示例。
import requests
from requests.auth import HTTPBasicAuth
response_data = requests.get('httpbin.org/basic-auth/admin/admin123', auth=HTTPDigestAuth('admin', 'admin123'))
print(response_data.text)
我们调用网址 https://httpbin.org/basic-auth/admin/admin123 ,其中用户是 admin,密码是 admin123。
因此,如果没有认证(即用户名和密码),这个网址将无法工作。一旦您使用 auth 参数提供认证,则只有服务器才会返回响应。
OAuth2 Authentication
要使用 OAuth2 认证,我们需要 “requests_oauth2” 库。要安装 “requests_oauth2”,请执行以下操作−
pip install requests_oauth2
在终端中安装时显示的内容如下图所示−
E:\prequests>pip install requests_oauth2
Collecting requests_oauth2
Downloading https://files.pythonhosted.org/packages/52/dc/01c3c75e6e7341a2c7a9
71d111d7105df230ddb74b5d4e10a3dabb61750c/requests-oauth2-0.3.0.tar.gz
Requirement already satisfied: requests in c:\users\xyz\appdata\local\programs
\python\python37\lib\site-packages (from requests_oauth2) (2.22.0)
Requirement already satisfied: six in c:\users\xyz\appdata\local\programs\pyth
on\python37\lib\site-packages (from requests_oauth2) (1.12.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\use
rs\xyz\appdata\local\programs\python\python37\lib\site-packages (from requests
->requests_oauth2) (1.25.3)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\xyz\appdata\loca
l\programs\python\python37\lib\site-packages (from requests->requests_oauth2) (2
019.3.9)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\xyz\appdata\l
ocal\programs\python\python37\lib\site-packages (from requests->requests_oauth2)
(3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in c:\users\xyz\appdata\local\pr
ograms\python\python37\lib\site-packages (from requests->requests_oauth2) (2.8)
Building wheels for collected packages: requests-oauth2
Building wheel for requests-oauth2 (setup.py) ... done
Stored in directory: C:\Users\xyz\AppData\Local\pip\Cache\wheels\90\ef\b4\43
3743cbbc488463491da7df510d41c4e5aa28213caeedd586
Successfully built requests-oauth2
我们已经完成了“requests-oauth2”的安装。要使用 Google、Twitter 的 API,我们需要其同意,这将使用 OAuth2 身份验证完成。
对于 OAuth2 身份验证,我们需要客户端 ID 和密钥。有关如何获取它们的详细信息,可以在链接中找到: https://developers.google.com/identity/protocols/OAuth2 。
稍后,登录 Google API 控制台,该控制台可在 https://console.developers.google.com/ 获得,并获取客户端 ID 和密钥。
Example
下面是使用“requests-oauth2”的示例。
import requests
from requests_oauth2.services import GoogleClient
google_auth = GoogleClient(
client_id="xxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com",
redirect_uri="http://localhost/auth/success.html",
)
a = google_auth.authorize_url(
scope=["profile", "email"],
response_type="code",
)
res = requests.get(a)
print(res.url)
我们无法重定向到给定的 URL,因为它需要登录 Gmail 帐户,但在这里,您将从示例中看到,google_auth 可用,并给出了授权 URL。
Requests - Event Hooks
我们可以使用事件挂钩将事件添加到请求的 URL。在下面的示例中,我们将添加一个回调函数,该函数将在响应可用时被调用。
Example
要添加回调,我们需要像下面示例中所示使用 hooks 参数−
mport requests
def printData(r, *args, **kwargs):
print(r.url)
print(r.text)
getdata = requests.get('https://jsonplaceholder.typicode.com/users',
hooks={'response': printData})
Output
E:\prequests>python makeRequest.py
https://jsonplaceholder.typicode.com/users
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
}
]
您还可以像下面所示调用多个回调函数−
Example
import requests
def printRequestedUrl(r, *args, **kwargs):
print(r.url)
def printData(r, *args, **kwargs):
print(r.text)
getdata = requests.get('https://jsonplaceholder.typicode.com/users', hooks={'response': [printRequestedUrl, printData]})
Output
E:\prequests>python makeRequest.py
https://jsonplaceholder.typicode.com/users
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
}
]
您还可以将钩子添加到下面所示的创建的会话−
Example
import requests
def printData(r, *args, **kwargs):
print(r.text)
s = requests.Session()
s.hooks['response'].append(printData)
s.get('https://jsonplaceholder.typicode.com/users')
Output
E:\prequests>python makeRequest.py
[
{
"id": 1,
"name": "Leanne Graham",
"username": "Bret",
"email": "Sincere@april.biz",
"address": {
"street": "Kulas Light",
"suite": "Apt. 556",
"city": "Gwenborough",
"zipcode": "92998-3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770-736-8031 x56442",
"website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
}
]
Requests - Proxy
到目前为止,我们已经看到客户端直接连接并与服务器通信。使用代理,交互如下进行−
-
客户端向代理发送请求。
-
代理将请求发送到服务器。
-
服务器将响应发回代理。
-
代理将向客户端发送响应。
使用 Http-proxy 是分配给管理客户端和服务器之间数据交换的附加安全性。requests 库还提供了处理代理的设置,方法是使用代理参数,如下所示−
Example
import requests
proxies = {
'http': 'http://localhost:8080'
}
res = requests.get('http://httpbin.org/', proxies=proxies)
print(res.status_code)
该请求将路由到( http://localhost:8080 )URL。
Requests - Web Scraping using Requests
我们已经看到如何使用 Python requests 库从给定的 URL 获取数据。我们将尝试使用以下代码从 Tutorialspoint 站点(可在此处获取: https://www.tutorialspoint.com/tutorialslibrary.htm )提取数据:
-
Requests Library
-
Python 中 BeautifulSoup 库
我们已经安装了 Requests 库,现在让我们安装 BeautifulSoup 包。如果您想进一步了解 BeautifulSoup 的一些功能,这是 beautiful soup 的官方网站,该网站可在此处获取: https://www.crummy.com/software/BeautifulSoup/bs4/doc/ 。
Installing Beautifulsoup
我们将在下面看到如何安装 Beautiful Soup:
E:\prequests>pip install beautifulsoup4
Collecting beautifulsoup4
Downloading https://files.pythonhosted.org/packages/3b/c8/a55eb6ea11cd7e5ac4ba
cdf92bac4693b90d3ba79268be16527555e186f0/beautifulsoup4-4.8.1-py3-none-any.whl (
101kB)
|████████████████████████████████| 102kB 22kB/s
Collecting soupsieve>=1.2 (from beautifulsoup4)
Downloading https://files.pythonhosted.org/packages/81/94/03c0f04471fc245d08d0
a99f7946ac228ca98da4fa75796c507f61e688c2/soupsieve-1.9.5-py2.py3-none-any.whl
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.8.1 soupsieve-1.9.5
我们现在已经安装了 Python requests 库和 beautiful soup。
现在让我们编写代码,这将从给定的 URL 中提取数据。
Web scraping
import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.tutorialspoint.com/tutorialslibrary.htm')
print("The status code is ", res.status_code)
print("\n")
soup_data = BeautifulSoup(res.text, 'html.parser')
print(soup_data.title)
print("\n")
print(soup_data.find_all('h4'))
使用 requests 库,我们可以从给定的 URL 中获取内容,beautiful soup 库有助于分析它并以我们想要的方式获取详细信息。
您可以使用 beautiful soup 库使用 Html 标记、类、id、css 选择器以及更多方式提取数据。以下是我们获得的输出,其中我们打印了页面的标题以及页面上的所有 h4 标记。
Output
E:\prequests>python makeRequest.py
The status code is 200
<title>Free Online Tutorials and Courses</title>
[<h4>Academic</h4>, <h4>Computer Science</h4>, <h4>Digital Marketing</h4>, <h4>M
onuments</h4>,<h4>Machine Learning</h4>, <h4>Mathematics</h4>, <h4>Mobile Devel
opment</h4>,<h4>SAP</h4>, <h4>Software Quality</h4>, <h4>Big Data & Analyti
cs</h4>, <h4>Databases</h4>, <h4>Engineering Tutorials</h4>, <h4>Mainframe Devel
opment</h4>, <h4>Microsoft Technologies</h4>, <h4>Java Technologies</h4>,<h4>XM
L Technologies</h4>, <h4>Python Technologies</h4>, <h4>Sports</h4>, <h4>Computer
Programming</h4>,<h4>DevOps</h4>, <h4>Latest Technologies</h4>, <h4>Telecom</h4>, <h4>Exams Syllabus</h4>, <h4>UPSC IAS Exams</h4>, <h4>Web Development</h4>,
<h4>Scripts</h4>, <h4>Management</h4>,<h4>Soft Skills</h4>, <h4>Selected Readin
g</h4>, <h4>Misc</h4>]