Python Web Scraping 简明教程

Python Web Scraping Tutorial

网络抓取，又称为网络数据挖掘或网络收割，是构建一个代理的过程，该代理可以自动从网络中提取、解析、下载和组织有用的信息。

Web scraping, also called web data mining or web harvesting, is the process of constructing an agent which can extract, parse, download and organize useful information from the web automatically.

本教程将教授您各种网络抓取概念，并使您能够轻松抓取各种类型的网站及其数据。

This tutorial will teach you various concepts of web scraping and makes you comfortable with scraping various types of websites and their data.

Audience

本教程对毕业生、研究生和研究型学生非常有用，他们要么对这门学科感兴趣，要么将其作为课程的一部分。本教程适合初学者和高级学习者的学习需求。

This tutorial will be useful for graduates, post graduates, and research students who either have an interest in this subject or have this subject as a part of their curriculum. The tutorial suits the learning needs of both a beginner or an advanced learner.

Prerequisites

读者必须具备有关 HTML、CSS 和 Java 脚本的基本知识。他还/她还应该了解 Web 技术中使用的基本术语以及 Python 编程概念。如果您不了解这些概念，我们建议您先学习这些概念的教程。

The reader must have basic knowledge about HTML, CSS, and Java Script. He/she should also be aware about basic terminologies used in Web Technology along with Python programming concepts. If you do not have knowledge on these concepts, we suggest you to go through tutorials on these concepts first.