Natural Language Toolkit 简明教程
Natural Language Toolkit - Getting Started
为了安装 NLTK,我们必须在电脑上安装 Python。您可以访问链接 www.python.org/downloads 并为您的操作系统(即 Windows、Mac 和 Linux/Unix)选择最新版本。有关 Python 的基本教程,您可以参考链接 www.tutorialspoint.com/python3/index.htm 。
In order to install NLTK, we must have Python installed on our computers. You can go to the link www.python.org/downloads and select the latest version for your OS i.e. Windows, Mac and Linux/Unix. For basic tutorial on Python you can refer to the link www.tutorialspoint.com/python3/index.htm.
现在,在您的计算机系统上安装 Python 之后,让我们了解如何安装 NLTK。
Now, once you have Python installed on your computer system, let us understand how we can install NLTK.
Installing NLTK
我们可以在不同的操作系统上安装 NLTK,如下所示:
We can install NLTK on various OS as follows −
On Windows
为了在 Windows 操作系统上安装 NLTK,请按照以下步骤操作:
In order to install NLTK on Windows OS, follow the below steps −
-
First, open the Windows command prompt and navigate to the location of the pip folder.
-
Next, enter the following command to install NLTK −
pip3 install nltk
现在,从 Windows 开始菜单中打开 PythonShell,并输入以下命令来验证 NLTK 的安装:
Now, open the PythonShell from Windows Start Menu and type the following command in order to verify NLTK’s installation −
Import nltk
如果未出现错误,则表示您已在具有 Python3 的 Windows 操作系统上成功安装了 NLTK。
If you get no error, you have successfully installed NLTK on your Windows OS having Python3.
On Mac/Linux
为了在 Mac/Linux 操作系统上安装 NLTK,请编写以下命令:
In order to install NLTK on Mac/Linux OS, write the following command −
sudo pip install -U nltk
如果你电脑上没有安装 pip,请按照下面的说明安装 pip −
If you don’t have pip installed on your computer, then follow the instruction given below to first install pip −
首先,通过如下命令更新包索引 −
First, update the package index by following using following command −
sudo apt update
现在,键入如下命令安装 Python 3 的 pip −
Now, type the following command to install pip for python 3 −
sudo apt install python3-pip
Through Anaconda
要通过 Anaconda 安装 NLTK,请按照如下步骤操作 −
In order to install NLTK through Anaconda, follow the below steps −
首先,安装 Anaconda,访问链接 https://www.anaconda.com/download 然后选择你需要安装的 Python 版本。
First, to install Anaconda, go to the link https://www.anaconda.com/download and then select the version of Python you need to install.
你的电脑系统安装了 Anaconda 之后,转到其命令提示符然后输入如下命令 −
Once you have Anaconda on your computer system, go to its command prompt and write the following command −
conda install -c anaconda nltk
你需要检查输出并输入“是”。NLTK 将下载并安装到你的 Anaconda 包中。
You need to review the output and enter ‘yes’. NLTK will be downloaded and installed in your Anaconda package.
Downloading NLTK’s Dataset and Packages
现在我们已经安装了 NLTK,但是为了使用它,我们需要下载其数据组(语料库)。一些重要的数据组包括 stpwords, guntenberg, framenet_v15 等。
Now we have NLTK installed on our computers but in order to use it we need to download the datasets (corpus) available in it. Some of the important datasets available are stpwords, guntenberg, framenet_v15 and so on.
通过如下命令,我们可以下载所有 NLTK 数据组 −
With the help of following commands, we can download all the NLTK datasets −
import nltk
nltk.download()
你会看到如下 NLTK 下载窗口。
You will get the following NLTK downloaded window.
现在,点击下载按钮下载数据组。
Now, click on the download button to download the datasets.
How to run NLTK script?
下面是使用 PorterStemmer nltk 类实现 Porter Stemmer 算法的示例。利用此示例,你可以了解如何运行 NLTK 脚本。
Following is the example in which we are implementing Porter Stemmer algorithm by using PorterStemmer nltk class. with this example you would be able to understand how to run NLTK script.
首先,我们需要导入自然语言工具包 (nltk)。
First, we need to import the natural language toolkit(nltk).
import nltk
现在,导入 PorterStemmer 类来实现波特词干化器算法。
Now, import the PorterStemmer class to implement the Porter Stemmer algorithm.
from nltk.stem import PorterStemmer
然后,按以下步骤创建波特词干化器类的实例 −
Next, create an instance of Porter Stemmer class as follows −
word_stemmer = PorterStemmer()
现在,输入你想提取词干的单词。−
Now, input the word you want to stem. −
word_stemmer.stem('writing')