Natural Language Toolkit 简明教程

Natural Language Toolkit Tutorial

语言是一种沟通方法,借助它我们可以说话、阅读和写作。自然语言处理 (NLP) 是计算机科学的一个子领域,尤其是人工智能 (AI),其关注于使计算机能够理解和处理人类语言。我们有各种开源 NLP 工具,但 NLTK(自然语言工具包)在易用性和概念解释方面得分很高。Python 的学习曲线非常快,NLTK 用 Python 编写,因此 NLTK 也有非常好的学习工具包。NLTK 已经纳入了标记化、词干化、词形还原、标点符号、字符计数和单词计数等大多数任务。它非常优雅且易于使用。

Language is a method of communication with the help of which we can speak, read and write. Natural Language Processing (NLP) is the sub field of computer science especially Artificial Intelligence (AI) that is concerned about enabling computers to understand and process human language. We have various open-source NLP tools but NLTK (Natural Language Toolkit) scores very high when it comes to the ease of use and explanation of the concept. The learning curve of Python is very fast and NLTK is written in Python so NLTK is also having very good learning kit. NLTK has incorporated most of the tasks like tokenization, stemming, Lemmatization, Punctuation, Character Count, and Word count. It is very elegant and easy to work with.

Audience

本教程对研究生、本科生和研究型学生非常有用,他们要么对自然语言处理感兴趣,要么在课程中学习这门课。读者可以是初学者或高级学习者。

This tutorial will be useful for graduates, post-graduates, and research students who either have an interest in NLP or have this subject as a part of their curriculum. The reader can be a beginner or an advanced learner.

Prerequisites

读者必须具备人工智能基础知识。他还/她还应该了解英语语法和 Python 编程概念中使用的基本术语。

The reader must have basic knowledge about artificial intelligence. He/she should also be aware of basic terminologies used in English grammar and Python programming concepts.