Artificial Intelligence With Python 简明教程
AI with Python – Natural Language Processing
自然语言处理(NLP)是指使用自然语言(如英语)与智能系统通信的 AI 方法。
Natural Language Processing (NLP) refers to AI method of communicating with intelligent systems using a natural language such as English.
当您希望像机器人这样的智能系统根据您的说明执行操作时,当您希望从基于对话的临床专家系统中听到决策时,需要进行自然语言处理。
Processing of Natural Language is required when you want an intelligent system like robot to perform as per your instructions, when you want to hear decision from a dialogue based clinical expert system, etc.
NLP 领域涉及使计算机使用人类使用的自然语言执行有用的任务。NLP 系统的输入和输出可以是:
The field of NLP involves making computers erform useful tasks with the natural languages humans use. The input and output of an NLP system can be −
-
Speech
-
Written Text
Components of NLP
在这一节,我们将学习 NLP 的不同组成部分。NLP 有两个组成部分。下面介绍这些组成部分:
In this section, we will learn about the different components of NLP. There are two components of NLP. The components are described below −
Natural Language Understanding (NLU)
它涉及以下任务:
It involves the following tasks −
-
Mapping the given input in natural language into useful representations.
-
Analyzing different aspects of the language.
Natural Language Generation (NLG)
这是从某种内部表示中生成有意义的短语和句子的过程。它包括:
It is the process of producing meaningful phrases and sentences in the form of natural language from some internal representation. It involves −
-
Text planning − This includes retrieving the relevant content from the knowledge base.
-
Sentence planning − This includes choosing the required words, forming meaningful phrases, setting tone of the sentence.
-
Text Realization − This is mapping sentence plan into sentence structure.
Difficulties in NLU
NLU 的形式和结构非常丰富;然而它很模糊。可能有不同级别的模糊性:
The NLU is very rich in form and structure; however, it is ambiguous. There can be different levels of ambiguity −
Lexical ambiguity
它处于非常原始的级别,例如单词级别。例如,将单词“board”视为名词还是动词?
It is at a very primitive level such as the word-level. For example, treating the word “board” as noun or verb?
NLP Terminology
现在让我们来看看 NLP 术语中的一些重要术语。
Let us now see a few important terms in the NLP terminology.
-
Phonology − It is study of organizing sound systematically.
-
Morphology − It is a study of construction of words from primitive meaningful units.
-
Morpheme − It is a primitive unit of meaning in a language.
-
Syntax − It refers to arranging words to make a sentence. It also involves determining the structural role of words in the sentence and in phrases.
-
Semantics − It is concerned with the meaning of words and how to combine words into meaningful phrases and sentences.
-
Pragmatics − It deals with using and understanding sentences in different situations and how the interpretation of the sentence is affected.
-
Discourse − It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.
-
World Knowledge − It includes the general knowledge about the world.
Steps in NLP
本部分显示了 NLP 中的不同步骤。
This section shows the different steps in NLP.
Lexical Analysis
它涉及识别和分析词语的结构。语言的词典表示语言中单词和短语的集合。词法分析正在将整个文本块划分为段落、句子和单词。
It involves identifying and analyzing the structure of words. Lexicon of a language means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs, sentences, and words.
Syntactic Analysis (Parsing)
它涉及分析句子中词语的语法,并以一种显示词语之间关系的方式排列词语。像“The school goes to boy”这样的句子被英语句法分析器拒绝。
It involves analysis of words in the sentence for grammar and arranging words in a manner that shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by English syntactic analyzer.
Semantic Analysis
它从文本中提取确切的含义或词典含义。文本的含义会得到检查。这是通过映射任务域中的句法结构和对象来完成的。语义分析器会忽略诸如“hot ice-cream”等句子。
It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer disregards sentence such as “hot ice-cream”.