Artificial Intelligence With Python 简明教程

AI with Python – Natural Language Processing

自然语言处理(NLP)是指使用自然语言(如英语)与智能系统通信的 AI 方法。

Natural Language Processing (NLP) refers to AI method of communicating with intelligent systems using a natural language such as English.

当您希望像机器人这样的智能系统根据您的说明执行操作时,当您希望从基于对话的临床专家系统中听到决策时,需要进行自然语言处理。

Processing of Natural Language is required when you want an intelligent system like robot to perform as per your instructions, when you want to hear decision from a dialogue based clinical expert system, etc.

NLP 领域涉及使计算机使用人类使用的自然语言执行有用的任务。NLP 系统的输入和输出可以是:

The field of NLP involves making computers erform useful tasks with the natural languages humans use. The input and output of an NLP system can be −

  1. Speech

  2. Written Text

Components of NLP

在这一节,我们将学习 NLP 的不同组成部分。NLP 有两个组成部分。下面介绍这些组成部分:

In this section, we will learn about the different components of NLP. There are two components of NLP. The components are described below −

Natural Language Understanding (NLU)

它涉及以下任务:

It involves the following tasks −

  1. Mapping the given input in natural language into useful representations.

  2. Analyzing different aspects of the language.

Natural Language Generation (NLG)

这是从某种内部表示中生成有意义的短语和句子的过程。它包括:

It is the process of producing meaningful phrases and sentences in the form of natural language from some internal representation. It involves −

  1. Text planning − This includes retrieving the relevant content from the knowledge base.

  2. Sentence planning − This includes choosing the required words, forming meaningful phrases, setting tone of the sentence.

  3. Text Realization − This is mapping sentence plan into sentence structure.

Difficulties in NLU

NLU 的形式和结构非常丰富;然而它很模糊。可能有不同级别的模糊性:

The NLU is very rich in form and structure; however, it is ambiguous. There can be different levels of ambiguity −

Lexical ambiguity

它处于非常原始的级别,例如单词级别。例如,将单词“board”视为名词还是动词?

It is at a very primitive level such as the word-level. For example, treating the word “board” as noun or verb?

Syntax level ambiguity

可以以不同的方式解析句子。例如,“他用红帽子举起甲虫。” - 他是用帽子举起甲虫还是他举起了一只戴着红帽子的甲虫?

A sentence can be parsed in different ways. For example, “He lifted the beetle with red cap.” − Did he use cap to lift the beetle or he lifted a beetle that had red cap?

Referential ambiguity

使用代词指代某物。例如,Rima 去了 Gauri。她说,“我累了。” - 到底是谁累了?

Referring to something using pronouns. For example, Rima went to Gauri. She said, “I am tired.” − Exactly who is tired?

NLP Terminology

现在让我们来看看 NLP 术语中的一些重要术语。

Let us now see a few important terms in the NLP terminology.

  1. Phonology − It is study of organizing sound systematically.

  2. Morphology − It is a study of construction of words from primitive meaningful units.

  3. Morpheme − It is a primitive unit of meaning in a language.

  4. Syntax − It refers to arranging words to make a sentence. It also involves determining the structural role of words in the sentence and in phrases.

  5. Semantics − It is concerned with the meaning of words and how to combine words into meaningful phrases and sentences.

  6. Pragmatics − It deals with using and understanding sentences in different situations and how the interpretation of the sentence is affected.

  7. Discourse − It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.

  8. World Knowledge − It includes the general knowledge about the world.

Steps in NLP

本部分显示了 NLP 中的不同步骤。

This section shows the different steps in NLP.

Lexical Analysis

它涉及识别和分析词语的结构。语言的词典表示语言中单词和短语的集合。词法分析正在将整个文本块划分为段落、句子和单词。

It involves identifying and analyzing the structure of words. Lexicon of a language means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs, sentences, and words.

Syntactic Analysis (Parsing)

它涉及分析句子中词语的语法,并以一种显示词语之间关系的方式排列词语。像“The school goes to boy”这样的句子被英语句法分析器拒绝。

It involves analysis of words in the sentence for grammar and arranging words in a manner that shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by English syntactic analyzer.

Semantic Analysis

它从文本中提取确切的含义或词典含义。文本的含义会得到检查。这是通过映射任务域中的句法结构和对象来完成的。语义分析器会忽略诸如“hot ice-cream”等句子。

It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer disregards sentence such as “hot ice-cream”.

Discourse Integration

任何句子的含义取决于它之前句子的含义。此外,它还会带来紧接着句子的含义。

The meaning of any sentence depends upon the meaning of the sentence just before it. In addition, it also brings about the meaning of immediately succeeding sentence.

Pragmatic Analysis

在此期间,重新解释所说的内容意味着实际要表达的意思。它涉及推导需要现实世界知识的语言那些方面。

During this, what was said is re-interpreted on what it actually meant. It involves deriving those aspects of language which require real world knowledge.