Natural Language Processing 简明教程
Natural Language Processing - Semantic Analysis
语义分析的目的是从文本中提取确切的含义,或者你可以说词典含义。语义分析器的作用是检查文本的含义。
The purpose of semantic analysis is to draw exact meaning, or you can say dictionary meaning from the text. The work of semantic analyzer is to check the text for meaningfulness.
我们已经知道词法分析也处理单词的含义,那么词法分析和语义分析有什么不同呢?词法分析基于较小的标记,但另一方面,语义分析则关注较大的块。这就是为什么语义分析可以分为以下两部分:
We already know that lexical analysis also deals with the meaning of the words, then how is semantic analysis different from lexical analysis? Lexical analysis is based on smaller token but on the other side semantic analysis focuses on larger chunks. That is why semantic analysis can be divided into the following two parts −
Studying meaning of individual word
它是语义分析的第一部分,其中执行单个单词的含义的研究。这部分称为词法语义。
It is the first part of the semantic analysis in which the study of the meaning of individual words is performed. This part is called lexical semantics.
Studying the combination of individual words
在第二部分中,单个单词将被组合起来为句子提供含义。
In the second part, the individual words will be combined to provide meaning in sentences.
语义分析最重要的任务是获得句子的恰当含义。例如,分析句子“在这句话中,说话者正谈论罗摩勋爵或一个叫罗摩的人。”这就是为什么语义分析器获得句子恰当含义的工作非常重要。
The most important task of semantic analysis is to get the proper meaning of the sentence. For example, analyze the sentence “Ram is great.” In this sentence, the speaker is talking either about Lord Ram or about a person whose name is Ram. That is why the job, to get the proper meaning of the sentence, of semantic analyzer is important.
Elements of Semantic Analysis
以下是语义分析的一些重要元素:
Followings are some important elements of semantic analysis −
Hyponymy
它可以被定义为泛词与泛词实例之间的关系。这里的泛词称为上位词,其实例称为下位词。例如,词“颜色”是上位词,而词“蓝色”、“黄色”等是下位词。
It may be defined as the relationship between a generic term and instances of that generic term. Here the generic term is called hypernym and its instances are called hyponyms. For example, the word color is hypernym and the color blue, yellow etc. are hyponyms.
Homonymy
它可以被定义为拼写或形式相同但含义不同且不相关的词。例如,单词“Bat”是一个同音异义词,因为bat可以是击球的工具,也可以是夜间飞行的哺乳动物。
It may be defined as the words having same spelling or same form but having different and unrelated meaning. For example, the word “Bat” is a homonymy word because bat can be an implement to hit a ball or bat is a nocturnal flying mammal also.
Polysemy
多义词是一个希腊词,意为“许多符号”。它是一个具有不同但相关意义的单词或短语。换句话说,我们可以说多义词具有相同的拼写但不同且相关的含义。例如,单词“bank”是一个多义词,具有以下含义:
Polysemy is a Greek word, which means “many signs”. It is a word or phrase with different but related sense. In other words, we can say that polysemy has the same spelling but different and related meaning. For example, the word “bank” is a polysemy word having the following meanings −
-
A financial institution.
-
The building in which such an institution is located.
-
A synonym for “to rely on”.
Difference between Polysemy and Homonymy
多义词和同音异义词都具有相同的语法或拼写。它们之间的主要区别在于,在多义词中,单词的含义是相关的,而在同音异义词中,单词的含义是不相关的。例如,如果我们谈论同一个单词“Bank”,我们可以写出“金融机构”或“河岸”的含义。在这种情况下,这将是同音异义词的例子,因为这些含义彼此无关。
Both polysemy and homonymy words have the same syntax or spelling. The main difference between them is that in polysemy, the meanings of the words are related but in homonymy, the meanings of the words are not related. For example, if we talk about the same word “Bank”, we can write the meaning ‘a financial institution’ or ‘a river bank’. In that case it would be the example of homonym because the meanings are unrelated to each other.
Synonymy
它是指具有不同形式但表示相同或相近含义的两个词素之间的关系。示例有“作者/作家”、“命运/天命”。
It is the relation between two lexical items having different forms but expressing the same or a close meaning. Examples are ‘author/writer’, ‘fate/destiny’.
Antonymy
它是指两个词素之间在其语义成分相对于某个轴对称的关系。反义词的范围如下所示:
It is the relation between two lexical items having symmetry between their semantic components relative to an axis. The scope of antonymy is as follows −
-
Application of property or not − Example is ‘life/death’, ‘certitude/incertitude’
-
Application of scalable property − Example is ‘rich/poor’, ‘hot/cold’
-
Application of a usage − Example is ‘father/son’, ‘moon/sun’.
Meaning Representation
语义分析创建句子的含义表示。但在进入与含义表示相关的概念和方法之前,我们需要了解语义系统的构建模块。
Semantic analysis creates a representation of the meaning of a sentence. But before getting into the concept and approaches related to meaning representation, we need to understand the building blocks of semantic system.
Building Blocks of Semantic System
在词语表示或词语含义的表示中,以下构建模块发挥着重要作用:
In word representation or representation of the meaning of the words, the following building blocks play an important role −
-
Entities − It represents the individual such as a particular person, location etc. For example, Haryana. India, Ram all are entities.
-
Concepts − It represents the general category of the individuals such as a person, city, etc.
-
Relations − It represents the relationship between entities and concept. For example, Ram is a person.
-
Predicates − It represents the verb structures. For example, semantic roles and case grammar are the examples of predicates.
现在,我们可以理解语义表征展示了如何将语义系统的构建模块放在一起。换句话说,它展示了如何将实体、概念、关系和谓词组合在一起,以描述一种情况。它还能够推理语义世界。
Now, we can understand that meaning representation shows how to put together the building blocks of semantic systems. In other words, it shows how to put together entities, concepts, relation and predicates to describe a situation. It also enables the reasoning about the semantic world.
Approaches to Meaning Representations
语义分析使用以下方法来表征意义——
Semantic analysis uses the following approaches for the representation of meaning −
-
First order predicate logic (FOPL)
-
Semantic Nets
-
Frames
-
Conceptual dependency (CD)
-
Rule-based architecture
-
Case Grammar
-
Conceptual Graphs
Need of Meaning Representations
这里出现的一个问题是我们为什么需要语义表征?以下是原因——
A question that arises here is why do we need meaning representation? Followings are the reasons for the same −
Linking of linguistic elements to non-linguistic elements
第一个原因是借助语义表征,可以将语言元素与非语言元素联系起来。
The very first reason is that with the help of meaning representation the linking of linguistic elements to the non-linguistic elements can be done.
Lexical Semantics
语义分析的第一个部分——研究各个单词的含义被称为词汇语义。它包括单词、子词、词缀(子单位)、复合词和短语。所有单词、子词等统称为词汇项。换句话说,可以说词汇语义是词汇项、句子含义和句子语法之间的关系。
The first part of semantic analysis, studying the meaning of individual words is called lexical semantics. It includes words, sub-words, affixes (sub-units), compound words and phrases also. All the words, sub-words, etc. are collectively called lexical items. In other words, we can say that lexical semantics is the relationship between lexical items, meaning of sentences and syntax of sentence.
以下是在词汇语义中涉及的步骤——
Following are the steps involved in lexical semantics −
-
Classification of lexical items like words, sub-words, affixes, etc. is performed in lexical semantics.
-
Decomposition of lexical items like words, sub-words, affixes, etc. is performed in lexical semantics.
-
Differences as well as similarities between various lexical semantic structures is also analyzed.