Xml 简明教程

XML - Character Entities

本章介绍 XML Character Entities 。在我们理解字符实体之前,让我们先理解什么是 XML 实体。

This chapter describes the XML Character Entities. Before we understand the Character Entities, let us first understand what an XML entity is.

正如 W3 Consortium 所说,实体的定义如下 −

As put by W3 Consortium the definition of an entity is as follows −

这意味着实体是 XML 中的占位符。它们可以在文档序言或 DTD 中声明。实体有不同的类型,在本章中我们将讨论字符实体。

This means, entities are the placeholders in XML. These can be declared in the document prolog or in a DTD. There are different types of entities and in this chapter we will discuss Character Entity.

HTML 和 XML 都为它们的使用保留了一些符号,这些符号不能用作 XML 代码中的内容。例如, <> 符号用于打开和关闭 XML 标记。为了显示这些特殊字符,使用了字符实体。

Both, HTML and XML, have some symbols reserved for their use, which cannot be used as content in XML code. For example, < and > signs are used for opening and closing XML tags. To display these special characters, the character entities are used.

有一些特殊字符或符号无法直接从键盘输入。字符实体也可用于显示这些符号/特殊字符。

There are few special characters or symbols which are not available to be typed directly from the keyboard. Character Entities can also be used to display those symbols/special characters.

Types of Character Entities

共有三种类型的字符实体 −

There are three types of character entities −

  1. Predefined Character Entities

  2. Numbered Character Entities

  3. Named Character Entities

Predefined Character Entities

引入它们是为了避免在使用某些符号时出现歧义。例如,当小于 ( < ) 或大于 ( > ) 符号与角度标记 ( <> ) 一起使用时,会观察到歧义。字符实体基本上用于在 XML 中定界标记。以下是 XML 规范中预定义字符实体的列表。这些字符可用于明确地表示字符。

They are introduced to avoid the ambiguity while using some symbols. For example, an ambiguity is observed when less than ( < ) or greater than ( > ) symbol is used with the angle tag (<>). Character entities are basically used to delimit tags in XML. Following is a list of pre-defined character entities from XML specification. These can be used to express characters without ambiguity.

  1. Ampersand − &

  2. Single quote − '

  3. Greater than − >

  4. Less than − <

  5. Double quote − "

Numeric Character Entities

数字引用用于引用字符实体。数字引用可以是十进制或十六进制格式。由于有成千上万个数字引用,因此它们有点难以记住。数字引用通过其在 Unicode 字符集中的数字引用字符。

The numeric reference is used to refer to a character entity. Numeric reference can either be in decimal or hexadecimal format. As there are thousands of numeric references available, these are a bit hard to remember. Numeric reference refers to the character by its number in the Unicode character set.

十进制数字引用的通用语法为 −

General syntax for decimal numeric reference is −

&# decimal number ;

十六进制数字引用的通用语法为 −

General syntax for hexadecimal numeric reference is −

&#x Hexadecimal number ;

下表列出了一些带有数字值的预定义字符实体 −

The following table lists some predefined character entities with their numeric values −

Entity name

Character

Decimal reference

Hexadecimal reference

quot

"

"

"

amp

&

&

&

apos

'

'

'

lt

<

<

<

gt

>

>

>

Named Character Entity

由于数字字符难以记住,因此首选的字符实体类型是命名字符实体。在这里,每个实体都用名称标识。

As it is hard to remember the numeric characters, the most preferred type of character entity is the named character entity. Here, each entity is identified with a name.

例如 -

For example −

  1. 'Aacute' represents capital character with acute accent.

  2. 'ugrave' represents the small with grave accent.