Xml 简明教程

XML - Syntax

在本章节中,我们将讨论编写 XML 文档的简单语法规则。以下是一个完整的 XML 文档 −

In this chapter, we will discuss the simple syntax rules to write an XML document. Following is a complete XML document −

<?xml version = "1.0"?>
<contact-info>
   <name>Tanmay Patil</name>
   <company>TutorialsPoint</company>
   <phone>(011) 123-4567</phone>
</contact-info>

您会注意到以上示例中有两种信息 −

You can notice there are two kinds of information in the above example −

  1. Markup, like <contact-info>

  2. The text, or the character data, Tutorials Point and (040) 123-4567.

下图描述了在 XML 文档以撰写不同类型的标记和文本的语法规则。

The following diagram depicts the syntax rules to write different types of markup and text in an XML document.

syntaxrules

让我们详细了解上图的每个组件。

Let us see each component of the above diagram in detail.

XML Declaration

XML 文档可以选择使用 XML 声明。它的撰写方式如下 −

The XML document can optionally have an XML declaration. It is written as follows −

<?xml version = "1.0" encoding = "UTF-8"?>

其中版本为 XML 版本,编码指定文件中使用的字符编码。

Where version is the XML version and encoding specifies the character encoding used in the document.

Syntax Rules for XML Declaration

  1. The XML declaration is case sensitive and must begin with "<?xml>" where "xml" is written in lower-case.

  2. If document contains XML declaration, then it strictly needs to be the first statement of the XML document.

  3. The XML declaration strictly needs be the first statement in the XML document.

  4. An HTTP protocol can override the value of encoding that you put in the XML declaration.

Tags and Elements

XML 文件由多个 XML 元素构成,也称为 XML 节点或 XML 标记。XML 元素的名称用尖括号 < > 括起来,如下所示 −

An XML file is structured by several XML-elements, also called XML-nodes or XML-tags. The names of XML-elements are enclosed in triangular brackets < > as shown below −

<element>

Syntax Rules for Tags and Elements

Element Syntax − 每个 XML 元素都需要使用起始元素或结束元素关闭,如下所示 −

Element Syntax − Each XML-element needs to be closed either with start or with end elements as shown below −

<element>....</element>

或在简单的情况下,只需这样 −

or in simple-cases, just this way −

<element/>

Nesting of Elements − 一个 XML 元素可以包含多个 XML 元素作为其子元素,但子元素不能重叠。即,元素的结束标记必须和最近未匹配的起始标记同名。

Nesting of Elements − An XML-element can contain multiple XML-elements as its children, but the children elements must not overlap. i.e., an end tag of an element must have the same name as that of the most recent unmatched start tag.

以下示例显示了错误的嵌套标记 −

The Following example shows incorrect nested tags −

<?xml version = "1.0"?>
<contact-info>
<company>TutorialsPoint
</contact-info>
</company>

以下示例显示了正确的嵌套标记 −

The Following example shows correct nested tags −

<?xml version = "1.0"?>
<contact-info>
   <company>TutorialsPoint</company>
<contact-info>

Root Element − 一个 XML 文档只能有一个根元素。例如,以下不是正确的 XML 文档,因为 xy 元素都出现在顶层,没有根元素 −

Root Element − An XML document can have only one root element. For example, following is not a correct XML document, because both the x and y elements occur at the top level without a root element −

<x>...</x>
<y>...</y>

以下示例显示了格式正确的 XML 文档 −

The Following example shows a correctly formed XML document −

<root>
   <x>...</x>
   <y>...</y>
</root>

Case Sensitivity − XML 元素名称区分大小写。这意味着起始元素和结束元素的名称需要完全相同。

Case Sensitivity − The names of XML-elements are case-sensitive. That means the name of the start and the end elements need to be exactly in the same case.

例如, <contact-info> 不同于 <Contact-Info>

For example, <contact-info> is different from <Contact-Info>

XML Attributes

attribute 为元素指定单个属性,使用名称/值对。XML 元素可以有一个或多个属性。例如 −

An attribute specifies a single property for the element, using a name/value pair. An XML-element can have one or more attributes. For example −

<a href = "http://www.tutorialspoint.com/">Tutorialspoint!</a>

此处 href 是属性名称, http://www.tutorialspoint.com/ 是属性值。

Here href is the attribute name and http://www.tutorialspoint.com/ is attribute value.

Syntax Rules for XML Attributes

  1. Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are considered two different XML attributes.

  2. Same attribute cannot have two values in a syntax. The following example shows incorrect syntax because the attribute b is specified twice −

<a b = "x" c = "y" b = "z">....</a>
  1. Attribute names are defined without quotation marks, whereas attribute values must always appear in quotation marks. Following example demonstrates incorrect xml syntax −

<a b = x>....</a>

在上述语法中,属性值没有用引号定义。

In the above syntax, the attribute value is not defined in quotation marks.

XML References

引用通常允许您在 XML 文档中添加或包含其他文本或标记。引用始终以 "&" 符号(这是一个保留字符)开头,以 ";". 符号结束,XML 有两种类型的引用−

References usually allow you to add or include additional text or markup in an XML document. References always begin with the symbol "&" which is a reserved character and end with the symbol ";". XML has two types of references −

  1. Entity References − An entity reference contains a name between the start and the end delimiters. For example & where amp is name. The name refers to a predefined string of text and/or markup.

  2. Character References − These contain references, such as A, contains a hash mark (“#”) followed by a number. The number always refers to the Unicode code of a character. In this case, 65 refers to alphabet "A".

XML Text

XML 元素和 XML 属性的名称区分大小写,这意味着起始和结束元素的名称需要采用相同的大小写书写。为了避免字符编码问题,所有 XML 文件都应保存为 Unicode UTF-8 或 UTF-16 文件。

The names of XML-elements and XML-attributes are case-sensitive, which means the name of start and end elements need to be written in the same case. To avoid character encoding problems, all XML files should be saved as Unicode UTF-8 or UTF-16 files.

XML 元素之间的空白字符(如空格、制表符和换行符)以及 XML 属性之间的空白字符将被忽略。

Whitespace characters like blanks, tabs and line-breaks between XML-elements and between the XML-attributes will be ignored.

XML 语法本身保留一些字符。因此,它们不能直接使用。要使用它们,应使用下面列出的替换实体−

Some characters are reserved by the XML syntax itself. Hence, they cannot be used directly. To use them, some replacement-entities are used, which are listed below −

Not Allowed Character

Replacement Entity

Character Description

<

<

less than

>

>

greater than

&

&

ampersand

'

'

apostrophe

"

"

quotation mark