Automata Theory 简明教程
Automata Theory Tutorial
Automata Theory 是计算机科学的一个分支,它涉及设计遵循规定操作顺序自动执行的抽象自推进计算设备。
Automata Theory is a branch of computer science that deals with designing abstract self-propelled computing devices that follow a predetermined sequence of operations automatically.
具有有限状态数的自动机称为 Finite Automaton 。这是一个简短而简洁的自学教程,介绍有限自动机、正规语言和下推自动机的基本概念,然后进入图灵机和可判定的学习。
An automaton with a finite number of states is called a Finite Automaton. This is a brief and concise tutorial that introduces the fundamental concepts of Finite Automata, Regular Languages, and Pushdown Automata before moving onto Turing machines and Decidability.
Who Should Learn Automata Theory?
本教程旨在为攻读任何信息技术或计算机科学相关专业的学生准备。它旨在帮助学生掌握自动机理论中的重要概念。
This tutorial has been prepared for students pursuing a degree in any information technology or computer science related field. It attempts to help students grasp the essential concepts involved in automata theory.
Prerequisites to Learn Automata Theory
本教程很好地平衡了理论和数学严谨性。读者应具备离散数学结构的基本理解。
This tutorial has a good balance between theory and mathematical rigor. The readers are expected to have a basic understanding of discrete mathematical structures.
What is Automata Theory?
在数学和计算机科学中, * automata theory* 涉及抽象机器和计算问题。自动机是遵循预定操作自动执行的自推进计算设备。
In mathematics and computer science, automata theory deals with abstract machines and computational problems. Automatons are self-propelled computing devices that follow predetermined operations automatically.
约翰·冯·诺伊曼说:“自动机理论是一门逻辑理论,它涉及自动机和信息的学习,强调需要一个详细的、数学的和分析性的理论来理解复杂的自动机。”
According to John von Neumann, "Automata theory is a logical theory that deals with the study of automata and information, emphasizing the need for a detailed, mathematical, and analytical theory to understand complex automata."
What is an Automaton?
自动机是一个有限尺寸的设备,具有指定的输入和输出,其行为由输入决定。换句话说,自动机是指根据预先确定的指令或过程转换信息的自操作机械对象。
An automaton is a finite-sized device with specified inputs and outputs, whose behavior is determined by inputs. In other words, an automaton refers to self-operating mechanical objects that transform information based on predetermined instructions or procedures.
自动机还可以指一类机电设备,既有理论上也有实际上的,它们在运动后转换信息。
An automaton can also refer to a class of electromechanical devices, both theoretical and real, that transform information after being set in motion.
What are the Main Types of Automata?
自动机可分类如下:
The automata could be classified into different types as follows −
-
Finite Automata − A Finite automaton is an automaton with a finite number of states that changes its state based on the input string of symbols. When the desiring symbol is search, then the transition occurs.
-
Pushdown Automata − A pushdown automaton (PDA) is a stack-based automaton used in the theory of computation. It is more capable than finite-state machines to solve little complex problem than FA.
-
Turing Machine − A Turing machine is a theoretical device that uses a table of rules to manipulate symbols on a tape strip, consisting of an infinitely long tape divided into cells with 1’s, 0’s, or empty spaces.
-
Linear Bounded Automata − A linear-bounded automaton (LBA) is a Turing machine that uses only the input tape space, unlike a Turing machine. Its length is a linear function of the input’s length, and only a finite contiguous portion of the tape can be accessed by the read/write head.
-
Cellular Automata − Cellular automata are deterministic systems that evolve in discrete time and space, typically a grid. They consist of locally updated cells that follow a global time scale and recursive rule, updating synchronously across the grid.
What is a Finite Automaton?
A * finite-state machine (FSM)* 是一个数学计算模型,它在任何给定时间都能处于有限数量的状态之一。它可以根据输入(称为 transition )从一个状态转换为另一个状态。
A finite-state machine (FSM) is a mathematical model of computation that can be in one of a finite number of states at any given time. It can change from one state to another in response to inputs, called a transition.

FSM 由其状态、初始状态和输入定义。有两种类型的 FSM: deterministic 和 non-deterministic 。对于任何非确定性 FSM,都可以构建一个等效的确定性 FSM。
An FSM is defined by its states, initial state, and inputs. There are two types of FSM: deterministic and non-deterministic. For any non-deterministic FSM, an equivalent deterministic one can be constructed.
Difference between Deterministic and Non-deterministic Finite Automata
下表突出了确定性和非确定性有限自动机之间的主要区别:
The following table highlights the major differences between Deterministic and Non-deterministic Finite Automate −
Aspect |
Deterministic Finite Automaton |
Nondeterministic Finite Automaton |
State transitions |
Each input determines the resulting state uniquely |
Some inputs may allow a choice of resulting states |
Predictability |
Final state is completely determined by input word |
Several possible final states for a given input word |
Empty transitions |
Can change state only after reading an input |
May change state without reading any input |
Initial state |
One initial state |
May have more than one initial state |
Word acceptance |
Word is accepted if the final state is an acceptor state |
Word is accepted if at least one possible final state is an acceptor state |
State occupancy |
In one state at a time |
Can be thought of as being in multiple states at once |
What is a Pushdown Automaton?
-
Pushdown automata* 是具有栈形式附加内存的非确定性有限状态机。它比 FSM 更强大,但比图灵机弱。它们接受上下文无关语言。
Pushdown automata are nondeterministic finite state machines with additional memory in the form of a stack. It is more powerful than a FSM but less than a Turing machine. They accept context-free languages.

例如,为了确保代码有效,程序员可以将代码输入到下推自动机中,下推自动机会用为平衡括号语言实现上下文无关语法的过渡函数对其进行编程。如果代码有效且所有括号都匹配,则下推自动机将接受该代码。如果存在不平衡的括号,则自动机可以返回代码的无效性。
For example, to ensure a valid code, a programmer can feed the code into a pushdown automaton programmed with transition functions that implement the context-free grammar for the language of balanced parentheses. If the code is valid and all parentheses are matched, the pushdown automaton will accept the code. If unbalanced parentheses are present, the automaton can return the code’s invalidity.
What is a Turing Machine?
-
Turing machine* 是一种抽象计算模型,它通过读写无限胶带执行计算。它为解决计算机科学问题和测试计算限制提供了强大的计算模型。图灵机是由艾伦·图灵在 1936 年发明的。
A Turing machine is an abstract computational model that performs computations by reading and writing to an infinite tape. It provides a powerful computational model for solving computer science problems and testing the limits of computation. Turing machines was invented by Alan Turing in 1936.

图灵机由无限胶带、磁带头和状态转换表组成。它们对位串执行,磁带头处于特定状态,表控制其行为。
A Turing Machine consists of an infinite tape, a tape head, and a state transition table. They execute on an input string of bits, with the tape head in a certain state and the table governing its behavior.
图灵机可以模拟程序的复杂性和它对内存中不同数据的反应。
Turing machines can simulate the complexity of a program and its reactions to different data in memory.
What is the Chomsky Hierarchy?
-
Chomsky* 等级制度是计算机科学和语言学中的分类系统,由四个级别组成 −
The Chomsky hierarchy is a classification system in computer science and linguistics that consists of four levels −
-
Type 0 (unrestricted)
-
Type 1 (context-sensitive)
-
Type 2 (context-free)
-
Type 3 (regular)
它以诺姆·乔姆斯基的名字命名,描述了不同类型形式语言和语法的表达能力。它是形式语言研究中的一个基本概念,并用于解析算法和其他用于处理形式语言的工具的开发。
Named after Noam Chomsky, it characterizes the expressive power of different types of formal languages and grammars. It is a fundamental concept in the study of formal languages and is used in the development of parsing algorithms and other tools for working with formal languages.

What are Context-Free Languages?
-
Context-free languages (CFLs)* 由上下文无关语法生成,这与下推自动机接受的语言集相同。正规语言是上下文无关语言的子集,如果输入语言以接受的最终状态结束,则由计算模型接受。
Context-free languages (CFLs) are generated by context-free grammars, which are identical to the set of languages accepted by pushdown automata. Regular languages are a subset of context-free languages, and inputted languages are accepted by computational models if they end in an accepting final state.
形式语言理论将语言定义为一组受特定规则约束的符号,例如书面英语语言。一个有效的句子必须遵循特定的语法规则。上下文无关语言是通过上下文无关语法生成的 * language generated* ,它可以由多个上下文无关语法生成,这使其更加通用和包容。
Formal language theory defines a language as a set of symbols constrained by specific rules, like the written English language. A valid sentence must follow specific grammar rules. A context-free language is a language generated by a context-free grammar, which can be generated by multiple context-free grammars, making it more general and inclusive.
$\mathrm{\lbrace a {n}b {n} \: | \: n > 0\rbrace}$ 是上下文无关语法的示例,它表示所有具有相等数量 a 和 b 的字符串。例如 $\mathrm{a {4}b {4}}$ 是 aaaabbbb 。
$\mathrm{\lbrace a{n}b{n} \: | \: n > 0\rbrace}$ is an example of a Context Free Grammar, which represents all strings with equal numbers of a’s and b’s. For example $\mathrm{a{4}b{4}}$ is aaaabbbb.
What are Context-Sensitive Languages?
上下文相关语法比上下文无关语法更强大,因为它们能够描述某些不能由上下文无关语法描述的语言。它们位于乔姆斯基等级制度中上下文无关语法和不受限语法之间。
Context-sensitive grammars are more powerful than context-free grammars due to their ability to describe certain languages that cannot be described by context-free grammars. They are positioned between context-free and unrestricted grammars in the Chomsky hierarchy.
上下文相关语法 (CSG) 允许生成规则被终结符和非终结符包围。
A Context-Sensitive Grammar (CSG) allows production rules to be surrounded by terminal and nonterminal symbols.
$\mathrm{\lbrace a {n}b {n}c^{n} \: | \: n > 0\rbrace}$ 是 * CFG* 的示例,表示所有具有相等数量的 a、b 和 c 的字符串。例如 $\mathrm{a {4}b {4}c^{4}}$ 是 aaaabbbbcccc 。为了通过 CSG 生成这种语言,有必要在生成时跟踪左右上下文。
$\mathrm{\lbrace a{n}b{n}c^{n} \: | \: n > 0\rbrace}$ is an example of CFG, which represents all strings with equal numbers of a’s and b’s and c’s. For example $\mathrm{a{4}b{4}c^{4}}$ is aaaabbbbcccc. To generate such language through CSG, it is necessary to keep track of the left and right contexts while generating.
What are Recursively Enumerable Languages?
-
Recursive Enumerable (RE)* 语言也称为 0 型语言,由 0 型语法生成,并且可以被图灵机接受或识别。
Recursive Enumerable (RE) languages, also known as Type-0 languages, are generated by type-0 grammars, and can be accepted or recognized by a Turing machine.
RE 语言可以进入语言字符串的最终状态,也可能不会进入非语言字符串的拒绝状态。TM 可以永远循环非语言字符串,使其成为图灵可识别的语言。
RE languages can enter the final state for the language’s strings and may or may not enter the rejecting state for non-language strings. TM can loop forever for non-language strings, making them Turing recognizable languages.
这些字符串可能执行以下操作之一:
These strings may do one of the following −
-
Halt and accept
-
Halt and reject
-
Loop forever
可递归枚举的另一个子集是递归语言,它可以被全图灵机所接受并且可以在这两种情况下停止:“停止并接受”或“停止并拒绝”。
Another subset of Recursive Enumerable is Recursive Languages which can be accepted by Total Turing Machine, and it will halt in either of the two cases: "Halt and accept" or "Halt and reject".
What is the Significance of Pumping Lemma?
-
Pumping lemma* 是一种定理(通常较小的、被证明的命题,可以用作迈向更大的结果的垫脚石),其中我们通过矛盾来证明一些东西。在 TOC 中,抽送引理用于反驳一种语言是有规律的或没有上下文的(如果我们对 CFG 使用抽送引理)。
Pumping lemma is a kind of Lemma (Generally minor, proven proposition which can be used as a stepping stone to a larger result) where we prove something through contradiction. In TOC pumping lemma are used to disprove a language is regular or context free (if we use pumping lemma for CFG).
抽送引理可以作为证明某些语言不是正则的或没有上下文的工具。通过应用抽送引理,我们可以表明某些语言不能被有限状态自动机或下推自动机识别。这在理论计算机科学和形式语言理论中特别有用,可以确立这些计算模型的局限性。
Pumping lemma could be serves as a tool for proving that certain languages are not regular or not context-free. By applying the pumping lemma, we can show that some languages cannot be recognized by finite automata or pushdown automata. This is particularly useful in theoretical computer science and formal language theory to establish the limitations of these computational models.
抽送引理有助于对语言进行分类,了解不同类型自动机的功能和约束。
The pumping lemma helps in classifying languages and understanding the capabilities and constraints of different types of automata.
What is a Grammar in the Context of Formal Languages?
形式语法是一组用于形式语言中字符串的产生规则,它定义了哪些字符串根据该语言的语法是有效的,而不是根据它们的含义或语境。它专注于这些形式语言中字符串的形式。
A formal grammar is a set of production rules for strings in a formal language, defining which strings are valid according to the language’s syntax, rather than their meaning or context. It focuses on the form of these strings in a formal language.
它被表示为 G(V, N, P, S) 。它生成整个字母上的语法正确的字符串,主要用于语法分析阶段,尤其是在编译过程中,在编译阶段。
It is represented as G(V, N, P, S). It generates all syntactically correct strings over the alphabet, primarily used in the syntactic analysis phase, particularly during compilation, during the compilation phase.
G = (V, N, P, S) 是一门语法,其中:
G = (V, N, P, S) is a grammar, where −
-
N is a finite set of non-terminal symbols
-
V is a finite set of terminal symbols
-
P describes a set of production rules
-
S is the start symbol
Difference Between a Language and a Grammar
语法是一组用于生成语言字符串的产生规则。 * Grammars* 不过是符号、产生规则和起始符号的集合,通过四元组 G = (V, N, P, S) 表示。
A grammar is a set of production rules which are used to generate strings of a language. Grammars are nothing but a collection of symbols, production rules and a start symbol, represented through four-tuples G = (V, N, P, S).
另一方面,语言是语法的产物。如果未定义语法,则无法创建语言。在以下示例中显示了该语法使用的语法和语言的概念。
Language on the other hand the product of a grammar. If a grammar is not defined, we cannot create a language. In the following example it is shown the concept of grammar and languages used by that grammar.
\mathrm{G \: = \: (V,N,P,S) \: = \: (\lbrace S,A,B,C \rbrace , \: \lbrace a,b,c \rbrace , \: \lbrace S \rightarrow ABC, \: A\rightarrow a, \: B\rightarrow b, \: C\rightarrow c \rbrace , \: \lbrace S \rbrace)}
可能的语言 L(G) 可以通过该语法形成, L(G) = {abc}
A possible language L(G) could be formed through this grammar, L(G) = {abc}
S → ABC 来自 A → a, B → b 和 C → c ,它会生成“ abc ”,这是一种语言。
S → ABC from A → a, B → b and C → c, it will produce "abc" which is a language.
Difference Between a Decidable Problem and an Undecidable Problem
对于可判定问题,算法可以针对任何给定的实例提供明确的答案,例如“是”或“否”,而不可判定性指的是在其中没有算法可以针对所有可能的实例提供明确答案的问题。
For decidable problems an algorithm can provide a definite answer, like "yes" or "no" for any given instance, while undecidability refers to problems where no algorithm can provide a definite answer for all possible instances.
Aspect |
Decidable Problems |
Undecidable Problems |
Definition |
Algorithm exists for definite answer |
No algorithm for definite answer in all cases |
Answer |
Definite "yes" or "no" |
Lack definite answer for all instances |
Algorithm |
Efficient solving algorithm exists |
No algorithm for all instances |
Time |
Finite answer within reasonable time |
May not terminate for all inputs |
Proof |
Proven by existence of algorithm |
Proven through rigorous mathematical proofs |
Impact |
Allows efficient problem-solving |
Poses fundamental challenges |
Example |
Language membership, sorting |
Halting problem, Collatz Conjecture |
What is the Halting Problem?
终止问题是可计算性理论中的一个决策问题,用于确定计算机程序将终止还是永久运行。
The halting problem is a decision problem in computability theory that determines whether a computer program will terminate or run forever.
例如,如果我们从用户那里获取一个正数“@ {s0}”(x> 0),并在 while 循环内检查“x”是否为 0,在循环内我们不更新“x”的值。这将永远不会停止。
For example, if we take a positive number "x" (x > 0) from the user and inside a while loop, check "x" is 0 or not, inside the loop we are not updating the value of "x". This will never halt.
终止问题是一个众所周知的问题,已被证明是不可判定的,这意味着没有程序可以为足够通用的计算机程序解决它。
Halting problem is a well-known problem that has been proven to be undecidable, meaning no program can solve it for general enough computer programs.
图灵机与“普通计算机”一样强大,经常用于计算理论。 1936 年,艾伦·图灵证明了使用图灵机的图灵机上的终止问题是不可判定的,这意味着没有图灵机可以正确地为所有可能的程序/输入对做出决定。
Turing machines, which are as strong as "usual computers," are often used in computation theory. In 1936, Alan Turing proved that the halting problem over Turing machines is undecidable using a Turing machine, meaning no Turing machine can correctly decide for all possible program/input pairs.
What are P and NP Classes in Computational Complexity?
P 是一个复杂性类,其中包括可以使用确定性图灵机在多项式时间内解决的决策问题。它通常被认为是“可有效解决的”或“易于处理的”计算问题。
P is a complexity class that includes decision problems that can be solved by a deterministic Turing machine using polynomial time. It is often considered "efficiently solveable" or "tractable" computational problems.
难以处理的问题在理论上是可以解决的,但实际上无法解决。 P 包含自然问题,比如计算最大公约数,以及寻找最大匹配,确定一个数是否是质数也是 P 的一部分。
Intractable problems are theoretically solvable but cannot be solved in practice. P contains natural problems like calculating the greatest common divisor, and finding maximum matching, determining if a number is prime is also part of P.
NP 或非确定性多项式时间,是一组可以在非确定性图灵机上在多项式时间内解决的决策问题。这些问题可以通过确定性图灵机在多项式时间内进行验证。一些示例包括布尔可满足性问题 (SAT)、哈密顿路径问题(TSP 的特例)、顶点覆盖问题等。
NP, or Non-deterministic Polynomial Time, is a set of decision problems solvable in polynomial time on a non-deterministic Turing machine. These problems can be verified by a deterministic Turing machine in polynomial time. Some examples include Boolean satisfiability problem (SAT), the Hamiltonian path problem (special case of TSP), the Vertex cover problem, etc.
Difference Between NP-Complete and NP-Hard Problems
在计算理论中,问题 X 可以称为 NP-Hard,如果存在一个现有的 NP-Complete 问题 Y 可以在多项式时间内简化为 X 。
In computation theory, a problem X can be termed as NP-Hard if there is an existing NP-Complete problem Y that can be reduced to X within polynomial time.
NP-Hard 问题的难度等级与 NP-Complete 问题相同。然而,NP-Hard 问题不一定属于 NP 类。一个问题只有在属于 NP-Hard 和 NP 问题的情况下才能成为 NP-Complete。
The difficulty level of an NP-Hard problem is like that of an NP-Complete problem. However, NP-Hard problems do not necessarily have to fall under the NP Class. A problem can only be NP-Complete if it’s part of both NP-Hard and NP Problems.
Aspect |
NP-Hard |
NP-Complete |
Meaning |
The solution to an NP-Hard Problem X requires the existence of an NP-Complete Problem Y, which can be reduced to the problem X in polynomial time. |
A problem X is NP-Complete when there is an NP problem Y, which is reducible to X in a polynomial line. |
NP Class |
Doesn’t need to be in NP class |
Must be in NP class |
Relationship |
Subset of NP-Hard is NP-Complete |
Must be both NP and NP-Hard |
Examples |
Circuit-satisfactory Vertex Cover |
Hamiltonian cycle determination in a graph Boolean formula satisfiability problem. |
What is a Transition Function in the Context of Automata?
在自动机理论的语境中,有一些表格包括状态、输入符号、初始状态、终态集合和转移函数。
In the context of automata theory, there are certain tables which includes states, input symbols, starting state, set of final states and transition functions.
(Q, Σ, q0, F, T) ,其中 Q 是状态集合,而 Σ 是输入符号。
(Q, Σ, q0, F, T), where Q is the set of states and Σ are the input symbols.
当状态 q 从集合 Σ 中获取输入时,它会从 q 发送到另一个集合 q' 或 q 本身。基于输入的转移在转移函数中定义。转移函数集合在集合 T 中提及。
When a state q takes an input from the set Σ, it sends from q to another set q' or the q itself. The transition based on inputs are defined in the transition functions. The set of transition functions are mentioned in the set T.

在这个状态图中,有以下几个转移,其中一些可能为:
In this state diagram, there are several transitions, some of them could be −
\mathrm{\delta(q1, 0) \: \rightarrow \: q2}
\mathrm{\delta(q3, 1) \: \rightarrow \: qf}
What is a State Diagram?
自动机理论中的状态图是对有限自动机行为的可视化表示。
A state diagram in automata theory is a visual representation of a finite automaton’s behavior.
它由表示状态的顶点(集合 Q )、输入符号( Σ )和输出符号 (Z) 组成,如果不存在输出集合,它可以具有终态或确认状态集合 Q, F 的子集,以及转移集合 T 。
It consisting of vertices (the set Q) representing states, input symbols (Σ) and output symbols (Z), if there is no output set, it could have a subset of Q, F the set of Final states or acceptance states, and a set of transitions T.
它用于描述和分析系统行为,方法是显示这些状态之间可能的各个状态和转移。该状态图提供了一种清晰简洁的方法来理解有限自动机的可能状态和转移。
It is used to describe and analyze the system behavior by showing possible states and transitions between these states. The state diagram provides a clear and concise way to understand the possible states and transitions of a finite automaton.
这是一个状态图示例:
Here is a sample state diagram −

这里我们有四个状态:
Here we have four states:
-
Q = {q1, q2, q3, qf}
-
Final state F = {qf}
-
Inputs Σ = {0, 1}
-
Initial state q1
并且,
And,
\mathrm{T \: = \: \lbrace \delta(q1, 0) \: \rightarrow \: q2, \: \delta(q1, 0) \: \rightarrow \: q3, \: \delta(q2, 1) \: \rightarrow \: q1, \: \delta(q2, 1) \: \rightarrow \: q3, \: \delta(q3, 1) \: \rightarrow \: qf \rbrace}
What is the Significance of States in Automata?
自动机理论用于概念化一个系统,以便帮助在实际生活中设计它。它必须具有与输入相关的输入、输出和过渡。表示有限数量的条件或情况的系统中的基本组成部分是状态图中的状态
Automata theory is used to conceptualize a system which helps to design it in real life. It must have the inputs, outputs and transitions associated with inputs. The fundamental components that represent a finite number of conditions or situations a system are the states in state diagrams
状态对于分析和表示系统的行为、跟踪对象的不同条件至关重要。有限的状态集描述了自动机对输入符号的响应行为。这种基于状态的方法对于可以抽象为有限数量的不同条件的系统特别有用。
States are crucial for analyzing and representing a system’s behavior, tracking different conditions of objects. A finite set of states describes an automaton’s behavior in response to input symbols. This state-based approach is particularly useful for systems that can be abstracted into a finite number of distinct conditions.
What is a Deterministic Pushdown Automaton?
确定性下推自动机 (DPDA 或 DPA) 是自动机理论中下推自动机的变体,接受确定性无上下文语言。
A deterministic pushdown automaton (DPDA or DPA) is a variation of the pushdown automaton in automata theory, accepting deterministic context-free languages.
这里,机器过渡基于当前状态、输入符号和栈的最顶层符号,较低层符号没有直接影响。机器动作包括压入、弹出或替换栈顶。
Here the machine transitions are based on the current state, input symbol, and topmost symbol of the stack, with lower symbols having no immediate effect. Machine actions include pushing, popping, or replacing the stack top.
正式地说,对于 PDA 机器, M = (Q, Σ, Γ, δ, q0, z, F) 是确定性的,如果对于每个 q ∈ Q, a ∈ Σ ∪ {λ}, b ∈ Γ
Formally we can say, for a PDA machine, M = (Q, Σ, Γ, δ, q0, z, F) is deterministic if for every q ∈ Q, a ∈ Σ ∪ {λ}, b ∈ Γ
-
δ(q, a, b) has at most one element
-
If δ(q, λ, b) is non-empty and δ(q, c, b) must be empty for every c ∈ Σ
What is a Non-deterministic Pushdown Automaton?
非确定性下推自动机 (NPDA) 是一种具有 NFA 和栈的所有特征的机器。它的程序使用其当前状态、读写头下的当前符号和栈顶部的符号来做出决策。NPDA 比确定性 PDA 更强大。
A Non-deterministic Pushdown Automaton (NPDA) is a machine that has all the features of an NFA and a stack. Its program uses its current state, current symbol under the read head, and the symbol at the top of the stack to make decisions. NPDAs are more powerful than deterministic PDAs.
NPDA 中的栈不同于输入带的“语言”字母表。栈在控制自动机的启动状态中使用,栈上只有初始符号。
The stack in NPDAs is distinct from the input tape’s "language" alphabet. The stack is used in the start state of the control automaton, with only initial symbols on the stack.
状态、输入元素和栈中的顶部符号确定每一步的下一步(过渡)。过渡步骤包括进行状态更改、从输入带读取符号、移至右侧的下一个符号以及修改栈。
The state, input element, and top symbol in the stack determine the next step (transition) at each step. Transition steps include making a state change, reading a symbol from the input tape, shifting to the next symbol on the right, and modifying the stack.
What is a Universal Turing Machine?
图灵机是一种与数字计算机等效的数学工具(图灵,1930 年代)。它由机器计算的输入输出关系组成,输入以二进制形式给定在机器的磁带上。输出由机器停止时的磁带内容组成。磁带的内容由图灵机内部的有限状态机 (FSM) 确定
A Turing Machine is a mathematical tool that is equivalent to a digital computer (Turing, 1930s). It consists of an input output relation that the machine computes, with the input given in binary form on the machine’s tape. The output consists of the tape’s contents when the machine halts. The contents of the tape are determined by a finite state machine (FSM) inside the Turing Machine
然而,图灵机需要为每个新的计算和输入输出关系构建一个不同的图灵机。这导致了通用图灵机 (UTM) 的发明,它接受机器 M 的描述,并可以在输入磁带的其余内容上模拟它。
However, Turing Machines require constructing a different one for every new computation and input output relation. This led to the invention of the universal Turing machine (UTM), which takes in the description of a machine M and can simulate it on the rest of the input tape’s contents.
看看以下通用图灵机的概念图 −
Take a look at the following conceptual diagram of Universal Turning Machine −

在这里,我们正在使用无限带,UTM 在其上工作,但此外,它还将机器描述或状态转换从一个单独的块传输到 UTM 模块,在实现中,磁带本身将机器描述保存在一个单独的区域。
Here, we are using the infinite tape, the UTM is working on them, but additionally it is taking the machine description or state transition from a separate block to the UTM module, in implementation, the tape itself holds the machine description at a separate zone.
Why Formal Languages are used in Computer Science?
计算机科学中的形式语言是一组字符串,它们是基于有限符号集(称为字母表),并且使用此字母表创建的结构化字符串,基于定义的语法规则,构成形式语言。
A formal language in computer science is a set of strings over a finite set of symbols, called an alphabet, and structured strings created using this alphabet, based on defined grammar rules, constitute the formal language.
形式语言围绕三个关键概念构建:字母表、字符串和语法。
Formal languages are structured around three key concepts: Alphabet, String, and Grammar.
-
Alphabet is a finite set of distinct symbols,
-
String is a finite sequence of symbols selected from an alphabet. The order of symbols matters in a string, and empty strings have zero symbols.
-
Grammar is a set of formal rules that govern the combination of symbols to compose strings in a formal language.
对于不同类别形式语言,这些规则已发生变化,如常规、上下文无关、上下文相关和递归可枚举。
These rules are changed for different classes of formal languages as regular, context-free, context-sensitive, and recursively enumerable.
What is the Role of Automata in Parsing?
-
Parsing* 是基于语法,使用产生规则来推导字符串和检查其可接受性的过程。它是一个用于检查字符串是否在语法上正确的工具。
Parsing is a grammar-based process that uses production rules to derive a string and check its acceptability. It is a tool used to check if a string is syntactically correct or not.
解析器可以自顶向下或自底向上,从顶端开始使用起始符号,使用解析树来推导字符串。
A parser can be top-down or bottom-up, starting from the top with the start symbol and using a parse tree to derive the string.
Top-Down Parsing
-
The PDA uses top-down parsing to match the start symbol of a grammar on its stack with the input.
-
If it’s a terminal, then compare it with the current input symbol,
-
If it is non-terminal, then replace by production rules.
-
The stack tracks the parsing context and expected symbols, and input is accepted if the stack empties and all input is consumed.
What is a Linear-bounded Automaton?
-
linear bounded automaton* 是带有一段有界有限长度磁带的多轨道非确定性图灵机。在有界线性自动机中的计算限制在有界常量区域,其中两个特殊符号作为左右结束标记。
A linear bounded automaton is a multi-track non-deterministic Turing machine with a bounded finite length of tape. The computation in linear bounded automata is restricted to the constant bounded area, with two special symbols serving as left and right end markers.
一个确定性线性有界自动机是上下文相关的,并且一个线性有界自动机具有一个空 * language is undecidable* 。
A deterministic linear bounded automaton is context-sensitive, and a linear bounded automaton with an empty language is undecidable.
非确定性线性边界算法 (LBA) 与上下文无关语言的关联关系与下推自动机与上下文无关语言的关联关系类似。
Nondeterministic Linear Boundary Algorithms (LBAs) bear the same relationship to context-sensitive languages as pushdown automata do to context-free languages.
LBA 的确定性版本是否拥有与非确定性版本同样的能力这一点未知。
It is unknown whether deterministic versions of LBA have equivalent power to the nondeterministic version.
它是一个 9 元组自动机:G = (Q, Σ, Γ, δ, q0, B, F, GL, GR)
It is a 9 tuple automata: G = (Q, Σ, Γ, δ, q0, B, F, GL, GR)
-
Q − the set of states
-
Σ − Set of input alphabets
-
Γ − Tape alphabets Σ ⊂ Γ
-
δ − set of transition functions
-
q0 − Initial state
-
B − Set of blank symbols, B ∈ Γ – Σ
-
F − Set of final states F ⊂ Q
-
GL − Left terminal GL ∈ Σ
-
GR − Right terminal GR ∈ Σ
What is the Significance of Myhill-Nerode Theorem?
-
Myhill-Nerode theorem* 声明,如果语言 L 是常规的,则 L~ ( L~ 是表示为字符串的 x 和 y 的关系,其中没有 x 和 y 的可区分拓展)具有有限数量的等价类别,且如果此数量为 N ,则在最简 * Deterministic Finite Automaton (DFA)* 中有 N 个状态可识别 L。
The Myhill-Nerode theorem states that a language L is regular if L~ (L~ is relation on strings x and y, where no distinguishable extension for x and y) has a finite number of equivalence classes, and if this number is N, then there are N states in a minimal Deterministic Finite Automaton (DFA) recognizing L.
除定义外,Myhill-Nerode 定理用于解释概念,即简单机器或“有限状态自动机”可以识别哪些语言。它有助于理解这些简单机器可以处理哪些语言,或者在机器上可以做多少简化以接受这些语言。
Apart from the definition, the Myhill-Nerode theorem is used to explain the concept that which languages simple machines, or "finite state automata," can recognize. It helps to understand which languages these simple machines can handle or how much simplification can be done on a machine to accept the languages.
What is the Equivalence of Automata and Grammars?
等价性指示不同类型自动机和它们可以识别的语法类别之间的关系。根据乔姆斯基分类,语法有四种类型。
The equivalence indicates the relationship between different types of automata and the classes of grammars they can recognize. According to Chomsky’s classification there are four types of grammar.
Languages |
Automata |
Recursively Enumerable Languages: Equivalent to Type-0 grammars |
Recognized by Turing Machines |
Context-Sensitive Languages: Equivalent to Type-1 grammars |
Recognized by Linear Bounded Automata (LBA) |
Context-Free Languages: Equivalent to Type-2 grammars |
Recognized by Pushdown Automata (PDA) |
Regular Languages: Equivalent to Type-3 grammars |
Recognized by Finite State Automata (FSA) |
What is a Regular Expression?
乔姆斯基分类中的 3 型语法也称作正则语法。有限自动机可以通过简单表达式接受正则语言,这种表达式称为 * Regular Expressions*
The Type-3 Grammar in Chomsky’s classification is also termed as regular grammar. A Finite automaton can accept regular languages through simple expressions which is known as Regular Expressions
正则表达式是表示任何语言的最有效方法。这些表达式定义字符串并匹配字符组合。字符串搜索算法使用此模式在字符串上查找操作,从而可以轻松地检查语言。
Regular expressions are the most effective way to represent any language. These expressions define a string and match character combinations. The string searching algorithm uses this pattern to find operations on a string, making it easy to check the language.
正则表达式用 x * 表示,表明 x 发生零次或多次,而 x+ 表示 x 发生一次或多次,从而生成一系列字符。
A regular expression, denoted by x*, indicates zero or more occurrences of x, while x+ signifies one or more occurrences of x, generating a range of characters.
(a | b) * 表示 a 和 b 所有可能的任意大小的组合。[此处符号“ | ”表示 OR ]
(a | b)* denotes all possible combination of a and b of any size. [Here the symbol "|" denotes OR]
How do Regular Expressions Relate to Finite Automata?
正则表达式是有限自动机接受的语言描述语言,提供了任何语言的最有效表示方式,其中 Σ 将输入集表示为字母表。它可以定义如下 −
Regular expressions are a language-describing language accepted by finite automata, providing the most effective representation of any language, with Σ denoting the input set as an alphabet. And it can be defined as follows −
-
Φ represents the empty set,
-
ε denotes the null string,
-
For each 'a' in Σ, is a regular expression and denotes the set 'a'.
如果“ r ”和“ s ”是表示语言 L1 和 L2 的正则表达式,则
If "r" and "s" are regular expressions representing languages L1 and L2, then
-
"r + s" is same as (L1 ∪ L2)
-
rs is same as L1L2 concatenation
-
r* are equivalent L1 *closure respectively

该图展示了使用 epsilon 移动轻松转换正则表达式到 * Non-deterministic Finite Automata (NFA)* 的过程,然后转换为确定性有限自动机 (DFA),后者可以轻松转换为正则表达式。
The figure demonstrates the easy conversion of RE to Non-deterministic Finite Automata (NFA) with epsilon moves, then to Deterministic Finite Automata (DFA), which can be easily converted to RE.
What is the Importance of Closure Properties in Automata Theory?
自动机理论和形式语言中的 * closure property* 指成员进行某些操作后仍保留在该类别中的语言类别。
The closure property in automata theory and formal languages refers to a class of languages remaining within that class after certain operations are performed on its members.
在正则语言的上下文中,当通过并集或连接等操作执行时,它们具有相同的特征和限制,因此我们可以说正则语言在并集下是闭合的。
In the context of regular languages, they share the same characteristics and limitations when performed through operations like union or concatenation, so we can say regular languages are closed under Union.
正则语言在并集、交集、连接、补集和 Kleene 闭包等操作下是闭合的。相比之下,确定性上下文无关语言在补集、与正则语言的交集和逆同态下是闭合的。
Regular languages are closed under union, intersection, concatenation, complement and Kleene closure, etc. In contrast, Deterministic CFLs are closed under complementation, intersection with regulars and inverse homomorphism.
What is a Non-deterministic Turing Machine?
非确定性图灵机 ™ 为每个状态和符号设置了一组动作,使转换变得非确定性。TM 的计算是从开始配置开始的配置树。
A Non-Deterministic Turing Machine ™ has a group of actions for every state and symbol, making transitions non-deterministic. The computation of a TM is a tree of configurations, starting from the start configuration.
像 NTM 这样的机器可以在每次步骤中执行多个转换,例如从输入符号转换到状态 Qi、Qj、Qk 等。NTM“猜测”最终导致可接受状态 Qf 的最佳转换,因为在每个步骤中都没有建立良好的转换。这个概念就像迷宫,机器总能猜出正确的选择,如果有一条路,它可以找到最快的出路。
A machine like the NTM can make multiple transitions in every step such as transitioning from input symbols to state Qi, Qj, Qk, and so on. The NTM "guesses" the best transition that eventually leads to an accepting state Qf, as no transition is well established in every step. This concept is like a maze, where a machine can always guess the correct choice, finding the quickest way out if there is one.
NTM 是计算机科学家了解计算限制、复杂性理论和高效算法求解类别的宝贵工具。它们在研究复杂性类别(如 NP(非确定性多项式时间)和 NP-完全问题)时特别有用,使研究人员能够推理解决方案可以被有效验证但最初可能难以找到的问题。然而,像 DTM 一样,设计 NTM 在实践中是不可行的。
NTMs are valuable tools for computer scientists to understand computational limits, complexity theory, and efficient algorithm-solving classes. They are particularly useful in studying complexity classes like NP (nondeterministic polynomial time) and NP-complete problems, allowing researchers to reason about problems where the solution can be verified efficiently but may be hard to find initially. However, designing NTM is not practically possible like DTM.