Automata Theory 简明教程
Regular Sets
任何表示正则表达式值的集合都被称为 Regular Set.
Any set that represents the value of the Regular Expression is called a Regular Set.
Properties of Regular Sets
Property 1 . 两个正则集的并集是正则。
Property 1. The union of two regular set is regular.
Proof −
Proof −
让我们取两个正则表达式
Let us take two regular expressions
RE1 = a(aa),RE2 = (aa)
RE1 = a(aa)* and RE2 = (aa)*
所以,L1 = {a, aaa, aaaaa,…..}(不包括 Null 的奇数长度字符串)
So, L1 = {a, aaa, aaaaa,…..} (Strings of odd length excluding Null)
且 L2 ={ ε, aa, aaaa, aaaaaa,…….} (包括 Null 的偶数长度字符串)
and L2 ={ ε, aa, aaaa, aaaaaa,…….} (Strings of even length including Null)
L1 ∪ L2 = { ε, a, aa, aaa, aaaa, aaaaa, aaaaaa,…….}
(包括 Null 的所有可能长度的字符串)
(Strings of all possible lengths including Null)
RE (L1 ∪ L2) = a* (本身就是一个正则表达式)
RE (L1 ∪ L2) = a* (which is a regular expression itself)
Hence, proved.
Hence, proved.
Property 2. 两个正则集的交集是正则。
Property 2. The intersection of two regular set is regular.
Proof −
Proof −
让我们取两个正则表达式
Let us take two regular expressions
RE1 = a(a*),RE2 = (aa)*
RE1 = a(a*) and RE2 = (aa)*
所以,L1 = { a,aa, aaa, aaaa, ….} (不包括 Null 的所有可能长度字符串)
So, L1 = { a,aa, aaa, aaaa, ….} (Strings of all possible lengths excluding Null)
L2 = { ε, aa, aaaa, aaaaaa,…}(包含空字符串的偶数长度字符串)
L2 = { ε, aa, aaaa, aaaaaa,…….} (Strings of even length including Null)
L1 ∩ L2 = { aa, aaaa, aaaaaa,…}(不包含空字符串的偶数长度字符串)
L1 ∩ L2 = { aa, aaaa, aaaaaa,…….} (Strings of even length excluding Null)
RE (L1 ∩ L2) = aa(aa)*,本身是一个正则表达式。
RE (L1 ∩ L2) = aa(aa)* which is a regular expression itself.
Hence, proved.
Hence, proved.
Property 3. 一个正则集合的补集是正则的。
Property 3. The complement of a regular set is regular.
Proof −
Proof −
我们取正则表达式−
Let us take a regular expression −
RE = (aa)*
所以,L = {ε, aa, aaaa, aaaaaa, …}(包含空字符串的偶数长度字符串)
So, L = {ε, aa, aaaa, aaaaaa, …….} (Strings of even length including Null)
L 的补集是不在 L 中的所有字符串。
Complement of L is all the strings that is not in L.
所以,L’ = {a, aaa, aaaaa, …}(不包含空字符串的奇数长度字符串)
So, L’ = {a, aaa, aaaaa, …..} (Strings of odd length excluding Null)
RE (L’) = a(aa)*,本身是一个正则表达式。
RE (L’) = a(aa)* which is a regular expression itself.
Hence, proved.
Hence, proved.
Property 4. 两个正则集合的差是正则的。
Property 4. The difference of two regular set is regular.
Proof −
Proof −
我们取两个正则表达式−
Let us take two regular expressions −
RE1 = a (a*) 和 RE2 = (aa)*
RE1 = a (a*) and RE2 = (aa)*
所以,L1 = {a, aa, aaa, aaaa, …}(包含所有可能的长度且不包含空字符串的字符串)
So, L1 = {a, aa, aaa, aaaa, ….} (Strings of all possible lengths excluding Null)
L2 = { ε, aa, aaaa, aaaaaa,…}(包含空字符串的偶数长度字符串)
L2 = { ε, aa, aaaa, aaaaaa,…….} (Strings of even length including Null)
L1 – L2 = {a, aaa, aaaaa, aaaaaaa, …}
L1 – L2 = {a, aaa, aaaaa, aaaaaaa, ….}
(不包含空字符串的所有奇数长度字符串)
(Strings of all odd lengths excluding Null)
RE (L1 – L2) = a (aa)*,是一个正则表达式。
RE (L1 – L2) = a (aa)* which is a regular expression.
Hence, proved.
Hence, proved.
Property 5. 一个正则集合的反转是正则的。
Property 5. The reversal of a regular set is regular.
Proof −
Proof −
我们要证明 LR 也是正则的,如果 L 是一个正则集合。
We have to prove LR is also regular if L is a regular set.
令 L = {01,10,11,10}
Let, L = {01, 10, 11, 10}
RE (L) = 01 + 10 + 11 + 10
LR = {10,01,11,01}
LR = {10, 01, 11, 01}
RE (LR) = 01 + 10 + 11 + 10 为正则语言
RE (LR) = 01 + 10 + 11 + 10 which is regular
Hence, proved.
Hence, proved.
Property 6. 正则集合的闭包为正则语言。
Property 6. The closure of a regular set is regular.
Proof −
Proof −
如果 L = {a, aaa, aaaaa, …….} (长度为奇数且不含空集合的字符串),
If L = {a, aaa, aaaaa, …….} (Strings of odd length excluding Null)
即 RE (L) = a (aa)*
i.e., RE (L) = a (aa)*
L* = {a, aa, aaa, aaaa , aaaaa,……………} (长度可为任意正整数且不含空集合的字符串)
L* = {a, aa, aaa, aaaa , aaaaa,……………} (Strings of all lengths excluding Null)
RE (L*) = a (a)*
Hence, proved.
Hence, proved.
Property 7. 两个正则集合的连接为正则语言。
Property 7. The concatenation of two regular sets is regular.
Proof −
Proof −
令 RE1 = (0+1) 0 and RE2 = 01(0+1)
Let RE1 = (0+1)0 and RE2 = 01(0+1)
此处,L1 = {0, 00, 10, 000, 010, ……} (以 0 结尾的字符串集合)
Here, L1 = {0, 00, 10, 000, 010, ……} (Set of strings ending in 0)
且 L2 = {01, 010,011,…..} (以 01 开头的字符串集合)
and L2 = {01, 010,011,…..} (Set of strings beginning with 01)
则 L1 L2 = {001,0010,0011,0001,00010,00011,1001,10010,………….}
Then, L1 L2 = {001,0010,0011,0001,00010,00011,1001,10010,………….}
包含 001 作为子串的字符串集合,可以用正则表达式表示为 − (0 + 1) 001(0 + 1)
Set of strings containing 001 as a substring which can be represented by an RE − (0 + 1)001(0 + 1)
因此得证。
Hence, proved.
Identities Related to Regular Expressions
给出正则表达式 R, P, L, Q,则有如下恒等式:
Given R, P, L, Q as regular expressions, the following identities hold −
-
∅* = ε
-
ε* = ε
-
RR* = R*R
-
R*R* = R*
-
(R*)* = R*
-
RR* = R*R
-
(PQ)P =P(QP)
-
(a+b)* = (a*b*)* = (a*+b*)* = (a+b*)* = a*(ba*)*
-
R + ∅ = ∅ + R = R (The identity for union)
-
R ε = ε R = R (The identity for concatenation)
-
∅ L = L ∅ = ∅ (The annihilator for concatenation)
-
R + R = R (Idempotent law)
-
L (M + N) = LM + LN (Left distributive law)
-
(M + N) L = ML + NL (Right distributive law)
-
ε + RR* = ε + R*R = R*