Automata Theory 简明教程

CFG Simplification

在一个 CFG 中,所有产生式和符号可能不一定都用于推导字符串。此外,可能存在一些空产生式和单位产生式。消除这些产生式和符号被称为 simplification of CFGs 。简化主要包括以下步骤 −

In a CFG, it may happen that all the production rules and symbols are not needed for the derivation of strings. Besides, there may be some null productions and unit productions. Elimination of these productions and symbols is called simplification of CFGs. Simplification essentially comprises of the following steps −

  1. Reduction of CFG

  2. Removal of Unit Productions

  3. Removal of Null Productions

Reduction of CFG

CFG 分为两个阶段来简化 −

CFGs are reduced in two phases −

Phase 1 − 从 CFG G 推导出一个等价文法 G’ ,使得每个变量都推导出某个终结符字符串。

Phase 1 − Derivation of an equivalent grammar, G’, from the CFG, G, such that each variable derives some terminal string.

Derivation Procedure

Derivation Procedure

步骤 1 − 包含所有推导出某个终结符的符号 W1 ,并初始化 i=1

Step 1 − Include all symbols, W1, that derive some terminal and initialize i=1.

步骤 2 − 包含所有推导出 Wi 的符号 Wi+1

Step 2 − Include all symbols, Wi+1, that derive Wi.

步骤 3 − 增加 i 并重复步骤 2,直到 Wi+1 = Wi

Step 3 − Increment i and repeat Step 2, until Wi+1 = Wi.

步骤 4 − 包含所有包含 Wi 的产生式。

Step 4 − Include all production rules that have Wi in it.

Phase 2 − 从 CFG G’ 推导出一个等价文法 G” ,使得每个符号都出现在句式中。

Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that each symbol appears in a sentential form.

Derivation Procedure

Derivation Procedure

步骤 1 − 将起始符号包含在 Y1 中,并初始化 i = 1

Step 1 − Include the start symbol in Y1 and initialize i = 1.

步骤 2 − 包含所有可以从 Yi 推导出的符号 Yi+1 ,并包含所有已经应用的产生式。

Step 2 − Include all symbols, Yi+1, that can be derived from Yi and include all production rules that have been applied.

步骤 3 − 增加 i 并重复步骤 2,直到 Yi+1 = Yi

Step 3 − Increment i and repeat Step 2, until Yi+1 = Yi.

Problem

找到与语法 G 等价的简化语法,其中产生式为 P:S → AC | B,A → a,C → c | BC,E → aA | e

Find a reduced grammar equivalent to the grammar G, having production rules, P: S → AC | B, A → a, C → c | BC, E → aA | e

Solution

Phase 1

Phase 1

T = {a, c, e}

T = { a, c, e }

根据产生式 A → a, C → c 和 E →aAa,W1 = {A, C, E}

W1 = { A, C, E } from rules A → a, C → c and E → aA

根据产生式 S → AC,W2 = { A, C, E } U { S }

W2 = { A, C, E } U { S } from rule S → AC

W3 = {A, C, E, S} U ∅

W3 = { A, C, E, S } U ∅

由于 W2 = W3,我们可以推导出 G' 为:

Since W2 = W3, we can derive G’ as −

G' = {{A, C, E, S}, {a, c, e}, P, {S}}

G’ = { { A, C, E, S }, { a, c, e }, P, {S}}

其中 P:S → AC,A → a,C → c,E → aA | e

where P: S → AC, A → a, C → c , E → aA | e

Phase 2

Phase 2

根据产生式 S → AC,Y1 = {S}

Y1 = { S }

Y2 = {S, A, C}

Y2 = { S, A, C } from rule S → AC

根据产生式 A → a 和 C → c,Y3 = {S, A, C, a, c}

Y3 = { S, A, C, a, c } from rules A → a and C → c

Y4 = {S, A, C, a, c}

Y4 = { S, A, C, a, c }

由于 Y3 = Y4,我们可以推导出 G'' 为:

Since Y3 = Y4, we can derive G” as −

G'' = {{A, C, S}, {a, c}, P, {S}}

G” = { { A, C, S }, { a, c }, P, {S}}

其中 P:S → AC,A → a,C → c

where P: S → AC, A → a, C → c

Removal of Unit Productions

任何形式为 A → B 的产生式,其中 A, B ∈ 非终结符被称作 unit production.

Any production rule in the form A → B where A, B ∈ Non-terminal is called unit production..

Removal Procedure −

Step 1 − 要消除 A → B ,只要在语法中出现 B → x 时,就添加到语法规则中产生式 A → x 。[x ∈ 终结符,x 可以为空]

Step 1 − To remove A → B, add production A → x to the grammar rule whenever B → x occurs in the grammar. [x ∈ Terminal, x can be Null]

Step 2 ——从语法中删除 A → B

Step 2 − Delete A → B from the grammar.

Step 3 ——从步骤 1 开始重复,直到删除所有单位产生式。

Step 3 − Repeat from step 1 until all unit productions are removed.

Problem

Problem

从以下内容中删除单位产生式 -.

Remove unit production from the following −

S → XY、X → a、Y → Z | b、Z → M、M → N、N → a

S → XY, X → a, Y → Z | b, Z → M, M → N, N → a

Solution

Solution

语法中有 3 个单位产生式-

There are 3 unit productions in the grammar −

Y → Z、Z → M和 M → N

Y → Z, Z → M, and M → N

At first, we will remove M → N.

At first, we will remove M → N.

由于 N → a,我们添加 M → a,并删除 M → N。

As N → a, we add M → a, and M → N is removed.

产生式集变为

The production set becomes

S → XY、X → a、Y → Z | b、Z → M、M → a、N → a

S → XY, X → a, Y → Z | b, Z → M, M → a, N → a

Now we will remove Z → M.

Now we will remove Z → M.

由于 M → a,我们添加 Z→ a,并删除 Z → M。

As M → a, we add Z→ a, and Z → M is removed.

产生式集变为

The production set becomes

S → XY、X → a、Y → Z | b、Z → a、M → a、N → a

S → XY, X → a, Y → Z | b, Z → a, M → a, N → a

Now we will remove Y → Z.

Now we will remove Y → Z.

由于 Z → a,我们添加 Y→ a,并删除 Y → Z。

As Z → a, we add Y→ a, and Y → Z is removed.

产生式集变为

The production set becomes

S → XY、X → a、Y → a | b、Z → a、M → a、N → a

S → XY, X → a, Y → a | b, Z → a, M → a, N → a

现在 Z、M和 N 不可达,因此我们可以删除它们。

Now Z, M, and N are unreachable, hence we can remove those.

最终的 CFG 没有单位产生式 -.

The final CFG is unit production free −

S → XY、X → a、Y → a | b

S → XY, X → a, Y → a | b

Removal of Null Productions

在 CFG 中,非终结符 ‘A’ 是一个空值变量,如果存在产生式 A → ε 或存在从 A 开始最终以

In a CFG, a non-terminal symbol ‘A’ is a nullable variable if there is a production A → ε or there is a derivation that starts at A and finally ends up with

ε: A → …​…​.… → ε

Removal Procedure

Step 1 结束的推导 - 找出导出 ε 的空值非终结符变量。

Step 1 − Find out nullable non-terminal variables which derive ε.

Step 2 - 对于每个产生式 A → a ,构建所有产生式 A → x ,其中 x 是通过从步骤 1 中移除一个或多个非终结符从 ‘a’ 获得的。

Step 2 − For each production A → a, construct all productions A → x where x is obtained from ‘a’ by removing one or multiple non-terminals from Step 1.

Step 3 - 将原始产生式与步骤 2 的结果合并,并移除 ε - productions

Step 3 − Combine the original productions with the result of step 2 and remove ε - productions.

Problem

Problem

从以下内容中移除空值产生式 -

Remove null production from the following −

S → ASA | aB | b, A → B, B → b | ∈

Solution

Solution

有两个空值变量 - AB

There are two nullable variables − A and B

At first, we will remove B → ε.

At first, we will remove B → ε.

移除 B → ε 后,产生式集变为 -

After removing B → ε, the production set becomes −

S→ASA | aB | b | a,A ε B| b | &epsilon,B → b

S→ASA | aB | b | a, A ε B| b | &epsilon, B → b

Now we will remove A → ε.

Now we will remove A → ε.

去掉 A → ε 后,产生式集变为−

After removing A → ε, the production set becomes −

S→ASA | aB | b | a | SA | AS | S,A → B| b,B → b

S→ASA | aB | b | a | SA | AS | S, A → B| b, B → b

这是去除空串转换的最终产生式集。

This is the final production set without null transition.