Grammars

(Notes from Chapter 7, part of Section 5)

Grammars

A (phrase structure) grammar is a semi-Thue process in which the letters of the alphabet are separated into two disjoint sets called variables and terminals.

One of the variables is singled out as the start symbol, often denoted \(S\).

Let \(\Gamma\) be a grammar with start symbol \(S\) and let \({\cal V}\) and \(T\) be the sets of variables and terminals of \(\Gamma\), respectively. Define \[ L(\Gamma) = \{u \in T^* \mid S \mathop{\Rightarrow}^* u \},\] the language generated by \(\Gamma\).

Theorem 5.1. Let \(U\) be a language accepted by a nondeterministic Turing machine. Then there is a grammar \(\Gamma\) such that \(U = L(\Gamma)\).

An important result is the following:

Theorem 5.3. Let \(L\) be a given language. Then the following are equivalent:

\(L\) is r.e.;
\(L\) is accepted by a deterministic Turing machine;
\(L\) is accepted by a nondeterministic Turing machine;
there is a gramma \(\Gamma\) such that \(L = L(\Gamma)\).;

The proof of the above theorem goes as follows: \( (1) \Rightarrow (2)\) follows from Theorem 3.1 in Chapter 6. \( (2) \Rightarrow (3)\) is trivial. \( (3) \Rightarrow (4)\) is given by Theorem 5.1 above. \((4) \Rightarrow (1)\) follows from the next result.

Theorem 5.2. A language \(U\) is r.e. if and only if there is a grammar \(\Gamma\) such that \(U = L(\Gamma)\).

Necessity follows from Theorem 5.1 in Chapter 6 and Theorem 5.1 above.

Sufficiency is proved by first defining a primitive recursive predicate \(\textbf{deriv}(u,y)\) that holds if and only if \(y\), as the Gödel number of a sequence, is \([u_1,\ldots,u_m,1]\) such that the sequence \(u_1,\ldots,u_m\) is a derivation of \(u\) from \(S\) in \(\Gamma\).

Then we have \[S \mathop{\Rightarrow}_{\Gamma}^* u \iff \min_{y} \textbf{deriv}(u,y)\downarrow.\] Hence, \(\{u \mid S \displaystyle\mathop{\Rightarrow}_{\Gamma}^* u \}\) is r.e. by Theorem 7.2 in Chapter 3.

Note that \[L(\Gamma) = T^* \cap \{u \mid S \mathop{\Rightarrow}_{\Gamma}^* u \}\] Since \(T^*\) is r.e. and the intersection of two r.e. sets is again r.e., it follows that \(L(\Gamma)\) is r.e.

With the alphabet of \(\Gamma\) being \[\{s_1,\ldots,s_n, V_1,\ldots, V_k\}\] where \(S = V_1\) has value \(n+1\) in base \(n+k\), \(\textbf{deriv}(u,y)\) can be defined as follows: \begin{eqnarray*} \textbf{deriv}(u,y) & \iff & (\exists m)_{\leq y} \left( \begin{array}{l} m+1 = \operatorname{Lt}(y)~ \&~ (y)_1 = n+1 \\ \&~(y)_m = u~ \&~(y)_{m+1} = 1 \\ \&~(\forall j)_{\lt m} \left (j = 0 \vee \left [ (y)_j \displaystyle\mathop{\Rightarrow}_{\Gamma} (y)_{j+1} \right ] \right ) \end{array} \right) \end{eqnarray*}

Primitive recursiveness is easily seen with the following result:

Lemma 1. The predicate \(u \displaystyle\mathop{\Rightarrow}_\Gamma v\) is primitive recursive.

Proof. Let the productions in \(\Gamma\) be \(g_i \rightarrow h_i\), \(i = 1,2,\ldots,k\). Then \[ u\mathop{\Rightarrow}_\Gamma v \iff \textbf{prod}_1(u,v) \vee \textbf{prod}_2(u,v) \vee \cdots \vee \textbf{prod}_k(u,v)\] where \(\textbf{prod}_i(u,v)\) is the primitive recursive predicate \[(\exists r,s)_{\leq u} [ u = \textbf{concat}(r,g_i,s) ~\&~v=\textbf{concat}(r,h_i,s) ]. \] Recall that \(\textbf{concat}\) is primitive recursive as discussed in Chapter 5, Section 1.