How to prove that a language is not context-free?

Question Detail: 

We learned about the class of context-free languages $\mathrm{CFL}$. It is characterised by both context-free grammars and pushdown automata so it is easy to show that a given language is context-free.

How do I show the opposite, though? My TA has been adamant that in order to do so, we would have to show for all grammars (or automata) that they can not describe the language at hand. This seems like a big task!

I have read about some pumping lemma but it looks really complicated.

Asked By : Raphael
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/265

Answered By : jmad

To my knowledge the pumping lemma is by far the simplest and most-used technique. If you find it hard, try the regular version first, it's not that bad. There are some other means for languages that are far from context free. For example undecidable languages are trivially not context free.

That said, I am also interested in other techniques than the pumping lemma if there are any.

EDIT: Here is an example for the pumping lemma: suppose the language $L=\{ a^k \mid k ∈ P\}$ is context free ($P$ is the set of prime numbers). The pumping lemma has a lot of $∃/∀$ quantifiers, so I will make this a bit like a game:

  1. The pumping lemma gives you a $p$
  2. You give a word $s$ of the language of length at least $p$
  3. The pumping lemma rewrites it like this: $s=uvxyz$ with some conditions ($|vxy|≤p$ and $|vy|≥1$)
  4. You give an integer $n≥0$
  5. If $uv^nxy^nz$ is not in $L$, you win, $L$ is not context free.

For this particular language for $s$ any $a^k$ (with $k≥p$ and $k$ is a prime number) will do the trick. Then the pumping lemma gives you $uvxyz$ with $|vy|≥1$. Do disprove the context-freeness, you need to find $n$ such that $|uv^nxy^nz|$ is not a prime number.

$$|uv^nxy^nz|=|s|+(n-1)|vy|=k+(n-1)|vy|$$

And then $n=k+1$ will do: $k+k|vy|=k(1+|vy|)$ is not prime so $uv^nxy^nz\not\in L$. The pumping lemma can't be applied so $L$ is not context free.

A second example is the language $\{ww \mid w \in \{a,b\}^{\ast}\}$. We (of course) have to choose a string and show that there's no possible way it can be broken into those five parts and have every derived pumped string remain in the language.

The string $s=a^{p}b^{p}a^{p}b^{p}$ is a suitable choice for this proof. Now we just have to look at where $v$ and $y$ can be. The key parts are that $v$ or $y$ has to have something in it (perhaps both), and that both $v$ and $y$ (and $x$) are contained in a length $p$ substring - so they can't be too far apart.

This string has a number of possibilities for where $v$ and $y$ might be, but it turns out that several of the cases actually look pretty similar.

  1. $vy \in a^{\ast}$ or $vy \in b^{\ast}$. So then they're both contained in one of the sections of continguous $a$s or $b$s. This is the relatively easy case to argue, as it kind of doesn't matter which they're in. Assume that $|vy| = k \leq p$.
    • If they're in the first section of $a$s, then when we pump, the first half of the new string is $a^{p+k}b^{p-k/2}$, and the second is $b^{k/2}a^{p}b^{p}$. Obviously this is not of the form $ww$.
    • The argument for any of the three other sections runs pretty much the same, it's just where the $k$ and $k/2$ ends up in the indices.
  2. $vxy$ straddles two of the sections. In this case pumping down is your friend. Again there's several places where this can happen (3 to be exact), but I'll just do one illustrative one, and the rest should be easy to figure out from there.
    • Assume that $vxy$ straddles the border between the first $a$ section and the first $b$ section. Let $vy = a^{k_{1}}b^{k_{2}}$ (it doesn't matter precisely where the $a$s and $b$s are in $v$ and $y$, but we know that they're in order). Then when we pump down (i.e. the $i=0$ case), we get the new string $s'=a^{p-k_{1}}b^{p-k_{2}}a^{p}b^{p}$, but then if $s'$ could be split into $ww$, the midpoint must be somewhere in the second $a$ section, so the first half is $a^{p-k_{1}}b^{p-k_{2}}a^{(k_{1}+k_{2})/2}$, and the second half is $a^{p-(k_{1}+k_{2})/2}b^{p}$. Clearly these are not the same string, so we can't put $v$ and $y$ there.

The remaining cases should be fairly transparent from there - they're the same ideas, just putting $v$ and $y$ in the other 3 spots in the first instance, and 2 spots in the second instance. In all cases though, you can pump it in such a way that the ordering is clearly messed up when you split the string in half.

No comments

Powered by Blogger.