CS173 Thursday Oct 3, 2002 ACM local programming contest Friday Oct 4 Project 3 is out. First part due Tuesday, noon. Next part, due Oct 21 Today 4:45, CSB209, Freshman welcome reception ------------------------------------------------------------------------ Programs from FA consider 5-state FA to recognize "main" in a program. * Let FA = {S,C,T,s0,F} * S = {ss, sm, sa, si, sn} * C = {a,b,..z,A,B,..Z,0,1,..9,+,-,*,/,etc} * F = {sn} * T = {(ss,m,sm), (ss,C-m,ss), (sm,a,sa), (sm,m,sm), (sm,C-a-m,ss), (sa,i,si), (sa,m,sm), (sa,C-i-m,ss), (si,n,sn), (si,m,sm), (si,C-n-m,ss), (sn,C,sn)} Can use GOTOs like book (less structured, not good style) Preferable: use two-level nested switch Note: always consume whole input enum {ss, sm, sa, si, sn} state = ss char c bool accept = false while (c = getchar()) != EOF switch (state) case ss: if c == 'm' state = sm case sm: switch (c) case 'm': ; // stay in sm case 'a': state = sa default : state = ss case sa: switch (c) case 'm': state = sm case 'i': state = si default : state = ss case si: switch (c) case 'm': state = sm case 'n': state = sn default : state = ss case sn: accept = true print (accept ? "yes" : "no") ------------------------------------------------------------------------ Nondeterministic Automata Previously, unique next state for a transation => deterministic (DFA) If not => non-deterministic (NDFA) NDFA - same symbol on more than one arc epsilon transition - spontaneously move from one state to another means delta "function" no longer a function, but a binary relation delta \subset S x (C U {epsilon}) x S or as a function from (state,input) to sets of states delta : S x (C U {epsilon}) --> 2^S What does it mean for an FA to have epsilon transitions, or more than one transition from a given state on the same input symbol? How do we translate such an FA into a program? How can we "goto" more than one place at a time? FA accepts if there exists a series of valid transitions that reaches an accepting state follow all paths simultaneously or always guess the right path to take. ------------------------------------------------------------------------ Example of an NDFA An NDFA to accept strings containing the word "main": -> s0 -m-> s1 -a- > s2 -i-> s3 -n-> (s4) -any sumbol-> (s4) -> s0 -any symbol-> s0 What makes this an NDFA? when in state s0, can choose between s0 and s1 effectivally guess whether "m" is beginning of "main" or not can also use epsilon transitions (more later) simulate on "mmainm" after first "m" in s0 or s1. after 2nd "m", branch of s1 has nowhere to go -> wrong guess can still be in s0 or s1! ... at end of simulation, can be: s0, still waiting for start of "main" s1, guessing the last "m" is the beginning of "main" s4, in this case, the right guess, and a final state. ACCEPT ------------------------------------------------------------------------ Equivalence of Automata Two automata A and B are said to be equivalent if both accept exactly the same set of input strings. Formally: * if there is a path from the start state of A to a final state of A labeled a1a2..ak, there there is a path from the start state of B to a final state of B labeled a1a2..ak. * if there is a path from the start state of B to a final state of B labeled b1b2..bj, there there is a path from the start state of A to a final state of A labeled b1b2..bj. ------------------------------------------------------------------------ Equivalence of Deterministic and Nondeterministic Automata For every NDFA, we have a costruction that converts it to a DFA therefore, there is nothing we can express with an NDFS that cannot also be expressed with a DFA Subset construction : NDFA->DFA - each state in the generated DFA corresponds to a subset of states in the NDFA - main idea: as we trace the set of possible paths thru an NDFA, we must record all possible states that we could be in as a result of the input seen so far. We create a DFA which encodes the set of states of the NDFA that we could be in within a single state of the DFA. Put another way, each state of the DFA represents the *set* of states that would have pennies on them in the direct NDFA simulation. ------------------------------------------------------------------------ Subset Construction for NDFA To create a DFA that accepts the same strings as this NDFA, we create a state to represent all the combinations of states that the NDFA can enter. From the previous example (of an NDFA to recognize input strings containing the word "main") of a 5 state NDFA, we can create a corresponding DFA (with up to 2^5 states) whose states correspond to all possible combinations of states in the NDFA: {}, {s0}, {s1}, {s2}, {s3}, {s4}, {s0, s1}, {s0, s2}, {s0, s3}, {s0, s4}, {s1, s2}, {s1, s3}, {s1, s4}, {s2, s3}, {s2, s4}, {s3, s4}, {s0, s1, s2}, {s0, s1, s3}, {s0, s1, s4}, {s0, s2, s3}, {s0, s2, s4}, {s0, s3, s4}, {s1, s2, s3}, {s1, s2, s4}, {s1, s3, s4}, {s2, s3, s4}, {s0, s1, s2, s3}, {s0, s1, s2, s4}, {s0, s1, s3, s4}, {s0, s2, s3, s4}, {s1, s2, s3, s4}, {s0, s1, s2, s3, s4} Note that many of these states won't be needed in our DFA because there is no way to enter that combination of states in the NDFA. However, in some cases, we might need all of these states in the DFA to capture all possible combinations of states in the NDFA. The "empty" DFA state handles the case where the NDFA gets completely stuck, and has no transitions on the input symbol from any of the states it might be in. ------------------------------------------------------------------------ A DFA accepting the same strings as our example NDFA has the following transitions: {s0} -m-> {s0,s1} {s0} -not m-> {s0} {s0,s1} -m-> {s0,s1} {s0,s1} -a-> {s0,s2} {s0,s1} -not m,a-> {s0} {s0,s2} -m-> {s0,s1} {s0,s2} -i-> {s0,s3} {s0,s2} -not m,i-> {s0} {s0,s3} -m-> {s0,s1} {s0,s3} -n-> {s0,s4} {s0,s3} -not m,n-> {s0} There are also a bunch of less interesting transitions after we've already seen the word main: {s0,s4} -m-> {s0,s1,s4} {s0,s1,s4} -a-> {s0,s2,s4} {s0,s2,s4} -i-> {s0,s3,s4} {s0,s3,s4} -n-> {s0,s4} The start state is {s0} and the final states are {s0,s4}, {s0,s1,s4}, {s0,s2,s4}, and {s0,s3,s4} -- the ones containing a final state of the NDFA. This is an 8-state DFA. We know, of course, that a 5-state DFA exists (it was our original example), and we can see in this case by inspection that the four final states really "ought" to be combined. We'll see later how to do this automatically. By coincidence in this case, the minimal DFA has the same number of states (5) as the original NDFA. THIS IS NOT ALWAYS TRUE. For example, if we start with the "obvious" 5-state NDFA to accept strings that contain at least two 'a's or at least two 'b's or at least two 'c's, the subset construction gives us a DFA with 15 states, and the minimal DFA has 9 states. (I RECOMMEND WORKING THIS OUT AS A NOT-TO-BE-TURNED-IN HOMEWORK EXERCISE.) A much messier example is given in the book. Consider an NDFA that accepts all strings that cannot be created out of the letters in the word "washington". It's a relatively simple 19-state machine (fig. 10.14, with a few corrections: (a) the various Lambda-x transitions can simply be Lambda; (b) the final states need self-loops on Lambda to consume the whole input). The minimal DFA has nearly 5000 states. ------------------------------------------------------------------------ Limitations of Finite Automata The defining characteristic of FA is that they have only a finite number of states. Hence, a finite automata can only "count" (that is, maintain a counter, where different states correspond to different values of the counter) a finite number of input scenarios. There is no finite automaton that recognizes these strings: * The set of binary strings consisting of an equal number of 1's and 0's * The set of strings over '(' and ')' that have "balanced" parentheses The 'pumping lemma' can be used to prove that no such FA exists for these examples. Take 280 and you'll see.