Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
J. Xue✬
✫
✩
✪
Tutorials
• Tutorials to start in week 3 (i.e., next week)
• Tutorial questions are already available on-line
COMP3131/9102 Page 65 March 4, 2018
J. Xue✬
✫
✩
✪
Assignment 1: Scanner
• +5 =⇒ two tokens: + and 5
the scanner understands how tokens are formed but not
anything else
COMP3131/9102 Page 66 March 4, 2018
J. Xue✬
✫
✩
✪
COMP3131/9102: Programming Languages and Compilers
Jingling Xue
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
http://www.cse.unsw.edu.au/~cs3131
http://www.cse.unsw.edu.au/~cs9102
Copyright @2018, Jingling Xue
COMP3131/9102 Page 67 March 4, 2018
J. Xue✬
✫
✩
✪
The Big Picture
REs
NFA
DFA
DFA
minimal-state
DFA
The two conversions in dashed arrows are not covered:
• REs→ DFA (pages 135 – 141, Red Dragon/§3.7, Purple
Dragon)
• DFA→ REs: Chapter 3, J. Hopcroft, R. Motwani and J. Ullman, Introduction to
Automata Theory, Languages, and Computation, Addison-Wesley, 2nd Edition, 2001. See
www-db.stanford.edu/~ullman/ullman-books.html.
• DFA→ minimal-state DFA (pages 141 – 144, Red
Dragon/§3.9.6, Purple Dragon)
• Tools: http://www.jflap.org/
COMP3131/9102 Page 68 March 4, 2018
J. Xue✬
✫
✩
✪
Week 2: Regular Expressions, DFA and NFA
1. Definitions of REs, DFA and NFA
2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red
Dragon/Algorithm 3.23, Purple Dragon)
3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red
Dragon/Algorithm 3.20, Purple Dragon)
4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red
Dragon/Algorithm 3.39, Purple Dragon)
5. Scanner generators
• How to use them (straightforward)
• How to write them (the most techniques introduced today)
COMP3131/9102 Page 69 March 4, 2018
J. Xue✬
✫
✩
✪
Applications of Regular Expressions
• Anywhere when patterns of text need to be specified
– Specifying restriction enzymes
– Google analytics
• Unix system, database and networking administration:
grep, fgrep, egrep, sed, awk
• HTML documents: Javascript and VBScript
• Perl:
J. Friedl, Mastering Regular Expressions, O’reilly, 1997
• Token Specs for scanner generators (lex, Jflex, etc.)
• http://www.zytrax.com/tech/web/regex.htm
COMP3131/9102 Page 70 March 4, 2018
J. Xue✬
✫
✩
✪
Applications of Finite Automata (i.e., Finite State Machines)
• Hardware design (minimising states =⇒ minimising cost)
• Language theory
• Computational complexity
• Scanner generators (lex and Jflex)
• Automata tools:
http://research.microsoft.com/en-us/
downloads/
39c51620-548c-49a3-ac9c-40d807010c07/
COMP3131/9102 Page 71 March 4, 2018
J. Xue✬
✫
✩
✪
Alphabet, Strings and Languages
• Alphabet denoted Σ: any finite set of symbols
– The binary alphabet {0,1} (for machine languages)
– The ASCII alphabet (for high-level languages)
• String: a finite sequence of symbols drawn from Σ:
– Length |s| of a string s: the number of symbols in s
– ǫ: the empty string (|ǫ| = 0)
• Language: any set of strings over Σ; its two special cases:
– ∅: the empty set
– {ǫ}
COMP3131/9102 Page 72 March 4, 2018
J. Xue✬
✫
✩
✪
Examples of Languages
• Σ = {0, 1} – a string is an instruction
– The set of M68K instructions
– The set of Pentium instructions
– The set of MIPS instructions
• Σ = the ASCII set – a string is a program
– the set of Haskell programs
– the set of C programs
– the set of VC programs
COMP3131/9102 Page 73 March 4, 2018
J. Xue✬
✫
✩
✪
Terms for Parts of a String (Figure 3.7 of Text)
TERM DEFINITION
prefix of s a string obtained by removing
0 or more trailing symbols of s
suffix of s a string obtained by removing
0 or more leading symbols of s
substring of s a string obtained by deleting
a prefix and a suffix from s
proper prefix
suffix, substring of s
Any nonempty string x that is, respectively,
a prefix, suffix
or substring of s such that s 6= x
COMP3131/9102 Page 74 March 4, 2018
J. Xue✬
✫
✩
✪
String Concatenation
• If x and y are strings, xy is the string formed by
appending y to x
• Examples:
x y xy
key word keyword
java script javascript
• ǫ is the identity: ǫx = xǫ = x
COMP3131/9102 Page 75 March 4, 2018
J. Xue✬
✫
✩
✪
Operations on Languages (Figure 3.8 of Text)
OPERATION DEFINITION
union: L ∪M L ∪M = {s | s ∈ L or s ∈M}
concatenation: LM LM = {st | s ∈ L and t ∈M}
Kleene Closure: L∗ L∗ = ∪∞
i=0L
i = L0 ∪ L ∪ LL ∪ LLL . . .
where L0 = {ǫ}
(0 or more concatenations of L)
Positive Closure: L+ L+ = ∪∞
i=1L
i = L ∪ LL ∪ LLL . . .
(1 or more concatenations of L)
COMP3131/9102 Page 76 March 4, 2018
J. Xue✬
✫
✩
✪
Examples: Operations on Languages
• L = {a, . . . , z, A, . . . , Z, }
• D = {0, . . . , 9}
EXAMPLE LANGUAGE (THE SET OF )
L ∪D
L3
LD
L∗
L(L ∪D)∗
D+
COMP3131/9102 Page 77 March 4, 2018
J. Xue✬
✫
✩
✪
Examples: Operations on Languages
• L = {a, . . . , z, A, . . . , Z, }
• D = {0, . . . , 9}
EXAMPLE LANGUAGE
L ∪D letters and digits
L3 all 3-letter strings
LD strings consisting of a letter followed by a digit
L∗ all strings of letters, including the empty string ǫ
L(L ∪D)∗ all strings of letters and digits beginning with a letter
D+ all strings of one or more digits
COMP3131/9102 Page 78 March 4, 2018
J. Xue✬
✫
✩
✪
Regular Expressions (REs) Over Alphabet Σ
• Inductive Base:
1. ǫ is a RE, denoting the RL {ǫ}
2. a ∈ Σ is a RE, denoting the RL {a}
• Inductive Step: Suppose r and s are REs, denoting the
RLs L(r) and L(s). Then (next slide):
1. (r)|(s) is a RE, denoting the RL L(r) ∪ L(s)
2. (r)(s) is a RE, denoting the RL L(r)L(s)
3. (r)∗ is a RE, denoting the RL L(r)∗
4. (r) is a RE, denoting the RL L(r)
REs define regular languages (RL) or regular sets
COMP3131/9102 Page 79 March 4, 2018
J. Xue✬
✫
✩
✪
Precedence and Associativity of “Regular” Operators
• Precedence:
– “∗” has the highest precedence
– “Concatenation” has the second highest precedence
– “|” has the lowest precedence
• Associativity: — all are left-associative
• Example:
(a)|((b)∗(c)) ≡ a|b∗c
Unnecessary parentheses can be avoided!
COMP3131/9102 Page 80 March 4, 2018
J. Xue✬
✫
✩
✪
An Example (Following the Definition of REs)
• Alphabet: Σ = {0, 1}
• RE: 0(0|1)∗
• Question: What is the language defined by the RE?
• Answer:
L(0(0|1)∗) = L(0)L((0|1)∗)
= {0}L(0|1)∗
= {0}(L(0) ∪ L(1))∗
= {0}({0} ∪ {1})∗
= {0}{0, 1}∗
= {0}{ǫ, 0, 1, 00, 01, 10, 11, . . . }
= {0, 00, 01, 000, 001, 010, 011, . . . }
The RE describes the set of strings of 0’s and 1’s beginning with a 0.
COMP3131/9102 Page 81 March 4, 2018
J. Xue✬
✫
✩
✪
More Example Regular Expressions: Σ = {0, 1}
RE LANGUAGE
1 {1}
0|1 {0, 1}
1∗ {ǫ, 1, 11, 111, . . . }
1∗1 {1, 11, 111, . . . }
0|0∗1 the set containing 0 and all strings consisting
of zero or more 0’s followed by a 1.
COMP3131/9102 Page 82 March 4, 2018
J. Xue✬
✫
✩
✪
Notational Shorthands
• One or more instances +: r+ = rr∗
– denotes the language (L(r))+
– has the same precedence and associativity as ∗
• Zero or one instance ?: r? = r|ǫ
– denotes the language L(r) ∪ {ǫ}
– written as (r)? to indicate grouping (e.g., (12)?)
• Character classes:
[A− Za− z ][A− Za− z0− 9 ]∗
COMP3131/9102 Page 83 March 4, 2018
J. Xue✬
✫
✩
✪
Regular Expressions for VC (or C)
TOKEN RE
Identifiers letter(letter|digit)∗
Integers digit+
Reals A bit long but can be obtained from
the following page by substitutions
• In the VC spec, letter includes “ ”
• In Java, letters and digits may be drawn from the entire
Unicode character set. Examples of identifiers are:
abc αβγ
COMP3131/9102 Page 84 March 4, 2018
J. Xue✬
✫
✩
✪
Regular Grammars for Integers and Reals in VC
• Integers:
digit: 0|1|2|...|9
intLiteral: digit+
• Reals:
digit: 0|1|2|...|9
fraction: .digit+
exponent: (E|e)(+|-)?digit+
floatLiteral: digit∗ fraction exponent?
| digit+.
| digit+.?exponent
Regular grammars are a special case of CFGs (Week 3).
COMP3131/9102 Page 85 March 4, 2018
J. Xue✬
✫
✩
✪
Finite Automata (or Finite State Machines)
A finite automaton consists of a 5-tuple:
(Σ, S, T, F, I)
where
• Σ is an alphabet
• S is a finite set of states
• T is a state transition function: T : S × Σ→ S
• F is a finite set of final or accepting states
• I is the start state: I ∈ S.
COMP3131/9102 Page 86 March 4, 2018
J. Xue✬
✫
✩
✪
Representation and Acceptance
• Transition graph:
state A
transition A B
a
start state S
start
final state A
• Acceptance: A FA accepts an input string x iff there is
some path in the transition graph from the start state to
some accepting state such that the edge labels spell out x.
COMP3131/9102 Page 87 March 4, 2018
J. Xue✬
✫
✩
✪
What Language does this FA accept?
S A
1
1
0
0
start
COMP3131/9102 Page 88 March 4, 2018
J. Xue✬
✫
✩
✪
Example 1
• The language: strings of 0 and 1 with an odd number of 1
(ǫ not included)
S A
1
1
0
0 S: even number of 1’s seen
A: odd number of 1’s seen
start
Σ {0, 1}
S {S,A}
T T (S, 0) = S, T (S, 1) = A, T (A, 0) = A, T (A, 1) = S
F {A}
I S
• 01011 is recognised because
S
0→ S 1→ A 0→ A 1→ S 1→ A
COMP3131/9102 Page 89 March 4, 2018
J. Xue✬
✫
✩
✪
Implicit Error State
• By definition, T is a function from S × Σ to S, but ...
S A
1
1
0
start
• If T (s, a) is undefined at the state s on input a, then
T (s, a) = error
S A error
1
1
0
0
start
• The error state and transitions to it aren’t drawn (by convention)
COMP3131/9102 Page 90 March 4, 2018
J. Xue✬
✫
✩
✪
Deterministic FA (DFA) and Nondeterministic FA (NFA)
A FA is a DFA if
• no state has an ǫ-transition, i.e., an transition on input ǫ,
and
• for each state s and input symbol a, there is
at most one edge labeled a leaving s
A FA is an NFA if it is not a DFA:
• Nondeterministic: can make several parallel transitions on
a given input
• Acceptance: the existence of some path as per Slide 87
COMP3131/9102 Page 91 March 4, 2018
J. Xue✬
✫
✩
✪
DFA or NFA? What are the Languages Recognised?
0 1 2 3 4 5
a
a a a a a
astart
0 1 2 3 4 5
a
a a ǫ ǫ a
astart
COMP3131/9102 Page 92 March 4, 2018
J. Xue✬
✫
✩
✪
Two Examples
• NFA 1:
0 1 2 3 4 5
a
a a a a a
astart
• NFA 2:
0 1 2 3 4 5
a
a a ǫ ǫ a
astart
• The same language:
the set of all strings of a’s such that the length of each of
these strings is a multiple of 2 or 3 (ǫ included)
COMP3131/9102 Page 93 March 4, 2018
J. Xue✬
✫
✩
✪
Real-Life DFAs
The ghosts in Pac-Man have four behaviors:
1. Randomly wander the maze
2. Chase Pac-Man, when he is within line of sight
3. Flee Pac-Man, after Pac-Man has consumed a power pellet
4. Return to the central base to regenerate
COMP3131/9102 Page 94 March 4, 2018
J. Xue✬
✫
✩
✪
Real-Life DFAs
The behavior of a vending machine:
accepts dollars and 25 cents, and charges $1.25 per coke.
COMP3131/9102 Page 95 March 4, 2018
J. Xue✬
✫
✩
✪
What About this Non-Real-Life NFA?
ǫ
COMP3131/9102 Page 96 March 4, 2018
J. Xue✬
✫
✩
✪
Week 2: Regular Expressions, DFA and NFA
1. Definitions of REs, DFA and NFA
√
2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red
Dragon/Algorithm 3.23, Purple Dragon)
3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red
Dragon/Algorithm 3.20, Purple Dragon)
4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red
Dragon/Algorithm 3.39, Purple Dragon)
5. Scanner generators
• How to use them (straightforward)
• How to write them (the most techniques introduced today)
COMP3131/9102 Page 97 March 4, 2018
J. Xue✬
✫
✩
✪
Thompson’s Construction of NFA from REs
• Syntax-driven
• Inductive: The cases in the construction of the NFA follow
the cases in the definition of REs
• Important: if a symbol a occurs several times in a RE r, a
separate NFA is constructed for each occurrence
• Thompson’s method is one of many available
COMP3131/9102 Page 98 March 4, 2018
J. Xue✬
✫
✩
✪
Thompson’s Construction
• Inductive Base:
1. For ǫ, construct the NFA
S A
ǫstart
2. For a ∈ Σ, construct the NFA
S A
astart
• Inductive step: suppose N(r) and N(s) are NFAs for REs
r and s. Then
COMP3131/9102 Page 99 March 4, 2018
J. Xue✬
✫
✩
✪
Thompson’s Construction (Cont’d)
N(r)
N(s)
S A
ǫ
ǫ
ǫ
ǫ
RE r|s : start
S N(r) AN(s)
start
RE rs :
N(r)S A
ǫ ǫ
ǫ
ǫ
start
RE r∗ :
RE (r) : N((r)) is the same as N(r)
COMP3131/9102 Page 100 March 4, 2018
J. Xue✬
✫
✩
✪
Example: RE =⇒ NFA
Converting (0|10∗1)∗10∗ to an NFA
COMP3131/9102 Page 101 March 4, 2018
J. Xue✬
✫
✩
✪
Example: RE =⇒ NFA
• Regular expression: (0|10∗1)∗10∗
• NFA:
0 1 2 3 4 5 6 7 8 9
start
ǫ ǫ 0 ǫ ǫ 1 ǫ 0 ǫ
ǫ ǫ
ǫ10 15
11 12 13 14
ǫ
ǫ
1 ǫ 0 ǫ 1
ǫ
ǫ
ǫ
COMP3131/9102 Page 102 March 4, 2018
J. Xue✬
✫
✩
✪
Week 2: Regular Expressions, DFA and NFA
1. Definitions of REs, DFA and NFA
√
2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red
Dragon/Algorithm 3.23, Purple Dragon)
√
3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red
Dragon/Algorithm 3.20, Purple Dragon)
4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red
Dragon/Algorithm 3.39, Purple Dragon)
5. Scanner generators
• How to use them (straightforward)
• How to write them (the most techniques introduced today)
COMP3131/9102 Page 103 March 4, 2018
J. Xue✬
✫
✩
✪
Example: DFA (Cont’d)
1,2,3,4,5,10
6,7,9,11,12,14
7,8,9,
12,13,14
0,1,2,5,10
1,2,4,5,10,15
0
0
1
0
0
1 1
1 1
0
start
• The algorithm used is known as the subset construction,
because a DFA state corresponds to a subset of NFA states
• There are at most 2n DFA states, where n is the total
number of the NFA states
COMP3131/9102 Page 104 March 4, 2018
J. Xue✬
✫
✩
✪
Subset Construction: The Operations Used
OPERATION DESCRIPTION
ǫ-closure(s) Set of NFA states readable from
NFA state s on ǫ-transitions
ǫ-closure(T ) Set of NFA states readable from
some state s in T on ǫ-transitions
move(T, a) Set of NFA states to which there is a transition
on input a from some state s in T
• s: a NFA state
• T : a set of NFA states
COMP3131/9102 Page 105 March 4, 2018
J. Xue✬
✫
✩
✪
Subset Construction: The Algorithm
Let s0 be the start state of the NFA;
DFAstates contains the only unmarked state ǫ-closure(s0);
while there is an unmarked state T in DFAstates do begin
mark T
for each input symbol a do begin
U := ǫ-closure(move(T, a));
if U is not in DFAstates then
Add U as an unmarked state in DFAstates;
DFATrans[T, a] := U ;
end;
end;
COMP3131/9102 Page 106 March 4, 2018
J. Xue✬
✫
✩
✪
Subset Construction: The Definition of the DFA
Let (Σ, S, T, F, s0) be the original NFA. The DFA is:
• The alphabet: Σ
• The states: all states in DFAstates
• The start state: ǫ-closure(s0)
• The accepting states: all states in DFAstates containing at
least one accepting state in F of the NFA
• The transitions: DFATrans
COMP3131/9102 Page 107 March 4, 2018
J. Xue✬
✫
✩
✪
Week 2: Regular Expressions, DFA and NFA
1. Definitions of REs, DFA and NFA
√
2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red
Dragon/Algorithm 3.23, Purple Dragon)
√
3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red
Dragon/Algorithm 3.20, Purple Dragon)
√
4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red
Dragon/Algorithm 3.39, Purple Dragon)
5. Scanner generators
• How to use them (straightforward)
• How to write them (the most techniques introduced today)
COMP3131/9102 Page 108 March 4, 2018
J. Xue✬
✫
✩
✪
An Algorithm to Mimimise DFA Statements
Initially, let Π be the partition with the two groups:
(1) one is the set of all final states
(2) the other is the set of all non-final states
Let Πnew = Π
for (each group G in Πnew) {
partition G into subgroups such that two states s and t
are in the same subgroup iff for all input symbols
a, states s and t have transitions on a to
states in the same group of Πnew
replace G in Πnew by the set of subgroups formed
}
• Begins with the most optimistic assumption
• Also used in global value numbering (COMP4133)
COMP3131/9102 Page 109 March 4, 2018
J. Xue✬
✫
✩
✪
Example (Cont’d): States Re-Labeled
D
B CA
E
0
01
0
0
1 1
1 1
0
start
state minimisation
A,D,E B,C
1
1
0
0
start
Theoretical Result: every regular language can be recognised
by a minimal-state DFA that is unique up to state names
COMP3131/9102 Page 110 March 4, 2018
J. Xue✬
✫
✩
✪
Various Conversions for an Example
(0|10∗1)∗10∗
NFA in Slide 102
DFA
DFA in Slide 110
minimal-state
DFA in Slide 110
However, the conversions in dashed arrows are not covered.
COMP3131/9102 Page 111 March 4, 2018
J. Xue✬
✫
✩
✪
Week 2: Regular Expressions, DFA and NFA
1. Definitions of REs, DFA and NFA
√
2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red
Dragon/Algorithm 3.23, Purple Dragon)
√
3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red
Dragon/Algorithm 3.20, Purple Dragon)
√
4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red
Dragon/Algorithm 3.39, Purple Dragon)
√
5. Scanner generators
• How to use them (straightforward)
• How to write them (the most techniques introduced today)
COMP3131/9102 Page 112 March 4, 2018
J. Xue✬
✫
✩
✪
Scanner Generators
• Scanners generated in C
– lex (UNIX)
– flex – GNU’s fast lex (UNIX)
– mks lex (MS-DOS and OS/2)
• Scanners generated in Java
– Jflex
– JavaCC (SUN Microsystems)
COMP3131/9102 Page 113 March 4, 2018
J. Xue✬
✫
✩
✪
The Scanner Spec in Jflex
user code -- copied verbatim to the scanner file
%%
Jflex directives
%%
regular expression rules
COMP3131/9102 Page 114 March 4, 2018
J. Xue✬
✫
✩
✪
How a Scanner Generator Works
Token Spec using REs
NFA
DFA
minimal-state DFA
table-driven code (Jflex) – simulating a DFA on an input
or hard-wired code (Assignment 1 or §3.4 of either Dragon Book)
COMP3131/9102 Page 115 March 4, 2018
J. Xue✬
✫
✩
✪
An Example: Spec
some user code
%%
LETTER=[A-Za-z_]
DIGIT=[0-9]
%%
"if" { return new Token(Token.IF, "if", pos); }
"<" { return new Token(Token.LT, "<", pos); }
"<=" { return new Token(Token.LE, "<=", pos); }
{LETTER}({LETTER}|{DIGIT})*
{ return new Token(Token.ID, "itsSpelling", pos); }
Two rules:
• The first pattern used when more than one are matched – “if” as a
keyword not as an id
• The longest prefix of the input is always matched – ”<= as one token
COMP3131/9102 Page 116 March 4, 2018
J. Xue✬
✫
✩
✪
An Example: NFA
N(if)
N(<)
N(<=)
N(id)
S
ǫ
ǫ
ǫ
ǫ
start
A DFA can also be used for each pattern.
COMP3131/9102 Page 117 March 4, 2018
J. Xue✬
✫
✩
✪
An Example: NFA
1 2 3
4 5
6 7 8
9 10
i f
<
< =
0
LETTER
ǫ
ǫ
ǫ
ǫ
start
LETTER, DIGIT
COMP3131/9102 Page 118 March 4, 2018
J. Xue✬
✫
✩
✪
An Example: DFA
2, 10 3, 10
5, 7 8
10
f
=
0,1,4,6,9
i
<
LETTER but i
start
LETTER, DIGIT
LETTER, DIGIT
DIGIT
LETTER but f
Already a minimal-state DFA!
COMP3131/9102 Page 119 March 4, 2018
J. Xue✬
✫
✩
✪
A DFA Represented as a Transition Table
State
Character
< = i f LETTER but i LETTER but f DIGIT
(0,1,4,6,9) (5,7) (2,10) (10)
(2,10) (3,10) (10) (10)
(5,7) (8)
(8)
(10) (10) (10) (10) (10) (10)
(3,10) (10) (10) (10) (10) (10)
• Letter = {i}∪ “letter but i”
• Character classes reduce the table size
• The blank entries are errors
• The tables are usually sparse (pages 146 – 177 of text for compression techniques)
COMP3131/9102 Page 120 March 4, 2018
J. Xue✬
✫
✩
✪
The Scanner Driver for Simulating a DFA
state = initial_state
while (TRUE) {
next_state = T[state][current_char];
if (next_state == ERROR) // cannot move any further
break;
state = next_state;
if (current_char == EOF) // input exhausted
break;
current_char = getchar(); // fetch the next char
}
Backtrack to the most recent accepting state
if (such a state exists)
/* return the corresponding token
reset current_char to the first after the token
*/
else
lexical_error(state);
• There should be a column in the transition table for EOF
• Need to backtrack
COMP3131/9102 Page 121 March 4, 2018
J. Xue✬
✫
✩
✪
The Output of Running Jflex on a Sample Scanner Spec
• Scanner.l: the spec for the scanner generator Jflex
jflex Scanner.l
Constructing NFA : 267 states in NFA
Converting NFA to DFA :
139 states before minimization, 106 states in minimized DFA
Old file ‘‘Scanner.java’’ saved as ‘‘Scanner.java~’’
Writing code to ‘‘Scanner.java’’
• Scanner.l.java: the scanner generated
javac Scanner.l.java
• java Scanner < test.vc
COMP3131/9102 Page 122 March 4, 2018
J. Xue✬
✫
✩
✪
Limitations of Regular Expressions (or FAs)
• Cannot “count”
• Cannot recognise palindromes (e.g., racecar & rotator)
• The language of the balanced parentheses
{(n)n | n > 1}
is not a regular language
– cannot build a FA to recognise the language for any n
(can trivially build a FA for n=3, for example)
– but can be specified by a CFG (Week 3):
P → (P ) | ( )
COMP3131/9102 Page 123 March 4, 2018
J. Xue✬
✫
✩
✪
Chomsky’s Hierarchy
Depending on the form of production
α→β
four types of grammars (and accordingly, languages) are distinguished:
GRAMMAR KNOWN AS DEFINITION LANGUAGE MACHINE
Type 0 unrestricted grammar α 6= ǫ Type 0 Turing machine
Type 1
context-sensitive grammar
CSGs
|α| ≤ |β| Type 1
linear bounded
automaton
Type 2
context-free grammar
CFGs
A→α Type 2 stack automaton
Type 3 Regular grammars A→w | Bw Type 3 finite state automaton
COMP3131/9102 Page 124 March 4, 2018
J. Xue✬
✫
✩
✪
Reading
• Sections 3.3 – 3.7 of either Dragon Book
• Week 3 tutorial questions (available on-line)
Week 3: Context-Free Grammars
COMP3131/9102 Page 125 March 4, 2018