J. Xue✬
✫
✩
✪
COMP3131/9102: Programming Languages and Compilers
Jingling Xue
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
http://www.cse.unsw.edu.au/~cs3131
http://www.cse.unsw.edu.au/~cs9102
Copyright @2018, Jingling Xue
COMP3131/9102 Page 530 May 13, 2018
J. Xue✬
✫
✩
✪
Lectures on Java Byte Code Generation
1. Lecture 10: Code generation (Assignment 5)
2. Lecture 11: Code generation + Revision
COMP3131/9102 Page 531 May 13, 2018
J. Xue✬
✫
✩
✪
Understanding Jasmin Assembly Language
1. Read syntax.bnf to understand Jasmin’s syntax
2. Read Jasmin User Guide to Jasmin’s syntax
3. Read Jasmin instruction reference manual to understand its
instructions (1-to-1 mapped to JVM instructions)
4. To under a particular feature, do the following:
(a) Design a Java program Test.java
(b) javac -g Test.java
(c) javap -c Test > Test.j (jasmind Test.class)
(d) Study the Jasmin code in Test.j
COMP3131/9102 Page 532 May 13, 2018
J. Xue✬
✫
✩
✪
Lecture 10: Java Byte Code Generation
1. Translation:
• Expressions (including actual parameters)
• Statements
• Declarations (including formal parameters)
2. Allocating variable indices for local variables
3. Some special code generation issues:
• lvalue (store) v.s rvalue (load)
• assignment expressions such as “a = b[1+i] = 1”
• Expression statements such as “1 + (a = 2);”
• Short-circuit evaluations
• break and continue
• return
4. Generating Jasmin assembler directives
• .limit stack
• .limit locals
• .var
• .line
COMP3131/9102 Page 533 May 13, 2018
J. Xue✬
✫
✩
✪
The VC Compiler
Source Code (gcd.vc)
Assignment 1: Scanner
Assignment 2 & 3: Parser
Assignment 4: Semantic Analyser
Assignment 5: Code Generator
Jasmin Assembly Code (gcd.j)
Jasmin Assembler
Java Virtual Machine
Results
V
C
.
l
a
n
g
.
S
y
s
t
e
m
.
c
l
a
s
s
Tokens
AST
Decorated AST
Java Byte Code (gcd.class)
COMP3131/9102 Page 534 May 13, 2018
J. Xue✬
✫
✩
✪
Object Code Generation for Register-Based Machines
• Three issues:
– register allocation
– instruction selection
– code scheduling
• These are simpler for stack-based machines, especially if
the quality of the generated code is not a concern
• Advanced Compiler Construction:
– Code Optimisations
– Object code generation for register-based machines
– Research-oriented topics
COMP3131/9102 Page 535 May 13, 2018
J. Xue✬
✫
✩
✪
Example 1: gcd.vc
int i = 2;
int j = 4;
int gcd(int a, int b) {
if (b == 0)
return a;
else
return gcd(b, a - (a/b) *b);
}
int main() {
putIntLn(gcd(i, j));
return 0; // optional in VC or C/C++
}
COMP3131/9102 Page 536 May 13, 2018
J. Xue✬
✫
✩
✪
Example 1: gcd.vc(Red Assumed by the VC Compiler)
public class gcd {
static int i = 2;
static int j = 4;
public gcd() { } // the default constructor
int gcd(int a, int b) {
if (b == 0)
return a;
else
return gcd(b, a - (a/b) *b);
}
void main(String argv[]) {
gcd vc$;
vc$ = new gcd();
System.putIntLn(vc$.gcd(i, j));
return;
}
}
COMP3131/9102 Page 537 May 13, 2018
J. Xue✬
✫
✩
✪
Example 1: gcd.vc (cont’d)
• int main() is assumed to be:
public static void main(String argv[]) { ... }
– visitFuncDec: a return is always emitted just in case no
“return expr” was present in the main of a VC program
– visitReturnStmt: emit a RETURN rather than
IRETURN even if a return statement, e.g., “return expr”
is present in the main of a VC program
• All VC functions are assumed to be instance methods with
the package access
• All global variables are assumed to be static field variables
with the package access
• All built-in VC functions are static
COMP3131/9102 Page 538 May 13, 2018
J. Xue✬
✫
✩
✪
Example 1: gcd.j
.class public gcd
.super java/lang/Object
.field static i I
.field static j I
; standard class static initializer
.method static ()V
iconst_2
putstatic gcd/i I
iconst_4
putstatic gcd/j I
; set limits used by this method
.limit locals 0
.limit stack 1
return
.end method
; standard constructor initializer
.method public ()V
.limit stack 1
.limit locals 1
aload_0
invokespecial java/lang/Object/()V
return
.end method
.method gcd(II)I
L0:
.var 0 is this Lgcd; from L0 to L1
.var 1 is a I from L0 to L1
COMP3131/9102 Page 539 May 13, 2018
J. Xue✬
✫
✩
✪
.var 2 is b I from L0 to L1
iload_2
iconst_0
if_icmpeq L5
iconst_0
goto L6
L5:
iconst_1
L6:
ifeq L3
iload_1
ireturn
goto L4
L3:
aload_0
iload_2
iload_1
iload_1
iload_2
idiv
iload_2
imul
isub
invokevirtual gcd/gcd(II)I
ireturn
L4:
L1:
nop
; set limits used by this method
.limit locals 3
.limit stack 5
.end method
.method public static main([Ljava/lang/String;)V
L0:
.var 0 is argv [Ljava/lang/String; from L0 to L1
.var 1 is vc$ Lgcd; from L0 to L1
new gcd
dup
COMP3131/9102 Page 540 May 13, 2018
J. Xue✬
✫
✩
✪
invokenonvirtual gcd/()V
astore_1
aload_1
getstatic gcd/i I
getstatic gcd/j I
invokevirtual gcd/gcd(II)I
invokestatic VC/lang/System/putIntLn(I)V
L1:
; The following return inserted by the VC compiler
return
; set limits used by this method
.limit locals 2
.limit stack 3
.end method
COMP3131/9102 Page 541 May 13, 2018
J. Xue✬
✫
✩
✪
The Translation of the gcd Method by the Java Compiler
.method gcd(II)I // more optimised!
.limit stack 5
.limit locals 3
.var 0 is this Lgcd; from Label1 to Label2
.var 1 is arg0 I from Label1 to Label2
.var 2 is arg1 I from Label1 to Label2
Label1:
.line 10
iload_2
ifne Label0
.line 11
iload_1
ireturn
Label0:
.line 13
aload_0
iload_2
iload_1
iload_1
iload_2
idiv
iload_2
imul
isub
invokevirtual gcd/gcd(II)I
Label2:
ireturn
.end method
COMP3131/9102 Page 542 May 13, 2018
J. Xue✬
✫
✩
✪
Code Generator as a Visitor Object
• Visitor (as an Object): implementing VC.ASTs.visitor
• Syntax-driven: traversing the AST to emit code in pre-, in-
or post-order or any of their combinations
• Classes:
Emitter.java: the visitor class for generating code
JVM.java: The class defining the simple JVM used
Instruction.java: The class defining Jasmin instructions
Frame.java: The class for info about labels, local
variable indices, etc. for a function
COMP3131/9102 Page 543 May 13, 2018
J. Xue✬
✫
✩
✪
Code Template
• [[X]]: the code generated for construct X
• Code template: a specification of [[X]] in terms of the
codes for its syntactic components
• A code template specifies the translation of a construct
independently of the context in which it is used
− Compiled code always executes in some context
− Optimisation is the art of captalising on context!
− Lack of context⇒ fully general (i.e., slow) code
• Thus, inefficient code may be generated; it can be
optimised later by the compiler backend
COMP3131/9102 Page 544 May 13, 2018
J. Xue✬
✫
✩
✪
Expressions
1. Literals
2. Variables (lvalues and rvalues)
3. Arithmetic expressions
4. Boolean expressions
5. Relational expressions
6. Assignment expressions
7. Call expressions (assignment spec)
COMP3131/9102 Page 545 May 13, 2018
J. Xue✬
✫
✩
✪
Integer Literals
• CodeTemplate: [[IntLiteral]]: emitICONST(IntLiteral.value)
private void emitICONST(int value) {
if (value == -1)
emit(JVM.ICONST_M1);
else if (value >= 0 && value <= 5)
emit(JVM.ICONST + "_" + value);
else if (value >= -128 && value <= 127)
emit(JVM.BIPUSH, value);
else if (value >= -32768 && value <= 32767)
emit(JVM.SIPUSH, value);
else
emit(JVM.LDC, value);
}
• Visitor method:
public Object visitIntLiteral(IntLiteral ast, Object o) {
Frame frame = (Frame) o;
emitICONST(Integer.parseInt(ast.spelling));
...
return null;
}
COMP3131/9102 Page 546 May 13, 2018
J. Xue✬
✫
✩
✪
Floating-Point Literals
• CodeTemplate: [[FloatLiteral]]: emitFCONST(FloatLiteral.value)
private void emitFCONST(float value) {
if(value == 0.0)
emit(JVM.FCONST_0);
else if(value == 1.0)
emit(JVM.FCONST_1);
else if(value == 2.0)
emit(JVM.FCONST_2);
else
emit(JVM.LDC, value);
}
• Visitor method:
public Object visitFloatLiteral(FloatLiteral ast, Object o) {
Frame frame = (Frame) o;
emitFCONST(Float.parseFloat(ast.spelling));
...
return null;
}
COMP3131/9102 Page 547 May 13, 2018
J. Xue✬
✫
✩
✪
Boolean Literals
• CodeTemplate: [[BooleanLiteral]]: emitBCONST(BooleanLiteral.value)
private void emitFCONST(boolean value) {
if (value)
emit(JVM.ICONST_1);
else
emit(JVM.ICONST_0);
}
• Visitor method:
public Object visitBooleanLiteral(BooleanLiteral ast, Object o) {
Frame frame = (Frame) o;
emitBCONST(ast.spelling.equals("true"));
...
return null;
}
COMP3131/9102 Page 548 May 13, 2018
J. Xue✬
✫
✩
✪
Arithmetic Expression E1 i+ E2
• Code template:
[[E1 i+ E2]]:
[[E1]]
[[E2]]
emit(”iadd”)
• Visitor Method:
public Object visitBinaryExpr(BinaryExpr ast, Object o) {
Frame frame = (Frame) o;
String op = ast.O.spelling;
ast.E1.visit(this, o);
ast.E2.visit(this, o);
...
else if (op.equals("i+")) {
emit(JVM.IADD);
...
}
...
• Other arithmetic operators (integral or real) handled similarly
COMP3131/9102 Page 549 May 13, 2018
J. Xue✬
✫
✩
✪
Example 1: 1 + 100 + (200 + 40000)
• AST:
• The nodes visited in post-order per code template
• Code:
iconst_1
bipush 100
iadd
sipush 200
ldc 40000
iadd
iadd
COMP3131/9102 Page 550 May 13, 2018
J. Xue✬
✫
✩
✪
visitFuncDecl: Frame Objects
• A new frame object created each time visitFuncDecl is called
• public Object visitFuncDecl(FuncDecl ast, Object o) {
...
frame = new Frame(true) for main or new Frame(false) otherwise
...
• The frame object passed as the 2nd arg and available at all child nodes
• The constructor of the class Frame:
public Frame(boolean _main) {
this._main = _main;
label = 0;
localVarIndex = 0;
currentStackSize = 0;
maximumStackSize = 0;
conStack = new Stack();
brkStack = new Stack();
scopeStart = new Stack();
scopeEnd = new Stack();
}
• Code will be provided
COMP3131/9102 Page 551 May 13, 2018
J. Xue✬
✫
✩
✪
Boolean (or Logical) Expressions: E1&&E2
[[E1]]
ifeq L1
[[E2]]
ifeq L1
iconst 1
goto L2
L1:
iconst 0
L2:
public Object visitBinaryExpr(BinaryExpr ast, Object o) {
Frame frame = (Frame) o;
...
L1 = frame.getNewLabel();
L2 = frame.getNewLabel();
ast.E1.visit(this, o);
emit(JVM.IFEQ, L1);
ast.E2.visit(this, o);
emit(JVM.IFEQ, L1);
emit(JVM.ICONST 1);
emit(JVM.GOTO, L2);
emit(L1 + ”:”);
emit(JVM.ICONST 0);
emit(L2 + ”:”);
...
• Code must respect the short circuit evaluation rule
• || and ! dealt with similarly
• Better codes can be generated (Week 11 Tutorial)
COMP3131/9102 Page 552 May 13, 2018
J. Xue✬
✫
✩
✪
Example 2: Boolean Expressions: true && false
label=0
· · ·
iconst 1
ifeq L2
iconst 0
ifeq L2
iconst 1
goto L3
L2:
iconst 0
L3: label=2
· · ·
frame.getNewLabel
called twice
• The Frame object created for main
• Passed to all the children of the main’s FuncDecl node
COMP3131/9102 Page 553 May 13, 2018
J. Xue✬
✫
✩
✪
Testing and Marking Short-Circuit Evaluation
• Example:
boolean f() {
putBool(false);
return false;
}
void main() {
false && f();
}
• Wrong if ”false” is printed!
COMP3131/9102 Page 554 May 13, 2018
J. Xue✬
✫
✩
✪
Relational Expressions: E1 i> E2
• Code Template:
[[E1]]
[[E2]]
if icmpgt L1
iconst 0
goto L2
L1:
iconst 1
L2:
• Other relational operations on integer operands handled similarly
COMP3131/9102 Page 555 May 13, 2018
J. Xue✬
✫
✩
✪
Example 3: Relational Expressions
• AST:
• Code – L0 and L1 generated in visitCompStmt
iconst_1
iconst_2
iadd
iconst_3
if_icmpgt L2
iconst_0
goto L3
L2:
iconst_1
L3:
COMP3131/9102 Page 556 May 13, 2018
J. Xue✬
✫
✩
✪
Relational Expressions: E1 f > E2
• Code Template:
[[E1]]
[[E2]]
fcmpg
ifgt L1
iconst 0
goto L2
L1:
iconst 1
L2:
• if fcmpgt is non-existent and is simulated by fcmpg and ifgt
• Other floating-point relational operators handled similarly
COMP3131/9102 Page 557 May 13, 2018
J. Xue✬
✫
✩
✪
Assignment Expression: a = E
• Assumptions:
(1) a is int
(2) Its local variable index is 1
• Code Template:
[[E]]
istore 1
• The above code template breaks down for a = b = 1;
iconst 1
dup
istore 2 // the local var index for b is 2
istore 1
• Need to know the context in which b = 1 is used when the node for
b=1 is visited
• How? a parent link is added to every AST node
• ast.parent is not · · · ⇒ dup
COMP3131/9102 Page 558 May 13, 2018
J. Xue✬
✫
✩
✪
Assignment Expression: LHS = RHS
• Code Template:
[[LHS]]
[[RHS]]
appropriate store instruction
• Example:
VC:
int[] a = new int[10]; // index 1
int i = 1; // index 2
int j = 2; // index 3
a[i + 1] = j + 10;
Bytecode for a[i + 1] = j + 10:
aload_1
iload_2
iconst_1
iadd
iload_3
bipush 10
iadd
iastore
COMP3131/9102 Page 559 May 13, 2018
J. Xue✬
✫
✩
✪
Statements
1. if
2. while — “for” left for you to work it out
3. break and continue
4. return
5. expression statement
6. compound statement
COMP3131/9102 Page 560 May 13, 2018
J. Xue✬
✫
✩
✪
if (E) S1 else S2
• Code Template:
[[E]]
ifeq L1
[[S1]]
goto L2
L1:
[[S2]]
L2:
• Works even when either S1 or S2 or both are empty
• In the AST, if (E) S1 without the else is represented as
IfStmt
/ | \
E S1 EmptyStmt
Those instructions in blue need not be generated.
COMP3131/9102 Page 561 May 13, 2018
J. Xue✬
✫
✩
✪
while (E) S
• Code Template:
Push the continue label L1 to conStack
Push the break label L2 to brkStack
L1:
[[E]]
ifeq L2
[[S]]
goto L1
L2:
Pop the continue label L1 from conStack
Pop the break label L2 from brkStack
• Also works when S is empty
COMP3131/9102 Page 562 May 13, 2018
J. Xue✬
✫
✩
✪
break and continue
• Code template for break:
goto the label marking the inst following the while
• Code template for continue:
goto the label marking the first inst of the while
COMP3131/9102 Page 563 May 13, 2018
J. Xue✬
✫
✩
✪
return E
• Assumption: type coercion has been done.
• Code Template: return E:int and return E:Boolean
[[E]]
ireturn
• Code Template: return E:float
[[E]]
freturn
COMP3131/9102 Page 564 May 13, 2018
J. Xue✬
✫
✩
✪
Expression Statement: E;
• Code Template:
[[E]]
pop if it has a value left on the stack
• Examples:
1; ---> pop
1 + 2; ---> pop
f(1,2) ---> pop if the return type is not void
a = 1; ---> no pop
; ---> no pop
COMP3131/9102 Page 565 May 13, 2018
J. Xue✬
✫
✩
✪
Compound Statememts
• Code template:
Push the label marking the beginning of scope to scopeStart
Push the label marking the end of scope to scopeEnd
...
[[DL]] // no code;
[[SL]]
Pop the scopeStart label
Pop the scopeEnd label
• Code will be provided
COMP3131/9102 Page 566 May 13, 2018
J. Xue✬
✫
✩
✪
Global Variable Declarations
• Provided for you (but only for scalar variables)
– Generate .field declarations
– Geneate the class initialiser
• You need to add the initialisations for arrays
• All initialisers for global variables are assumed to be
constant expressions as in C, although this was not
checked in Assignment 4.
COMP3131/9102 Page 567 May 13, 2018
J. Xue✬
✫
✩
✪
Local Variable Declarations
• Instance field index available in VC.ASTs.Decl.java
• Call frame.getNewIndex() to allocate indices
consecutively for formal parameters and local variables:
• For a function (treated as an instance method), 0 is
allocated to this
• For main (a static method), 0 is allocated argv and 1 to
the implicitly declared variable vc$
COMP3131/9102 Page 568 May 13, 2018
J. Xue✬
✫
✩
✪
lvalues (store) v.s rvalues (load)
• Let visitSimpleVar do nothing (because we do not know
by looking at this node whether the variable is a lvalue or
rvalue)
• Generate an appropriate load or store in visitAssignExpr
• Consider l = r (store for l and load for r):
COMP3131/9102 Page 569 May 13, 2018
J. Xue✬
✫
✩
✪
Generating Jasmin Directives
• .limit locals
• .limit stack
• .var
• .line
COMP3131/9102 Page 570 May 13, 2018
J. Xue✬
✫
✩
✪
.limit locals XXX
• Generated at the end of processing a function
• XXX is the current value of frame.getNewIndex()
COMP3131/9102 Page 571 May 13, 2018
J. Xue✬
✫
✩
✪
.var
• Syntax:
.var var-index is name type-desc scopeStart-label scopeEnd-label
• Generated when a var or formal para decl is processed
• var-index, name and type are extracted from the Decl node
• The scopeStart and scopeEnd labels from scopeStart and
scopeEnd stacks (Slide 566)
COMP3131/9102 Page 572 May 13, 2018
J. Xue✬
✫
✩
✪
.line XXX
• Source line where the instructions between this .line and
the next are translated from
• Optional (you should leave it at the very end)
• Maintain a current line
• Generate a .line if the next construct is from a different
line
COMP3131/9102 Page 573 May 13, 2018
J. Xue✬
✫
✩
✪
.limit stack XXX
• XXX is the maximum depth of the operand stack
• Calculating the value by simulating the execution of the
byte code generated incrementally
• Example:
iconst_1 frame.push()
iconst_2 frame.push()
iadd frame.pop()
iconst_1 frame.push()
iconst_2 frame.push()
iconst_3 frame.push()
iadd frame.pop()
iadd frame.pop()
astore_1 frame.pop()
...
COMP3131/9102 Page 574 May 13, 2018
J. Xue✬
✫
✩
✪
Some Language Issues
• Java byte code requires that
• all variables be initialised
• all method be terminated by a return
• Both are not enforced in the VC language
• All test cases used for marking Assignment 5 will satisfy
the 1st restriction,
• You can satisfy the 2nd restriction by either having
approprite returns in your test programs or optionally
forcing your VC compiler to always generate an
appropriate return for each function
COMP3131/9102 Page 575 May 13, 2018
J. Xue✬
✫
✩
✪
Reading
• Chapter 7 of the on-line JVM Spec (compiling Java)
• §8.4 (Red Dragon) or §6.6.2 of Purple Dragon (for
short-circuit evaluations)
• Assignment 5 spec
Next/Last class (Last Lecture): Code Generation + Revision
COMP3131/9102 Page 576 May 13, 2018