J. Xue✬ ✫ ✩ ✪ COMP3131/9102: Programming Languages and Compilers Jingling Xue School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia http://www.cse.unsw.edu.au/~cs3131 http://www.cse.unsw.edu.au/~cs9102 Copyright @2018 Jingling Xue COMP3131/9102 Page 410 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Timetable • Lectures: Week 23/4 Static Semantics Assn 4 released Week 30/4 Teaching-Free (Working on Assignment 4) Week 7/5 Jasmin Week 14/5 Code Gen Assn 4 (12/5 Due) Assn 5 (Released) Week 21/5 Code Gen + Revision Week 28/5 No lecture Assn 5 due on 1/6 • Tutorial: No tutorial in the Week of 30/4 – 4/5 COMP3131/9102 Page 411 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Lecture 8: Static Semantics The semantic analyser enforces a language’s semantic constraints 1. Two types of semantic constraints: • Scope rules • Type rules 2. Two subphases in semantic analysis: • Identification (symbol table) • Type checking 3. Standard environment 4. Assignment 4: • The visitor design pattern • The two subphases combined in one pass only This lecture + Assn 4 spec⇒ Type Checker COMP3131/9102 Page 412 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Static Semantics • Is x a variable, method, array, class or package? • Is x declared before used? • Which declaration of x does this reference? • Is an expression type-consistent? • Does the dimension of an array match with the declaration? • Is an array reference in bounds? • Is a method called with the right number and types of args? • Is break or continue enclosed in a loop construct? • etc. These cannot be specified using a CFG COMP3131/9102 Page 413 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Blocks • Block: a language construct that can contain declarations: – the compilation units (i.e., the code files) – procedures/functions (or methods) – compound statements • The two kinds of blocks in VC: – The entire program as one block (i.e., the outermost block) – compound statements { . . . } • Block-structured language: permits the nesting of blocks – Examples: Ada, Pascal and Modula-2 – C exhibits nested block structure (because { . . . } can be nested) but are not strictly block-structured languages (because functions cannot be nested inside other functions in the standard C) • Basic and COBOL: the only block is the entire program • Fortran: the entire program plus blocks contained in the program COMP3131/9102 Page 414 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Scope • The scope of a declaration is the portion of the program over which the declaration takes effect • A declaration is in scope at a particular point in a program if and only if the declaration’s scope includes that point • Scope rules: the rules governing declarations (called defined occurrences of identifiers) and references to declarations (called applied occurrences of identifiers) The scope rules provide the answer to: what is the dec- laration for this variable referenced in the program? COMP3131/9102 Page 415 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Scope Rules in VC 1. The scope of a function: from the point at which it is declared to the end of the file 2. The scope of a variable in a block: from the point at which it is declared to the end of the block 3. The scope of a formal parameter: the same as the local variables in the function body void f(int i, int j) { void f( ) { int k; ===> int i; int j; ... int k; (True in almost all languages such as C, C++ and Java.) 4. The scope of a built-in function: the entire program COMP3131/9102 Page 416 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Scope Rules in VC (Cont’d) 5. No identifier can be declared more than once in a block 6. Most closed nested rule: For every applied occurrence (i.e., use) of an identifier I in a block B, there must be a corresponding declaration, which is in the smallest enclosing block that contains any declaration of I . 7. Due to Rule 6, the scope of a declaration defined in each of the first four rules excludes the scope of another declaration using the same name (inside an inner block). • Such a gap is known as a scope hole. • The inner declaration is said to hide the outer declaration • The outer declaration is not visible in the inner declaration COMP3131/9102 Page 417 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Implication of Rule 1 in VC • A syntactically illegal VC program: int f() { g(); // not in scope } int g() { f(); } main() { } • This allows identification and checking to be done in one-pass • Pascal solves the problem by allowing forward references • ANSI C and C++ solve the problem by allowing function prototypes COMP3131/9102 Page 418 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 1: Scope Rules int k; void foo() { int i; int j; i = 1; j = 7; putIntLn(i); // 1 putIntLn(j); // 7 { int i; i = 2; putIntLn(i); // 2 putIntLn(j); // 7 } putIntLn(i); // 1 putIntLn(j); // 7 } S c o p e H o l e putIntLn COMP3131/9102 Page 419 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 2: Scope Rules int foo(int foo) { putIntLn(foo); // 1 { int putIntLn; putIntLn = 2; putFloatLn(putIntLn); // 2.0 } putIntLn(foo); // 1 } void main() { foo(1); } S c o p e H o l e built-in putIntLn S c o p e H o l e Bad programming style but compiles! COMP3131/9102 Page 420 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Scope Levels in Block-Structured Languages • Scope levels in general: 1. The declarations in the outermost block are in level 1 2. Increment the scope level by 1 every time when we move from an enclosing to an enclosed block 3. Typically, the pre-defined functions, variables and constants are in level 0 or 1 or the innermost level (the last is uncommon) • Scope levels in VC: 1. All function and global variable declarations are in level 1 2. Rule 2 as above 3. All built-in functions are in level 1⇒ They cannot be redeclared as user-functions or global variables (Rule 6) COMP3131/9102 Page 421 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 1: Scope Levels int i; void foo() { int i; int j; i = 1; j = 7; putIntLn(i); putIntLn(j); { int i; i = 2; putIntLn(i); putIntLn(j); } putIntLn(i); putIntLn(j); } Identifier Level Built-in putIntLn 1 foo 1 i 1 i 2 j 2 i 3 COMP3131/9102 Page 422 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 2: Scope Rules int foo(int foo) { putIntLn(foo); { int putIntLn; putIntLn = 2; putFloatLn(putIntLn); } putIntLn(foo); } void main() { foo(1); } Identifier Level Built-in putIntLn 1 Built-in putFloatLn 1 foo 1 foo 2 User-defined putIntLn 3 COMP3131/9102 Page 423 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Lecture 8: Static Semantics The semantic analyser enforces a language’s semantic constraints 1. Two types of semantic constraints: • Scope rules √ • Type rules 2. Two subphases in semantic analysis: • Identification (symbol table) • Type checking 3. Standard environment 4. Assignment 4: • The visitor design pattern • The two subphases combined in one pass only This lecture + Tutorial 9 + Ass 4 spec⇒ Type Checker COMP3131/9102 Page 424 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Static Semantics: Identification • Identification: Relate each applied occurrence of an identifier to its declaration and report an error if no such a declaration exists • Symbol Table: Associates identifiers with their attributes • The attributes of an identifier: a variable or a function; the type in the former and the function’s result type and the types of a function’s formal parameters in the latter • Two approaches to representing the attributes in the table: – The information distilled from the declaration and typically stored in the symbol table or – A pointer to the declaration itself (used in Assignment 4) COMP3131/9102 Page 425 April 21, 2018 J. Xue✬ ✫ ✩ ✪ The Inherited Attributed Decl Used in VC.ASTs.Ident.java for Decorating ASTs in Identification package VC.ASTs; import VC.Scanner.SourcePosition; public class Ident extends Terminal { public AST decl; public Ident(String value , SourcePosition position) { super (value, position); decl = null; } public Object visit(Visitor v, Object o) { return v.visitIdent(this, o); } } COMP3131/9102 Page 426 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Symbol Table Implementation in VC (Two Classes) • IdEntry: each IdEntry object has three instance fields: id: the lexeme level: the scope level of id attr: ptr to the corresponding declaration in the AST • SymbolTable – one table for all scopes! constructor: creates a new table; set scope level to 1 called at the start of semantic analysis insert: insert a new id entry into the table called at each declaration retrieve: retrieves the entry for an id called at each applied occurrence of an id retrieveOneLevel: retrieve the entry for an id from the current scope openScope: increment the scope level by 1 called at the start of a block closeScope: delete all entries in the current level called at the end of a block • See VC.Checker.IdEntry.java and VC.Checker.SymbolTable.java COMP3131/9102 Page 427 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Two Tasks in Identification • Processing declarations: • Call openScope at the start of a block • Call closeScope at the end of a block • Call insert to enter an id along its scope level and a pointer to the corresponding declaration into the symbol table • Processing applied occurrences – decorating Ident nodes • Call retrieve to link the field Decl in an Ident node (an inherited attribute) to its corresponding declaration • Decl = null if no corresponding declaration found – The fact to be used by you to report errors COMP3131/9102 Page 428 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Standard Environment • Most languages contain a collection of predefined variables, types, functions and constants • Java: java.lang • Haskell: the standard prelude • VC: 11 built-in functions and a few primitive types • At the start of identification, the symbol table contains 11 small ASTs for the nine built-in functions Identifier Level Attr getInt 1 ptr to the getInt AST putInt 1 ptr to the putInt AST putIntLn 1 ptr to the putIntLn AST the entries for the other 5 built-in function putLn 1 ptr to the putLn AST COMP3131/9102 Page 429 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Standard Environment (ASTs for Built-in Functions) • The AST for putIntLn • The name of the formal parameter is set to "" • Initialised by establishStdEnvironment() in Checker COMP3131/9102 Page 430 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 3 void foo() { int i; putIntLn(i); { int i; putIntLn(i); } putIntLn(i); } • The ASTs for Examples 1 and 2 are too big to be used. • Exercises: Print and decorate the ASTs for Examples 1 and 2 COMP3131/9102 Page 431 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Symbol Table and Decorated AST Identifier Level Attr Built-in putIntLn 1 foo 1 i 2 i 3 The table when block 3 has just been analysed COMP3131/9102 Page 432 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Symbol Table Implementations • In VC: one symbol table as a stack for all scopes • In industry-strength compilers: – A new symbol table for each scope – Link the tables from inner to outer scopes – More efficient data structures for tables are used: • Hash tables • Binary search trees • Need to handle languages that import and export scopes COMP3131/9102 Page 433 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Lecture 8: Static Semantics The semantic analyser enforces a language’s semantic constraints 1. Two types of semantic constraints: • Scope rules √ • Type rules 2. Two subphases in semantic analysis: • Identification (symbol table) √ • Type checking 3. Standard environment √ 4. Assignment 4: • The visitor design pattern • The two subphases combined in one pass only This lecture + Tutorial 9 + Ass 4 spec⇒ Type Checker COMP3131/9102 Page 434 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Type Checking • Data type: set of values plus set of operations on the values • Typical checks: • Type checks: expressions well typed using the type rules • Flow-of-control checks: break & continue contained in a loop, etc. • Uniqueness checks: The labels in a switch are distinct, etc. • Type rules: the rules to infer the type of each language construct and decide whether the construct has a valid type • Type checking: applying the language’s type rules to infer the type of each construct and comparing that type with the expected type COMP3131/9102 Page 435 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Type Checking in VC • VC is statically type checked • Lack of structured types⇒ simple checks: – Expressions: an operator applied to compatible operands – Statements: ∗ break and continue must be contained in a loop ∗ The type of the expression in a return statement in a function must be assignment-compatible with the result type of the function ∗ unreachable statements ∗ Every function whose result type is int or float must have a return statement (optional) COMP3131/9102 Page 436 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Type Checking of Expressions in VC • The type rules are defined informally in the document: VC Language Definition • Essentially, one checks if an operator is applied to compatible operands • The type of an unary operator: T1 → T2 • The type of a binary operator (including ”=”): T1 × T2 → T3 • The type of the function int f(int i, float f) is: int × float → int • The compiler infers that expression E has some type T or that E is ill-typed. If E occurs in a context where T ′ is expected, then the compiler checks if T is equivalent to T ′ COMP3131/9102 Page 437 April 21, 2018 J. Xue✬ ✫ ✩ ✪ The Synthesised Attributed type in Expr.java • The abstract class Expr.java: package VC.ASTs; import VC.Scanner.SourcePosition; public abstract class Expr extends AST { public Type type; public Expr (SourcePosition Position) { super (Position); type = null; } } • All concrete expr classes inherit the instance variable type COMP3131/9102 Page 438 April 21, 2018 J. Xue✬ ✫ ✩ ✪ The Synthesised Attributed type • The abstract class Var.java: package VC.ASTs; import VC.Scanner.SourcePosition; public abstract class Var extends AST { public Type type; public Var (SourcePosition Position) { super (Position); type = null; } } • Both attributes inherited in the concrete class SimpleVar.java COMP3131/9102 Page 439 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Bottom-Up Computation of type in an Expression AST • Literal: its type is immediately known • Identifier: is type obtained from the inherited attribute Decl associated with every Ident node • Binary operator application: Consider E1OE2, where O is a binary operator of type T1 × T2 → T3. The type checker ensures that E1’s type is equivalent to T1, and E2’s type is equivalent to T2, and thus infers that the type of E1OE2 is T3. Otherwise, there is a type error. • Other operator applications dealt with similarly COMP3131/9102 Page 440 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Standard Environment • StdEnvironment is a class with the five static variables: StdEnvironment.intType = new IntType(dummyPos); StdEnvironment.floatType = new FloatType(dummyPos); StdEnvironment.booleanType= new BooleanType(dummyPos); StdEnvironment.stringType= new stringType(dummyPos); StdEnvironment.voidType = new VoidType(dummyPos); StdEnvironment.errorType = new ErrorType(dummyPos); • errorType can be assigned to an ill-typed expression whose type cannot be deduced since some of its subexpressions are ill-typed COMP3131/9102 Page 441 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Types of Identifiers and Literals public Object visitIdent(Ident I, Object o) { Decl binding = idTable.retrieve(I.spelling); if (binding != null) I.decl = binding; return binding; } public Object visitIntLiteral(IntLiteral IL, Object o) { return StdEnvironment.intType; } public Object visitFloatLiteral(FloatLiteral IL, Object o) { return StdEnvironment.floatType; } public Object visitBooleanLiteral(BooleanLiteral IL, Object o) { return StdEnvironment.booleanType; } public Object visitStringLiteral(StringLiteral SL, Object o) { return StdEnvironment.stringType; } COMP3131/9102 Page 442 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Type Coercions • There are two types of operations at hardware level on, say, +: • integer addition when both operands are integers • floating-point addition when both operands are reals • Type coercion: the compiler implicitly converts int to float, whenever necessary, when an expression is evaluated • Each overloaded operator is associated two non-overloaded operators: one for integer operation and the other for floating-point operation • Your VC compiler needs to perform two tasks: – Add i2f conversion operator, whenever necessary – Indicate if an operator is integral or real (e.g., i+ or f+) – See Assignment 4 spec for details COMP3131/9102 Page 443 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 4: Type Checking Expressions void main() { int i; float f; i = i * 1 + f; } COMP3131/9102 Page 444 April 21, 2018 J. Xue✬ ✫ ✩ ✪ The AST for Example 4 COMP3131/9102 Page 445 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 4: Type Checking Expressions :int :int :int :int :int :error :float :float :float Error: assignment incompatible! COMP3131/9102 Page 446 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Example 4: Type Coercions (Decorated AST) COMP3131/9102 Page 447 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Error Detection, Reporting and Recovery: Tutorial 9 • Detection: based on type rules • reporting: prints meaningful error messages • Recovery: continue checking types in the presence of errors: • Must avoid a cascade of spurious errors • An ill-typed expression given StdEnvironment.errorType if its type cannot be determined in the presence of errors • Do not report an error for an expression if any of its subexpressions is StdEnvironment.errorType errorType ---> no error reported since the type / \ of the left operand is errorType errorType \ -----> an error is reported / \ \ true + 1 + 2 ^ ^ ^ boolean int int COMP3131/9102 Page 448 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Attribute Grammar for Type Checking (Using VC’s Type Rules) Production Semantic Rules 〈S〉 → 〈E〉 〈S.type〉 = 〈E.type〉 〈E1〉 → 〈E2〉 / 〈E3〉 〈E1.type〉 = 〈E2.type〉 = int and 〈E3.type〉 = int →int 〈E2.type〉 = int and 〈E3.type〉 float →float 〈E2.type〉 = float and 〈E3.type〉 int →float 〈E2.type〉 = float and 〈E3.type〉 float→float else →errortype 〈E〉 → num 〈E.type〉 = int 〈E〉 → num . num 〈E.type〉 = real COMP3131/9102 Page 449 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Assignment 4 • Implement a one-pass semantic analyser using the visitor design pattern • Identification implemented for you • Type checking implemented by you – checking – add i2f – choose a non-overloaded operation (e.g., +=>i+ or f+) • Decorated ASTs: The synthesised attribute type in Expr nodes The synthesised attribute type in SimpleVar nodes • The symbol table discarded once the AST is decorated COMP3131/9102 Page 450 April 21, 2018 J. Xue✬ ✫ ✩ ✪ Reading • Chapter 6 (Red Dragon) or Section 6.5 (Purple Dragon) • TreeDrawer, TreePrinter and Unparser to understand the visitor design pattern • On-line resources on typing if you are interested • Assignment 4 spec • The on-line VC language definition Next class: JVM + Jasmin Assembly Code COMP3131/9102 Page 451 April 21, 2018