CMPE 152-04 and -05: Compiler Design Ron Mak 1 San José State University Department of Computer Engineering CMPE 152 Compiler Design Section 4 (Seminar) Section 5 (Lab) Spring 2017 GREEN SHEET CMPE 152-04 (seminar) TuTh 10:30 – 11:20 AM ENG 303 CMPE 152-05 (lab) Tu 1:30 – 4:20 PM ENG 405 Instructor: Ron Mak Office hours: Th: 2:30 – 4:30 PM Office location: ENG 250 E-mail: ron.mak@sjsu.edu Instructor web page: http://www.cs.sjsu.edu/~mak/ Class web page: http://www.cs.sjsu.edu/~mak/CMPE152/ Course description “Principles of lexical analysis, finite state automata and parsing; issues of variable declarations, variable types, control statements, function calls, nested scopes and efficient assembler target code. Prerequisite: CMPE 126, CMPE 102 (both with grade of ‘C-‘ or better). Misc/Lab: Lecture 2 hours/lab 3 hours.” 3 units Please submit into Canvas a copy of your transcript with the prerequisite courses highlighted. “Students who do not provide documentation of having satisfied the class prerequisite and co-requisite requirements (if any) by the second class meeting will be dropped from the class.” Goals of the course 1. Compiler construction. Design and build a working compiler for a programming language that you invented. Write sample programs in your language, compile your programs into byte code for the Java Virtual Machine to produce .class files, and then successfully run your programs on the JVM. 2. Software engineering. Employ the best practices of object-oriented design and team-based software engineering. A compiler is a large, complex program! Managing the development of such a program requires learning skills that are highly desired by employers. CMPE 152-04 and -05: Compiler Design Ron Mak 2 You will have the unique opportunity to invent a programming language, define its grammar, create a compiler for it, and then successfully compile and run programs written in your new language. Course learning outcomes Upon successful completion of the course, students will be able to • Understand some of the underlying theory of compiler technology. • Develop a scanner and a parser for a procedure-oriented programming language. • Produce a symbol table and intermediate code (parse trees). • Perform semantic analysis such as type checking. • Develop an interpreter that creates a suitable runtime environment and executes the source program from the intermediate code and the symbol table. • Invent a new programming language and define a grammar for it. • Use the JavaCC compiler-compiler to generate a compiler for the team’s language. • Compile programs into the assembly language of the Java Virtual Machine (JVM). • Assemble the assembly code into byte code for the JVM (create .class files). • Successfully run programs written in the team’s programming language on the JVM. • Learn critical job skills that employers look for in new college hires! o Work together in a small programming team. Students form teams of four students each. Each student must be part of a team. o Understand and modify a Big Hairy Legacy Application (a Pascal interpreter written in Java). o Practice good object-oriented design and apply modern software engineering tools and practices to successfully develop a large complex application (your team's compiler project). This is a challenging course that will demand much of your time and effort throughout the semester. Course websites Course materials such as syllabus, handouts, notes, assignment instructions, etc. can be found on the class web page at http://www.cs.sjsu.edu/~mak/CMPE152/ and on the Canvas Learning Management System course website at http://sjsu.instructure.com. Piazza at https://piazza.com provides broadcast messages and a class forum. You are responsible for registering with these sites and regularly checking them for updates. CMPE 152-04 and -05: Compiler Design Ron Mak 3 Required texts Title: Author: Publisher: ISBN: Source files: Writing Compilers and Interpreters, 3rd edition Ronald Mak Wiley Publishers, Inc. 978-0-470-17707-5 http://www.cs.sjsu.edu/~mak/CMPE152/sources.zip Title: Author: Publisher: ISBN: PDF or book: Generating Parsers with JavaCC, 2nd edition Tom Copeland Centennial Books 0-9762214-3-8 http://generatingparserswithjavacc.com/ Recommended texts Title: Author: Publisher: ISBN: Programming for the Java Virtual Machine Joshua Engel Addison-Wesley Professional 978-0201309720 Title: Author: PDF: Understanding and Writing Compilers (free download) Richard Bornat http://www.cs.sjsu.edu/~mak/CMPE152/UnderstandingAndWritingCompilers.pdf Online Pascal tutorials Pascal Tutorial looks very good. It has an online Pascal compiler. Learn Pascal also looks good, although it doesn't appear to cover set types. There may be additional reading assignments and the use of tools from the Internet.This class will progress rapidly, and you must work hard to keep up. Do not fall behind in the reading. Compiler project teams You will form project teams of four students each. Team membership is mandatory for this class. Choose your teams wisely! Once teams are formed, students may not move from one team to another. All the members of a project team will each receive the same score for each team assignment. At the end of the semester, all members of a project team will receive the same score for its compiler project. The project grade will be determined by the overall quality of the final version of the team’s artifacts and by how well the team achieved its goals to create a successful compiler. CMPE 152-04 and -05: Compiler Design Ron Mak 4 Labs, assignments, and compiler projects Lab assignments during the first half of the semester will be to • Learn the Pascal language. • Experiment with the underlying theory of compiler technology. • Modify, debug, and add features to the source code of a Pascal interpreter written in Java. Lab assignments during the second half of the semester will be to • Define a grammar for your team’s programming language. • Implement the grammar in JavaCC. • Generate a compiler for your language. • Compile programs into the assembly language of the JVM. There will be individual and team assignments. Each assignment will be worth a specified maximum number of points, depending on difficulty. You will have at least one week to complete each assignment. Late assignments will lose 20% of the maximum points and an additional 20% for each 24 hours after the due date. Most of the lab assignments will help you to design and develop your compiler. Each team should be working on its compiler project “in the background” outside of the labs, although the labs can be ideal times to discuss and find solutions to problems. The following applies to individual assignments: Postmortem report At the end of the semester, each student must also turn in a short (1 or 2 pages) individual postmortem report that includes: • A brief description of what you learned in the course. • An assessment of your accomplishments for your project team on the assignments and the web application project. • An assessment of each of your other project team members. Only the instructor will see these reports. You may study together and discuss the assignments, but what you turn in must be your individual work. Assignment submissions will be checked for plagiarism using Moss (http://theory.stanford.edu/~aiken/moss/). Copying another student’s program or sharing your program is a violation of academic integrity. Moss is not fooled by renaming variables, reformatting source code, or re-ordering functions. Violators of academic integrity will suffer severe sanctions, including academic probation. Students who are on academic probation are not eligible for work as instructional assistants in the university or for internships at local companies. CMPE 152-04 and -05: Compiler Design Ron Mak 5 Exams The midterm and final examinations will be closed book. Instant messaging, e-mails, texting, tweeting, file sharing, or any other forms of communication with anyone else during the exams will be strictly forbidden. There can be no make-up midterm examination unless there is a documented medical emergency. Make-up final examinations are available only under conditions dictated by University regulations. Class grade Your individual final class grade will be weighted as follows: 30% Assignments 35% Compiler project 15% Midterm exam 20% Final exam During the semester, you can keep track of your progress in Canvas. Each assignment and exam will be scored (given points) but not assigned a letter grade. The average score can be seen in Canvas after each assignment and the midterm exam. At the end of the semester, all the students will be ranked in the order of their weighted class scores. Students with the median score will be assigned the B grade. Higher and lower grades will then be assigned based on how the scores cluster above and below the median. Your final class grade can be adjusted up or down depending on your level and quality of participation on your project team as determined by project tracking tools and your team members' assessments of your performance. Note that “All students have the right, within a reasonable time, to know their academic scores, to review their grade-dependent work, and to be provided with explanations for the determination of their course grades.” See University Policy F13-1 at http://www.sjsu.edu/senate/docs/F13-1.pdf for more details. Classroom protocol It is very important for each student to attend classes and labs and to participate. Cell phones in silent mode, please. University policies Per University Policy S16-9, university-wide policy information relevant to all courses, such as academic integrity, accommodations, etc. will be available on Office of Graduate and Undergraduate Programs’ Syllabus Information web page at http://www.sjsu.edu/gup/syllabusinfo/” CMPE 152-04 and -05: Compiler Design Ron Mak 6 Schedule Subject to change with fair notice. Chapter readings are from the required texts. Week Dates Topics and activities WCI JavaCC 1 Jan 26 Overview of the course What are compilers and interpreters? A software framework for compilers and interpreters Form programming teams 1, 2 2 Jan 31 Feb 2 Syntax diagrams Scanning (lexical analysis) Symbol table management 3, 4 Jan 31 lab Install software Write, compile, and run a Pascal program 3 Feb 7, 9 Top-down recursive-descent parsing Parsing assignment statements and expressions Intermediate code (parse trees) 5 Feb 7 lab JFLAP exercises on finite automata and grammars Create a scanner and a symbol table 4 Feb 14, 16 Interpreting assignment statements and expressions Parsing control statements Parser error handling 6, 7 Feb 14 lab More exercises on finite automata and grammars Modify the Pascal parser for set expressions 5 Feb 21, 23 Interpreting control statements Runtime error handling Parsing declarations 8, 9 Feb 21 lab Execute Pascal set expressions 6 Feb 28 Mar 2 Parsing declarations, cont'd Semantic actions and type checking 9, 10 Feb 28 lab Parse set and complex type definitions 7 Mar 7, 9 Scope and the symbol table stack Parsing programs, procedures, and functions Parsing procedure and function calls Runtime memory management The runtime stack and activation frames 11, 12 Mar 7 lab Type check set and complex expressions Create a runtime library for complex arithmetic 8 Mar 14, 16 Passing parameters by value and by reference Interpreting Pascal programs Midcourse review Midterm exam Thursday, March 16 12 Mar 14 lab Compile and interpret Pascal programs Debug the interpreter CMPE 152-04 and -05: Compiler Design Ron Mak 7 9 Mar 21, 23 A simple DFA scanner BNF grammars for programming languages The JavaCC compiler-compiler 1 Mar 21 lab Create a grammar for your programming language SPRING BREAK 10 Apr 4, 6 Generating a scanner with JavaCC Generating a parser with JavaCC JJTrees 2, 3, 4 Apr 4 lab Use JavaCC to generate a scanner for your programming language 11 Apr 11, 13 JavaCC error handling The Java Virtual Machine (JVM) architecture Jasmin assembly language Code templates and code generation 15 7 Apr 11 lab Use JavaCC to generate a parser for your programming language 12 Apr 18, 20 Code for expressions Code for assignment statements 16 Apr 18 lab The Java Virtual Machine (JVM) architecture Learn the Jasmin assembly language 13 Apr 25, 27 Code for procedure and function calls Code to pass parameters by value and by reference Code for string operations 17 Apr 25 lab Generate Jasmin assembly code for expressions, assignment statements, and function calls 14 May 2, 4 Code for control statements Code for arrays Code for records 18 May 2 lab Generate Jasmin assembly code for control statements 15 May 9, 11 Executing compiled Pascal programs Bottom-up parsing Yacc and Lex Code optimization 19 May 9 lab Generate Jasmin assembly code for arrays and records 16 May 16 Compiling object-oriented languages An interactive source-level debugger A multi-threaded GUI-based debugger Heap, stack, and garbage collection Course review 19 May 16 lab Complete your compiler Exercises on Yacc and Lex May 17 Final projects due Wednesday, May 17 Final May 22 Final exam Monday, May 22 9:45 AM – noon, ENG 303