CS 4100: Introduction to Formal Languages and Compilers

Spring 2016

ARCHIVED PAGE: 2016 OFFERING OF CS4100

Formal Languages and Compilers

An upper-level course for CS majors on formal languages theory and compilers.

Topics (subject to revision): regular expressions; finite automata; context-free grammars; predictive parsing; LR parsing; abstract syntax; type systems and type-checking; stack layout and activation records; intermediate representations; control-flow graphs; static-single assignment (SSA) form; dataflow/liveness analysis; register allocation; garbage collection/runtimes; the LLVM compiler infrastructure. Over the course of the semester, students will implement a full functioning compiler for a small imperative programming language, targeting LLVM. The course involves a significant amount of programming.

Lecture: Tuesday, Thursday 1:30–2:50 p.m., ARC 221
Professor: Gordon Stewart (gstewart@ohio.edu)
TA: Sam Merten (sm137907@ohio.edu)
Office Hours: Monday, Thursday 11:00 a.m.–12:00 p.m. (Stocker 355), or by appointment
Lab Hours: the Mondays before assignments are due, 3:15—4:30 p.m., Stocker 307A
Piazza: Course Page, Signup

Course Objectives

After completing the course, students will have

Textbooks and Software

The primary texts are Hard copies of these books are certainly worthwhile, but before you buy I urge you to check out the electronic reserves first. If you don't mind reading on your laptop screen, the electronic versions may save you some money!

Periodically I may assign additional supplementary (optional but recommended) readings from resources such as

all of which are freely available online.

Prerequisites

CS 3200 and 3610, but also: Some mathematical maturity (at the level of "I've seen and done a few proofs before"), facility with a couple different programming languages, and a desire to learn.

Course Structure

The course consists of twice-weekly lectures (Tuesday and Thursdays), attendance at which is required. To help get you up to speed with OCaml and the course programming assignments, we'll also hold biweekly lab hours (Stocker 307A, Mondays 3:15 p.m.—4:30 p.m.). Although attendance at the lab hours is optional, I highly recommend that you attend — at least for the first few weeks of the course. The programming assignments for this course are extensive and time consuming, so be prepared!

In addition to biweekly homework assignments, there will be a midterm exam (Week 7, approximately 15% of your grade) and a final (Week 15, approximately 30%). The biweekly homeworks are worth approximately 40%. We'll have weekly quizzes every Tuesday (with probability 1/3, 5%). Participation and attendance at lecture are worth 10%.

Blackboard will be used only to report grades and to post lecture notes. Up-to-date information on all other aspects of the course (assignment due dates, etc.) will be posted either on this website or on the Piazza page or both.

Schedule (Tentative)

Intro. to Compilers, OCaml
W1: 1/11-15
Introduction to compilers and functional programming in OCaml.
Reading: Appel 1; OCaml Manual: Core Language
Homework: A0: Intro. to OCaml. Due Tuesday, 1/19
Lecture Thu. 1/21 Sam Merten leads "OCaml Bootcamp"
W2: 1/18-22
More functional programming: polymorphism, higher-order functions, algebraic datatypes and pattern-matching.
Supplemental Reading: RWO I.1, OCaml Pervasives Library (reference)
Homework: A1: Functional Programming in OCaml. Due Tuesday, 1/26
Lexing and Parsing
W3: 1/25-29
Regular expressions, DFAs and NFAs
Reading: Appel 2; Mozgovoy 1 (available in Blackboard)
Homework: A2: Regular Expressions Re-Examined. Due Tuesday, 2/9
W4: 2/1-5
Lexer generators, ocamllex, Recursive descent, LL/LR parsing
Reading: Appel 3 (through Section 3.2)
Lecture Thu. 2/11 In-Class Lab (Sam Merten)
W5: 2/8-12
Parsing cont'd, Parser generators, Menhir
Reading: Appel Sections 3.3-3.5
Homework: A3: Lexing and Parsing with ocamllex and Menhir. Due Tuesday, 2/23
Types and Type-Checking
W6: 2/15-19
Abstract syntax trees, type systems
Reading: Appel 4, TAPL 8 (OU Library eBook)
W7: 2/22-26
Type systems continued.
Reading: Appel 5
Midterm Exam: Thursday 2/25
W8: 2/28-3/5 Spring Break, No Class
W9: 3/7-11
Symbol Tables, Type-checking
Reading:
Homework: A4: Type-checking. Due Tuesday, 3/22
Intermediate Representations
W10: 3/14-18
Stack layout and activation records, control-flow graphs, dominator computation, loop optimizations
Reading: Appel 6.1, Appel 7.1, Appel 18.1
W11: 3/21-25
Use-def, dataflow/liveness analysis, Static Single Assignment (SSA) form, interference graphs
Reading: Appel 10.1, Appel 19 (up to but not including 19.1)
Homework: A5: SSA. Due Tuesday, 4/5
Intro. to LLVM
W12: 3/28-4/1
Introduction to LLVM assembly and the LLVM compiler toolkit
Reading: AOSA: LLVM, by Chris Lattner
Runtimes and Garbage Collection
W13: 4/4-8
Intro. to runtimes, garbage collection; mark-and-sweep collection, copying collection, reference counting, generational collection
Reading: Appel 13, through 13.4
Homework: A6: LLVM. Due Tuesday, 4/19
W14: 4/11-4/15
Garbage collection contd., runtime representations of objects, of functions (closures), of polymorphic variables
Reading: Appel 14.1-14.3 (objects), Appel 16.3 (polymorphism)
Register Allocation + Review
W15: 4/18-22
Register allocation
Reading: Appel 11 through 11.3
April 25-29: Final Exams

Homework and Collaboration Policies

Homework will usually be due Tuesdays, by the start of class (1:30 p.m.). Late homework assignments will be penalized according to the following formula:

You may discuss the homework with other students in the class, but only after you've attempted the problems on your own first. If you do discuss the homework problems with others, write the names of the students you spoke with, along with a brief summary of what you discussed, in a README comment at the top of each submission. Example:

(* README Gordon Stewart, Assn #1
I worked with X and Y. We swapped tips regarding the use of pattern-matching in OCaml. *)

However, under no circumstances are you permitted to share or directly copy code or other written homework material, except with course instructors. The code and proofs you turn in must be your own. Remember: homework is there to give *you* practice in the new ideas and techniques covered by the course; it does you no good if you don't engage!

In general, students in EECS courses such as this one should adhere to the Russ College of Engineering and Technology Honor Code, and to the OU Student Code of Conduct.

Students with Disabilities

If you suspect you may need an accommodation based on the impact of a disability, please contact me privately to discuss your specific needs. If you're not yet registered as a student with a disability, contact the Office of Student Accessibility Services first.