A4: Type-Checking OCaml logo

Due: Tuesday, 3/22

Type-Checking Grumpy

Your job

in this assignment is to implement a type-checker for Grumpy, in file tycheck.ml, following the type-checking rules given in the Grumpy language specification. Before doing any actual programming, read through all the instructions below. And as always, ask early on Piazza if something's unclear!

Pair Programming

On this assignment (in contrast with previous assignments), you may — if you like — pair-program with up to one other person. If you do so, write that person's name in a comment at the top of your tycheck.ml. For example:
  (* NAME: Your name
     OUID: Ohio University ID

     I worked with ... on this assignment. *)
Each student should individually turn in tycheck.ml on Blackboard, regardless whether you worked with someone else. Pair programming does not mean each student does half of the assignment. Instead, it means the two of you construct tycheck.ml collaboratively, while both sitting at the same computer screen.

1. Download the assignment files

First, download the assignment files and unzip the resulting gzipped tarfile into a new directory.

UPDATE 3/10/16: Small tweaks to a4.tgz. Be careful not to overwrite your work when you re-download and unpack.
$ tar xzvf a4.tgz
In the resulting directory src you'll find the following file structure:
  src/               -- compiler source files
    Makefile         -- the project Makefile
    _tags            -- the tags file for ocamlbuild
    AST.mli          -- language-independent abstract syntax stuff
    AST.ml           -- associated helper functions
    exp.mli          -- the definition of Grumpy's abstract syntax
    exp.ml           -- associated functions
    lexer.mll        -- ocamllex source file (stub)
    parser.mly       -- Menhir source file (stub)
    tycheck.mli      -- The type-checker interface
    tycheck.ml       -- The type-checker (Part 2)
    grumpy.ml        -- the toplevel compiler program    
    tests/           -- test cases
To build the project, type
$ make
As in a3, you'll see a bunch of warnings at this point:
  File "parser.mly", line 13, characters 15-20:
  Warning: the token WHILE is unused.
  Finished, 22 targets (0 cached) in 00:00:00.
That's OK. The lexer and parser files are the same stubs that were given to you in the last assignment. Before you get started on this assignment, copy your own lexer.mll and parser.mly in their place.

If your lexer and parser don't work quite right, you can request working versions from Sam or from me. Just shoot one of us an email. We only ask that you don't share these files with others or post them on the internet.

Tests

As in the previous assignment, you can run the tests by doing

  $ make test
or by typing ./run.sh from within the tests directory.

For this assignment, the *.expected files in the tests directory contain type-annotated pretty-printed versions of each *.gpy program. The run.sh script checks — for each .gpy file — that your type-checker produced the same annotations as ours did. For example, the Grumpy program tests/test11-integer-equality.gpy:

  3 = 3
produces the following reference output (in tests/test11-integer-equality.gpy.expected) when type-checked:
  (((3 : int) == (3 : int)) : bool)

We also include a limited number of bad Grumpy programs, prefixed fail*.gpy, upon which your type-checker should fail. Initially, the test script may tell you you're passing these test cases, just because your type-checker fails indiscriminately on all test cases.

2. tycheck.ml

TL;DR Complete the definition of function tycheck_prog, which takes as input the abstract syntax of a Grumpy program (without type annotations), and returns a copy (if the program type-checks) of that same abstract syntax — but now with each sub-expression annotated with its type.

Your job in this part is to implement the type-checking functions sketched out as stubs in src/tycheck.ml.

Start by opening tycheck.ml. You'll see a bunch of functions, such as

  let rec tycheck (gamma : ty_env) (e : 'a exp) : ty exp = 
    ...
  | _ -> raise_ty_err "Unimplemented"
Many of these functions raise the type error "Unimplemented". Practically speaking, your job is to fill in the definition of each of these functions, at the spots at which they're currently raising "Unimplemented".

At a higher level, though, you'll need to read and understand the typing rules given in the Grumpy language specification and then to implement them in code.

For example, consider the rule for typing integers:

  ------------------------------------------------------- T_Num
    G;D |- num_int : int
which states that in typing context G (mapping program variables to their types) and function typing context D (mapping function names to their return types and the types of their arguments), any literal num_int expression has type int

The implementation of this rule in code is:

  let rec tycheck (gamma : ty_env) (e : 'a exp) : ty exp = 
    match e.exp_of with
      | EInt i -> { e with exp_of = EInt i; ety_of = TyInt }
(with the other cases elided). Notice that tycheck does two things:
  1. It returns a new expression (of type ty exp):
      { e with exp_of = EInt i; ety_of = TyInt } 
    in which EInt i has been annotated with its type TyInt.
  2. By not raising a type error (e.g., by not calling the function raise_ty_err), it indicates to the calling context (in src/grumpy.ml) that e is well-typed.

As a second example, consider the rule for type-checking program variables:

  (id,ty) \in G 
  ------------------------------------------------------- T_Id
  G;D |- id : ty
which states that id has type ty as long as the ordered pair (id,ty) is in the type context G. Our implementation of the T_Id rule is:
  | EId x -> 
     (match Symtab.get x gamma with
      | None ->
        raise_ty_err
          (pp_to_string 
            (fun ppf -> 
               fprintf ppf "unbound identifier '%a'@ at position %a" 
 	               pp_id x pp_pos e))
      | Some t -> 
        { e with 
             exp_of = EId x; 
             ety_of = t
        })
Symtab.get x gamma checks whether identifier x is bound in gamma (which corresponds to G in the typing rule), and if so, to which type. If x is bound (to some type t), then we return the new expression EId x annotated with that type. Otherwise, we raise a type error indicating that x is unbound and at which source code position.

Hints

  • The function Symtab.get is defined as part of Grumpy's symbol table module, in files symtab.mli and symtab.ml.
  • For more information on pretty-printing using fprintf and formatters, take a look at the Batteries documentation on module BatFormat.
  • In OCaml, inner match expressions (as in the implementation of T_Id above) should be surrounded by parentheses.
  • Recall that the syntax
            { e with 
                 exp_of = EId x; 
                 ety_of = t
            }
    
    performs a record update, returning the expression e updated with field exp_of = EId x and field ety_of = t.

3. Submit

Submit your tycheck.ml on or before the due date, via Blackboard.

4. Piazza

Finally: if any of these instructions are unclear, ask for clarification early and often on Piazza! I want everyone to succeed (and have fun!) on this assignment.