Final Project Ideas

Done in 2015 by Charlie Murphy and Patrick Gray:
Verified Machine Learning in Coq: Perceptron

The Perceptron is a simple linear classifier that learns, from training data, how to classify real-valued feature vectors V as "positive" or "negative".

Potential tasks:

Implement Perceptron in Coq. You might assume, first, that feature vectors are integer-valued. Then work up to operating with real numbers in Coq, using a standard library like Coq.Reals.
Prove some property of your Perceptron implementation. For example, that Perceptron training converges assuming the input training set is linearly separable (i.e., positive examples can be distinguished from negative ones by some hyperplane).
As a bonus, learn about extraction in Coq in order to generate, from your Coq Perceptron implementation, executable code in the functional programming language OCaml. Benchmark your implementation to see how fast it is!

Verified Algorithms and Data Structures: 2-3 Trees

2-3 Trees, invented by John Hopcroft in 1970, are a balanced binary-tree data structure similar to AVL or Red-Black trees.

Potential tasks:

Implement 2-3 trees in Coq (lookup, insert, delete).
Prove your 2-3 implementation correct with respect to a set abstraction of the data structure, as in the midterm on splay trees.
Interesting but difficult: prove a theorem stating that lookup, insert, and delete are each O(log n) time.
As a bonus, learn about extraction in Coq in order to generate, from your 2-3 tree implementation, executable code in the functional programming language OCaml. Benchmark your implementation to see how fast it is!

One possible reference: Sozeau, Finger Trees in Coq

Inverse Transform Sampling in Coq

A discrete distribution D over outcomes of type A can be represented in Coq as a function that maps each value of type A to a (rational, Q or real) probability in the range [0,1], subject to \sum_{a:A} D a = 1.

The CDF (cumulative distribution function) of D is the function F(a) that tells us the probability that an outcome is "less than or equal" a, for some ordering of the possible outcomes in A. The inverse of the CDF, a function that maps values in [0,1] to associated outcomes of type A, can be used to sample the distribution, assuming access to a source of random values in the range [0,1].

Potential tasks:

Define a notion of discrete probability distribution over values of type A in Coq as a function that maps each a : A to real (or perhaps rational) numbers.
Define cumulative distribution functions (CDFs).
Define inverse CDFs.
Axiomatize a function that produces uniform random values in the range [0,1], in order to define a sampling function over inverse CDFs.
Prove that your sampling procedure produces values according to the original distribution D. Difficulty of this last part is probably high!

Monte Carlo in Coq: Estimating Pi

As this page demonstrates, it's possible to estimate pi=3.14159... using a Monte Carlo-style random simulation (choose points randomly within the bounding box (-1,-1), (1, 1); then pi can be estimated from the ratio of points within the circle centered at (0,0) [approximately pi*R^2] to the total number of points in the bounding box [approximately 4R^2]).

Potential tasks:

Define a notion of discrete probability distribution over values of type A in Coq as a function that maps each a : A to real (or perhaps rational) numbers.
Define a notion of Monte Carlo simulation over probability distributions, as a function that operates over a sequence of random draws.
Using your formulation of Monte Carlo, define the approximation of pi described above.
Prove that your approximation is equal to the real value of pi up to some approximation factor. Difficulty of this last part is probably high!

Final Project Ideas

Done in 2015 by Charlie Murphy and Patrick Gray: Verified Machine Learning in Coq: Perceptron

Verified Algorithms and Data Structures: 2-3 Trees

Inverse Transform Sampling in Coq

Monte Carlo in Coq: Estimating Pi

Done in 2015 by Charlie Murphy and Patrick Gray:
Verified Machine Learning in Coq: Perceptron