ckitckit is a C front end written in SML that translates C source code (after preprocessing) into abstract syntax represented as a set of SML datatypes. It also provides facilities for extending the C language with additional syntactic constructs, which can be useful for implementing "C-like" domain-specific languages as well as C dialects. Ckit is currently used as the front end for a variety of tools for static analysis of C code and at least one domain-specific language. Documentation is still rudimentary at this point. There is an overview (file doc/overview.html) explaining how to get started, and the signatures of the major components are commented. The ckit software is available at:ftp://ftp.research.bell-labs.com/dist/smlnj/packages/ckit-1.0.tar.gz
StatusWe have just released version 1.0 (31 March 2000).
Ckit has been used extensively to parse large code bases. While the parser is very stable, the type checker has been less thoroughly tested. Known open bugs can be found in the file BUGS. A record of previously fixed bugs is at the end of the file HISTORY.
Report bugs to email@example.com. The list of currently open bugs is available in the BUGS file.
Credits and History
David Ladd of Bell Labs in Naperville wrote a prototype C parser using ML-Lex and ML-Yacc in the early 90s which was used by Daniel Jackson and Gene Rollins at CMU for use in a program slicing tool (ChopShop, 1994).
Starting in late 1997, Satish Chandra (Bell Labs, Naperville) and Michael Siff (University of Wisconsin-Madison) extensively rewrote the parser and fixed numerous scoping problems, making it possible to parse the SPEC benchmarks and large industrial code bases. They also developed ML types for representing abstract syntax trees and C types and a translator from parse trees to abstract syntax. They used this infrastructure in various program analysis tools.
Starting in April 1998, Nevin Heintze and Dino Oliva (Bell Labs, Murray Hill), later joined by Dave MacQueen and with continuing assistance from Satish Chandra, turned this tool into a full-fledged extensible frontend, adding a type checker, pretty-printer, and syntax extension mechanisms as well as rewriting the parser to support error recovery. This effort lead to the current ckit, which has been used to process multi-million line industrial code bases.