		INTERCAL IMPLEMENTOR'S NOTES
			by ESR

The C-INTERCAL compiler has a very conventional implementation using YACC and
LEX.  Each line of INTERCAL is translated into a C if()-then; the guard part
is used to implement abstentions and RESUMES, and the arm part translates the
`body' of the corresponding INTERCAL statement.

The generated C code is plugged into the template file ick-wrap.c
inside main().  It needs to be linked with cesspool.o, fiddle.o and
lose.o (these are in libick.a, with the new routine, arrgghh.o).
Cesspool.o is the code that implements the storage manager; fiddle.o
implements the INTERCAL operators; and lose.o is the code that
generates INTERCAL's error messages.  The routine arrgghh.o parses the
new command line arguments.

The abstain[] array in the generated C is used to track line and label
abstentions; if member i is on, the statement on line i is being
abstained from.  If gerund abstentions/reinstatements are present in
the code, a second array recording the type of each statement in
generated into the runtime, and used to ensure that these operations
are translated into abstention-guard changes on all appropriate line numbers.

Labels are mapped to line numbers in the code checker, just before
optimization.

RESUMES are implemented with a branch to a generated switch statement
that executes a goto to the appropriate label.

The parser builds an array of tuples, one for each INTERCAL statement.  Most
tuples have node trees attached.  Once all tuples have been generated,
the compile-time checker and optimizer phases can do consistency checks
and expression-tree rewrites.  Finally, the tuples are ground out as C code
by the emit() function.

The optimizer does constant folding for all five operators.  It also checks
for the idioms for `test for equality' and `test for nonzeroness'.

Calculations are fully type-checked at compile time; they have to be because
(as I read the manual) the 16- and 32-bit versions of the unary ops do
different things.  The only potential problem here is that the typechecker
has to assume that :m ~ :n has the type of :n (32-bit) even though the
result might fit in 16 bits.  At run-time everything is calculated in 32
bits.  When INTERCAL-72 was designed 32 bits was expensive; now it's cheap.
Really, the only reason for retaining a 16-bit type at all is for the
irritation value of it (yes, C-INTERCAL *does* enforce the 16-bit limit
on constants).

Note that the spectacular ugliness of INTERCAL syntax requires that
the lexical analyzer have two levels.  One, embedded in the input()
function, handles the backquote and bang constructs, and stashes the
input line away in a buffer for the splat construct's benefit.  The
upper level is generated by lex(1) and does normal tokenizing for YACC.

Note that numeral tokens for input are defined in a symbol table (numerals.c)
that is directly included in the run-time library module (cesspool.c).  This
avoids having to put the the size of the numerals array in an extern.  To add
new numeral tokens, simply put them in the numerals initializer.
