Identifier-Free C

| tags: code eso

Recently, there was some idle discussion on #esoteric on the topic of reserved identifiers in C and POSIX. The discussion further progressed to a thought experiment on whether it would be possible to write useful programs in C if the standard (or POSIX) reserved all identifiers, i.e., using only identifiers defined by the standard, in a legal manner.

The following program is a rudimentary brainfuck (sans input) interpreter that is at least mostly valid C, yet uses no user-defined identifiers (after preprocessing).

Comments and Limitations

The primary tricks used by the program are the use of errno (guaranteed to be a modifiable lvalue) for temporary storage, a file freopened over stderr for more permanent storage, and the use of the file pointer position (of another file freopened over stdin) for storing accessible values slightly more safely than errno.

The brainfuck tape is composed of unsigned char bytes (with wraparound), and extends almost indefinitely to the right. Going past the left edge is probably not a good idea. The tape size cannot grow larger than approximately INT_MAX, though for a more strictly TC take, a trivial construction is to switch to brainfuck sans input or output, and use stdout as a separate file dedicated for the tape. (You might be hard pressed to find a C implementation that allows for entirely unbouded files, however.)

The interpreter has been tested only with a "Hello, world" brainfuck program, so it may contain horrible bugs.

Exact standards-conformance of the program is not quite guaranteed. errno "may be set to nonzero by a library function call whether or not there is an error" (C11 7.5p3), so the use of expressions such as fwrite(&errno, 1, 1, stderr) is somewhat dubious; it would seem legal (if unlikely) for fwrite to overwrite the value of errno before reading the value we wanted to write.

Source Code

Here is the "human-readable" version of the source code. You can find a partially preprocessed (without expanding errno or substituting include files) version as idbf-pp.c.

Of the macros, R(x) and W(x,v) read and write integer-sized variable at index x (small integer), the former temporarily reading its value into errno so that it can be used in an expression. The variable x is stored at offset x * sizeof (int) in the idbf.mem file. Program is stored at offset PROGBASE, and tape starts at TAPEBASE.

The V, VSET(v) and VADD(v) macros use the stdin file position as a variable in the obvious way. The idbf.var file is never written to, so it is expected to always have a size of 0.

idbf.c

#include <errno.h>
#include <stdio.h>

#define SI (sizeof (int))

#define R(x) (fseek(stderr, (x)*SI, SEEK_SET), fread(&errno, SI, 1, stderr), errno)
#define W(x,v) (fseek(stderr, (x)*SI, SEEK_SET), fwrite(&(int){(v)}, SI, 1, stderr))

#define V (ftell(stdin))
#define VSET(v) (fseek(stdin, (v), SEEK_SET))
#define VADD(v) (fseek(stdin, (v), SEEK_CUR))

#define PROGBASE 1024
#define TAPEBASE 2048

#define PROGLEN 0
#define PROG 1
#define TAPE 2
#define TAPELEN 3

int main(void)
{
	freopen("idbf.mem", "w+", stderr);

	fseek(stderr, PROGBASE, SEEK_SET);
	while (fread(&errno, 1, 1, stdin) > 0)
		fwrite(&errno, 1, 1, stderr);

	freopen("idbf.var", "w+", stdin);

	VSET(ftell(stderr) - PROGBASE);
	W(PROGLEN, V);
	W(PROG, 0);
	W(TAPE, 0);
	W(TAPELEN, 1);

	fseek(stderr, TAPEBASE, SEEK_SET);
	fputc(0, stderr);

	while (1)
	{
		VSET(R(PROG));
		if (V >= R(PROGLEN))
			break;

		fseek(stderr, PROGBASE + V, SEEK_SET);
		VSET(fgetc(stderr));

		switch (V)
		{
		case '>':
			VSET(R(TAPE)); VADD(1); W(TAPE, V);
			if (V == R(TAPELEN))
			{
				fseek(stderr, TAPEBASE + V, SEEK_SET);
				fputc(0, stderr);
				W(TAPELEN, V+1);
			}
			break;
		case '<':
			VSET(R(TAPE)); VADD(-1); W(TAPE, V);
			break;

		case '+':
			VSET(R(TAPE));
			fseek(stderr, TAPEBASE + V, SEEK_SET);
			VSET(fgetc(stderr));
			fseek(stderr, -1, SEEK_CUR);
			fputc(V + 1, stderr);
			break;
		case '-':
			VSET(R(TAPE));
			fseek(stderr, TAPEBASE + V, SEEK_SET);
			VSET(fgetc(stderr));
			fseek(stderr, -1, SEEK_CUR);
			fputc(V - 1, stderr);
			break;

		case '.':
			VSET(R(TAPE));
			fseek(stderr, TAPEBASE + V, SEEK_SET);
			fputc(fgetc(stderr), stdout);
			break;

		case '[':
			VSET(R(TAPE));
			fseek(stderr, TAPEBASE + V, SEEK_SET);
			if (fgetc(stderr))
				break;
			VSET(R(PROG));
			fseek(stderr, PROGBASE + V + 1, SEEK_SET);
			VSET(1);
			while (V > 0)
			{
				if (fgetc(stderr) == '[') VADD(1);
				fseek(stderr, -1, SEEK_CUR);
				if (fgetc(stderr) == ']') VADD(-1);
			}
			VSET(ftell(stderr) - PROGBASE);
			W(PROG, V);
			break;

		case ']':
			VSET(R(TAPE));
			fseek(stderr, TAPEBASE + V, SEEK_SET);
			if (!fgetc(stderr))
				break;
			VSET(R(PROG));
			fseek(stderr, PROGBASE + V - 1, SEEK_SET);
			VSET(1);
			while (V > 0)
			{
				if (fgetc(stderr) == ']') VADD(1);
				fseek(stderr, -1, SEEK_CUR);
				if (fgetc(stderr) == '[') VADD(-1);
				fseek(stderr, -2, SEEK_CUR);
			}
			VSET(ftell(stderr) + 1 - PROGBASE);
			W(PROG, V);
			break;

		default:
			break;
		}

		VSET(R(PROG));
		VADD(1);
		W(PROG, V);
	}

	fclose(stdin);
	fclose(stderr);
	fflush(stdout);
	return 0;
}

Sample Session

Just as a bit of evidence that it works.

$ cat hello.bf
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
$ ./idbf <hello.bf
Hello World!