Hash-based ELF Symbol Imports | 2010-08-13

Backdated content; see this post for details.

Introduction

You know how, when you link an ELF executable dynamically, all the names of the dynamically linked symbols end up being included in the executable? That's, like, several bytes — maybe even dozens — of bloat. (And there's no Windows DLL style import-by-ordinal functionality.) Turns out, however, that it's possible to refer to the symbols by their hash values instead. That's a fixed cost of four bytes per symbol, compared to 8 (symbol table entry) + strlen(name) + 1 for regular dynamic linking. (Of course there's also some amount of bytes wasted in the stub that does the hash resolving.)

The following might be more or less x86-32 Linux-specific.

Acknowledgements

I got the concept from Universaal Universum, by LHB. Based on the comments, they got it from Metatunnel, by FRequency.

How Much Does It Save?

A simple OpenGL/GLU/GLX test is 1160 bytes uncompressed, while a directly compiled and stripped version of the same code is 5684. Some sort of a more aggressive stripping tool might whittle it down a lot. However, even the .text + .data + .rodata in the directly compiled version takes 1080 bytes; compare with the 757-byte combined .text section in the hashlinked version.

How Does It Work?

There's a 110-byte stub injected before the actual program that takes the list of hash values and replaces them by the runtime addresses of those symbols. Details below.

C Side

In the C code, we have a structure of function pointers to the interesting dynamically linked functions:

libs.h

#include <X11/Xlib.h>
#include <GL/glx.h>

struct dynamic_funcs {
  Display *(*XOpenDisplay)(const char *name);
      XVisualInfo *(*glXChooseVisual)(Display *dpy, int screen, int *attrs);
      /* ... */
};

extern struct dynamic_funcs f;

After including that, you can call the functions like Display *dpy = f.XOpenDisplay(NULL);. Then we implement this header with a thing like:

libs.c

/* SYM: XOpenDisplay */
/* SYM: glXChooseVisual */
/* ... */

extern int main(void);
void entry(void) __attribute__ ((section (".entry"), aligned (1)));
void entry (void) { main(); }

Finally, we pair that with a linker script that locates the .entry section and the f symbol at a suitable fixed address, stuffs all initialized things in the .text section, leaves 40 empty bytes at the start of .bss for an unexpected symbol table; then link the code, and objcopy .text into a binary blob. Then a Perl script goes through the libs.c file, picks up all the /* SYM: ... */ comments, and produces another binary file that has the symbol hashes.

Assembler Side

The dynamic-loading stub contains a hand-crafted ELF header inspired by the well-known Teensy ELF Executables tutorial, except with slightly less overlapping. It's commented, but the workings are described in prose, below.

The main idea is to have the following parts in the ELF DYNAMIC section:

One DT_NEEDED entry per each library you have symbols from. Note that if some libraries depend on others, it's enough to list the one that pulls in the rest.
One DT_STRTAB entry pointing at a string table containing only the library file names for the DT_NEEDEDs.
One DT_SYMTAB entry with a value of 0; this seems to be required. (Quite a few other entries are also listed as mandatory for executables, but don't seem to be.)
One DT_DEBUG entry. The value part of this (non-standard) entry will be filled by ld.so with a pointer into data structures that describe the dynamic linking environment. It's used by things like GDB, but more importantly the data structures contain a chain of loaded libraries, and the symbol tables of each of them are accessible via it.

The format of the last one is slightly poorly documented, but for the most part you can look e.g. at the uClibc link.h. It is the address of a struct r_debug that will be placed into the DT_DEBUG entry. In particular, we want the r_map member.

To resolve a hash value into a symbol address, we walk that chain of struct link_map objects. For each linked library, we find the values of the DT_HASH, DT_STRTAB and DT_SYMTAB entries from the dynamic section pointed by the l_ld member. The hash table, sadly, doesn't contain the actual hash values of symbols, just the buckets in which the symbols fall; otherwise this would be trivial. Anyway, from DT_HASH we get the number of symbols, after which we walk that many entries of DT_SYMTAB, hash their names, and check for match. If one is found, we just have to add the load address of the library (from link_map) and the symbol address to get the runtime address.

Code

The code is probably not quite usable out of the box, but the most important bits should be here.

hlstub.asm: the linking stub that will assemble into the ELF file.
hlstub.ld: the linker script; addresses here will need adjusting.
hashsym.pl: Perl script to hash the SYM: comments.

To use, try something like this:

compilation example (approx.)

gcc -m32 -Os -Wall -fomit-frame-pointer -c -o main.o main.c
gcc -m32 -Os -Wall -fomit-frame-pointer -c -o libs.o libs.c # see libs.h/libs.c above
ld -m elf_i386 -T hlstub.ld -o prog-hlstub main.o libs.o
objcopy -j .text -O binary prog-hlstub prog-hlstub.text
./hashsym.pl libs.c prog-hlstub.hash
nasm -Ox -o prog -f bin -l prog.lst hlstub.asm > prog.map
chmod u+x prog