Backdated content; see this post for details.
Introduction
You know how, when you link an ELF executable dynamically, all the names of the dynamically linked symbols end up being included in the executable? That's, like, several bytes — maybe even dozens — of bloat. (And there's no Windows DLL style import-by-ordinal functionality.) Turns out, however, that it's possible to refer to the symbols by their hash values instead. That's a fixed cost of four bytes per symbol, compared to 8 (symbol table entry) + strlen(name) + 1 for regular dynamic linking. (Of course there's also some amount of bytes wasted in the stub that does the hash resolving.)
The following might be more or less x86-32 Linux
-specific.
Acknowledgements
I got the concept from Universaal Universum, by LHB. Based on the comments, they got it from Metatunnel, by FRequency.
How Much Does It Save?
A simple OpenGL/GLU/GLX test is 1160 bytes uncompressed, while a directly compiled and stripped version of the same code is 5684.
Some sort of a more aggressive stripping tool might whittle it down a lot.
However, even the .text
+ .data
+ .rodata
in the directly compiled version takes 1080 bytes; compare with the 757-byte combined .text
section in the hashlinked version.
How Does It Work?
There's a 110-byte stub injected before the actual program that takes the list of hash values and replaces them by the runtime addresses of those symbols. Details below.
C Side
In the C code, we have a structure of function pointers to the interesting dynamically linked functions:
libs.h
#include <X11/Xlib.h> #include <GL/glx.h> struct dynamic_funcs { Display *(*XOpenDisplay)(const char *name); XVisualInfo *(*glXChooseVisual)(Display *dpy, int screen, int *attrs); /* ... */ }; extern struct dynamic_funcs f;
After including that, you can call the functions like Display *dpy = f.XOpenDisplay(NULL);
.
Then we implement
this header with a thing like:
libs.c
/* SYM: XOpenDisplay */ /* SYM: glXChooseVisual */ /* ... */ extern int main(void); void entry(void) __attribute__ ((section (".entry"), aligned (1))); void entry (void) { main(); }
Finally, we pair that with a linker script that locates the .entry
section and the f
symbol at a suitable fixed address, stuffs all initialized things in the .text
section, leaves 40 empty bytes at the start of .bss
for an unexpected symbol table; then link the code, and objcopy .text
into a binary blob.
Then a Perl script goes through the libs.c
file, picks up all the /* SYM: ... */
comments, and produces another binary file that has the symbol hashes.
Assembler Side
The dynamic-loading stub contains a hand-crafted ELF header inspired by the well-known Teensy ELF Executables tutorial, except with slightly less overlapping. It's commented, but the workings are described in prose, below.
The main idea is to have the following parts in the ELF DYNAMIC
section:
-
One
DT_NEEDED
entry per each library you have symbols from. Note that if some libraries depend on others, it's enough to list the one that pulls in the rest. -
One
DT_STRTAB
entry pointing at a string table containing only the library file names for theDT_NEEDED
s. -
One
DT_SYMTAB
entry with a value of 0; this seems to be required. (Quite a few other entries are also listed asmandatory
for executables, but don't seem to be.) -
One
DT_DEBUG
entry. The value part of this (non-standard) entry will be filled by ld.so with a pointer into data structures that describe the dynamic linking environment. It's used by things like GDB, but more importantly the data structures contain a chain of loaded libraries, and the symbol tables of each of them are accessible via it.
The format of the last one is slightly poorly documented, but for the most part you can look e.g. at the uClibc link.h.
It is the address of a struct r_debug
that will be placed into the DT_DEBUG
entry.
In particular, we want the r_map
member.
To resolve a hash value into a symbol address, we walk that chain of struct link_map
objects.
For each linked library, we find the values of the DT_HASH
, DT_STRTAB
and DT_SYMTAB
entries from the dynamic section pointed by the l_ld
member.
The hash table, sadly, doesn't contain the actual hash values of symbols, just the buckets in which the symbols fall; otherwise this would be trivial.
Anyway, from DT_HASH
we get the number of symbols, after which we walk that many entries of DT_SYMTAB
, hash their names, and check for match.
If one is found, we just have to add the load address of the library (from link_map
) and the symbol address to get the runtime address.
Code
The code is probably not quite usable out of the box, but the most important bits should be here.
- hlstub.asm: the linking stub that will assemble into the ELF file.
- hlstub.ld: the linker script; addresses here will need adjusting.
- hashsym.pl: Perl script to hash the
SYM:
comments.
To use, try something like this:
compilation example (approx.)
gcc -m32 -Os -Wall -fomit-frame-pointer -c -o main.o main.c gcc -m32 -Os -Wall -fomit-frame-pointer -c -o libs.o libs.c # see libs.h/libs.c above ld -m elf_i386 -T hlstub.ld -o prog-hlstub main.o libs.o objcopy -j .text -O binary prog-hlstub prog-hlstub.text ./hashsym.pl libs.c prog-hlstub.hash nasm -Ox -o prog -f bin -l prog.lst hlstub.asm > prog.map chmod u+x prog