A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://www.linuxjournal.com/content/embedding-file-executable-aka-hello-world-version-5967 below:

Embedding a File in an Executable, aka Hello World, Version 5967

Embedding a File in an Executable, aka Hello World, Version 5967

on June 12, 2008

I recently had the need to embed a file in an executable. Since I'm working at the command line with gcc, et al and not with a fancy RAD tool that makes it all happen magically it wasn't immediately obvious to me how to make this happen. A bit of searching on the net found a hack to essentially cat it onto the end of the executable and then decipher where it was based on a bunch of information I didn't want to know about. Seemed like there ought to be a better way...

And there is, it's objcopy to the rescue. objcopy converts object files or executables from one format to another. One of the formats it understands is "binary", which is basicly any file that's not in one of the other formats that it understands. So you've probably envisioned the idea: convert the file that we want to embed into an object file, then it can simply be linked in with the rest of our code.

Let's say we have a file name data.txt that we want to embed in our executable:

  # cat data.txt
  Hello world

To convert this into an object file that we can link with our program we just use

objcopy

to produce a ".o" file:

  # objcopy --input binary \
            --output elf32-i386 \
            --binary-architecture i386 data.txt data.o

This tells

objcopy

that our input file is in the "binary" format, that our output file should be in the "elf32-i386" format (object files on the x86). The

--binary-architecture

option tells

objcopy

that the output file is meant to "run" on an x86. This is needed so that

ld

will accept the file for linking with other files for the x86. One would think that specifying the output format as "elf32-i386" would imply this, but it does not.

Now that we have an object file we only need to include it when we run the linker:

  # gcc main.c data.o

When we run the result we get the prayed for output:

  # ./a.out
  Hello world

Of course, I haven't told the whole story yet, nor shown you

main.c

. When

objcopy

does the above conversion it adds some "linker" symbols to the converted object file:

   _binary_data_txt_start
   _binary_data_txt_end

After linking, these symbols specify the start and end of the embedded file. The symbol names are formed by prepending

_binary_

and appending

_start

or

_end

to the file name. If the file name contains any characters that would be invalid in a symbol name they are converted to underscores (eg

data.txt

becomes

data_txt

). If you get unresolved names when linking using these symbols, do a

hexdump -C

on the object file and look at the end of the dump for the names that

objcopy

chose.

The code to actually use the embedded file should now be reasonably obvious:

#include <stdio.h>

extern char _binary_data_txt_start;
extern char _binary_data_txt_end;

main()
{
    char*  p = &_binary_data_txt_start;

    while ( p != &_binary_data_txt_end ) putchar(*p++);
}

One important and subtle thing to note is that the symbols added to the object file aren't "variables". They don't contain any data, rather, their address is their value. I declare them as type

char

because it's convenient for this example: the embedded data is character data. However, you could declare them as anything, as

int

if the data is an array of integers, or as

struct foo_bar_t

if the data were any array of foo bars. If the embedded data is not uniform, then

char

is probably the most convenient: take its address and cast the pointer to the proper type as you traverse the data.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4