[ home ]

Simple C amalgamation



In this context, "amalgamation" means the union of a number of C header- and source-files into a big single file.

The idea is to bundle a set of files into something like a "source-code library", which can be included (using normal #include-directives) into your own code.

No magic involved. Everything below says lightweight and simple.

BTW, this idea is not mine - for example, the SQLite and Whefs embeddable virtual filesystem projects make use of the same thing, and the implementation described on this page here is inspired by both.

Challenge

I recently had the desire/need to reuse a lightweight "utility-library" consisting of a bunch of source-files into a few other projects.

One way to do this, would be to keep the original source in its own location, and make it available (copy it) to other projects in the form of a compiled lib. This has the disadvantage that changes between lib-versions are no longer obvious - a binary doesn't diff(1) very well.

Another way would be to simply copy the source-files from their original location into any project that wants to use them. Disadvantage is that the concept of "version" and "change" may become fuzzy - which files belong to the lib, and which are specific to the project using the lib? Furthermore, which incarnation of the lib's source was the original again..?

Chosen solution

Like the SQLite and Whefs projects, code reuse is implemented by semi-intelligently concatenating a bunch of source-files into a big file, to be copied around and recompiled at will, as part of any project that wishes to use the lib.

Changes remain clearly visible (since it's still all plaintext after all).

Also, it's still obvious which files belongs to the lib and which to the project using the lib (since the lib now spans just 1 single file).

Example of use

Hereafter, a simple library is given as bunch-of-sourcefiles, which will be used as part of a new simple project.

To do this, an amalgamation of the library is created, resulting in a single file. This file is then included into, and built as part of the new project.

Definition of example-lib to be included in a new project

Let's assume we have a superb lib called "megalib", consisting of files a.[hc], b.[hc] and c.[hc], like this:

a.h includes some system-headers of its own:

    #ifndef A_H_INCLUDED
    #define A_H_INCLUDED
        
    #include <stdio.h>
    #include <stdlib.h>
        
    typedef char A;
        
    A a( void );
        
    #endif // ndef A_H_INCLUDED

a.c includes only its own header a.h:

    #include "a.h"
        
    A a( void ) { return 'a'; }

b.h includes some system-headers:

    #ifndef B_H_INCLUDED
    #define B_H_INCLUDED
        
    #include <stdio.h>
    #include <stdlib.h>
        
    typedef char B;
        
    B b( void );
        
    #endif // ndef B_H_INCLUDED

b.c includes only its own header b.h:

    #include "b.h"
        
    B b( void ) { return 'b'; }

c.h includes some system-headers as well as both local headers a.h and b.h:

    #ifndef C_H_INCLUDED
    #define C_H_INCLUDED
        
    #include <stdio.h>
    #include <stdlib.h>
        
    #include "a.h"
    #include "b.h"
        
    void c( A *pa, B *pb );
        
    #endif // ndef C_H_INCLUDED

c.c includes only its own header c.h:

    #include "c.h"
        
    void c( A *pa, B *pb )
    {
        *pa = a();
        *pb = b();
    }

Creating an amalgamation of our library

Let's assume the amalgamation-script is called mkamal.sh, which the following usage:

    ./mkamal.sh  <macro_prefix>  <list_of_headers>  <list_of_sources>

...where

Running the command

    mkamal.sh  "megalib"  "a.h b.h c.h"  "a.c b.c c.c"  >  amal.c

creates an amalgamation in file amal.c - contents of which are given later.

Using the amalgamation as either source- or header-file

The original library consisted of a bunch of headers and a bunch of source-files. After combining them, we end up with 1 single file.

How do we extract the headers or the source?

Use as source: simply compile it

    $ ls amal.*
    amal.c

    $ cc -c amal.c

    $ ls amal.*
    amal.c amal.o

(Object-file amal.o will be created as expected, corresponding to the source-code from a.c, b.c and c.c.)

Use as header: #define a special flag before #include'ing

Recall the first argument <macro_prefix> to the amalgamation-script being given as "megalib".

When defining MEGALIB_AMALGAMATION_GIVE_ME_HEADERS and then simply #include'ing the amalgamation into your project-source, its source-section will be skipped, and only the concatenated headers will effectively be included:

    #define MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
    #include "amal.c"

(The above define will automatically be cleared when #include'ing the amalgamation.)

An imaginary project-file main.c using headers from the amalgamation is given below:

    #include <stdio.h>
        
        
    #define MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
    #include "amal.c"
        
        
    int main( void )
    {
        A a;
        B b;
        
        c( &a, &b );
        
        puts( "this is " MEGALIB_AMALGAMATION_VERSION "!" );
        
        return  printf( "%c%c\n", a, b );
    }

(Function c() was implemented in file c.c, and exposed through header c.h in the original bunch-of-files comprising the lib - if interested, see contents of file c.c given earlier.)

Building your amalgamation-using project

To build your project:

    cc  main.c  amal.c

That's all. Running the resulting binary:

    this is 2017-10-26T08:56:02+02:00!
    ab

(A stringified timestamp from the time of creating the amalgamation can be retrieved through constant MEGALIB_AMALGAMATION_VERSION, and was printed from the main project-file main.c.)

Implementation

Amalgamation-script: mkamal.sh

(Note that about 90% is fluff. :-)

    #!/usr/bin/env bash
        
        
        
    #   WHAT IS THIS?
    #
    #       This is a simplistic script that concatenates/processes some C-headers and -sourcefiles
    #       to form a single "amalgamation" file, to be easily integrated into other projects.
    #
    #       Idea was taken from "SQLite" project (in-process database-lib / Hwaci); implementation
    #       is a simplified version of the "createAmalgamation.sh" script from the "whefs" project 
    #       (embeddable virtual filesystem / S. Beal).
    #
    #   HOW TO USE THIS SCRIPT?
    #
    #       To create an amalgamation, use:
    #
    #           myscript  <macro_prefix>  <list_of_headers>  <list_of_sources>
    #
    #       ...where 
    #
    #           <macro_prefix>      a prefix for macro-names and flags used in the generated file.
    #                               (e.g. specifying "megalib" results in macros and flags being prefixed
    #                               with "MEGALIB_AMALGAMATION_"). This would typically be the project-name.
    #
    #                               Must only contain alphanumeric chars and/or underscores.
    #
    #           <list_of_headers>   whitespace-delimited ordered list of header-files (all as one argument)
    #
    #           <list_of_sources>   whitespace-delimited list of source-files (all as one argument)
    #
    #
    #       A single file contains the filtered contents of all headers and sources will be printed
    #       on standard output. 
    #
    #   AND HOW TO USE THE RESULTING FILE..?
    #
    #       (see comment-header in generated file, or hardcoded help-text below.)
    #   
        
                
    function die
    {
        echo  "FATAL: $1"  >&2
        exit 1
    }
        
        
        
    [ $# -eq 3 ]  ||  die "use: \"$( basename $0 ) <macro_prefix> <headers> <sources>\""
        
    MACRO_PFX="$1"
        
    HEADERS="$2"
        
    SOURCES="$3"
        
        
        
    VAR_PREFIX=$( echo $MACRO_PFX | tr a-z A-Z )_AMALGAMATION_
        
    VERSION_VARNAME=${VAR_PREFIX}VERSION
        
    HDRREQ_VARNAME=${VAR_PREFIX}GIVE_ME_HEADERS
        
    INCGUARD_VARNAME=${VAR_PREFIX}INCLUDED
        
    VERSION=$(date -Iseconds)
    [ -n "$VERSION" ]  ||  die "cannot get timestamp"
        
        
        
    function print_converted_contents_of
    {
        local fname=$1
        
        [ -f "$fname" ]  ||  die "file '$fname' doesn't exist"
        
        echo
        echo
        echo
        echo  /////////////////////////////////////////////////////
        echo  //
        echo "//   $fname:"
        echo  //
        echo  /////////////////////////////////////////////////////
        echo
        
        sed  -e  '/^ *# *include ".*/d'  $fname
    }
        
        
        
    cat << ---EOF---
    //  $VERSION
    //
    //    ^^^   The above string is the amalgamation-version, available as compile-time macro 
    //          "$VERSION_VARNAME", which is a quoted string.
    //
    //          You can use  "sed -n -e 's@/* *@@' -e 's/ *$//' -e 1p"  to extract it from this file, 
    //          e.g. in a build-script.
    // 
    // 
    //  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    //  !!   THIS FILE WAS GENERATED - YOU PROBABLY DON'T WANT TO EDIT IT!   !!
    //  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    // 
    // 
    //  WHAT IS THIS?
    //  -------------
    // 
    //      This file is a so-called "amalgamation" (combination/unification/concatenation) of 
    //      a number of C headers/sources, and this single file can be integrated into your own 
    //      project.
    //
    //      The advantages over using the original pile-of-files or distributed library, are:
    //
    //        - it's instantly obvious that an amalgamation is a snapshot, not the original
    //        - unambiguous version-control (there is only 1 resulting and diff'able file)
    //        - less margin for errors/omission/mixup (there is only 1 resulting file)
    //     
    //      This single file can be dropped into your project and compiled as any other file.
    //
    //
    //  HOW TO USE IT?
    //  --------------
    //
    //      To extract/use only headers (assuming the file you're reading now is called "my_amalgamation.c"):
    //
    //          in "your_file.c":
    //
    //                ...
    //                #define $HDRREQ_VARNAME
    //                #include "my_amalgamation.c"
    //                ...
    //                // use functions from file "my_amalgamation.c" here
    //                ...
    //
    //          (the flag "$HDRREQ_VARNAME" will be cleared automatically within "my_amalgamation.c")
    //
    //      To compile:
    //
    //          (nothing special - just compile "my_amalgamation.c" as any other)
    //
    //      Note that the suggested extension for this file is ".c". This is done to make the compiler
    //      (or at least GCC) happy. However, it can also be included as if it were a header - see above.
    //
    //  WHICH FILES WERE INCLUDED?
    //  --------------------------
    // 
    //      following C-headers and -sources:
    // 
    //          headers, in this order:
    // 
    $( for a in $HEADERS; do echo "//            $a"; done )
    //
    //          sources, in this order:
    //
    $( for a in $SOURCES; do echo "//            $a"; done )
    //
        
    #ifndef  $INCGUARD_VARNAME
    #define  $INCGUARD_VARNAME
        
    #define  $VERSION_VARNAME   "$VERSION"
        
    /////////////////////////////////////////////////////////////////////////////////////////////////
    //
    //                                        HEADERS FOLLOW:
    //
    /////////////////////////////////////////////////////////////////////////////////////////////////
        
    ---EOF---
        
        
        
    for f in $HEADERS; do
        print_converted_contents_of  $f
    done
        
        
        
    cat << ---EOF---
        
    /////////////////////////////////////////////////////////////////////////////////////////////////
    //
    //                                        SOURCES FOLLOW:
    //
    /////////////////////////////////////////////////////////////////////////////////////////////////
        
    #ifndef  $HDRREQ_VARNAME
        
    ---EOF---
        
        
        
    for f in $SOURCES; do
        print_converted_contents_of  $f
    done
        
        
        
    cat << ---EOF---
        
    #endif  // ndef  $HDRREQ_VARNAME
        
    #undef  $HDRREQ_VARNAME
        
    #endif  // ndef  $INCGUARD_VARNAME
        
    ---EOF---

For completeness sake: generated amalgamation amal.c

    //  2017-10-26T08:56:02+02:00
    //
    //    ^^^   The above string is the amalgamation-version, available as compile-time macro 
    //          "MEGALIB_AMALGAMATION_VERSION", which is a quoted string.
    //
    //          You can use  "sed -n -e 's@/* *@@' -e 's/ *$//' -e 1p"  to extract it from this file, 
    //          e.g. in a build-script.
    // 
    // 
    //  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    //  !!   THIS FILE WAS GENERATED - YOU PROBABLY DON'T WANT TO EDIT IT!   !!
    //  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    // 
    // 
    //  WHAT IS THIS?
    //  -------------
    // 
    //      This file is a so-called "amalgamation" (combination/unification/concatenation) of 
    //      a number of C headers/sources, and this single file can be integrated into your own 
    //      project.
    //
    //      The advantages over using the original pile-of-files or distributed library, are:
    //
    //        - it's instantly obvious that an amalgamation is a snapshot, not the original
    //        - unambiguous version-control (there is only 1 resulting and diff'able file)
    //        - less margin for errors/omission/mixup (there is only 1 resulting file)
    //     
    //      This single file can be dropped into your project and compiled as any other file.
    //
    //
    //  HOW TO USE IT?
    //  --------------
    //
    //      To extract/use only headers (assuming the file you're reading now is called "my_amalgamation.c"):
    //
    //          in "your_file.c":
    //
    //                ...
    //                #define MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
    //                #include "my_amalgamation.c"
    //                ...
    //                // use functions from file "my_amalgamation.c" here
    //                ...
    //
    //          (the flag "MEGALIB_AMALGAMATION_GIVE_ME_HEADERS" will be cleared automatically within "my_amalgamation.c")
    //
    //      To compile:
    //
    //          (nothing special - just compile "my_amalgamation.c" as any other)
    //
    //      Note that the suggested extension for this file is ".c". This is done to make the compiler
    //      (or at least GCC) happy. However, it can also be included as if it were a header - see above.
    //
    //  WHICH FILES WERE INCLUDED?
    //  --------------------------
    // 
    //      following C-headers and -sources:
    // 
    //          headers, in this order:
    // 
    //            a.h
    //            b.h
    //            c.h
    //
    //          sources, in this order:
    //
    //            a.c
    //            b.c
    //            c.c
    //
        
    #ifndef  MEGALIB_AMALGAMATION_INCLUDED
    #define  MEGALIB_AMALGAMATION_INCLUDED
        
    #define  MEGALIB_AMALGAMATION_VERSION   "2017-10-26T08:56:02+02:00"
        
    /////////////////////////////////////////////////////////////////////////////////////////////////
    //
    //                                        HEADERS FOLLOW:
    //
    /////////////////////////////////////////////////////////////////////////////////////////////////
        
        
        
        
    /////////////////////////////////////////////////////
    //
    //   a.h:
    //
    /////////////////////////////////////////////////////
        
        
    #ifndef A_H_INCLUDED
    #define A_H_INCLUDED
        
    #include <stdio.h>
    #include <stdlib.h>
        
    typedef char A;
        
    A a( void );
        
    #endif // ndef A_H_INCLUDED
        
        
        
        
    /////////////////////////////////////////////////////
    //
    //   b.h:
    //
    /////////////////////////////////////////////////////
        
        
    #ifndef B_H_INCLUDED
    #define B_H_INCLUDED
        
    #include <stdio.h>
    #include <stdlib.h>
        
    typedef char B;
        
    B b( void );
        
    #endif // ndef B_H_INCLUDED
        
        
        
        
    /////////////////////////////////////////////////////
    //
    //   c.h:
    //
    /////////////////////////////////////////////////////
        
        
    #ifndef C_H_INCLUDED
    #define C_H_INCLUDED
        
    #include <stdio.h>
    #include <stdlib.h>
        
        
    void c( A *pa, B *pb );
        
    #endif // ndef C_H_INCLUDED
        
        
    /////////////////////////////////////////////////////////////////////////////////////////////////
    //
    //                                        SOURCES FOLLOW:
    //
    /////////////////////////////////////////////////////////////////////////////////////////////////
        
    #ifndef  MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
        
        
        
        
    /////////////////////////////////////////////////////
    //
    //   a.c:
    //
    /////////////////////////////////////////////////////
        
        
        
    A a( void ) { return 'a'; }
        
        
        
        
    /////////////////////////////////////////////////////
    //
    //   b.c:
    //
    /////////////////////////////////////////////////////
        
        
        
    B b( void ) { return 'b'; }
        
        
        
    /////////////////////////////////////////////////////
    //
    //   c.c:
    //
    /////////////////////////////////////////////////////
        
        
        
    void c( A *pa, B *pb )
    {
        *pa = a();
        *pb = b();
    }
        
        
    #endif  // ndef  MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
        
    #undef  MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
        
    #endif  // ndef  MEGALIB_AMALGAMATION_INCLUDED

Bonus: a lame Makefile

Note that this Makefile also creates the actual amalgamation.

In practice, that would typically be a target of the library's own Makefile, not of the project's Makefile.

    AMAL_SCRIPT = ./mkamal.sh
        
    AMAL = amal.c
        
    PROJ_NAME = megalib
        
    HEADERS =   \
    	a.h     \
    	b.h     \
    	c.h     \
        
    AMAL_SOURCES =   \
    	a.c          \
    	b.c          \
    	c.c          \
        
    SOURCES =             \
    	$(AMAL_SOURCES)   \
    	main.c
        
    TARGET = x
        
        
        
    $(TARGET): $(AMAL) main.c
    	cc main.c $(AMAL) -o $(TARGET)
        
        
        
    $(AMAL): $(HEADERS) $(AMAL_SOURCES) $(AMAL_SCRIPT)
    	$(AMAL_SCRIPT)  "$(PROJ_NAME)"  "$(HEADERS)"  "$(AMAL_SOURCES)"  >  $(AMAL)

Have fun! :-)