In this context, "amalgamation" means the union of a number of C header- and source-files into a big single file.
The idea is to bundle a set of files into something like a "source-code library", which can
be included (using normal #include
-directives) into your own code.
No magic involved. Everything below says lightweight and simple.
BTW, this idea is not mine - for example, the SQLite and Whefs embeddable virtual filesystem projects make use of the same thing, and the implementation described on this page here is inspired by both.
I recently had the desire/need to reuse a lightweight "utility-library" consisting of a bunch of source-files into a few other projects.
One way to do this, would be to keep the original source in its own location, and make it available
(copy it) to other projects in the form of a compiled lib. This has the disadvantage that changes
between lib-versions are no longer obvious - a binary doesn't diff(1)
very well.
Another way would be to simply copy the source-files from their original location into any project that wants to use them. Disadvantage is that the concept of "version" and "change" may become fuzzy - which files belong to the lib, and which are specific to the project using the lib? Furthermore, which incarnation of the lib's source was the original again..?
Like the SQLite and Whefs projects, code reuse is implemented by semi-intelligently concatenating a bunch of source-files into a big file, to be copied around and recompiled at will, as part of any project that wishes to use the lib.
Changes remain clearly visible (since it's still all plaintext after all).
Also, it's still obvious which files belongs to the lib and which to the project using the lib (since the lib now spans just 1 single file).
Hereafter, a simple library is given as bunch-of-sourcefiles, which will be used as part of a new simple project.
To do this, an amalgamation of the library is created, resulting in a single file. This file is then included into, and built as part of the new project.
Let's assume we have a superb lib called "megalib",
consisting of files a.[hc]
, b.[hc]
and c.[hc]
, like this:
a.h
includes some system-headers of its own:
#ifndef A_H_INCLUDED
#define A_H_INCLUDED
#include <stdio.h>
#include <stdlib.h>
typedef char A;
A a( void );
#endif // ndef A_H_INCLUDED
a.c
includes only its own header a.h
:
#include "a.h"
A a( void ) { return 'a'; }
b.h
includes some system-headers:
#ifndef B_H_INCLUDED
#define B_H_INCLUDED
#include <stdio.h>
#include <stdlib.h>
typedef char B;
B b( void );
#endif // ndef B_H_INCLUDED
b.c
includes only its own header b.h
:
#include "b.h"
B b( void ) { return 'b'; }
c.h
includes some system-headers as well as both local headers a.h
and b.h
:
#ifndef C_H_INCLUDED
#define C_H_INCLUDED
#include <stdio.h>
#include <stdlib.h>
#include "a.h"
#include "b.h"
void c( A *pa, B *pb );
#endif // ndef C_H_INCLUDED
c.c
includes only its own header c.h
:
#include "c.h"
void c( A *pa, B *pb )
{
*pa = a();
*pb = b();
}
Let's assume the amalgamation-script is called mkamal.sh
, which the following usage:
./mkamal.sh <macro_prefix> <list_of_headers> <list_of_sources>
...where
<macro_prefix>: a prefix for macro-names and flags used in the generated file. (e.g. specifying "megalib" results in macros and flags being prefixed with "MEGALIB_AMALGAMATION_"). This would typically be the project-name. Must only contain alphanumeric chars and/or underscores.
<list_of_headers>: whitespace-delimited ordered list of header-files (all as one argument)
<list_of_sources>: whitespace-delimited list of source-files (all as one argument)
Running the command
mkamal.sh "megalib" "a.h b.h c.h" "a.c b.c c.c" > amal.c
creates an amalgamation in file amal.c
- contents of which are given later.
The original library consisted of a bunch of headers and a bunch of source-files. After combining them, we end up with 1 single file.
How do we extract the headers or the source?
$ ls amal.*
amal.c
$ cc -c amal.c
$ ls amal.*
amal.c amal.o
(Object-file amal.o
will be created as expected, corresponding to the source-code from
a.c
, b.c
and c.c
.)
#define
a special flag before #include
'ingRecall the first argument <macro_prefix> to the amalgamation-script being given as "megalib".
When defining MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
and then
simply #include
'ing the amalgamation into your project-source, its source-section will be
skipped, and only the concatenated headers will effectively be included:
#define MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
#include "amal.c"
(The above define will automatically be cleared when #include
'ing the amalgamation.)
An imaginary project-file main.c
using headers from the amalgamation is given below:
#include <stdio.h>
#define MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
#include "amal.c"
int main( void )
{
A a;
B b;
c( &a, &b );
puts( "this is " MEGALIB_AMALGAMATION_VERSION "!" );
return printf( "%c%c\n", a, b );
}
(Function c()
was implemented in file c.c
, and exposed through header c.h
in the original
bunch-of-files comprising the lib - if interested, see contents of file c.c
given earlier.)
To build your project:
cc main.c amal.c
That's all. Running the resulting binary:
this is 2017-10-26T08:56:02+02:00!
ab
(A stringified timestamp from the time of creating the amalgamation can be retrieved through
constant MEGALIB_AMALGAMATION_VERSION
, and was printed from the main project-file main.c
.)
mkamal.sh
(Note that about 90% is fluff. :-)
#!/usr/bin/env bash
# WHAT IS THIS?
#
# This is a simplistic script that concatenates/processes some C-headers and -sourcefiles
# to form a single "amalgamation" file, to be easily integrated into other projects.
#
# Idea was taken from "SQLite" project (in-process database-lib / Hwaci); implementation
# is a simplified version of the "createAmalgamation.sh" script from the "whefs" project
# (embeddable virtual filesystem / S. Beal).
#
# HOW TO USE THIS SCRIPT?
#
# To create an amalgamation, use:
#
# myscript <macro_prefix> <list_of_headers> <list_of_sources>
#
# ...where
#
# <macro_prefix> a prefix for macro-names and flags used in the generated file.
# (e.g. specifying "megalib" results in macros and flags being prefixed
# with "MEGALIB_AMALGAMATION_"). This would typically be the project-name.
#
# Must only contain alphanumeric chars and/or underscores.
#
# <list_of_headers> whitespace-delimited ordered list of header-files (all as one argument)
#
# <list_of_sources> whitespace-delimited list of source-files (all as one argument)
#
#
# A single file contains the filtered contents of all headers and sources will be printed
# on standard output.
#
# AND HOW TO USE THE RESULTING FILE..?
#
# (see comment-header in generated file, or hardcoded help-text below.)
#
function die
{
echo "FATAL: $1" >&2
exit 1
}
[ $# -eq 3 ] || die "use: \"$( basename $0 ) <macro_prefix> <headers> <sources>\""
MACRO_PFX="$1"
HEADERS="$2"
SOURCES="$3"
VAR_PREFIX=$( echo $MACRO_PFX | tr a-z A-Z )_AMALGAMATION_
VERSION_VARNAME=${VAR_PREFIX}VERSION
HDRREQ_VARNAME=${VAR_PREFIX}GIVE_ME_HEADERS
INCGUARD_VARNAME=${VAR_PREFIX}INCLUDED
VERSION=$(date -Iseconds)
[ -n "$VERSION" ] || die "cannot get timestamp"
function print_converted_contents_of
{
local fname=$1
[ -f "$fname" ] || die "file '$fname' doesn't exist"
echo
echo
echo
echo /////////////////////////////////////////////////////
echo //
echo "// $fname:"
echo //
echo /////////////////////////////////////////////////////
echo
sed -e '/^ *# *include ".*/d' $fname
}
cat << ---EOF---
// $VERSION
//
// ^^^ The above string is the amalgamation-version, available as compile-time macro
// "$VERSION_VARNAME", which is a quoted string.
//
// You can use "sed -n -e 's@/* *@@' -e 's/ *$//' -e 1p" to extract it from this file,
// e.g. in a build-script.
//
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
// !! THIS FILE WAS GENERATED - YOU PROBABLY DON'T WANT TO EDIT IT! !!
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
//
// WHAT IS THIS?
// -------------
//
// This file is a so-called "amalgamation" (combination/unification/concatenation) of
// a number of C headers/sources, and this single file can be integrated into your own
// project.
//
// The advantages over using the original pile-of-files or distributed library, are:
//
// - it's instantly obvious that an amalgamation is a snapshot, not the original
// - unambiguous version-control (there is only 1 resulting and diff'able file)
// - less margin for errors/omission/mixup (there is only 1 resulting file)
//
// This single file can be dropped into your project and compiled as any other file.
//
//
// HOW TO USE IT?
// --------------
//
// To extract/use only headers (assuming the file you're reading now is called "my_amalgamation.c"):
//
// in "your_file.c":
//
// ...
// #define $HDRREQ_VARNAME
// #include "my_amalgamation.c"
// ...
// // use functions from file "my_amalgamation.c" here
// ...
//
// (the flag "$HDRREQ_VARNAME" will be cleared automatically within "my_amalgamation.c")
//
// To compile:
//
// (nothing special - just compile "my_amalgamation.c" as any other)
//
// Note that the suggested extension for this file is ".c". This is done to make the compiler
// (or at least GCC) happy. However, it can also be included as if it were a header - see above.
//
// WHICH FILES WERE INCLUDED?
// --------------------------
//
// following C-headers and -sources:
//
// headers, in this order:
//
$( for a in $HEADERS; do echo "// $a"; done )
//
// sources, in this order:
//
$( for a in $SOURCES; do echo "// $a"; done )
//
#ifndef $INCGUARD_VARNAME
#define $INCGUARD_VARNAME
#define $VERSION_VARNAME "$VERSION"
/////////////////////////////////////////////////////////////////////////////////////////////////
//
// HEADERS FOLLOW:
//
/////////////////////////////////////////////////////////////////////////////////////////////////
---EOF---
for f in $HEADERS; do
print_converted_contents_of $f
done
cat << ---EOF---
/////////////////////////////////////////////////////////////////////////////////////////////////
//
// SOURCES FOLLOW:
//
/////////////////////////////////////////////////////////////////////////////////////////////////
#ifndef $HDRREQ_VARNAME
---EOF---
for f in $SOURCES; do
print_converted_contents_of $f
done
cat << ---EOF---
#endif // ndef $HDRREQ_VARNAME
#undef $HDRREQ_VARNAME
#endif // ndef $INCGUARD_VARNAME
---EOF---
amal.c
// 2017-10-26T08:56:02+02:00
//
// ^^^ The above string is the amalgamation-version, available as compile-time macro
// "MEGALIB_AMALGAMATION_VERSION", which is a quoted string.
//
// You can use "sed -n -e 's@/* *@@' -e 's/ *$//' -e 1p" to extract it from this file,
// e.g. in a build-script.
//
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
// !! THIS FILE WAS GENERATED - YOU PROBABLY DON'T WANT TO EDIT IT! !!
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
//
// WHAT IS THIS?
// -------------
//
// This file is a so-called "amalgamation" (combination/unification/concatenation) of
// a number of C headers/sources, and this single file can be integrated into your own
// project.
//
// The advantages over using the original pile-of-files or distributed library, are:
//
// - it's instantly obvious that an amalgamation is a snapshot, not the original
// - unambiguous version-control (there is only 1 resulting and diff'able file)
// - less margin for errors/omission/mixup (there is only 1 resulting file)
//
// This single file can be dropped into your project and compiled as any other file.
//
//
// HOW TO USE IT?
// --------------
//
// To extract/use only headers (assuming the file you're reading now is called "my_amalgamation.c"):
//
// in "your_file.c":
//
// ...
// #define MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
// #include "my_amalgamation.c"
// ...
// // use functions from file "my_amalgamation.c" here
// ...
//
// (the flag "MEGALIB_AMALGAMATION_GIVE_ME_HEADERS" will be cleared automatically within "my_amalgamation.c")
//
// To compile:
//
// (nothing special - just compile "my_amalgamation.c" as any other)
//
// Note that the suggested extension for this file is ".c". This is done to make the compiler
// (or at least GCC) happy. However, it can also be included as if it were a header - see above.
//
// WHICH FILES WERE INCLUDED?
// --------------------------
//
// following C-headers and -sources:
//
// headers, in this order:
//
// a.h
// b.h
// c.h
//
// sources, in this order:
//
// a.c
// b.c
// c.c
//
#ifndef MEGALIB_AMALGAMATION_INCLUDED
#define MEGALIB_AMALGAMATION_INCLUDED
#define MEGALIB_AMALGAMATION_VERSION "2017-10-26T08:56:02+02:00"
/////////////////////////////////////////////////////////////////////////////////////////////////
//
// HEADERS FOLLOW:
//
/////////////////////////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////
//
// a.h:
//
/////////////////////////////////////////////////////
#ifndef A_H_INCLUDED
#define A_H_INCLUDED
#include <stdio.h>
#include <stdlib.h>
typedef char A;
A a( void );
#endif // ndef A_H_INCLUDED
/////////////////////////////////////////////////////
//
// b.h:
//
/////////////////////////////////////////////////////
#ifndef B_H_INCLUDED
#define B_H_INCLUDED
#include <stdio.h>
#include <stdlib.h>
typedef char B;
B b( void );
#endif // ndef B_H_INCLUDED
/////////////////////////////////////////////////////
//
// c.h:
//
/////////////////////////////////////////////////////
#ifndef C_H_INCLUDED
#define C_H_INCLUDED
#include <stdio.h>
#include <stdlib.h>
void c( A *pa, B *pb );
#endif // ndef C_H_INCLUDED
/////////////////////////////////////////////////////////////////////////////////////////////////
//
// SOURCES FOLLOW:
//
/////////////////////////////////////////////////////////////////////////////////////////////////
#ifndef MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
/////////////////////////////////////////////////////
//
// a.c:
//
/////////////////////////////////////////////////////
A a( void ) { return 'a'; }
/////////////////////////////////////////////////////
//
// b.c:
//
/////////////////////////////////////////////////////
B b( void ) { return 'b'; }
/////////////////////////////////////////////////////
//
// c.c:
//
/////////////////////////////////////////////////////
void c( A *pa, B *pb )
{
*pa = a();
*pb = b();
}
#endif // ndef MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
#undef MEGALIB_AMALGAMATION_GIVE_ME_HEADERS
#endif // ndef MEGALIB_AMALGAMATION_INCLUDED
Note that this Makefile also creates the actual amalgamation.
In practice, that would typically be a target of the library's own Makefile, not of the project's Makefile.
AMAL_SCRIPT = ./mkamal.sh
AMAL = amal.c
PROJ_NAME = megalib
HEADERS = \
a.h \
b.h \
c.h \
AMAL_SOURCES = \
a.c \
b.c \
c.c \
SOURCES = \
$(AMAL_SOURCES) \
main.c
TARGET = x
$(TARGET): $(AMAL) main.c
cc main.c $(AMAL) -o $(TARGET)
$(AMAL): $(HEADERS) $(AMAL_SOURCES) $(AMAL_SCRIPT)
$(AMAL_SCRIPT) "$(PROJ_NAME)" "$(HEADERS)" "$(AMAL_SOURCES)" > $(AMAL)
Have fun! :-)