[ home ]

Single-instruction CPU: simulator/validator for assembly-source

This page describes a simulator/validator for a macro-assembler I made earlier for a single-instruction CPU.

Like the assembler, this tool is made in Tcl (wiki).

(For details on assembly-instructions such as "IBC", and concepts like "macro" and "macro-argument", refer to the macro-assembler page.)

Introduction and rationale

Given only a single native instruction, it's pretty hard to debug native code.

Since, in this context, all (compound) instructions are the result of macro-expansion, there is no way to "disassemble" a program-image. (Well, it is possible, but you would only end up with a fairly bland list of opcode-fields to implied native IBC-instructions.)

Therefore, I would like to avoid actual debugging. An alternative is to automatically check semantics on a per-macro basis - which is what this page is about.

Furthermore, this tool can be part of a regression-test, e.g. when re-implementing an instruction-set as part of optimisation.

General concept

A macro has zero or more formal arguments.

Assuming that the behaviour of each instance of that macro depends only on the value of the corresponding actual arguments, we can:

  1. describe the macro-behaviour in terms of these arguments
  2. verify this behaviour for all possible initial values of these arguments.

(We'll only talk about data-arguments here; branch-arguments fall outside the scope of this tool, as explained later on.)

The basic functionality for validating a macro is this (in pseudocode):

    assemble generated macro-wrapper code
    load resulting program-image
    for each possible value-set "initial" of macro-arguments:    \
        preset RAM-locations corresponding to "initial"           |
        run program                                               | (*)
        read final values of macro-arguments "final" from RAM     |
        run semantic-tests on sets "initial" and "final"          |

The section indicated by the asterisk is described in more detail below.


In addition to a name, a formal argument can have access- and/or width-specifiers.


An access-specifier consists of a combination of letters:

The assembler ignores these specifiers, but the validator does not. If no access-specifier is given, "rw" is assumed.

As an example, consider macro "not1" (invert/complement a single bit):

    # Invert a bit.
    macro  not1  reg:rw1 {
                ibc1    $reg    done
        : done

As can be seen, this macro has 1 formal argument ("reg"), which has read-write access.


In the above macro-definition, the trailing "1" indicates the bit-width of this argument.

Since "not1" is a single-bit instruction, its operand is 1 bit wide. (It would make sense to use "4" as width-specifier for compound 4-bit instructions.)

If no width-specifier is given, "1" is assumed.

Description of a macro's behaviour

We can create namespaces "initial" and "final", each containing variables with the same name as the macro-arguments.

If the test-harness is constructed so, that the macro takes variables in namespace "initial", manipulates them, and writes resulting values into namespace "final", then its behaviour can be described in terms of these 2 sets of variables.

As an example, consider macro "add1" (a 1-bit full adder), with the following prototype:

    macro  add1  ci:r1  a:r1  b:r1  sum:w1  co:w1

(Arguments are "carry-in", "input A", "input B", "sum" and "carry-out", in this order.)

A description of its behaviour (in Tcl) can look like this:

    set  tot  [ +  [ + $final::a $final::b ]  $final::ci ]
    unless  { $final::sum  ==  ( $tot & 1 ) }  { error "sum has incorrect value" }
    unless  { $final::co   ==    $tot > 1   }  { error "co has incorrect value"  }

More examples are given later.

Mapping macro-arguments to RAM-locations

Macro-arguments each take up a number of bits. The total bit-width of all macro-arguments is simply the sum of these individual bit-widths, say N.

If each test-case arranges to map its actual macro-arguments at the start of RAM, then the first N bits of RAM hold the full set of macro-arguments.

This is nice.

Creating entropy

Before running the macro, we can pre-set the first N bits of RAM (thus, the macro-argument values) to a distinct pattern.

Then, after running the macro, those same bits can be compared to their initial values, effectively comparing initial and final argument-values for one specific iteration.

This comparison can be done for each possible bit-pattern corresponding to the macro-argument bits. This is a brute-force approach, where the macro has to be ran 2^N times, each time with a different pattern for the first N bits of RAM.

Checking for R/O-argument clobbering

Strictly speaking, only write-only and read-write arguments would have to be checked after executing the macro in order to verify its behaviour.

However, in case a macro erroneously clobbers its read-only arguments, subsequent code would likely fail, if it expected the macro to leave its read-only arguments unaltered.

Therefore, the test-harness itself verifies that no read-only argument is changed by the macro, by comparing values of read-only arguments in aforementioned namespaces "initial" and "final".


This tool works fairly well, however...

Many or large arguments

For a simple 1-bit compound instruction, the total set of arguments spans only a couple of bits. For a 4-bit instruction set, the number of bits would be bigger, obviously.

For a full 8-bit adder for instance, the total number of bits corresponding to all macro-arguments may become so high, that testing for all patterns becomes impractical.

Since there likely will never be any 8- or higher-bit instructions, this is probably not a real problem.


The previous text only dealt with data-arguments to macros. However, macros can also take branch-arguments.

Branch-arguments are ignored to avoid complexity. To vary branch-arguments as well would involve re-assembly for every possible value of each branch-argument, and would likely involve a more complex behaviour-description.

I'm fairly lazy.

Therefore, to test branch-like macros, at this point, the user has to make a wrapper around each such macro, using additional arguments to check whether branches were taken or not.

As an example, such a wrapper around the "cb1"-instruction ("Clear and unconditionally Branch") is given later on.

Dealing with endless loops

In short, the validator doesn't. The choice was made to pre-set RAM, then "run" the macro, then verify behaviour using resulting RAM-contents.

What does it mean to "run" a macro?

The assembler expands the macro to a number of IBC-instructions. To "run" the macro from a test-harness point of view, is to execute instructions until the program-counter reaches the ROM-address after the last program-instruction.

The implementation of "program::run" illustrates this:

    proc run {}  { 
        CPU reset
        while { [ CPU PC ] != [ size ] }  { CPU tick } 

In other words, "perform clock-ticks, as long as the program-counter ("PC") is not equal to the program-size" - that is, while executing program-code.

Dependencies on memory-mapped I/O

A system with only a CPU, ROM and RAM, is fairly useless.

When using some kind of memory-mapped I/O living at fixed addresses, these addresses would have to be used instead of RAM-addresses in a program using this I/O. These correspond to address-constants in source-code.

The current validator only concerns itself with the RAM-region corresponding to the arguments of the macro under test, and nothing else. Therefore, behaviour depending on mapped I/O is not tested.

However, considering that the purpose of the validator is to check for bugs in assembly-source and the assembler itself, it probably makes more sense to do this sort of testing in an actual system, and on a higher level.


Some correctly implemented macros as well as some buggy ones are given below, to illustrate the use of the validator.

Testing correct macro "not1" (negate single bit)

Macro "not1" is defined as follows (in file "1bit.asm"):

    # Invert a bit.
    macro  not1  reg:rw1 {
                ibc1    $reg    done
        : done

As can be seen, it takes a 1-bit read-write argument. If implemented correctly, it flips the bit at its argument's address.

This behaviour can be described as follows:

    unless  { $final::reg != $initial::reg }  { error "register has not been complemented" }

(Assume a file "not1.test" exists with the previous line as contents.)

To run the validator on macro "not1":

    $ ./trace.tcl 1bit.asm not1 not1.test

No output follows, meaning the behaviour is as expected. Yay!

For reference, the generated wrapper around "not1" looks like this:

    include "1bit.asm"
    main {
    . reg 1
    not1 $reg

(This wrapper-code is written to a temporary file, and then assembled.)

Things to note:

"Running" this snippet is said to be completed as soon as the program-counter reaches beyond the last program-location, as explained earlier.

Testing correct macro "cb1" (Clear, and unconditionally Branch)

Branch-instruction "cb1" can be defined as follows:

    # Clear and unconditionally branch.
    macro  cb1  reg:w1  branch {
        ibc1    $reg    $branch
        ibc1    $reg    $branch

Here we have a problem - the macro takes a branch-argument, which is not supported by the validator (see earlier for an explanation).

As a work-around, a wrapper can be created (let's say in file "cb1_wrapper.asm"):

    include "1bit.asm"
    macro cb1_wrapper reg:w1 zero:w1 {
                clr1    $zero
                cb1     $reg    done
                set1    $zero
        : done

As can be seen, the branch-argument is gone - the "cb1_wrapper" macro only takes data-arguments.

The additional "zero"-argument is used to check whether the branch was taken or not: it's cleared at the start, and only set in case the branch was not taken. Therefore, if this argument was set after running the wrapper-macro, the implementation was buggy.

A behaviour-description could look like this:

    if  { $final::reg  }  { error "register was not cleared" }
    if  { $final::zero }  { error "branch was not taken"     }

When running this description against the wrapper-macro, there is not output, indicating success.

Testing buggy macro "mcxor1" (Mostly Correct exclusive-OR)

Let's say what should have been "exclusive-OR" is implemented as simple "OR":

    macro  mcxor1  reg:rw1  mask:r1 {                                                            
                ibc1    $mask   x
                ibc1    $mask   done
        : x
                not1    $mask
                set1    $reg
        : done

When validated using the following XOR-behaviour...

    unless  { $final::reg  ==  ( $initial::mask ^ $initial::reg ) }  { error "reg has not been XOR'd with mask" }

...the error is exposed:

    Fail (id=3): reg has not been XOR'd with mask.
    Arguments (before): reg:rw1=1  mask:r1=1
    Arguments (after) : reg:rw1=1  mask:r1=1

The "id" displayed is actually the value of the first N bits of RAM, corresponding to the macro-argument variables. (To the user, "id" may make more sense than "bitmask".)

For each failed macro-run, the macro-arguments' values are displayed before and after running the macro, to give a clue as to where the bug may be. Access- and width-specifiers for each argument are given as well for convenience.

Testing buggy macro "awmov1" (Almost Weekend MOVe)

Let's say it's Friday afternoon - the weekend-magnet is ON, and people get sloppy.

What should have been a 1-bit "move"-instruction is perhaps implemented as follows:

macro  awmov1  from:r1  to:w1 {                                                           
            ibc1    $from   was_1
            cb1     $to     done
    : was_1
            set1    $to
    : done

May look OK at first sight - the output is set whenever the input was set, and cleared if the input was cleared.

However, the "from"-argument was specified as read-only. That means that following code would assume the "awmov1" macro does not change it - and it does, by accident.

This error would go by undetected if it were only for the behaviour-description:

    unless  { $final::to == $initial::from }  { error "value has not been copied" }

(Nowhere in here is "$initial::from" compared to "$final::from".)

However, this is where an aforementioned generic check, built into the validator, comes in - a mismatch between the initial and final value of any read-only argument triggers an error:

    Fail (id=0): r/o variable 'from' has been changed.
    Arguments (before): from:r1=0  to:w1=0
    Arguments (after) : from:r1=1  to:w1=0
    Fail (id=1): r/o variable 'from' has been changed.
    Arguments (before): from:r1=1  to:w1=0
    Arguments (after) : from:r1=0  to:w1=1
    Fail (id=2): r/o variable 'from' has been changed.
    Arguments (before): from:r1=0  to:w1=1
    Arguments (after) : from:r1=1  to:w1=0
    Fail (id=3): r/o variable 'from' has been changed.
    Arguments (before): from:r1=1  to:w1=1
    Arguments (after) : from:r1=0  to:w1=1

(As can be seen, the "from"-argument was erroneously altered in every test.)


For reference, the validator's source is given below.

    #!/usr/bin/env tclsh
    proc die msg { puts "FATAL: $msg"; exit 1 }
    proc unless { cond code }  { tailcall  if  !($cond)  $code }
    foreach op { + - * ** / & | << >> < <= > >= && || } { proc $op { a b } [ list expr \$a $op \$b ] }
    proc I x { set x }
    proc bitval { "at" pos "in" mask }  { expr { !!( $mask & [ << 1 $pos ] ) } }
    # Generate sequence of integers, exclusive ("...") and inclusive ("..").
    proc ... { start stop } {
        set li {}
        for { set x $start } { $x < $stop } { incr x } {
            lappend li $x
        set li
    proc .. { start stop }  { ...  start  [ + stop 1 ] }
    # Wrapper around namespace-ensemble.
    proc miniclass { name methods } {
        set a [ string map [ list __METHODS__ $methods ] {
            namespace export *
            namespace ensemble create
        } ]
        namespace eval $name $a
    # Get entry from sparse array (unoccupied slots read as '0').
    proc sparse-at { li idx }  { expr { ( $idx >= 0  &&  $idx < [ llength $li ] )  ?  [ lindex $li $idx ]  :  0 } }
    # Program under test.
    # This is an abstraction of "ROM", using data-/branch-fields instead of bare
    # program-words.
    # There is a concept of "current" address- and data-values, following from the
    # CPU's program-counter.
    # In addition, some I/O-functions are added to load/set program-contents.
    miniclass program {
        proc from-img { img } {
            binary scan $img i* pwords
            ROM clear
            foreach u32 $pwords { ROM append $u32 }
        proc from-file { fname } {
            set if [ open $fname ]
            fconfigure $if -translation binary
            set img [ read $if ]
            close $if
            from-img $img
        proc run {}  { 
            CPU reset
            while { [ CPU PC ] != [ size ] }  { CPU tick } 
        proc size  {}  {          ROM occupancy              }
        proc addr  {}  {          ROM addr                   }
        proc daddr {}  { >> [ & [ ROM data ] 0xffff0000 ] 16 }
        proc baddr {}  {      & [ ROM data ] 0x0000ffff      }
    # RAM-contents.
    # "Current" address- and data-values follow from the program's current data-address field,
    # indexing the RAM.
    miniclass RAM {
        # Simulate infinite RAM: reads '0' in areas not yet accessed.
        namespace eval our  { variable contents {} }
        proc clear  {}    { set our::contents {}                     }
        proc addr   {}    { program daddr                            }
        proc data   {}    { at [ addr ]                              }
        proc invert {}    { write  [ addr ]  [ expr { ! [ data ] } ] }
        proc at     addr  { sparse-at  $our::contents  $addr         }
        proc write { addr val } { 
            while { [ llength $our::contents ] <= $addr }  { lappend our::contents 0 }   ;# zzz...
            lset our::contents $addr $val
        # Return integer representation of RAM-region (lowest address corresponds to LSB).
        proc region-as-int { ofs N } {
            set val   0
            set shift 0
            foreach  addr  [ ...  $ofs  [ + $ofs $N ] ] {
                incr val [ << [ at $addr ] $shift ]
                incr shift
            set val
        # Returns list of current values for sorted variable-names in $proto (assumed to live at RAM-addr 0 onwards).
        proc values { "from" proto } {
            set values {}
            set startbit 0
            foreach argname [ dict keys $proto ] {
                set  width  [ dict get $proto $argname width ]
                lappend  values  [ region-as-int $startbit $width ]
                incr startbit $width
            set values
    # ROM, holding program under test.
    # "Current" address- and data-values follow from CPU's address-output.
    miniclass ROM {
        # Simulate infinite ROM: reads '0' in areas outside program-region.
        namespace eval our { variable contents {} }
        proc clear     {}        { set our::contents {}                }
        proc append    { data }  { lappend our::contents $data         }
        proc occupancy {}        { llength $our::contents              }
        proc addr      {}        { CPU addr_out                        }
        proc data      {}        { sparse-at  $our::contents  [ addr ] }
    # CPU, able to execute IBC-instructions (and that's all).
    # RAM-contents are updated with each tick, and CPU-buses are updated,
    # indirectly affecting ROM-, RAM- and program-indices.
    miniclass CPU {
        namespace eval our { variable  aout 0  num_tick 0 }
        proc addr_out  {}  { set our::aout     }
        proc branch_in {}  { program baddr     }
        proc num_tick  {}  { set our::num_tick }
        proc PC        {}  { addr_out          }
        proc reset {} { 
            set our::aout     0
            set our::num_tick 0
        proc debug {} {
            puts -nonewline "addr:[ program addr ] "
            puts -nonewline "daddr:[ program daddr ] "
            puts -nonewline "baddr:[ program baddr ] "
            puts            "data:[ RAM data ]->[ expr { ! [ RAM data ] } ]"
        proc tick {} {
    #        debug
            RAM invert
              if   [ RAM data ]  { incr our::aout 
            } else               { set  our::aout  [ branch_in ]
            incr our::num_tick
    # Returns dict with var's name as key, and sub-dicts 'width' and 'access' as value. 
    # Throws an appropriate error in case something goes wrong.
    proc extract-proto { asm_fname macroname } {
        set arglist {}   ;# output from filter: raw proto
        # Define dummy-functions to filter assembly-source.
        proc include args {}
        proc macro { candname args } {
            upvar  arglist arglist  macroname macroname
            if [ string equal $candname $macroname ] {
                set arglist [ lrange  $args  0 end-1 ]
        # Filter assembly-source, collecting arg-list.
        source $asm_fname
        unless [ llength $arglist ]  { error "cannot find macro '$macroname' in source-file '$asm_fname'" }
        # Convert arg-list to dict, and pass to user.
        set proto [ dict create ]
        foreach arg $arglist {
            set width 1
            set access rw
            regexp {^[^:]+:([rw]+)\d*$} $arg --> access
            regexp {^[^:]+:[rw]*(\d+)$} $arg --> width
            regexp {^([^:]+).*}         $arg --> argname
            dict set proto $argname width  $width
            dict set proto $argname access $access
        set proto
    # Returns assembly-source of a wrapper around an instance of macro under test.
    # For each of the macro-arguments, variables with the correct size are instantiated at
    # start of ROM. 
    # Test-logic can pre-set these variables to each possible combination in turn (by setting corresponding RAM-locations), 
    # then execute the program, and then use the final variables' values to verify the macro's semantics.
    proc macro-wrapper { asm_fname macroname proto } {
        set wrapper "include \"$asm_fname\"\n"
        append wrapper "main \{\n";
        # Declare a variable of correct width for each formal macro-argument, and build argname-list.
        set argnames [ dict keys $proto ]
        append wrapper [ join [ lmap a $argnames  { I ". $a [ dict get $proto $a width ]" }  ] \n ] \n
        # Add macro-instance itself.
        set argstring [ join [ lmap a $argnames  { I "\$$a" }  ] ]
        append wrapper "$macroname $argstring\n"
        append wrapper "\}\n"
    # Run the assembler on a code-snippet, and return the raw image.
    proc assemble { snippet } {
        set ASSEMBLER mas.tcl
        # Write code-snippet to temp source-file.
        set f [ file tempfile asm_fname ]
        puts $f $snippet 
        close $f
        # Assemble into temp binary.
        set f [ file tempfile bin_fname ]
        close $f
        exec $ASSEMBLER $asm_fname $bin_fname
        file delete $asm_fname
        # Import generated binary, and pass to user.
        set f [ open $bin_fname ]
        set bin_contents [ read $f ]
        close $f
        file delete $bin_fname
        set bin_contents
    # Create/set macro-arg variables in a namespace, with values from corresponding RAM-region.
    proc get-arg-vars { "from" proto "into" "namespace" ns } {
        set values [ RAM values from $proto ]
        set names  [ dict keys $proto ]
        foreach   val $values  name $names  { namespace eval  $ns  [ list variable $name $val ] }
    # Debug-representation of all variables corresponding to a macro-proto.
    # Variable-values are read from a given namespace.
    proc argstring { proto ns } {
        set li {}
        foreach name [ dict keys $proto ] { 
            set width  [ dict get $proto $name width  ]
            set access [ dict get $proto $name access ]
            set val    [ set ${ns}::$name             ]
            lappend li "$name:$access$width=$val" 
        return "[ join $li "  " ]\n"
    # Run test-logic, given a "initial"- and "final"-set of variables corresponding to a macro-proto.
    # In case of failure, debugging-info will be printed.
    proc run-test { testcode "with" "id1" id "using" "vars" ns_initial "-->" ns_final "from" proto } {
        if [ catch $testcode msg ] {
            puts "Fail (id=$id): $msg."
            puts -nonewline "Arguments (before): [ argstring $proto $ns_initial ]"
            puts -nonewline "Arguments (after) : [ argstring $proto $ns_final   ]"
    # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
    # Get command-line arguments.
    unless  [ >= [ llength $argv ] 3 ]  { die "use arguments: <asmfile> <macroname> <testfile>" }
    lassign $argv asm_fname macroname test_fname
    # Get proto for macro under test from assembly-source.
    set proto  [ extract-proto $asm_fname $macroname ];
    unless  [ llength $proto ]  { die "cannot get proto for macro '$macroname' in source-file '$asm_fname'" }
    # Create program by assembling wrapper around macro under test.
    set wrapper  [ macro-wrapper $asm_fname $macroname $proto ]
    set img  [ assemble $wrapper ]
    program from-img $img
    # Get total number of bits used by macro-argument variables.
    set num_argbit 0
    foreach a [ dict keys $proto ]  { incr num_argbit [ dict get $proto $a width ] }
    # Read macro-specific test from file, to be ran from within 'run-test'.
    set  f  [ open $test_fname ]
    set  macro_test  [ read $f ]
    close $f
    # Create generic test, to be ran from within 'run-test'.
    set generic_test {
        foreach argname [ dict keys $proto ] {
            set access [ dict get $proto $argname access ]
            if [ string equal $access r ] {
                set initial [ set ${ns_initial}::$argname ]
                set final   [ set ${ns_final}::$argname   ]
                unless { $initial == $final }  { error "r/o variable '$argname' has been changed" }
    # For each combination of macro-argument bits: run program, then run generic- and macro-specific test.
    foreach  mask  [ ...  0  [ ** 2 $num_argbit ] ] {
        RAM clear
        foreach addr [ ... 0 $num_argbit ]   { RAM write  $addr  [ bitval at $addr in $mask ] }
        get-arg-vars from $proto into namespace "initial"
        program run
        get-arg-vars from $proto into namespace "final"
        run-test $generic_test with id $mask using vars "initial" --> "final" from $proto 
        run-test $macro_test   with id $mask using vars "initial" --> "final" from $proto