From Bits to Arguments: sub-range-binders

Written by Dominik Pantůček on 2026-01-15

racket

Sometimes it is convenient to use characters of a string to illustrate the underlying low-level meaning of given identifier. One such example can be a definition of ISA opcode where various bit groups of the binary representation can have different meanings.


When designing new CPU it is a good idea to design its ISA first and verify that certain assumptions hold if you try to implement it. It is not necessary to immediately start implementing the design in question directly onto a silicon wafer, it is much easier to formally verify internal consistency and - perhaps - write an emulator.

But what would the emulator run? In order for such emulator to be useful, an assembler language compiler is needed as well. It definitely would be very useful to be able to specify a binary representation of an opcode with various meaning for different bit groups and directly use it for implementing an assembly compiler and procedural emulator.

(define regs (make-vector 16 0))
(define-opcode (ldl :0000-rrrr-iiii-iiii)
  (vector-set! regs r i))

This looks like a good design. For the emulator a vector of registers is created and for the compiler a bit specification of opcode and its arguments is written down. Introduced by a colon character to avoid ambiguity with numbers and having the ability to use "-" or "_" in any position to logically group the bits as the user desires seem to be a good design choices too.

We want the aforementioned code to behave like an ordinary procedure with respect to its arguments. So in the body of the procedure there should be no difference to the following:

(define (ldl r i)
  (vector-set! r i))

That means that the "rrrr" string should be collapsed into a single "r" character and that should be the first argument binding and the two groups of "i"s should be stored as the second argument "i" binding. For simplicity we assume that the bits are always interpreted as little-endian unsigned integers starting from the lowest - the rightmost - bit.

Using the sub-range-binders property on any syntax which remains in the procedure definition, it is possible to achieve exactly this behaviour. A brief look at the result in DrRacket shows this definitely is a viable approach to solve the problem in question:

DrRacket: sub-range-binders

The problem is that this syntax property is not applied to the whole scope in question. In order to introduce the sub-range binders, an " anchor" syntax object has to be inserted in the right scope. In this case it belongs to the beginning of the implicit lambda which comprises the whole procedure. So how do we introduce it?

(define-syntax (define-op-helper stx)
  (syntax-parse stx
    ((_ part whole spans)
     (syntax-property
      #'(void)
      'sub-range-binders
      (for/list ((range (in-list (syntax->datum (attribute spans)))))
        (vector (syntax-local-introduce #'part)
                0 1
                (syntax-local-introduce #'whole)
                (car range) (cdr range)))))))

This helper syntax does that for us. It has to expand to a syntax object containing something that persists the expansion phase - i.e. a simple begin is not an option because it gets immediately spliced into the surrounding definition context. It also expects the original large identifier under the whole pattern variable and a list of cons pairs with start/span numbers.

If you are wondering where can you see this action, the answer is - not yet. However in a due time the whole ISA designing toolkit will be released under a permissive license and you will definitely read about that right here.

Hope you liked it and see you next time!