LibAssuan Binary S-Expressions
Written by Dominik Pantůček on 2024-11-21
cryptographyyubikeypythonFaced with the requirement of computing symmetric key from the result of Elliptic-Curve Diffie-Hellman key exchange according to a specification, the primary ingredient for such calculation is the result itself. Read on to see how Interesting things you may find if you need to use YubiKey's OpenPGP Card application as a foundational component of any solution.
With recent versions of YubiKey firmware, it is possible to use the tokens directly
with typical GnuPG installation coming - for
example - with Ubuntu 24.04 LTS. That
is definitely a good thing. You just plug it in and gpg --edit-card
gives
you immediately reasonable interface to start using it - even for X25519 and Ed25519.
However, if you cannot use OpenPGP formats for the real work and you just need to use the underlying cryptography, it should not be much harder. Or is it? The whole cryptographic stack consists of:
- gpg binary
- gpg-agent daemon
- scdaemon
And as gpg
is offloading all its work through gpg-agent
onto scdaemon
and ultimately to the connected device, it is rather
straightforward to contact gpg-agent
directly and request whatever
operation you need to perform.
The protocol used for this communication is Assuan and it is implemented by libassuan library. We will not delve into the details of the protocol but rather we will focus on one Interesting data structure used by this protocol.
The Binary S-Expressions are typically used for transferring arbitrarily structured data.
A typical S-Expression is a nested list structure with various atomic data. For example:
(1 (2 3) 4 ("Hello" (6 7)))
Of course, depending on the implementation, various atoms in addition to numbers and strings may be present.
Binary S-Expressions have only one such data type and that is raw bytes. A not-so-nice example is as follows:
(10:public-key(3:ecc(5:curve7:Ed25519)(5:flags5:eddsa)(1:q33:@�/���K�n}�@f6@e�!Ȇ{j� V�p��2vn�)))
As we can easily see, the Binary S-Expression is either a list of Binary
S-Expressions or raw binary data prefixed by its length in bytes coded in ASCII decimal
digits with a colon :
serving as a separator between the length and the
actual contents.
We can write a (sort of) grammar for this encoding:
BSE ::= <atom>
| <list>
<atom> ::= <number>:<bytes>
<number> ::= positive decimal number
<bytes> ::= raw binary data of specified length for given <atom>
<list> ::= (BSE*)
As this particular project we are working on is written in the Python programming language, it is only natural to parse this data into a recursive list of bytes and lists. If we do not care about performance (in this case we do not) a very simple-yet-correct implementation might be as follows:
def parse_binary_sexp(data: bytes) -> list:
"""Reads libassuan binary S-Expression data into a nested lists
structure.
Parameters:
data: binary encoding of S-Expressions
Returns:
List of bytes and lists.
"""
root = []
stack = [root]
idx = 0
while idx < len(data):
if data[idx : idx + 1] == b"(":
lst = []
stack[len(stack) - 1].append(lst)
stack.append(lst)
idx = idx + 1
elif data[idx : idx + 1] == b")":
stack = stack[: len(stack) - 1]
idx = idx + 1
else:
sep_idx = data.find(b":", idx)
if sep_idx < 0:
return None
token_len = int(data[idx:sep_idx].decode("ascii"))
stack[len(stack) - 1].append(
data[sep_idx + 1 : sep_idx + 1 + token_len]
)
idx = sep_idx + token_len + 1
if len(root) == 0:
return None
return root[0]
It is not fast, it is not very nice, however it does the job. And as the YubiKey in question is just a placeholder for some real HSM that will come later on, it allows us to test the whole software stack as if there was a real HSM involved for little to no extra work!
Hope you liked this roller-coaster ride to the depths of mysterious protocols used by your favourite cryptographic stack. See ya next time!