4.13 Structured Integers

When we structure data using Poke structs, arrays and the like, we often use the same structure than a C programmer would use. For example, to model ELF RELA structures, which are defined in C like:

typedef struct
{
  Elf64_Addr   r_offset;  /* Address */
  Elf64_Xword  r_info;    /* Relocation type and symbol index */
  Elf64_Sxword r_addend;  /* Addend */
} Elf64_Rela;

we could use something like this in Poke:

type Elf64_Rela =
  struct
  {
    Elf64_Addr r_offset;
    Elf64_Xword r_info;
    Elf64_Sxword r_addend;
  };

Here the Poke struct type is pretty equivalent to the C incarnation. In both cases the fields are always stored in the given order, regardless of endianness or any other consideration.

However, there are situations where stored integral values are to be interpreted as composite data. This is the case of the r_info field above, which is a 64-bit unsigned integer (Elf64_Xword) which is itself composed by several fields, depicted here:

 63                                          0
+----------------------+----------------------+
|       r_sym          |      r_type          |
+----------------------+----------------------+
MSB                                         LSB

In order to support this kind of composition of integers, C programmers usually resort to either bit masking (most often) or to the often obscure and undefined behaviour-prone C bit fields. In the case of ELF, the GNU implementations define a few macros to access these “sub-fields”:

#define ELF64_R_SYM(i)         ((i) >> 32)
#define ELF64_R_TYPE(i)        ((i) & 0xffffffff)
#define ELF64_R_INFO(sym,type) ((((Elf64_Xword) (sym)) << 32) + (type))

Where ELF64_R_SYM and ELF64_R_TYPE are used to extract the fields from an r_info, and ELF64_R_INFO is used to compose it. This is typical of C data structures.

We could of course mimic the C implementation in Poke:

fun Elf64_R_Sym = (Elf64_Xword i) uint<32>:
   { return i .>> 32; }
fun Elf64_R_Type = (Elf64_Xword i) uint<32>:
   { return i & 0xffff_ffff; }
fun Elf64_R_Info = (uint<32> sym, uint<32> type) Elf64_Xword:
   { return sym as Elf64_Xword <<. 32 + type; }

However, this approach has a huge disadvantage: since we are not able to encode the logic of these “sub-fields” in proper Poke fields, they become second class citizens, with all that implies: no constraints on their own, can’t be auto-completed, can’t be assigned individually, etc.

But we can use the so-called integral structs! These are structs that are defined exactly like your garden variety Poke structs, with a small addition:

type Elf64_RelInfo =
  struct uint<64>
  {
    uint<32> r_sym;
    uint<32> r_type;
  };

Note the uint<64> addition after struct. This can be any integer type (signed or unsigned). The fields of an integral struct should be integral themselves (this includes both integers and offsets) and the total size occupied by the fields should be the same size than the one declared in the struct’s integer type. This is checked and enforced by the compiler.

The Elf64 RELA in Poke can then be encoded like:

type Elf64_Rela =
  struct
  {
    Elf64_Addr r_offset;
    struct Elf64_Xword
    {
      uint<32> r_sym;
      uint<32> r_type;
    } r_info;
    Elf64_Sxword r_addend;
  };

When an integral struct is mapped from some IO space, the total number of bytes occupied by the struct is read as a single integer value, and then the values of the fields are extracted from it. A similar process is using when writing. That is what makes it different with respect a normal Poke struct.

It is possible to obtain the integral value corresponding to an integral struct using a cast to an integral type:

(poke) type Foo = struct int<32> { int<16> hi; uint<16> lo; };
(poke) Foo { hi = 1 } as int<32>
0x10000

An useful idiom, that doesn’t require to specify an explicit integral type, is this:

(poke) type Foo = struct int<32> { int<16> hi; uint<16> lo; };
(poke) var x = Foo @ 0#B;
(poke) +x
0xfe00aa

These casts allow to “integrate” struct values explicitly, but the compiler also implicitly promotes integral struct values to integers in all contexts where an integer is expected:

Note that the above list doesn’t include relational operators, since these operators work with struct values.