3.3 Dumping File Contents

Data stored in modern computers, in both volatile memory and persistent files, is fundamentally a sequence of entities called bytes. The bytes can be addressed by its position in the sequence, starting with zero:

+--------+--------+--------+ ... +--------+
| byte 0 | byte 1 | byte 2 |     | byte N |
+--------+--------+--------+ ... +--------+

Each byte has capacity to store a little unsigned integer in the range 0..255. Therefore, the IO spaces that we edit with poke (like the file foo.o) can be seen as a sequence of little numbers, like depicted in the figure above.

GNU poke provides a command whose purpose is to display the values of these bytes: dump1 . It is called like that because it dumps ranges of bytes to the terminal, allowing the user to inspect them.

So let’s use our first poke command! Fire up poke, open the file foo.o as explained above, and execute the dump command:

(poke) dump
76543210  0011 2233 4455 6677 8899 aabb ccdd eeff  0123456789ABCDEF
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0100 f700 0102 0000 0000 0000 0000 0000  ................
00000020: 0102 0000 0000 0000 9801 0000 0000 0000  ................
00000030: 0000 0000 4000 0000 0000 4000 0800 0700  ..............
00000040: 2564 0a00 0000 0000 0000 0000 0000 0000  %d..............
00000050: b702 0000 0100 0000 1801 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 8510 0000 ffff ffff  ................
00000070: b700 0000 0000 0000 9500 0000 0000 0000  ................
(poke)

What are we looking at?

The first line of the output, starting with 76543210, is a ruler. It is there to help us to visually determine the location (or offset) of the data.

The rest of the lines show the values of the bytes that are stored in the file, 16 bytes per line. The first column in these data lines shows the offset, in hexadecimal and measured in number of bytes, from which the row of data starts. For example, the offset of the first byte shown in the third data line has offset 0x20 in the file, the second byte has offset 0x21, and so on. Note how the data rows show the values of the individual bytes, in hexadecimal. Generally speaking, when dealing with bytes (and binary data in general) it is useful to manipulate magnitudes in hexadecimal, or octal. This is because it is easy to group digits in these bases to little groups of bits (four and three respectively) in the equivalent binary representation. In this case, each couple of hexadecimal digits denote the value of a single byte2. For example, the value of the first byte in the third data row is 0x01, the value of the second byte 0x02, and so on.

Using the ruler and the column of offsets, locating bytes in the data is very easy. Let’s say for example we are interested in the byte at offset 0x68: we use the first column to quickly find the row starting at 0x60, and the ruler to find the column marked with 88. Cross column and row and… voila! The byte in question has the value 0x85. The reverse process is just as easy. What is the offset of the first 0x40 in the file? Try it!

The section at the right of the output is the ASCII output. It shows the row of bytes at the left interpreted as ASCII characters. Non-printable characters are shown as . to avoid scrambling the terminal, and yes, there is actually way to customize what character to use, so they are not confused from real ASCII dot characters (0x2e) :P In this particular dump we can see that near the beginning of the file there are three bytes whose value, if interpreted as ASCII characters, conform the string “ELF”. As we shall see, this is part of the ELF magic number. Again, the ruler is very useful to locate the byte corresponding to some character in the ASCII section, or the other way around. What is the value of the byte corresponding to the F in ELF? Try it!

Something to notice in the dump output above is that these are not, by any mean, the complete contents of the file foo.o. The .info ios dot-command informed us in the last section that foo.o contains 920 bytes, of which the dump command only showed us… 0x80 bytes, or 128 bytes in decimal.

dump is certainly capable of showing more (and less) than 128 bytes. We can ask dump to display some given amount of data by specifying its size using a command argument. For example:

(poke) dump :size 64#B
76543210  0011 2233 4455 6677 8899 aabb ccdd eeff  0123456789ABCDEF
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0100 f700 0102 0000 0000 0000 0000 0000  ................
00000020: 0102 0000 0000 0000 9801 0000 0000 0000  ................
00000030: 0000 0000 4000 0000 0000 4000 0800 0700  ..............

The command above asks poke to “dump 64 bytes”. In this example :size is the name of the argument, and 64#B is the argument’s value. Again, the suffix #B tells poke we want to dump 64 bytes, not 64 kilobits nor 64 potatoes.

Another interesting aspect of our first dump (ahem) is that the dumped bytes start from the beginning of the file, i.e. the offset of the first byte is 0x0. Certainly there should be other areas of the file with interesting contents for us to inspect. To that purpose, we can use yet another option, :from:

(poke) dump :size 64#B :from 128#B
76543210  0011 2233 4455 6677 8899 aabb ccdd eeff  0123456789ABCDEF
00000080: 1400 0000 0000 0000 0000 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000a0: 0000 0000 0300 0100 0000 0000 0000 0000  ................
000000b0: 0000 0000 0000 0000 0000 0000 0300 0300  ................

The command above asks poke to “dump 64 bytes starting at 128 bytes from the beginning of the file”. Note how the first row of bytes start at offset 0x80, i.e. 128 in decimal.

Passing options to commands is easy and natural, but we may find ourselves passing the same values again and again to certain command options. For example, if the default size of dump of 128 bytes is not what you prefer, because you have a particularly tall monitor, or you are one of these people using sub-atomic sized fonts, it can be tiresome and error-prone to pass :size to dump every time you use it. Fortunately, the default size can be customized by setting a global variable:

(poke) pk_dump_size = 160#B

This tells poke to set 160 bytes as the new value for the pk_dump_size variable. This is a global variable that the dump command uses to determine how much data to show if the user doesn’t specify an explicit value with the :size option. Many other commands use the same strategy in order to alter their default behavior, not just dump.

And now that we are talking about that, it is also cumbersome to have to set the default size used by dump every time we run poke. But no problem, just set the variable in a file called .pokerc in your home directory, like this:

pk_dump_size = 160#B

Every time poke starts, it reads ~/.pokerc and executes the commands contained in it. See .pokerc.

The dump command is very flexible, and accepts a lot of options and customization variables that we won’t be covering in this chapter. For a complete description of the command, see dump.


Footnotes

(1)

Note that this is not a dot-command like .file, .ios or .close: dump does not start with a dot! We will see later how dot-commands differ from “normal commands” like dump, but for now, let’s ignore the distinction.

(2)

Do not be fooled by the fact dump shows the hexadecimal digits in groups of four: this is just a visual aid and, as we shall see, it is possible to change the grouping by passing arguments to dump.