Posts Tagged ‘software’

Temporal Hex Dump

October 16th, 2009

After building some hardware to trace and inject data on the Nintendo DSi’s RAM bus, it became obvious pretty fast that there’s a lot of data there, and (as far as I know) no good tools for analyzing these sorts of logs.

The RAM tracer has already given us a lot of insight into how the DSi works by virtue of letting us inspect the boot process, the inter-processor communication, and most of the code that runs on the system. But all of that knowledge comes in an indirect way, from using the RAM tracer as a platform to run other experiments. I’ve been interested in figuring out whether there’s a way to use the RAM trace itself to help understand a system’s dynamic behaviour.

The RAM is on a packet-oriented bus, so it would make sense to have a tool that looks kind of like a packet-based protocol analyzer. Think Wireshark, but for memory.

But there are also a lot of complex patterns that show up over time. As the DS loads a file, or initializes itself, or renders frame after frame of a UI, there are obvious patterns that emerge. So it also might make sense to have a visual tool, like vusb-analyzer.

Unfortunately, both of these approaches ignore the spatial organization of memory. The bus is a stream of packets that say ‘read’ or ‘write’, but the contents of RAM as a whole is more like a file that’s changing over time. Like in a version control system.

So the tool I’ve been imagining is kind of a hybrid of these. It would have a graphical timeline that helps you visually navigate through large datasets and identify timing patterns. It would have a packet-by-packet listing of the reads and writes. And most importantly, it would be a hex dump tool. But instead of showing a hex dump of a static file, it would be a two-dimensional hex dump. The hex dump shows space, but you can also scrub forward or backward in time, and watch the hex dump change. The hex dump could be annotated with colors, to show which data is about to change, or which data recently changed. You could right click on a byte, and see hyperlinks to the memory transactions that are responsible for that byte’s previous and next values.

As far as I know, nobody’s written a tool like this. So I have no idea how useful it will actually be for reverse engineering or performance optimization, but it seems like a promising experiment at least. So far I’ve been working on an indexing and caching infrastructure to make it possible to interactively browse these huge memory dumps, and I’ve been working on the visual timeline widget. Here’s a quick screencast:

The top section shows read/write/zero activity binned by address, with each vertical pixel representing about 64 kB. The horizontal axis is time, with continuous zooming. The bottom section of the graph shows bandwidth, color-coded according to read/write/zero. Blue pixels are reads, reds are write, and orange is a write of a zero byte.

This log file is about a gigabyte of raw data, or about 2 minutes of wallclock time. It shows the Opera browser on the Nintendo DSi loading a very large web page, then crashing. You can see its heap growing, and you can watch the memory access patterns of code, data, and inter-processor communication.

There’s a lot of room for improvement, but I’m optimistic that this will be at least a useful tool for understanding the DSi, and maybe even a more generally applicable tool for reverse engineering and optimization.

As usual, the source is in svn if anyone’s interested. It’s implemented with C++, wxWidgets, sqlite3, and Boost. I’ve only tested it on Linux, but it “should” be portable.

Introducing Metalkit

March 13th, 2008

Metalkit is another of my random side-projects. It’s a very simple library for writing programs that run on IA32 (x86) machines on the bare metal. It isn’t an operating system, but it does contain some of the low-level pieces you might use to create one.

I created it partly for fun and for the challenge, and partly to use as a framework for low-level hardware testing at work. It is open source, released under an MIT-style license.

Features currently include:

  • A 512-byte bootloader that works either as a floppy disk MBR or a GNU Multiboot image. When you build a program with Metalkit, the same binary image can be used either as a raw floppy disk image or as a “kernel” image in GRUB. This makes it easy to use your programs on virtual machines (VMware, QEMU), emulators (Bochs), or real machines.
  • Basic PCI bus support. You can scan for PCI devices, find out what resources (I/O ports, memory, IRQs) they have, and poke at their configuration registers.
  • VGA text mode.
  • A very tiny zlib-compatible decompressor, the “puff” reference implementation of DEFLATE.
  • Low-level support for the PIT timer.
  • A small, efficient, and powerful interrupt subsystem. ISR trampolines are assembled at runtime, saving space in the binary. Any ISR can execute the equivalent of a longjmp(3) on return, making simple thread context-switching very easy. Includes basic PIC interrupt routing. Includes default fault handlers which dump CPU registers and the stack any time an unhandled fault occurs.

Metalkit could be useful for educational purposes, because programs written with Metalkit are extremely small and self-contained. This example is a complete Metalkit program which lists all devices on the PCI bus:

#include "types.h"
#include "vgatext.h"
#include "pci.h"
#include "intr.h"

int
main(void)
{
    PCIScanState busScan = {};

    Intr_Init();
    Intr_SetFaultHandlers(VGAText_DefaultFaultHandler);

    VGAText_Init();
    VGAText_WriteString("Scanning for PCI devices:\n\n");

    while (PCI_ScanBus(&busScan)) {
        VGAText_Format(" %2x:%2x.%1x  %4x:%4x\n",
        busScan.addr.bus, busScan.addr.device,
        busScan.addr.function, busScan.vendorId,
        busScan.deviceId);
    }

    VGAText_WriteString("\nDone.\n");

    return 0;
}

This example compiles to a 2962-byte image, and uses only about 1500 lines of library code. This is great for educational purposes, because it is practical to understand the purpose of every byte in that compiled image– and when this example is running, that’s the only code running on your computer.

Another example included with the source is a simple pre-emptive thread scheduler implemented in 152 lines of C. Metalkit itself doesn’t know anything about threads or multitasking, but it’s possible to use Metalkit’s interrupt trampoline as a thread context switch. This example creates two busy-looping threads. Each thread prints its name, and the “Task 2″ thread also increments a counter. The example switches threads round-robin style on every timer interrupt. Here’s the tiny example running in Bochs:

If you want to play with Metalkit, all you need is an x86-compatible PC and a copy of the GNU toolchain (GCC and Binutils). You also probably want Subversion so you can check out the Metalkit source from http://svn.navi.cx/misc/trunk/metalkit/.
Also, if you’re interested in OS development or just hacking on the bare metal, the OSDev.org Wiki is an invaluable resource.

Enjoy.


Borg’ed up to the Applesphere

October 5th, 2006

As of this weekend, I now own a shiny new 8GB iPod nano. Actually, it isn’t very shiny at all- but it should resist scratches nicely compared to the first generation Nano.

But wait? Could this really be true? After all, I still have a perfectly good 20GB Rio Karma, which only needed slight hardware modifications to work properly! Well, the truth is that I haven’t used the Rio in a while now. The UI is clunky, it’s hard to upload songs to it despite all the Python software I wrote to do that job, and it’s gigantic.

I realize that these words are coming from the owner of a Nokia 6620, a phone which could swallow some PDAs whole. Thus it’s perhaps a bit unfair of me to give my cell phone such a generous pocket space allotment. But honestly, the Nokia is worth it. The Rio was not. I needed a music player I would actually carry around with me. Something tiny, with good battery life and at least a few gigabytes of space.

So, I finally got an iPod nano. It certainly has tiny covered, and 8GB is much more than my car’s MP3 CD changer holds. The 2nd gen Nano is supposed to have pretty good battery life, especially compared to a giant hard disk player like the Karma.

But wait? I hear you asking, “Is this really the same Micah who had failure after failure with his old iBook?” “Is this really the same guy who flails in agony as he writes Mac OS code by day?” “Is this really the same Micah who does cruel and unnatural things to Apple hardware?” Yes. Yes it is.

It didn’t take long for me to become completely frustrated by gnupod and gtkpod. Gnupod is kind of a nice idea, but why does it need a separate XML database, parallel to the iTunesDB? Its wrappers to automatically transcode flac and ogg are a nice idea, but it repeatedly trashed its own database or left orphaned MP3s on my iPod. Gtkpod is nice for browsing and for uploading one or two songs, but it’s quite tedious for large uploads. I should never have to wait on the UI while it scans ID3 tags or transfers files. It gets especially sluggish when adding songs from smbfs mounts, and it still requires these explicit database read/sync steps. Why can’t it Just Work?

Then I dediced to eat some of my own USB dogfood, and run iTunes in a VM. Don’t try this at home just yet, ugly workarounds are necessary to use an iPod with versions of VMware which don’t support USB 2.0. This actually worked fairly well, and probably would have been very smooth if I weren’t using a debug build. I even pointed Windows, within the VM, directly at my Samba server. The VM handled this well, but iTunes gets pretty sluggish when adding songs to your library from a network disk. In spite of this, I used this setup to happily copy several gigs of music to my iPod.

But I was still grumpy. As nice as it is to have the full functionality of iTunes available when I need it, iTunes is a big hammer. It’s a heavyweight monolithic solution that I can’t automate in any way. I want to effortlessly queue up albums to copy from the command line, without sacrificing my entire music collection to the iTunes Library. I want to seamlessly use other tools like sox or the oh-so-sexy mp3fs to transcode audio from FLAC or Ogg into MP3 on-the-fly.

Enter FUSEPod, which actually hasn’t managed to annoy me at all yet. It uses FUSE to create a filesystem that represents the songs on your iPod in an artist/album/genre/etc. tree, exactly as they appear in the iPod’s menu system. These are real MP3s, and you can play them or delete them using ordinary command line tools. To upload songs, just dump them into a special directory or into a list of absolute paths, and use an included shell script to trigger a sync operation. Flawless. No database hoops, no corruption (yet). I can effortlessly copy entire albums from my local disk, a network mount, or anything else. I can even transcode FLAC to MP3 in real time by giving fusepod paths into an mp3fs mount. Insane.

This is what good software should be: simple, intuitive, and true to the UNIX philosophy: do one job, do it well, and play nicely with others. FUSE seems to provide a solid foundation for such tools. Fusepod and mp3fs are working great together, enough so that I might be inspired now to find my own abuses for FUSE.