An introduction to firmware analysis
This talk by Stefan Widmann gives an introduction to firmware analysis: It starts with how to retrieve the binary, e.g. get a plain file from manufacturer, extract it from an executable or memory device, or even sniff it out of an update process or internal CPU memory, which can be really tricky.
After that it introduces the necessary tools, gives tips on how to detect the processor architecture, and explains some more advanced analysis techniques, including how to figure out the offsets where the firmware is loaded to, and how to start the investigation.
The talk from the 30th Chaos Communication Congress focuses on the different steps to be taken to acquire and analyze the firmware of an embedded device, especially without knowing anything about the processor architecture in use. Frequently datasheets are not available or do not name any details about the used processor or System on Chip (SoC).
First the prerequisites, like knowledge about the device under investigation, the ability to read assembly language, and the tools of the trade for acquisition and analysis, are shown. The question “How do I get the firmware out of device X?” makes the next big chapter: From easy to hard we pass through the different kinds of storage systems and locations a firmware can be stored to, the different ways the firmware gets transferred onto the device, and which tools we can use to retrieve the firmware from where it resides. The next step is to analyze the gathered data.
Is it compressed in any way? For which of the various different processor architectures out there was it compiled for? Once we successfully figured out the CPU type and we’ve found a matching disassembler, where do we start to analyze the code? Often we have to find out the offset where the firmware is loaded to, to get an easy-to-analyze disassembler output. A technique to identify these offsets will be shown. The last chapter covers the modifications we can apply to the firmware, and what types of checksum mechanisms are known to be used by the device or the firmware itself to check the integrity of the code.