Firmware Triage and Reverse Engineering Workflow

Author: Marcos Azevedo

Date: 2026-01-20

Last Modified: 2026-01-20

Reading Time: 5 mins

Section: Series

Tags: c-lang programming risc-v

TL;DR

You’ll learn a practical workflow to go from an unknown firmware file (.bin, .img, .fw, sometimes .elf) to a structured understanding:
- identify file type and architecture,
- locate code/data boundaries,
- recover load addresses and entry points,
- and choose the right next tools (disassembly, decompilation, emulation, or QEMU tracing).
You’ll practice repeatable triage steps that work well for embedded targets (including RISC-V).

Important

Reverse engineering firmware is most successful when you treat it like an investigation with checkpoints, not a “click around in a disassembler” activity.

1. Firmware file types: what you might get

Common formats

ELF: best case (symbols, sections, entry point may exist)
Raw binary (.bin): flat bytes, no addresses
Container images: may embed file systems or multiple partitions (e.g., update bundles)
Compressed blobs: LZMA, gzip, etc.

A core reality

A raw binary does not tell you:

where it loads in memory
where execution starts
what architecture it is

You must infer these from context.

2. The triage checklist (do this every time)

Step 1: Identify the file type

1
file firmware.bin

If it’s ELF, you’re in a much easier situation.

Step 2: Quick entropy/structure sense

1
2
hexdump -C firmware.bin | head
strings -a firmware.bin | head

Look for:

ASCII strings (boot messages, paths, version)
magic bytes (e.g., 7f 45 4c 46 for ELF)
long runs of 00 or ff (often padding/erased flash)

Tip

If strings returns lots of readable paths like /etc/ or /bin/, you may be looking at an embedded Linux filesystem image.

Step 3: Search for signatures

Even without specialized tools you can search for patterns:

1
grep -aobU $'\x7fELF' firmware.bin | head

This tells you if an ELF is embedded inside a larger blob.

Step 4: If ELF: extract structure immediately

1
2
3
4
readelf -h firmware.elf
readelf -S firmware.elf
readelf -l firmware.elf
readelf -s firmware.elf | head

Key questions:

Is it ELF32 or ELF64?
Machine = RISC-V?
What is the entry point?
Which segments are loadable (PT_LOAD)?

3. If you have a .bin: how to recover likely load address

Strategy A: From the platform memory map

If you know the target memory map (for example, QEMU virt), you often know typical RAM/flash addresses.

Many RV32 bare-metal examples start around 0x80000000 for RAM on QEMU virt.
Real SoCs vary wildly-use datasheets or boot logs.

Strategy B: From vector tables / reset patterns

On some architectures, the reset vector has a recognizable structure. On RISC-V, boot code often begins with a small prologue and jumps; patterns are less standardized than ARM vector tables, but you can still hunt for:

plausible prologue sequences
references to known MMIO regions

Strategy C: From absolute addresses in code

If the firmware includes absolute addresses (MMIO registers, RAM ranges), those addresses can reveal the platform.

Scan for aligned 32-bit values that look like addresses (e.g., high bits consistent)

Note

This is where ELF knowledge helps: once you guess a base address, you can test whether disassembly “starts making sense”.

4. A practical “first disassembly” approach (without committing too early)

Even without a GUI tool, you can do a sanity disassembly pass if you know the architecture.

Example: disassemble a raw binary as RV32

If you have GNU binutils that support RISC-V:

1
riscv64-unknown-elf-objdump -D -b binary -m riscv:rv32 firmware.bin | less

If the output is mostly illegal/garbage instructions, your assumptions might be wrong:

wrong arch (rv64 vs rv32)
wrong endianness (rare for RISC-V)
wrong base address assumptions (for relative branches, this matters)

Warning

objdump -b binary does not know the correct load address. Disassembly is “addressed” from 0 unless you compensate in your analysis tooling.

5. Carving: extracting sub-images from a blob

If you find an embedded ELF at offset O, extract it:

1
2
dd if=firmware.bin of=extracted.elf bs=1 skip=$O
file extracted.elf

If it’s a real ELF, you can now use all Chapter 2 methods.

If you find a filesystem or compression signature, you may need specialized tools (common in firmware work), but the workflow stays:

identify
extract
validate

6. Turning findings into a map (the most underrated skill)

Create a simple analysis note like:

1
2
3
4
5
6
7
8
9
Firmware X
- size: ...
- type: ...
- possible arch: rv32im
- strings: ...
- suspected load address: ...
- suspected entry point: ...
- notable constants: ...
- next action: (emulate / disassemble / find UART logs / look for update format)

This makes your work reproducible and easier to share.

7. Minimal “firmware-style” practice lab (using your own sample)

Take build/ld_demo.elf (from Chapter 7) and convert it to .bin.
Pretend you don’t know what it is.
Use only file, hexdump, strings, and objdump -b binary to identify it.
Write down your best guess about:
- architecture,
- load address,
- what the code does.

Then compare with the truth using readelf on the original ELF.

Exercises

Embed an ELF into a larger blob (e.g., by concatenating with padding) and practice carving it out using grep -aob and dd.
Create a raw binary with a known base address assumption (e.g., your linker origin) and see how your disassembly changes if you assume the wrong base.
Pick 5 strings from a firmware image and write hypotheses about what subsystems they relate to.

How to test your answers

You can produce a short “analysis map” that someone else could follow.
Your extracted sub-images validate with file and readelf (when applicable).

Summary

You learned a repeatable firmware triage workflow: identify → extract → validate → map → choose next analysis step.

Next: dynamic analysis with Frida (Dynamic Instrumentation Toolkit)-when it applies to IoT/firmware, what constraints exist, and how to do safe, reproducible hooking experiments.