Control Flow and Data Access in RV32 Assembly
TL;DR
- You’ll learn how C control flow (if/else, loops, switch) becomes RV32 branch and jump patterns.
- You’ll understand how RV32 loads and stores address memory (base + offset), and how the compiler represents arrays, pointers, and structs.
- You’ll practice a repeatable approach to “reading” disassembly: identify inputs, identify memory references, reconstruct high-level structure.
Important
A huge part of reverse engineering is simply recognizing patterns: loop shapes, bounds checks, switch jump tables, and common library idioms.1. RV32 data access: the only addressing mode you see most of the time
RV32 load/store instructions typically use:
| |
Examples:
lw a0, 12(sp): load a 32-bit word fromsp+12sw a1, 0(s0): store a 32-bit word to address ins0
Why this matters
- Stack locals often look like
lw/swwithspors0/fp. - Struct fields often look like
lw/swwith a constant offset. - Arrays often combine an index calculation with a base register.
2. Load/store size and signedness
Common integer loads:
lb: load byte (sign-extend)lbu: load byte (zero-extend)lh: load halfword (16-bit, sign-extend)lhu: load halfword (zero-extend)lw: load word (32-bit)
Stores:
sb,sh,sw
Tip
Signedness often reveals intent. If you see lbu, it strongly suggests an unsigned 8-bit value (uint8_t) or a character treated as unsigned.3. Hands-on: arrays and pointer arithmetic
Create:
| |
Compile and disassemble:
| |
What you should see
A loop pattern that looks like:
- initialize counter
- compare counter vs bound
- compute element address = base + index*4
- load element
- add
- increment counter
- branch back
Typical address computation
On RV32, multiplying by 4 is often done with a shift:
| |
Note
When you later reverse firmware, recognizing “index « 2” is a fast way to spot 32-bit array indexing.4. If/else becomes compare + branch
Create:
| |
Build both -O0 and -O2:
| |
Compare disassembly.
RV32 branch instructions you’ll meet often
beq,bneblt,bge(signed)bltu,bgeu(unsigned)
Important
Unsigned comparisons (< on uint32_t) tend to use bltu/bgeu. Signed comparisons tend to use blt/bge.5. Loops: for and while patterns
The common shapes
“Top-tested” loop
| |
“Bottom-tested” loop
| |
The optimizer can transform one into the other.
6. Switch statements and jump tables
Create:
| |
Compile optimized:
| |
What you might see
- A bounds check on
x - A computed jump using an address table
A conceptual jump-table layout:
| |
Tip
When reversing, jump tables often live in .rodata. If you spot a “load address from table then jump”, you’re likely in a switch.7. Struct field access looks like constant offsets
Create:
| |
Build/disassemble and confirm that:
addris loaded from a constant offset from the struct base pointerlenuses a halfword store (sh) at its offset
| |
Exercises
- For
sum_u32, identify the exact instruction that loadsp[i]. What is the effective address formula? - For
clamp_u32, identify which branches are unsigned vs signed. - For the
switch, locate the jump table (if one exists) in.rodataand dump its bytes withobjdump -s -j .rodata. - For the struct example, compute expected offsets using the
OFFSETOFmacro pattern and verify they match the disassembly.
How to test your answers
- Use
readelf -Sto locate section addresses andobjdump -sto dump raw bytes. - Use
objdump -d -M numeric,no-aliasesso you see the true instruction forms.
Summary
You learned the common RV32 patterns for loads/stores and how high-level control flow becomes branches and (sometimes) jump tables.
Next: functions and the stack-we’ll go deep into prologues/epilogues, stack frames, saved registers, and how to debug stack-related bugs.