Control Flow and Data Access in RV32 Assembly

TL;DR


1. RV32 data access: the only addressing mode you see most of the time

RV32 load/store instructions typically use:

1
address = base_register + signed_immediate

Examples:

Why this matters


2. Load/store size and signedness

Common integer loads:

Stores:


3. Hands-on: arrays and pointer arithmetic

Create:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
// src/arrays.c
#include "types.h"
#include "uart.h"

u32 sum_u32(const u32 *p, u32 n) {
  u32 s = 0u;
  for (u32 i = 0u; i < n; i++) {
    s += p[i];
  }
  return s;
}

int main(void) {
  u32 a[4] = {1u, 2u, 3u, 4u};
  u32 s = sum_u32(a, 4u);
  uart_puts("sum=");
  uart_puthex32(s);
  uart_putc('\n');
  return 0;
}

Compile and disassemble:

1
2
3
4
5
6
riscv64-unknown-elf-gcc -O0 -g -ffreestanding -nostdlib -march=rv32im -mabi=ilp32 \
  -T src/link.ld src/start.s src/uart.c src/arrays.c -o build/arrays_O0.elf

riscv64-unknown-elf-objdump -d -M numeric,no-aliases build/arrays_O0.elf | less

qemu-system-riscv32 -M virt -nographic -bios none -kernel build/arrays_O0.elf

What you should see

A loop pattern that looks like:

  1. initialize counter
  2. compare counter vs bound
  3. compute element address = base + index*4
  4. load element
  5. add
  6. increment counter
  7. branch back

Typical address computation

On RV32, multiplying by 4 is often done with a shift:

1
2
index_bytes = i << 2
addr = base + index_bytes

4. If/else becomes compare + branch

Create:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// src/ifelse.c
#include "types.h"
#include "uart.h"

u32 clamp_u32(u32 x, u32 lo, u32 hi) {
  if (x < lo) return lo;
  if (x > hi) return hi;
  return x;
}

int main(void) {
  u32 v = clamp_u32(42u, 10u, 30u);
  uart_puts("clamp=");
  uart_puthex32(v);
  uart_putc('\n');
  return 0;
}

Build both -O0 and -O2:

1
2
3
4
5
6
7
riscv64-unknown-elf-gcc -O0 -g -ffreestanding -nostdlib -march=rv32im -mabi=ilp32 \
  -T src/link.ld src/start.s src/uart.c src/ifelse.c -o build/if_O0.elf

riscv64-unknown-elf-gcc -O2 -g -ffreestanding -nostdlib -march=rv32im -mabi=ilp32 \
  -T src/link.ld src/start.s src/uart.c src/ifelse.c -o build/if_O2.elf

qemu-system-riscv32 -M virt -nographic -bios none -kernel build/if_O0.elf

Compare disassembly.

RV32 branch instructions you’ll meet often


5. Loops: for and while patterns

The common shapes

“Top-tested” loop

1
2
3
4
5
init
check
body
increment
jump to check

“Bottom-tested” loop

1
2
3
4
5
init
body
increment
check
branch to body

The optimizer can transform one into the other.


6. Switch statements and jump tables

Create:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
// src/switch.c
#include "types.h"
#include "uart.h"

u32 dispatch(u32 x) {
  switch (x) {
    case 0: return 0x1111u;
    case 1: return 0x2222u;
    case 2: return 0x3333u;
    case 3: return 0x4444u;
    default: return 0xdeadu;
  }
}

int main(void) {
  u32 r = dispatch(2u);
  uart_puts("dispatch=");
  uart_puthex32(r);
  uart_putc('\n');
  return 0;
}

Compile optimized:

1
2
3
4
5
6
riscv64-unknown-elf-gcc -O2 -g -ffreestanding -nostdlib -march=rv32im -mabi=ilp32 \
  -T src/link.ld src/start.s src/uart.c src/switch.c -o build/switch_O2.elf

riscv64-unknown-elf-objdump -d -M numeric,no-aliases build/switch_O2.elf | less

qemu-system-riscv32 -M virt -nographic -bios none -kernel build/switch_O2.elf

What you might see

A conceptual jump-table layout:

1
2
3
4
.rodata:
  table[0] = &case0
  table[1] = &case1
  ...

7. Struct field access looks like constant offsets

Create:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// src/structs.c
#include "types.h"
#include "uart.h"

typedef struct {
  u8  flags;
  u8  mode;
  u16 len;
  u32 addr;
} header_t;

u32 read_addr(const header_t *h) {
  return haddr;
}

void set_len(header_t *h, u16 v) {
  hlen = v;
}

int main(void) {
  header_t h = {1u, 2u, 3u, 0x80001234u};
  set_len(&h, 0x55aau);
  uart_puts("addr=");
  uart_puthex32(read_addr(&h));
  uart_putc('\n');
  return 0;
}

Build/disassemble and confirm that:

1
2
3
riscv64-unknown-elf-gcc -O2 -g -ffreestanding -nostdlib -march=rv32im -mabi=ilp32 \
  -T src/link.ld src/start.s src/uart.c src/structs.c -o build/structs_O2.elf
riscv64-unknown-elf-objdump -d -M numeric,no-aliases build/structs_O2.elf | less

Exercises

  1. For sum_u32, identify the exact instruction that loads p[i]. What is the effective address formula?
  2. For clamp_u32, identify which branches are unsigned vs signed.
  3. For the switch, locate the jump table (if one exists) in .rodata and dump its bytes with objdump -s -j .rodata.
  4. For the struct example, compute expected offsets using the OFFSETOF macro pattern and verify they match the disassembly.

How to test your answers


Summary

You learned the common RV32 patterns for loads/stores and how high-level control flow becomes branches and (sometimes) jump tables.

Next: functions and the stack-we’ll go deep into prologues/epilogues, stack frames, saved registers, and how to debug stack-related bugs.