RV32 ABI and C Data Types: Sizes, Alignment, and Layout

TL;DR

The ABI is the contract. If you violate it (even accidentally), you get “weird bugs” that look like stack corruption, bad pointers, or random crashes.


1. RV32 in one sentence

Typical compiler flags

2. Register roles (the part you must memorize)

RISC-V has 32 integer registers: x0..x31.

2.1. Who preserves what? (The “Responsibility” Rule)

3. Stack rules (the second part you must memorize)

3.1. Stack grows downward

1
2
3
4
5
6
high addresses
   ...
   saved registers
   local variables
sp → current top of stack
low addresses

3.2. Alignment

The ABI requires the stack pointer (sp) to be aligned to 16 bytes whenever you call a function.

Why 16 bytes? Why not just 4? Think of the stack like a delivery truck.

“At call boundaries” means: Before you jump to a new function, you must ensure sp is a multiple of 16. If you push 1 word (4 bytes), you must add 12 bytes of padding so the next function starts on a clean 16-byte line.

4. C type sizes on RV32 (ILP32)

These are the typical sizes (verify in your toolchain):

C typeTypical bytes (RV32 ILP32)
char1
short2
int4
long4
long long8
void*4
size_t4
float4
double8

5. Hands-on: measure sizes and alignment

Create:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// src/types.c
#include "types.h"
#include "uart.h"

// Compute the byte offset of a member inside a struct type.
// This does not access memory; it just uses the member's address from a null base.
#define OFFSETOF(type, member) ((u32)(usize)&(((type *)0)->member))

static void show_type(const char *name, u32 size, u32 align) {
  // Print a "name size=... align=..." line for one type.
  uart_puts(name);
  uart_puts(" size=");
  uart_putdec(size);
  uart_puts(" align=");
  uart_putdec(align);
  uart_puts("\n");
}

// Convenience macro: stringize the type name and show its size and alignment.
#define SHOW(T) show_type(#T, (u32)sizeof(T), (u32)_Alignof(T))

struct A {
  // Likely introduces padding between fields due to alignment.
  u8  a;
  u32 b;
  u16 c;
};

struct B {
  // Same fields as A but reordered to reduce padding.
  u32 b;
  u16 c;
  u8  a;
};

int main(void) {
  // Show basic scalar sizes/alignments for this target/compiler.
  SHOW(char);
  SHOW(short);
  SHOW(int);
  SHOW(long);
  SHOW(long long);
  SHOW(void *);
  SHOW(float);
  SHOW(double);

  // Compare layout of two structs with the same fields in different orders.
  uart_puts("\nstruct A size=");
  uart_putdec((u32)sizeof(struct A));
  uart_puts(" off(a)=");
  uart_putdec(OFFSETOF(struct A, a));
  uart_puts(" off(b)=");
  uart_putdec(OFFSETOF(struct A, b));
  uart_puts(" off(c)=");
  uart_putdec(OFFSETOF(struct A, c));
  uart_puts("\n");

  uart_puts("\nstruct B size=");
  uart_putdec((u32)sizeof(struct B));
  uart_puts(" off(b)=");
  uart_putdec(OFFSETOF(struct B, b));
  uart_puts(" off(c)=");
  uart_putdec(OFFSETOF(struct B, c));
  uart_puts(" off(a)=");
  uart_putdec(OFFSETOF(struct B, a));
  uart_puts("\n");

  return 0;
}

Build and run in QEMU:

1
2
3
4
5
riscv64-unknown-elf-gcc -O0 -g -ffreestanding -nostdlib \
  -march=rv32im -mabi=ilp32 -T src/link.ld \
  src/start.s src/uart.c src/types.c -o build/types_rv32.elf

qemu-system-riscv32 -M virt -nographic -bios none -kernel build/types_rv32.elf

5.1. What you should observe

5.2. Deep Dive: Type Alignment vs. Stack Alignment

You observed double align=8, but the ABI requires sp to be 16-byte aligned. Confusion is common here. Let’s distinguish the Content from the Container.

5.2.1. The Content Rule (Type Alignment)

Every variable has a “natural alignment”.

If you violate this, the CPU generates a Misaligned Access Exception (or does a simpler, slower two-part read).

5.2.2. The Container Rule (Stack Alignment)

The Stack Frame is the container for all these local variables. To be a “universal container”, the stack must be aligned to the strictest requirement of any variable it might hold.

The Solution: The RISC-V ABI forces the stack to be 16-byte aligned (divisible by 16). Since 16 is divisible by 1, 2, 4, and 8, a fresh stack frame is guaranteed to be a safe starting point for any standard data type, including 128-bit SIMD vectors (float128 or v128), without needing complex adjustments.

6. Struct padding explained (with a diagram)

6.1. Example: struct A

1
2
3
4
5
struct A {
  uint8_t  a; // 1 byte
  uint32_t b; // needs 4-byte alignment
  uint16_t c; // 2 bytes
};

One common RV32 layout:

Think of memory as a grid of 4-byte (32-bit) words.

OffsetByte 0Byte 1Byte 2Byte 3Content
+0apadpadpada takes 1 byte. We skip 3 bytes so the next row starts fresh.
+4bbbbb (4 bytes) fits perfectly in a new word.
+8ccpadpadc (2 bytes) sits here. We pad the end to align the whole struct size.

6.2. Complex Example: Mixing char, int, long, long long, double

Let’s look at a struct using all the types you asked about, specifically distinguishing long (32-bit) from long long (64-bit).

1
2
3
4
5
6
7
struct Mixed {
  char c;       // 1 byte
  int i;        // 4 bytes
  long l;       // 4 bytes (on RV32)
  long long ll; // 8 bytes (needs 8-byte alignment)
  double d;     // 8 bytes (needs 8-byte alignment)
};

Layout Analysis:

  1. c sits at +0.
  2. i needs 4-byte alignment. Next available slot is +1, so we skip 3 bytes. i starts at +4.
  3. l needs 4-byte alignment. It fits perfectly at +8. Ends at +12.
  4. ll needs 8-byte alignment. +12 is not divisible by 8 (12 % 8 = 4). We need 4 bytes of padding. ll starts at +16.
  5. d needs 8-byte alignment. ll ended at +24. 24 is divisible by 8. d starts at +24 immediately.

Memory Grid:

OffsetByte 0Byte 1Byte 2Byte 3Content
+0cpadpadpadAligning for i
+4iiii
+8lllllong is 4 bytes on RV32
+12padpadpadpadAligning for ll (must be % 8)
+16ll (lo)lllllllong long (first half)
+20ll (hi)lllllllong long (second half)
+24d (lo)ddddouble (first half)
+28d (hi)ddddouble (second half)

Total Size: 32 bytes.

Why padding exists:

7. Endianness and what it means for C

Most RV32 targets are little-endian.

If you store 0x11223344 in memory, bytes appear as:

address+0+1+2+3
bytes44332211

7.1. Hands-on: confirm endianness

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// src/endian.c
#include "types.h"
#include "uart.h"

static void puthex8(u8 v) {
  // Print one byte as two lowercase hex digits.
  const char *digits = "0123456789abcdef";
  uart_putc(digits[(v >> 4) & 0x0f]);
  uart_putc(digits[v & 0x0f]);
}

int main(void) {
  // Store a known 32-bit pattern and examine its byte order in memory.
  u32 x = 0x11223344u;
  u8 *p = (u8 *)&x;
  // Emit the four bytes to reveal endianness (LSB first on little-endian).
  puthex8(p[0]); uart_putc(' ');
  puthex8(p[1]); uart_putc(' ');
  puthex8(p[2]); uart_putc(' ');
  puthex8(p[3]); uart_putc('\n');
  return 0;
}

Compile/run:

1
2
3
4
5
riscv64-unknown-elf-gcc -O0 -g -ffreestanding -nostdlib \
  -march=rv32im -mabi=ilp32 -T src/link.ld \
  src/start.s src/uart.c src/endian.c -o build/endian_rv32.elf

qemu-system-riscv32 -M virt -nographic -bios none -kernel build/endian_rv32.elf

8. ABI meets assembly: parameters and return values

Consider:

1
uint32_t add_u32(uint32_t a, uint32_t b) { return a + b; }

At the ABI level:

In disassembly you’ll often see:

9. Exercises

  1. Change struct A by adding a uint8_t d; at the end. Predict the new size before compiling.
  2. Create a packed version:
    1
    
    struct __attribute__((packed)) P { u8 a; u32 b; };
    
  3. Compare sizeof(struct P) with the unpacked version.
  4. Write a function that returns a u64. Observe which registers carry the return value.

9.1. How to test your answers

10. Summary

You learned the RV32 ILP32 ABI “contract”: register roles, stack rules, and how C types map to bytes.

Next: C → assembly + optimizations - you’ll learn how -O changes what you see in disassembly, and how volatile really affects generated code.