C → Assembly: Optimizations, Volatile, and What the Compiler Is Allowed to Do
TL;DR
- You’ll learn how the compiler transforms C into assembly and why the same C code can look wildly different under
-O0vs-O2. - You’ll build a practical mental model for:
- dead-code elimination,
- common subexpression elimination,
- inlining,
- register allocation,
- and how
volatileconstrains these optimizations.
- You’ll run experiments and verify results with
objdumpand GDB.
1. The compiler pipeline (why there are multiple “translations”)
flowchart TD
A["C source (.c)"] --> B["Frontend to IR (Intermediate Representation)"]
B --> C["Optimizer (depends on -O level)"]
C --> D["Backend to assembly (.s)"]
D --> E["Assembler to object (.o)"]
E --> F["Linker to ELF (.elf)"]
.
Two consequences:
- “The compiler” isn’t one step; it’s many stages.
-Ochanges the optimizer stage, which changes everything downstream.
2. Hands-on lab: one program, many optimization levels
Create:
| |
Build two variants:
| |
Run both:
| |
Disassemble both:
| |
What to look for
- Under
-O0:- more stack usage,
- more loads/stores,
- variables “live” as you expect.
- Under
-O2:aandbare likely computed once,- branches may be simplified,
- code may be rearranged.
-O0 and then learn to recognize the optimized forms.3. Why variables disappear in optimized builds
Register allocation
At -O2, the compiler tries to keep values in registers and may never materialize them in memory.
Lifetime shrinking
If a variable’s value is used only briefly, it may never exist as a named location.
Inlining
Small functions are often replaced by their body.
<optimized out> for variables.4. volatile means “must perform the access”
A volatile object tells the compiler:
- every read is a real load,
- every write is a real store,
- the compiler cannot remove or merge those accesses,
- the compiler cannot assume the value stays the same between accesses.
This is critical for:
- MMIO (Memory-Mapped I/O) registers,
- ISR (Interrupt Service Routine) shared state,
- externally-modified memory.
Hands-on: volatile vs non-volatile
Create:
| |
Build optimized and disassemble:
| |
What you should observe:
- The non-volatile double-store may collapse to one store.
- The volatile double-store should remain two stores.
volatile is not a synchronization primitive. It does not create atomicity, ordering guarantees across cores, or memory barriers. For concurrency you want C11 atomics or explicit fences.5. Mapping C back to assembly (a practical method)
When you see assembly, ask:
- Where are inputs? (usually
a0..a7) - Where does the return value go? (usually
a0) - Which registers must survive calls? (callee-saved
s*) - Which memory stores are observable? (volatile, globals, function calls)
Use compiler-generated assembly as a “bridge”
Generate .s output:
| |
Compare build/opt_O0.s and build/opt_O2.s.
.s file is often easier to read than objdump because it preserves labels and structure.Exercises
- Modify
opt.csosinkis not volatile. Predict what changes in-O2. - Add a
uart_putc(or any external call) and observe how it “pins” values (calls are optimization barriers). - Write two functions: one tiny, one large. Observe when the tiny one is inlined.
How to test your answers
- Use
objdump -d -M numeric,no-aliasesto compare instruction sequences. - Use
readelf -sto see if functions still exist as symbols (inlining can remove the symbol).
Summary
You learned what optimizations do, why debugging optimized code can be confusing, and what volatile truly guarantees.
Next: control flow and data access-you’ll learn how if/loops/switch become branches and jump tables, and how loads/stores encode addressing.