#include
/* Case 1: Hardware register — value changes without CPU writing it */
volatile uint32_t * const TIMER_CNT = (volatile uint32_t *)0xF0300010u;
uint32_t get_timestamp(void) { return *TIMER_CNT; } /* fresh read each call */
/* Case 2: Variable shared between ISR and task (shared memory) */
/* WITHOUT volatile: compiler may not reload g_rxFlag inside the while loop */
volatile uint8_t g_rxFlag = 0u;
void wait_for_rx(void) {
while (g_rxFlag == 0u) { /* spin */ } /* volatile: reloads each iteration */
}
void Can_RxISR(void) { g_rxFlag = 1u; } /* ISR sets the flag */
/* Case 3: longjmp target — value may be rolled back by setjmp/longjmp */
/* Rarely used in embedded; but volatile prevents optimiser from caching */
/* Case 4: Variable in memory scrutinised by a debugger or memory-mapped test tool */
/* Without volatile, debugger may see stale cached value */
volatile uint32_t g_debugWatch = 0u; /* debugger watchpoint target */
/* Case NOT needing volatile: local variable only accessed by one thread/ISR */
void process_can(void) {
uint32_t local_crc = 0u; /* NOT volatile: not shared, not hardware */
/* ... */
} The Four Cases Where volatile is Required
Memory Barriers and Compiler Barriers
#include
/* Compiler barrier: prevent compiler reordering across this point */
/* Does NOT affect CPU out-of-order execution; only compiler instruction order */
#define COMPILER_BARRIER() __asm__ volatile ("" ::: "memory")
/* Example: flag-based producer/consumer */
volatile uint32_t g_data;
volatile uint8_t g_ready;
/* Producer (task) */
void produce(uint32_t val)
{
g_data = val; /* write data BEFORE setting flag */
COMPILER_BARRIER(); /* ensure data write not reordered after flag write */
g_ready = 1u; /* set flag: consumer can now read g_data safely */
}
/* ARM DMB: Data Memory Barrier — full hardware memory ordering guarantee */
/* Required on Cortex-M when using LDREX/STREX or when cores share memory */
#define DMB() __asm__ volatile ("dmb" ::: "memory")
#define DSB() __asm__ volatile ("dsb" ::: "memory") /* Data Synchronisation Barrier */
#define ISB() __asm__ volatile ("isb" ::: "memory") /* Instruction Synchronisation */
/* DSB required before: enabling interrupts after configuring interrupt source */
void enable_irq_safe(uint32_t irqn)
{
configure_irq_source(irqn); /* configure peripheral to generate IRQ */
DSB(); /* wait for write to peripheral to complete */
ISB(); /* flush instruction pipeline */
NVIC_EnableIRQ(irqn); /* enable NVIC; now IRQ can fire safely */
} volatile and Compiler Optimisation Interaction
| Pattern | Without volatile | With volatile |
|---|---|---|
| Read in a loop: while(!flag){} | Hoisted out: infinite loop or one read | Reloads each iteration: correct |
| Dead store elimination: reg = 0x12; reg = 0x34 | First write removed (overwritten immediately) | Both writes emitted: hardware sees both |
| Multiple reads of same address | May read once, reuse value | New load instruction per access |
| Const-folding of hardware read | May substitute constant value | Always generates memory read instruction |
Summary
volatile is a promise to the compiler: do not optimise accesses to this object because the value can change outside your view. The four mandatory uses are: hardware registers, ISR/task shared flags, setjmp targets, and debugger watchpoints. Compiler barriers (asm volatile("" ::: "memory")) prevent instruction reordering by the compiler; ARM DMB/DSB instructions enforce ordering at the hardware level for multi-core or DMA scenarios. Both are needed for correct concurrent code — a compiler barrier alone is not sufficient when the CPU is out-of-order.
🔬 volatile, const, and Memory Barriers — Exact Semantics
These three C/C++ concepts are among the most frequently misunderstood in embedded systems. Each has a precise, implementation-defined meaning:
- volatile: Tells the compiler that every read/write of this variable must generate an actual memory access — the compiler cannot cache it in a register or reorder accesses across other volatile accesses. This is necessary for: (a) memory-mapped I/O registers, (b) variables modified by ISRs, (c) variables shared between tasks without an OS mutex. Critical: volatile does NOT provide atomicity or memory ordering on multi-core processors. A 32-bit volatile write on a Cortex-A core is not guaranteed to be visible to a Cortex-A core on a different cluster without a data memory barrier (DMB).
- const: Tells the compiler that the variable's value will not be changed through this particular pointer/reference. The compiler may place const objects in ROM (flash). AUTOSAR P2CONST, CONSTP2CONST, and P2VAR macros encode the const-ness of both the pointer and the pointed-to data. Getting this wrong causes MISRA violations and can place mutable calibration data in ROM.
- Compiler barriers (asm volatile("" ::: "memory")): These prevent the compiler from reordering memory accesses across the barrier point — but have no effect on CPU out-of-order execution. On single-core Cortex-M they are sufficient. On Cortex-A/R with cache coherency, hardware memory barrier instructions (DMB, DSB, ISB) are needed instead of or in addition to compiler barriers.
- AUTOSAR MemMap sections: AUTOSAR defines .h file-based linker section selection (MemMap.h) that controls whether variables land in RAM, ROM, fast-RAM, or DMA-accessible regions. A volatile variable in the wrong linker section can silently cause cache incoherency.
🏭 Real Bugs Caused by Missing volatile / Barriers
- Bosch ESC ECU (reconstructed example): An ISR sets a uint8_t flag. The main loop polls it with an if(flag) check. With -O2 optimisation, the compiler hoists the flag read out of the loop and caches it in a register — the main loop never sees the ISR's write. Result: the ECU appears to freeze. Fix: declare flag as volatile uint8_t.
- Multi-core spinlock without DMB: A spinlock implementation on TriCore TC399 using a volatile uint32_t lock variable worked correctly in single-core mode. After enabling the second core, occasional deadlocks occurred. Root cause: store buffer buffering caused Core 1 to see a stale cached value of the lock. Fix: add DSYNC (data synchronisation) instruction after the lock write — the TriCore equivalent of ARM DMB.
- DMA buffer without cache flush: A CAN DMA receive buffer declared as a global uint8_t array was being filled by the DMA controller, but the CPU was reading stale data from the D-cache. Fix: declare the buffer in an uncached memory section (via MemMap.h) or call dcache_invalidate_range() before reading DMA-filled data.
⚠️ volatile / const / Barrier Pitfalls
- Using volatile for multi-core shared data: volatile prevents compiler optimisation but does not prevent CPU reordering on out-of-order multi-core processors. Use proper OS synchronisation primitives (spinlocks, mutexes) or hardware barrier instructions for inter-core data sharing.
- Casting away const: Writing (uint8_t *)myConstPtr = newValue is undefined behaviour in C. If the linker placed myConst in flash ROM, the write will generate a hardware fault. If in RAM, it may work — but corrupts the immutability guarantee relied upon by other code.
- volatile pointer vs pointer to volatile:
volatile uint8_t *ptr— the pointed-to data is volatile (correct for MMIO).uint8_t * volatile ptr— the pointer itself is volatile, not the data. These are completely different. Using the wrong form on MMIO registers means the compiler will optimise away register accesses. - Assuming volatile ensures visibility across CPU caches: On ARMv7 and ARMv8 multi-core systems, a volatile write by CPU0 is not guaranteed visible to CPU1 until a DMB (data memory barrier) instruction is executed. MISRA C:2012 Rule 1.3 prohibits relying on compiler-specific behaviour for inter-core communication.
📊 Industry Note
MISRA C:2012 Rule 8.13 requires that pointers to non-const objects should be declared as const-pointer-to-const where the pointed-to object is not modified. Static analysis tools (Polyspace, PC-lint, Klocwork) flag violations. In AUTOSAR BSW code, all const usage must follow AUTOSAR Compiler Abstraction macros exactly to pass module-level MISRA compliance checks.
🧠 Knowledge Check — Click each question to reveal the answer
❓ Why is volatile insufficient for sharing data between two cores on a Cortex-A53 processor?
❓ What is the difference between a compiler barrier and a hardware memory barrier on ARM Cortex-R?
❓ A SWC declares: volatile const uint8_t * const pSensor;. Describe exactly what is volatile, what is const, and what is mutable.
pSensor is a const pointer (the pointer itself cannot be reassigned — mutable only at declaration). The pointer points to volatile const uint8_t data. 'volatile' means every read through pSensor generates a hardware memory access (no caching). 'const' means the value at that address cannot be modified through this pointer. This is the correct declaration for a read-only hardware status register: the address is fixed, the register value is hardware-driven (volatile), and software should not write to it (const).