DMA Fundamentals - Embedded C/C++ for Automotive

DMA Architecture: Freeing the CPU

DMA Transfer: CPU vs DMA

  Without DMA (CPU polling/interrupt per byte):
  CPU:  [Configure SPI] [wait] [Write byte 0] [wait] [Write byte 1] ... [Write byte N]
        → CPU busy for entire N-byte transfer

  With DMA:
  CPU:  [Configure DMA] [Trigger transfer] ──────────────────── [DMA Complete IRQ] [Process]
  DMA:  (invisible to CPU)  [byte0][byte1][byte2]...[byteN] → SPI peripheral
        → CPU free for other work during entire transfer

  DMA components:
  ├── Source: RAM buffer address, peripheral data register, or another peripheral
  ├── Destination: RAM buffer, peripheral data register, or another peripheral
  ├── Transfer count: N bytes/halfwords/words
  ├── Transfer mode: normal (stop at end), circular (wrap to start)
  └── Interrupt: half-complete, complete, error

DMA Channel Configuration (STM32/Cortex-M pattern)

Cdma_config.c

#include 
#include "Std_Types.h"

/* STM32-style DMA: transfer 256 bytes from ADC to RAM buffer */
typedef struct {
    volatile uint32_t CCR;    /* Configuration */
    volatile uint32_t CNDTR;  /* Number of data */
    volatile uint32_t CPAR;   /* Peripheral address */
    volatile uint32_t CMAR;   /* Memory address */
} DmaChannel_t;

#define DMA1_CH1  ((DmaChannel_t *)0x40020008u)

/* DMA CCR bits */
#define DMA_CCR_EN      (1u << 0u)  /* enable */
#define DMA_CCR_TCIE    (1u << 1u)  /* transfer complete interrupt */
#define DMA_CCR_HTIE    (1u << 2u)  /* half-transfer interrupt */
#define DMA_CCR_TEIE    (1u << 3u)  /* transfer error interrupt */
#define DMA_CCR_DIR     (1u << 4u)  /* direction: 0=periph→mem, 1=mem→periph */
#define DMA_CCR_CIRC    (1u << 5u)  /* circular mode */
#define DMA_CCR_MINC    (1u << 7u)  /* memory increment */
#define DMA_CCR_PSIZE8  (0u << 8u)  /* peripheral size: 8-bit */
#define DMA_CCR_MSIZE8  (0u << 10u) /* memory size: 8-bit */

static uint8_t g_adc_dma_buf[256];  /* DMA destination buffer */

Std_ReturnType DMA_StartAdcTransfer(void)
{
    /* Disable channel before configuring */
    DMA1_CH1->CCR  &= ~DMA_CCR_EN;

    /* Configure: ADC data register → g_adc_dma_buf, 256 bytes, circular */
    DMA1_CH1->CPAR  = (uint32_t)&ADC1_DR;          /* peripheral: ADC data reg */
    DMA1_CH1->CMAR  = (uint32_t)&g_adc_dma_buf[0]; /* memory: our buffer */
    DMA1_CH1->CNDTR = 256u;                         /* 256 transfers */
    DMA1_CH1->CCR   = DMA_CCR_CIRC     /* circular: refills buffer continuously */
                    | DMA_CCR_MINC     /* increment memory address */
                    | DMA_CCR_HTIE     /* interrupt at half-full */
                    | DMA_CCR_TCIE     /* interrupt when full */
                    | DMA_CCR_PSIZE8   /* 8-bit peripheral data */
                    | DMA_CCR_MSIZE8;  /* 8-bit memory data */

    DMA1_CH1->CCR |= DMA_CCR_EN;  /* start */
    return E_OK;
}

Cache Coherency with DMA (Cortex-M7)

⚠️ D-Cache + DMA = Silent Data Corruption

On Cortex-M7 with D-cache enabled, DMA writes to RAM may not be seen by the CPU if the cache contains stale data — and CPU writes may not be seen by the DMA if the cache hasn't been flushed. Required operations: (1) Before DMA reads from a CPU-written buffer: SCB_CleanDCache_by_Addr() — flush CPU cache to RAM. (2) After DMA writes to a CPU buffer: SCB_InvalidateDCache_by_Addr() — discard stale cache lines. Cortex-M4 and lower have no D-cache; this issue only affects M7 and above. The safest approach: place DMA buffers in non-cacheable MPU regions. Annotate with __attribute__((section(".noinit.dma_buf"))) and configure the MPU region as non-cacheable in the linker script.

Summary

DMA is essential for high-throughput peripheral communication (ADC streaming, SPI flash programming, Ethernet) without consuming CPU cycles on data movement. Circular DMA mode with half-transfer and full-transfer interrupts enables double-buffering: the CPU processes the first half of the buffer while DMA fills the second half, and vice versa. On Cortex-M7, cache coherency is the most dangerous DMA pitfall — a single missing CleanDCache call causes silently corrupted data that may only manifest as an intermittent fault under specific timing conditions. Use non-cacheable MPU regions for DMA buffers to eliminate this class of bug entirely.

🔬 Deep Dive — Core Concepts Expanded

This section builds on the foundational concepts covered above with additional technical depth, edge cases, and configuration nuances that separate competent engineers from experts. When working on production ECU projects, the details covered here are the ones most commonly responsible for integration delays and late-phase defects.

Key principles to reinforce:

Configuration over coding: In AUTOSAR and automotive middleware environments, correctness is largely determined by ARXML configuration, not application code. A correctly implemented algorithm can produce wrong results due to a single misconfigured parameter.
Traceability as a first-class concern: Every configuration decision should be traceable to a requirement, safety goal, or architecture decision. Undocumented configuration choices are a common source of regression defects when ECUs are updated.
Cross-module dependencies: In tightly integrated automotive software stacks, changing one module's configuration often requires corresponding updates in dependent modules. Always perform a dependency impact analysis before submitting configuration changes.

🏭 How This Topic Appears in Production Projects

Project integration phase: The concepts covered in this lesson are most commonly encountered during ECU integration testing — when multiple software components from different teams are combined for the first time. Issues that were invisible in unit tests frequently surface at this stage.
Supplier/OEM interface: This is a topic that frequently appears in technical discussions between Tier-1 ECU suppliers and OEM system integrators. Engineers who can speak fluently about these details earn credibility and are often brought into critical design review meetings.
Automotive tool ecosystem: Vector CANoe/CANalyzer, dSPACE tools, and ETAS INCA are the standard tools used to validate and measure the correct behaviour of the systems described in this lesson. Familiarity with these tools alongside the conceptual knowledge dramatically accelerates debugging in real projects.

⚠️ Common Mistakes and How to Avoid Them

Assuming default configuration is correct: Automotive software tools ship with default configurations that are designed to compile and link, not to meet project-specific requirements. Every configuration parameter needs to be consciously set. 'It compiled' is not the same as 'it is correctly configured'.
Skipping documentation of configuration rationale: In a 3-year ECU project with team turnover, undocumented configuration choices become tribal knowledge that disappears when engineers leave. Document why a parameter is set to a specific value, not just what it is set to.
Testing only the happy path: Automotive ECUs must behave correctly under fault conditions, voltage variations, and communication errors. Always test the error handling paths as rigorously as the nominal operation. Many production escapes originate in untested error branches.
Version mismatches between teams: In a multi-team project, the BSW team, SWC team, and system integration team may use different versions of the same ARXML file. Version management of all ARXML files in a shared repository is mandatory, not optional.

📊 Industry Note

Engineers who master both the theoretical concepts and the practical toolchain skills covered in this course are among the most sought-after professionals in the automotive software industry. The combination of AUTOSAR standards knowledge, safety engineering understanding, and hands-on configuration experience commands premium salaries at OEMs and Tier-1 suppliers globally.