Home Learning Paths ECU Lab Assessments Interview Preparation Arena Pricing Log In Sign Up

ETM: Embedded Trace Macrocell (ARM)

ETM Trace Data Path
  CPU instruction fetch pipeline
       │  ETM monitors: PC, branch decisions, exception entries
       ▼
  ETM Core (e.g., ETMv4 on Cortex-M7)
  ├── Compression: only branch/exception packets emitted (not every instruction)
  ├── Triggering: trace start/stop on address hit or comparator match
  └── FIFO output → TPIU (Trace Port Interface Unit)
       │
       ├── Serial trace (SWO/SWV): 1 pin, up to ~4 Mbit/s — only ITM/DWT data
       └── Parallel trace port (TRACEDATA[0:3]): 4 pins × 400 MHz = 1.6 Gbit/s
            │
            ▼
  Lauterbach LA-7780 trace capture hardware
  └── Stores up to 2 GB compressed trace; decompresses to full instruction trace
       │
       ▼
  TRACE32: ETM.List — shows every executed instruction with timestamps

MCDS: Multi-Core Debug Solution (Aurix TC3xx)

FeatureETM (ARM)MCDS (Aurix)
ProtocolARM CoreSight packet protocolNexus IEEE-ISTO 5001 Class 3+4
BandwidthTPIU parallel 4-pin ×400 MHzMCDS up to 16-pin × 800 MHz = 12.8 Gbit/s
Cores coveredOne ETM per CPU coreAll Aurix cores (TC0, TC1, TC2) + DMA + GTM shared
Data traceDWT (4 comparators) + ETM data traceMCDS hardware buffer: full memory access trace
TimestampGlobal timestamp packet in ETM streamAurix STM-synced timestamps; cross-core correlation possible
TRACE32 commandETM.ON / ETM.ListMCDS.ON / MCDS.List

ETM/MCDS Trace Capture in TRACE32

CMMetm_capture.cmm
// Enable ETM instruction trace on Cortex-M7 (LA-7780 probe required)

// Step 1: Configure ETM
ETM.ON                          // enable ETM hardware
ETM.DataTrace NONE              // instruction trace only (data trace = bandwidth intensive)
ETM.CLOCK 200MHz                // inform TRACE32 of core clock (for timestamp decode)

// Step 2: Configure trace port (TPIU)
// 4-bit parallel trace port at 200 MHz = 800 Mbit/s effective bandwidth
TPIU.PortSize 4                 // 4 trace pins
TPIU.PortClock 200MHz

// Step 3: Allocate trace buffer on LA-7780
Trace.METHOD Analyzer           // use onboard LA-7780 trace memory
Trace.Size 256MB                // 256 MB of trace storage

// Step 4: Configure capture trigger (optional — trace entire run if not set)
// Trigger: start trace when breakpoint at CanRx_HandleMsg is hit
// Trigger.Set CanRx_HandleMsg /Program

// Step 5: Run and capture
Go
WAIT 2s                         // capture 2 seconds of execution
Break

// Step 6: Review trace
ETM.List /Track Register(PC)    // disassembly view of captured trace
Trace.Chart.Func                // flame graph of function call/return trace

// For Aurix MCDS (replace ETM.* with MCDS.*):
// MCDS.ON ; MCDS.List ; Trace.Chart.Func

Trace Analysis: Fault Root Cause

CMMtrace_analysis.cmm
// Analyse trace after a fault to find root cause without reproducing crash

// Scenario: ECU crashed; we have a post-mortem trace buffer dump

// Step 1: Load saved trace
Trace.LOAD crash_trace_20241015.t32t

// Step 2: Find the exception entry (HardFault)
Trace.Find /Exception 3.        // find entry to exception 3 (HardFault)

// Step 3: Step backwards from the fault in the trace
// ETM.List shows every instruction; 'Back' button steps backward through trace
ETM.List /Track 0x80001234      // view instructions around faulting PC

// Step 4: Identify the exact write that caused the fault
// Use trace statistics to find all writes to the faulted address
Trace.Statistics.DATA 0x20001234  // show all accesses to faulted address in trace

// Step 5: Correlate with source
// Click instruction in ETM.List to jump to source
// Source shows: ptr->field = value;  // ptr was stale (freed earlier)

// Step 6: Find where ptr became stale
Trace.Find /Address ptr         // find all assignments to 'ptr' variable in trace

// Total time in trace from earliest to latest event:
PRINT "Trace duration: " Trace.INFO(DURATION) " ms"
PRINT "Instructions captured: " Trace.INFO(COUNT)

Summary

ETM provides lossless instruction trace (via compression) at up to 1.6 Gbit/s over a 4-pin parallel port — sufficient for most automotive MCUs running at ≤400 MHz. MCDS on Aurix adds cross-core timestamp correlation and covers the entire SoC (CPU cores + DMA + GTM), making it the right tool for multi-core timing issues. The most powerful use case is post-mortem trace analysis: a crash that is impossible to reproduce under a debugger can be fully reconstructed from the trace buffer, showing every instruction executed in the seconds before the fault.

🔬 Deep Dive — Core Concepts Expanded

This section builds on the foundational concepts covered above with additional technical depth, edge cases, and configuration nuances that separate competent engineers from experts. When working on production ECU projects, the details covered here are the ones most commonly responsible for integration delays and late-phase defects.

Key principles to reinforce:

  • Configuration over coding: In AUTOSAR and automotive middleware environments, correctness is largely determined by ARXML configuration, not application code. A correctly implemented algorithm can produce wrong results due to a single misconfigured parameter.
  • Traceability as a first-class concern: Every configuration decision should be traceable to a requirement, safety goal, or architecture decision. Undocumented configuration choices are a common source of regression defects when ECUs are updated.
  • Cross-module dependencies: In tightly integrated automotive software stacks, changing one module's configuration often requires corresponding updates in dependent modules. Always perform a dependency impact analysis before submitting configuration changes.

🏭 How This Topic Appears in Production Projects

  • Project integration phase: The concepts covered in this lesson are most commonly encountered during ECU integration testing — when multiple software components from different teams are combined for the first time. Issues that were invisible in unit tests frequently surface at this stage.
  • Supplier/OEM interface: This is a topic that frequently appears in technical discussions between Tier-1 ECU suppliers and OEM system integrators. Engineers who can speak fluently about these details earn credibility and are often brought into critical design review meetings.
  • Automotive tool ecosystem: Vector CANoe/CANalyzer, dSPACE tools, and ETAS INCA are the standard tools used to validate and measure the correct behaviour of the systems described in this lesson. Familiarity with these tools alongside the conceptual knowledge dramatically accelerates debugging in real projects.

⚠️ Common Mistakes and How to Avoid Them

  1. Assuming default configuration is correct: Automotive software tools ship with default configurations that are designed to compile and link, not to meet project-specific requirements. Every configuration parameter needs to be consciously set. 'It compiled' is not the same as 'it is correctly configured'.
  2. Skipping documentation of configuration rationale: In a 3-year ECU project with team turnover, undocumented configuration choices become tribal knowledge that disappears when engineers leave. Document why a parameter is set to a specific value, not just what it is set to.
  3. Testing only the happy path: Automotive ECUs must behave correctly under fault conditions, voltage variations, and communication errors. Always test the error handling paths as rigorously as the nominal operation. Many production escapes originate in untested error branches.
  4. Version mismatches between teams: In a multi-team project, the BSW team, SWC team, and system integration team may use different versions of the same ARXML file. Version management of all ARXML files in a shared repository is mandatory, not optional.

📊 Industry Note

Engineers who master both the theoretical concepts and the practical toolchain skills covered in this course are among the most sought-after professionals in the automotive software industry. The combination of AUTOSAR standards knowledge, safety engineering understanding, and hands-on configuration experience commands premium salaries at OEMs and Tier-1 suppliers globally.

← PreviousHands-On: AUTOSAR Stack DebuggingNext →Runtime Measurement & Profiling