| Mechanism | Type | Fault Detected | Typical DC |
|---|---|---|---|
| Range/plausibility check | SW monitor | Sensor out-of-range; stuck-at | 90–99% |
| Cross-channel comparison | SW monitor | Channel discrepancy (drift, stuck-at) | 90–95% |
| Window watchdog | HW | SW hang; task deadline miss | ≥ 99% |
| CRC on memory/data | HW+SW | Memory corruption; data integrity | ≥ 99% (multi-bit) |
| ECC on flash/RAM | HW | Single/double bit flip | SEC: 100%, DED: 100% detect |
| ADC reference check | HW+SW | ADC offset; gain error | 95% |
| Voltage supply monitor | HW | Under/over-voltage | 97–99% |
| CPU lockstep (dual core) | HW | CPU instruction execution error | ≥ 99% |
| Memory March test (periodic) | SW | Latent RAM cell fault | ≥ 60–90% |
| CRC at startup (ROM test) | SW | ROM corruption | ≥ 99% |
Hardware Safety Mechanism Types
Window Watchdog Implementation
/* Window watchdog: detects both too-slow AND too-fast servicing */
/* Open window: only service watchdog during this time window */
/* Close window: servicing outside window triggers reset */
/* Aurix TC3xx: SCU_WDTCPU0CON0/CON1 registers */
#include "Wdg.h"
#include "SchM_Wdg.h"
/* Watchdog configuration: */
/* - Cycle period: 1ms task */
/* - Window: 0.5ms–0.9ms after task start (service must occur in this window) */
/* - If serviced at 0.3ms (too early): watchdog reset (task hung at start) */
/* - If serviced at 1.1ms (too late): watchdog reset (task took too long) */
void WdgTask_1ms(void)
{
/* AUTOSAR WdgM: Alive Supervision */
/* Each ASIL-D supervised entity reports its checkpoint */
WdgM_CheckpointReached(WDGM_ENTITY_SAFETY_MONITOR, WDGM_CP_ALIVE);
}
/* WdgM Deadline Supervision: */
/* Min time between checkpoints: DEADLINE_MIN */
/* Max time between checkpoints: DEADLINE_MAX */
/* If violated: WdgM transitions to expired state; triggers watchdog reset */
/* Aurix CPU watchdog: hardware window watchdog */
/* STM32: IWDG (independent watchdog) or WWDG (window watchdog) */
/* Window watchdog service window: WR[6:0] × (4096 × 2^WDGTB) / PCLK */
/* Anti-pattern: unconditional watchdog kick in interrupt */
/* BAD: void SysTick_Handler(void) { HAL_IWDG_Refresh(&hiwdg); } */
/* GOOD: Kick watchdog only from ASIL-D task after confirming work done */
void SafetyMonitor_1ms(void)
{
boolean all_checks_ok = TRUE;
all_checks_ok &= SensorRangeCheck_Run();
all_checks_ok &= CrossChannelCheck_Run();
all_checks_ok &= CommunicationTimeout_Check();
if (all_checks_ok) {
Wdg_SetTriggerCondition(WDG_TRIGGER_NORMAL); /* kick watchdog */
}
/* If any check fails: do NOT kick watchdog → system resets → safe state */
}CPU Lockstep: Dual-Core Safety Architecture
Core 0 (main) Core 1 (checker)
┌──────────────┐ ┌──────────────┐
│ Executes │ │ Executes │
│ same code │ │ same code │
│ same inputs │ │ same inputs │
└──────┬───────┘ └──────┬───────┘
│ │
▼ ▼
┌──────────────────────────────────────────┐
│ Comparator circuit (hardware) │
│ Compares outputs every clock cycle │
│ Any discrepancy → SMU (Safety Mgmt Unit)│
│ → immediate reset / safe state │
└──────────────────────────────────────────┘
Diagnostic coverage: ≥ 99% for transient and permanent CPU faults
Detects: radiation-induced SEU; stuck-at fault; logic error; data corruption
Aurix TC3xx: pairs TC0+TC1, TC2+TC3 in lockstep
NXP S32K3: Cortex-M7 dual-core lockstep
ARM Cortex-R52: hardware lockstep option
Latency: checker core runs N cycles behind main core
(configurable: 0–3 cycles; 0 = same cycle comparison)
Important: lockstep does NOT catch software design errors
(both cores execute the same wrong code identically)
→ Lockstep: random hardware fault detection only
→ Software correctness: requires process (reviews, testing, verification)Summary
Hardware safety mechanisms are the technical implementation of the diagnostic coverage claims made in the FMEA and SPFM/LFM calculations. The window watchdog is particularly powerful: it detects both software hangs (task doesn't complete → no kick) and software timing violations (task completes too fast → kick too early, indicating abnormal execution path). CPU lockstep provides very high DC (≥ 99%) for random hardware faults in the processor but provides zero coverage for systematic software errors — both cores execute the same incorrect code identically. This is why ISO 26262 requires both hardware metrics (SPFM/LFM/PMHF) AND rigorous software development processes.
🔬 Deep Dive — Core Concepts Expanded
This section builds on the foundational concepts covered above with additional technical depth, edge cases, and configuration nuances that separate competent engineers from experts. When working on production ECU projects, the details covered here are the ones most commonly responsible for integration delays and late-phase defects.
Key principles to reinforce:
- Configuration over coding: In AUTOSAR and automotive middleware environments, correctness is largely determined by ARXML configuration, not application code. A correctly implemented algorithm can produce wrong results due to a single misconfigured parameter.
- Traceability as a first-class concern: Every configuration decision should be traceable to a requirement, safety goal, or architecture decision. Undocumented configuration choices are a common source of regression defects when ECUs are updated.
- Cross-module dependencies: In tightly integrated automotive software stacks, changing one module's configuration often requires corresponding updates in dependent modules. Always perform a dependency impact analysis before submitting configuration changes.
🏭 How This Topic Appears in Production Projects
- Project integration phase: The concepts covered in this lesson are most commonly encountered during ECU integration testing — when multiple software components from different teams are combined for the first time. Issues that were invisible in unit tests frequently surface at this stage.
- Supplier/OEM interface: This is a topic that frequently appears in technical discussions between Tier-1 ECU suppliers and OEM system integrators. Engineers who can speak fluently about these details earn credibility and are often brought into critical design review meetings.
- Automotive tool ecosystem: Vector CANoe/CANalyzer, dSPACE tools, and ETAS INCA are the standard tools used to validate and measure the correct behaviour of the systems described in this lesson. Familiarity with these tools alongside the conceptual knowledge dramatically accelerates debugging in real projects.
⚠️ Common Mistakes and How to Avoid Them
- Assuming default configuration is correct: Automotive software tools ship with default configurations that are designed to compile and link, not to meet project-specific requirements. Every configuration parameter needs to be consciously set. 'It compiled' is not the same as 'it is correctly configured'.
- Skipping documentation of configuration rationale: In a 3-year ECU project with team turnover, undocumented configuration choices become tribal knowledge that disappears when engineers leave. Document why a parameter is set to a specific value, not just what it is set to.
- Testing only the happy path: Automotive ECUs must behave correctly under fault conditions, voltage variations, and communication errors. Always test the error handling paths as rigorously as the nominal operation. Many production escapes originate in untested error branches.
- Version mismatches between teams: In a multi-team project, the BSW team, SWC team, and system integration team may use different versions of the same ARXML file. Version management of all ARXML files in a shared repository is mandatory, not optional.
📊 Industry Note
Engineers who master both the theoretical concepts and the practical toolchain skills covered in this course are among the most sought-after professionals in the automotive software industry. The combination of AUTOSAR standards knowledge, safety engineering understanding, and hands-on configuration experience commands premium salaries at OEMs and Tier-1 suppliers globally.