Artemis II fault tolerance
NASA's Artemis II spacecraft features a highly fault-tolerant computer system designed to withstand multiple failures during flight. The system uses layered redundancies, including eight parallel processors, self-correcting memory, and a backup flight software on dissimilar hardware. These safeguards ensure mission continuity even after severe faults, such as power loss or radiation-induced errors.
Opening excerpt (first ~120 words) tap to expand
Artemis II fault tolerance Communications of the ACM had a fascinating post about how NASA built Artemis II’s fault tolerant computer. 3 fascinating excerpts. (1) Eight modules with several back up scenarios: “Orion utilizes two Vehicle Management Computers, each containing two Flight Control Modules, for a total of four FCMs. But the redundancy goes even deeper: each FCM consists of a self-checking pair of processors. Effectively, eight CPUs run the flight software in parallel. The engineering philosophy hinges on a “fail-silent” design. The self-checking pairs ensure that if a CPU performs an erroneous calculation due to a radiation event, the error is detected immediately and the system responds.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Hacker News: Front Page.