Jean-Claude Laprie Award in Dependable Computing

JCLDr. Jean-Claude Laprie was Directeur de Recherche at LAAS-CNRS, Toulouse, France. He devoted his entire career to research on the dependability of computing systems. His unique capability of abstraction and formalization, and his contributions to the formulation of the concepts and methodologies of dependability, rapidly led to national and international recognition.

He received the IFIP Silver Core in 1992, the Silver Medal of French Scientific Research in 1993, and the Grand Prize in Informatics of the French Academy of Science in 2009. He was made Chevalier de l’Ordre National du Mérite in 2002.

The IFIP 10.4 Working Group on Dependable Computing created the award in his honor in 2011. It recognizes outstanding papers that have significantly influenced the theory and/or practice of Dependable Computing. For 2013, the award committee decided to recognize three seminal papers, one in each of the award’s impact categories:

  • L. Lamport, R. Shostak, and M. Pease, “The Byzantine Generals Problem”, ACM Transactions on Programming Languages and Systems, vol. 4, no. 3, July 1982 pp. 382-401 .
  • J. Gray, “Why Do Computers Stop and What Can Be Done About It?”, In Symposium on Reliability in Distributed Software and Database Systems, pp. 3-12. IEEE, 1986.
  • W.G. Bouricius, W.C. Carter and P.R. Schneider, “Reliability Modeling Techniques for Self-Repairing Computer Systems”, In Proceedings of the 24th ACM National Conference, pp. 295-309. ACM, 1969.

Lamport’s “The Byzantine Generals Problem” is the seminal paper on dealing with Byzantine faults in the replication and distribution of information to the redundant/replicated components of a fault tolerant system. Indeed the term Byzantine fault originated with this paper. Universally common in redundant systems before this paper was an implicit assumption that all redundant components in a fault tolerant system were receiving identical information. Sadly this is an all too common assumption even in contemporary fault tolerant systems. This paper highlighted the exposure of non-Byzantine proofed system to a failure caused by a single Byzantine fault. Given that other failures required multiple fault sequences, the failure due to a single Byzantine fault is frequently the dominant (though unmodeled) mode of failure for unproofed systems.

Gray’s “Why Do Computers Stop and What Can Be Done About It?,” had enormous impact on both industry and the academic research on fault tolerant computing. This was the seminal paper compiling and analyzing from empirical data the sources and causes of computer system failures, and how they could be addressed. This paper turned much of the FT community from its primary focus on hardware failures, to the issues of software failures and operational missteps leading to system failures, still the primary and most difficult to address causes of systems failures. It was the first to bring to the communities attention the concept of Heisenbugs (soft) and Bohrbugs (hard) for software. The observation or hypothesis that most software failures are Heisenbugs, that is, they occur only infrequently due to seldom occurring and difficult to repeat circumstances is now key to most strategies for recovering from software failures.

Bouricius, Carter and Schneider’s, “Reliability Modeling Techniques for Self-Repairing Computer Systems,” set the foundation for reliability modeling of fault tolerant computing systems, which we now find so fundamental. It was the first paper to rigorously analyze not only the impact of redundant components on a fault tolerant system (as suggested by Von Neuman in his assertion that systems of arbitrary reliability could be built using redundant components) but also rigorously analyzed the impact that coverage and effectiveness of the recovery mechanism (the paper studied the failure of the switch-over mechanism to effect the switch) had on reliability. The rigor and refinement made reliability modeling an essential part of fault tolerant systems design and architecture. Primitive though some of this work may seem by current standards of sophistication, it is the shoulders of this work on which much of the highly refined modeling now rests.