Fault-tolerance Techniques in Computer System Last Updated : 17 Feb, 2023 Summarize Comments Improve Suggest changes Share Like Article Like Report Fault-tolerance is the process of working of a system in a proper way in spite of the occurrence of the failures in the system. Even after performing the so many testing processes there is possibility of failure in system. Practically a system can't be made entirely error free. hence, systems are designed in such a way that in case of error availability and failure, system does the work properly and given correct result. Any system has two major components - Hardware and Software. Fault may occur in either of it. So there are separate techniques for fault-tolerance in both hardware and software. Hardware Fault-tolerance Techniques: Making a hardware fault-tolerance is simple as compared to software. Fault-tolerance techniques make the hardware work proper and give correct result even some fault occurs in the hardware part of the system. There are basically two techniques used for hardware fault-tolerance: BIST - BIST stands for Build in Self Test. System carries out the test of itself after a certain period of time again and again, that is BIST technique for hardware fault-tolerance. When system detects a fault, it switches out the faulty component and switches in the redundant of it. System basically reconfigure itself in case of fault occurrence. TMR - TMR is Triple Modular Redundancy. Three redundant copies of critical components are generated and all these three copies are run concurrently. Voting of result of all redundant copies are done and majority result is selected. It can tolerate the occurrence of a single fault at a time. Software Fault-tolerance Techniques: Software fault-tolerance techniques are used to make the software reliable in the condition of fault occurrence and failure. There are three techniques used in software fault-tolerance. First two techniques are common and are basically an adaptation of hardware fault-tolerance techniques. N-version Programming - In N-version programming, N versions of software are developed by N individuals or groups of developers. N-version programming is just like TMR in hardware fault-tolerance technique. In N-version programming, all the redundant copies are run concurrently and result obtained is different from each processing. The idea of n-version programming is basically to get the all errors during development only. Recovery Blocks - Recovery blocks technique is also like the n-version programming but in recovery blocks technique, redundant copies are generated using different algorithms only. In recovery block, all the redundant copies are not run concurrently and these copies are run one by one. Recovery block technique can only be used where the task deadlines are more than task computation time. Check-pointing and Rollback Recovery - This technique is different from above two techniques of software fault-tolerance. In this technique, system is tested each time when we perform some computation. This techniques is basically useful when there is processor failure or data corruption. Comment More infoAdvertise with us Next Article Basic Fault Tolerant Software Techniques P pp_pankaj Follow Improve Article Tags : Software Engineering Similar Reads Basic Fault Tolerant Software Techniques Fault tolerance is a critical property of software systems, ensuring they can continue operating even when faced with failures or errors. This resilience is achieved through various techniques to prevent disruptions and maintain high availability, particularly for mission-critical applications. Basi 11 min read Techniques to Identify Defects Defects are one of the major causes of a decrease in improvement and quality of products. Therefore, identification of defects is very important to control and minimize its impact on system. Identification or detection of defects is not an easy process. Testers need to be very focused. Removing all 4 min read Fault Reduction Techniques in Software Engineering Fault reduction techniques are methods used in software engineering to reduce the number of errors or faults in a software system. Some common techniques include: Code Reviews: Code review is a process where code is evaluated by peers to identify potential problems and suggest improvements.Unit Test 4 min read Fault Injection Testing - Software Engineering Fault injection is a technique used in software engineering to test the resilience of a software system. The idea is to intentionally introduce errors or faults into the system to see how it reacts and to identify potential weaknesses. This can be achieved in several ways, such as:Hardware faults: T 7 min read Error Handling Software Testing Error handling testing is a type of software testing that is performed to check whether the system is capable of or able to handle the errors that may happen in future. This type of testing is basically performed with the help of both developers and the testers. Error handling testing not only focus 3 min read Software Tolerance In this article, we will discuss software fault tolerance starting from fault tolerance in general to the advantages and disadvantages of fault tolerance. So, let's go a little bit deep into this article to understand the concept well. Fault Tolerance :Fault Tolerance is a terminology that explains 4 min read Like