tailieunhanh - System diagnosis and fault tolerance for distributed computing system: a review

Various fault detecting mechanism and fault tolerant methodology to be study here and the main goal of the study is to find out some automatic fault detection and fault tolerance techniques. | ISSN:2249-5789 P Saikia et al , International Journal of Computer Science & Communication Networks,Vol 3(4),284-295 System Diagnosis and Fault Tolerance for Distributed Computing System: A Review 1. Nilotpal Baruah, PhD Research Scholar, Dept. of Comp. Sc., Assam University, Silchar, India nilotpaldu@ 2. Dr. Lakshmi P. Saikia, Professor, Dept. of Computer Sc.& Engg., Assam down town University, Guwahati, India lp_saikia@ 3. Dr. K. Hemachandran, Professor, Dept. of Comp. Sc., Assam University, Silchar, India khchandran@rediffmail. com Abstract An adaptive system diagnosis fault tolerance method for distributed system. The system is comprised of a network including N nodes where N is integer and greater than equal to 3 and each node is able to execute an algorithm to communicate with the network. A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information. As computer network is a collection of hardware components it is very often that is may have some fault either in the hardware or in the software of the entire network. So to deal with these kinds of faults either hardware of software, some fault diagnosis and fault tolerance mechanism to be implemented for the proper functioning of the system. For such a fault detection and fault tolerant mechanism is to be discussed in this paper. What kind of fault and how they occur will discuss and try to find out some suitable solution of our proposed problem. Various fault detecting mechanism and fault tolerant methodology to be study here and the main goal of the study is to find out some automatic fault detection and fault tolerance techniques. Keywords Distributed System, Network, Fault Tolerance, System Diagnosis, LAN, Middleware, Topology, Client –Server, DCS, Simulation. 1. Introduction Distributed computing is a method of computer processing