tailieunhanh - Database Replication Techniques: a Three Parameter Classification

Data replication is an increasingly important topic as databases are more and more deployed over clusters of workstations. One of the challenges in database replication is to introduce replication without severely affecting perfor- mance. Because of this difficulty, current database products use lazy replication, which is very efficient but can com- promise consistency. As an alternative, eager replication guarantees consistency but most existing protocols have a prohibitive cost. In order to clarify the current state of the art and open up new avenues for research, this paper anal- yses existing eager techniques using three key parameters. In our analysis, we distinguish eight classes of eager repli- cation protocols and, for each category,. | Database Replication Techniques a Three Parameter Classification Matthias Wiesmann Fernando Pedone Andre Schiper Bettina Kemme- Gustavo Alonso Departement de Systemes de Communication Software Technology Laboratory Swiss Federal Institute of Technology in Lausanne Hewlett-Packard Laboratories CH-1015 Lausanne Switzerland Palo Alto CA 94304 USA Institute of Information Systems Swiss Federal Institute of Technology in Zurich CH-8092 Zurich Switzerland E-mail dragon@ Abstract Data replication is an increasingly important topic as databases are more and more deployed over clusters of workstations. One of the challenges in database replication is to introduce replication without severely affecting performance. Because of this difficulty current database products use lazy replication which is very efficient but can compromise consistency. As an alternative eager replication guarantees consistency but most existing protocols have a prohibitive cost. In order to clarify the current state of the art and open up new avenues for research this paper analyses existing eager techniques using three key parameters. In our analysis we distinguish eight classes of eager replication protocols and for each category discuss its requirements capabilities and cost. The contribution lies in showing when eager replication is feasible and in spelling out the different aspects a database replication protocol must account for. 1. Introduction In the distributed systems community software based replication is seen as a cost effective way to increase availability. In the database community however replication is used for both performance and fault-tolerant purposes thereby introducing a constant trade-off between consistency and efficiency. In fact many commercial 16 26 and research databases 35 are based on the asynchronous replication model also called lazy update model where changes introduced by a transaction are propagated to other sites only after the transaction has .