tailieunhanh - The Case for Determinism in Database Systems
Of course there are times when indices are important, such as when doing single-row lookups or filtering or grouping low-cardinality columns. Greenplum Database provides a range of index types, including b-trees and bitmap indices, that address these needs exactly. Another very powerful technique that is available with Greenplum Database is multi-level table partitioning. This technique allows users to break very large tables into buckets on each segment based on one or more date, range, or list values. This partitioning is above and beyond the hash partitioning described earlier and allows the system to scan just the subset of buckets that might. | The Case for Determinism in Database Systems Alexander Thomson Yale University thomson@ Daniel J. Abadi Yale University dna@ ABSTRACT Replication is a widely used method for achieving high availability in database systems. Due to the nondeterminism inherent in traditional concurrency control schemes however special care must be taken to ensure that replicas don t diverge. Log shipping eager commit protocols and lazy synchronization protocols are well-understood methods for safely replicating databases but each comes with its own cost in availability performance or consistency. In this paper we propose a distributed database system which combines a simple deadlock avoidance technique with concurrency control schemes that guarantee equivalence to a predetermined serial ordering of transactions. This effectively removes all nondeterminism from typical OLTP workloads allowing active replication with no synchronization overhead whatsoever. Further our system eliminates the requirement for two-phase commit for any kind of distributed transaction even across multiple nodes within the same replica. By eschewing deadlock detection and two-phase commit our system under many workloads outperforms traditional systems that allow nondeterministic transaction reordering. 1. INTRODUCTION Concurrency control protocols in database systems have a long history of giving rise to nondeterministic behavior. They traditionally allow multiple transactions to execute in parallel interleaving their database reads and writes while guaranteeing equivalence between the final database state and the state which would have resulted had transactions been executed in some serial order. The key modifier here is some. The agnosticism of serialization guarantees to which serial order is emulated generally means that this order is never determined in advance rather it is dependant on a vast array of factors entirely orthogonal to the order in which transactions may have entered the
đang nạp các trang xem trước