Đang chuẩn bị liên kết để tải về tài liệu:
A Systolic Array for Implementing LRU Replacement

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

 Increasing the associativity of a cache reduces both the miss rate and the power consumption. It also makes LRU replacement more difficult to implement. We present a simple systolic array that can be used to implement LRU replacement in arbitrarily associative caches. | T-18 March 13 2002 A Systolic Array for Implementing LRU Replacement J.P. Grossman Abstract Increasing the associativity of a cache reduces both the miss rate and the power consumption. It also makes LRU replacement more difficult to implement. We present a simple systolic array that can be used to implement LRU replacement in arbitrarily associative caches. 1 Introduction One of the important design parameters of a hardware cache is its degree of associativity. Increasing a cache s associativity improves performance by reducing the miss rate Hennessy96 and leads to a lower power implementation Zhang00 . However it also becomes more difficult to implement a least recently used LRU replacement policy. As a result hardware designers opt for simpler replacement strategies such as round-robin Clark01 even though the LRU policy is known to provide better performance Smith82 . In this paper we present a simple systolic array that can keep track of LRU information for a set of cache lines. Since the length of the critical path is constant the approach can be used for N-way associative caches with N arbitrarily large. We begin by constructing a systolic array that can handle one cache access on every other cycle. In section 3 we modify the design to allow a cache access on every cycle. Finally in section 4 we show how to accommodate multiple accesses per cycle. 2 Implementation The central idea is to maintain a list of cache line indices sorted from LRU to MRU most recently used . When a cache line is accessed its line index L is presented to the list and that index is rotated to the MRU position at the end Figure 1a . We can implement this list as a systolic array by advancing L one node per clock cycle along with a single-bit matched signal M indicating whether or not the index has found a match within the array. Until a match is found L is advanced without any changes being made. Once a match is found nodes begin copying values from their neighbours to the right. .