7.5 Emerging Memory Architectures

Printer-friendly version PDF version

Date: Wednesday 16 March 2016
Time: 14:30 - 16:00
Location / Room: Konferenz 3

Chair:
Costin Anghel, ISEP, FR

Co-Chair:
Fabian Oboril, Karlsruhe Institute of Technology, DE

The first paper presents a method to utilize the variations in RRAM access latency due to IR drop in a given array. The second paper exploits the spatial and temporal locality of cache access, and proposes an ECC scheme wherein write operations with potentially different error rates are mapped to regions with different ECC strengths. The third paper proposes a write scheme for phase change memory to minimize the number of write units.

TimeLabelPresentation Title
Authors
14:307.5.1LEADER: ACCELERATING RERAM-BASED MAIN MEMORY BY LEVERAGING ACCESS LATENCY DISCREPANCY IN CROSSBAR ARRAYS
Speaker:
Hang Zhang, National University of Defense Technology, CN
Authors:
Hang Zhang, Nong Xiao, Fang Liu and Zhiguang Chen, National University of Defense Technology, CN
Abstract
Emerging Resistive Memory (ReRAM) technology is a promising candidate as the replacement to DRAM due to its low leakage power consumption, good scalability, and high density. By employing crossbar structures, the density of ReRAM can be further improved for capacity benefits. However, such structure also causes an IR drop issue due to wire resistance and sneak currents, which lead to an access latency discrepancy in ReRAM memory banks. Existing designs conservatively utilize the worst-case latency of ReRAM arrays, and thus fail to explore the potential of the fast access speed of ReRAM, resulting in sub-optimal performance. In this work, we present an asymmetric ReRAM memory design, which separates a crossbar array into multiple logical regions according to their access latency, and further groups logical regions across different crossbars into virtual regions. Based on the observation of access hotspots inside memory banks, we design a table structure to remap memory requests to different virtual regions with non-uniform access latency, so as to match these access hotspots with the underlying asymmetric bank design. We then introduce both static mapping and dynamic mapping schemes to prioritize memory requests from critical applications to the fast regions for better performance. Experimental results show that our design can improve the 4-core system performance by 13.3% and reduce the memory latency by 21.6% on average for a ReRAM-based memory system across memory intensive applications.

Download Paper (PDF; Only available from the DATE venue WiFi)
15:007.5.2SLIDING BASKET: AN ADAPTIVE ECC SCHEME FOR RUNTIME WRITE FAILURE SUPPRESSION OF STT-RAM CACHE
Speaker:
Yiran Chen, University of Pittsburgh, US
Authors:
Xue Wang1, Mengjie Mao1, Wujie Wen2, Enes Eken1, Hai Li1 and Yiran Chen1
1University of Pittsburgh, US; 2Florida International University, US
Abstract
Write reliability is one of the major challenges in design of spin-transfer torque random access memory (STT- RAM) caches. To ensure design quality, error correction code (ECC) scheme is usually adopted in STT-RAM caches. However, it incurs significant hardware overhead. In observance of the dynamic error correcting requirements, in this work, we propose Sliding Basket - an adaptive ECC scheme to suppress the runtime write failures of STT-RAM cache with minimized hardware cost. Our simulation results show that compared to the STT-RAM caches with conventional ECC scheme, applying Sliding Basket can achieve up to 80.2% saving in ECC bit overhead, comparable write reliability and even better system performance.

Download Paper (PDF; Only available from the DATE venue WiFi)
15:307.5.3EXPLOITING MORE PARALLELISM FROM WRITE OPERATIONS ON PCM
Speaker:
Zheng Li, Huazhong University of Science and Technology, CN
Authors:
Zheng Li, Fang Wang, Yu Hua, Wei Tong, Jingning Liu, Yu Chen and Dan Feng, Huazhong University of Science and Technology, CN
Abstract
The number of bits can be written concurrently to PCM, called write unit, is restricted due to heavy write energy consumption and we need many serially executed write units to finish a cache line service, which results in long write time and poor write performance of PCM. In order to address the poor write performance problem, we propose a novel PCM write scheme called IZV. The key idea behind IZV is to reduce the number of write unit execution in a cache line service. IZV design includes sFPC (simplified FPC data coding), RW (Reordering Write operations) and WP (Write Parallelism circuits). By means of sFPC, RW and WP, the zero parts of write units can be indicated with predefined prefix bits and the residues can be reordered and written concurrently under power constraints. IZV is highly effective and efficient in improving the performance and reducing the energy consumption. Experimental results of 4-core PARSEC 2.0 workloads show that IZV improves 32.5% performance and reduces 48% energy as well as 44% latency compared with the conventional write scheme. When combined with partly data flip, the variation of IZV (IZV-PF) yields 12% performance improvement, 23% energy saving and 22% latency reduction compared with the state-of-the-art FNW.

Download Paper (PDF; Only available from the DATE venue WiFi)
16:00End of session
Coffee Break in Exhibition Area