11.6 System-Level Thermal Estimation and Management

Printer-friendly version PDF version

Date: Thursday 27 March 2014
Time: 14:00 - 15:30
Location / Room: Konferenz 4

Chair:
Petru Eles, Linköping University, SE

Co-Chair:
Oliver Bringmann, University of Tübingen, DE

This session deals with thermal management issues in multicore and battery-operated systems. In particular the papers cover three orthogonal aspects, specifically, thermal estimation and tracking from sparse sensor readings, proactive dynamic thermal management via task migration, and thermal management of hybrid energy storage devices based on idle period insertion.

TimeLabelPresentation Title
Authors
14:0011.6.1MINIMAL SPARSE OBSERVABILITY OF COMPLEX NETWORKS: APPLICATION TO MPSOC SENSOR PLACEMENT AND RUN-TIME THERMAL ESTIMATION & TRACKING
Speakers:
Santanu Sarma and Nikil Dutt, University of California Irvine, US
Abstract
This paper addresses the fundamental and practically useful question of identifying a minimum set of sensors and their locations through which a large complex dynamical network system and its time-dependent states can be observed. The paper defines the minimal sparse observability problem (MSOP) and provides analytical tools with necessary and sufficient conditions to make an arbitrary complex dynamic network system completely observable. The mathematical tools are then used to develop effective algorithms to find the sparsest measurement vector that provides the ability to estimate the internal states of a complex dynamic network system from experimentally accessible outputs. The developed algorithms are further used in the design of a sparse Kalman filter (SKF) to estimate the time-dependent internal states of a linear time-invariant (LTI) dynamical network system. The approach is applied to illustrate the minimum sensor in-situ run-time thermal estimation and robust hotspot tracking for dynamic thermal management (DTM) of high performance processors and MPSoCs.
14:3011.6.2MDTM: MULTI-OBJECTIVE DYNAMIC THERMAL MANAGEMENT FOR ON-CHIP SYSTEMS
Speakers:
Heba Khdr, Thomas Ebi, Muhammad Shafique, Hussam Amrouch and Jörg Henkel, Karlsruhe Institute of Technology (KIT), DE
Abstract
Thermal hot spots and unbalanced temperatures between cores on chip can cause either degradation in performance or may have a severe impact on reliability, or both. In this paper, we propose mDTM, a proactive dynamic thermal management technique for on-chip systems. It employs multi-objective management for migrating tasks in order to both prevent the system from hitting an undesirable thermal threshold and to balance the temperatures between the cores. Our evaluation on the Intel SCC platform shows that mDTM can successfully avoid a given thermal threshold and reduce spatial thermal variation by 22%. Compared to state-of-the-art, our mDTM achieves up to 58% performance gain. Additionally, we deploy an FPGA and IR camera based setup to analyze the effectiveness of our technique.
15:0011.6.3THERMAL MANAGEMENT OF BATTERIES USING A HYBRID SUPERCAPACITOR ARCHITECTURE
Speakers:
Donghwa Shin1, Massimo Poncino2 and Enrico Macii3
1Department of Computer Engineering, Yeungnam University, KR; 2Politecnico di Torino, IT; 3Dipartimento di Automatica e Informatica, Politecnico di Torino, IT
Abstract
Thermal analysis and management of batteries have been an important research issue for battery-operated systems such as electric vehicles and mobile devices. Nowadays, battery packs are designed considering heat dissipation, and external cooling devices such as a cooling fan are also widely used to enforce the reliability and extend the lifetime of a battery. This type of approaches that target the enhancement of the cooling efficiency via the reduction of the thermal resistance cannot achieve an immediate temperature drop to avoid a thermal emergency situation. Approaches based on removing the heat from the heat sources via idle period insertion (similar to what is done for silicon devices) would allow faster thermal response; however it is not obvious how to implement these schemes in the context of batteries. In this paper, we propose the use of a simple parallel battery-supercapacitor hybrid architecture with a dual-mode discharging strategy that can provide immediate temperature management, in which the supercapacitor is used as an energy buffer during the idle periods of the battery. Simulation results shows that the proposed method can keep the battery temperature within the safe range without external cooling devices while exploiting the advantage of the battery-supercapacitor parallel connection.
15:31IP5-17, 873(Best Paper Award Candidate)
THERMAL ANALYSIS AND MODEL IDENTIFICATION TECHNIQUES FOR A LOGIC + WIDEIO STACKED DRAM TEST CHIP
Speakers:
Francesco Beneventi1, Andrea Bartolini1, Pascal Vivet2, Denis Dutoit2 and Luca Benini1
1DEI - University of Bologna, IT; 2CEA-Leti, Grenoble, FR
Abstract
High temperature is one of the limiting factors and major concerns in 3D-chip integration. In this paper we use a 3D test chip (WIDEIO DRAM on top of a logic die) equipped with temperature sensors and heaters to explore thermal effects. We correlated real temperature measurements with the power dissipated by the heaters using model learning techniques. The resulting compact thermal model is able to predict temperatures at chip locations far from the temperature sensors and to infer the power dissipation at any location of the chip. Results are verified by mean of an off-sample validation technique and show a high accuracy of the compact thermal model when compared with silicon measurements.
15:32IP5-18, 349ADAPTIVE POWER ALLOCATION FOR MANY-CORE SYSTEMS INSPIRED FROM MULTIAGENT AUCTION MODEL
Speakers:
Xiaohang Wang1, Baoxin Zhao1, Terrence Mak2, Mei Yang3, Yingtao Jiang3, Masoud Daneshtalab4 and Maurizio Palesi5
1Guangzhou Institute of Advanced Technology, CN; 2The Chinese University of Hong Kong, CN; 3University of Nevada, Las Vegas, US; 4University of Turku, FI; 5University of Enna, Kore, IT
Abstract
Scaling of future many-core chips is hindered by the challenge imposed by ever-escalating power consumption. At its worst, an increasing fraction of the chips will have to be shut down, as power supply is inadequate to simultaneously switch all the transistors. This so-called dark silicon problem brings up a critical issue regarding how to achieve the maximum performance with a given limited power budget. This issue is further complicated by two facts. First, high variation in power budget calls for wide range power control capability, whereas most current frequency/voltage scaling techniques cannot effectively adjust power over such a wide range. Second, as the applications' behavior becomes more complicated, there is a pressing need for scalability and global coordination, rendering heuristic-based centralized or fully distributed control schemes inefficient. To address the aforementioned problems, in this paper, a power allocation method employing multiagent auction models is proposed, referred as Hierarchal MultiAgent based Power allocation (HiMAP). Tiles act the role of consumers to bid for power budget and the whole process is modeled by a combinatorial auction, whereas HiMAP finds the Walrasian equilibria. Experimental results have confirmed that HiMAP can reduce the execution time by as much as 45% compared to three competing methods. The runtime overhead and cost of HiMAP are also small, which makes it suitable for adaptive power allocation in many-core systems.
15:33IP5-19, 815UNIFIED, ULTRA COMPACT, QUADRATIC POWER PROXIES FOR MULTI-CORE PROCESSORS
Speakers:
Muhammad Yasin1, Ibrahim (Abe) Elfadel2 and Anas Shahrour2
1New York University - Abu Dhabi, AE; 2Masdar Institute of Science and Technology, AE
Abstract
Per-core power proxies for multi-core processors are known to use several dozens of hardware activity monitors to achieve a 2% accuracy on core power estimation. These activity monitors are typically not accessible to the user, and even if they were accessible, there would be a significant overhead in using them at the kernel or OS level for power monitoring or control. Furthermore, when scaled up to hundreds of cores per chip, such power proxies become a computational bottleneck for power management operations such as chip power capping. In this paper, we show that a 4% accuracy or better for per-core power estimation can be achieved using an ultra compact power proxy based on a hybrid set of only four user-accessible parameters, namely core frequency, core temperature, instruction-per-cycle and active-state residency. Our proxy is nonlinear, valid across all P and C states, and is based on a randomized power data collection strategy that aims at exercising all the P and C levels of each core. We illustrate the accuracy of the model using the full suite of the SPEC CPU 2006 benchmarks on a 12-core processor.
15:30End of session
Coffee Break in Exhibition Area
On Tuesday-Thursday the coffee and lunch breaks will be located in the Exhibition Area (Terrace Level).