9.2 Low-Cost, High-Performance NoCs

Printer-friendly version PDF version

Date: Thursday 27 March 2014
Time: 08:30 - 10:00
Location / Room: Konferenz 6

Chair:
Kees Goossens, Eindhoven University, NL

Co-Chair:
Luca Ramini, University of Ferrara, IT

This session pushes the boundaries of NoC performance optimization while at the same time accounting for implementation constraints. The first paper takes a perspective where express channels are added to the topology, and then smart application mapping is performed. The second paper instead chooses the TDM NoC route to provide guaranteed performance, and significantly optimizes the TDM scheduling process. Finally, the last paper reduces buffer sizes, while also providing elasticity, in a router's virtual channel buffers.

TimeLabelPresentation Title
Authors
08:309.2.1APPLICATION MAPPING FOR EXPRESS CHANNEL-BASED NETWORKS-ON-CHIP
Speakers:
Di Zhu1, Lizhong Chen1, Siyu Yue2 and Massoud Pedram1
1Univ. of Southern California, US; 2University of Southern California, US
Abstract
With the emergence of many-core multiprocessor system-on-chips (MPSoCs), the on-chip networks are facing serious challenges in providing fast communication for various tasks and cores. One promising solution shown in recent studies is to add express channels to the network as shortcuts to bypass intermediate routers, thereby reducing packet latency. However, this approach also greatly changes the packet delay estimation and traffic behaviors of the network, both of which have not yet been exploited in existing mapping algorithms. In this paper, we explore the opportunities in optimizing application mapping for express channel-based on-chip networks. Specifically, we derive a new delay model for this type of networks, identify their unique characteristics, and propose an efficient heuristic mapping algorithm that increases the bypassing opportunities by reducing unnecessary turns that would otherwise impose the entire router pipeline delay to packets. Simulation results show that the pro-posed algorithm can achieve a 2~4X reduction in the number of turns and 10~26% reduction in the average packet delay.
09:009.2.2PARALLEL PROBE BASED DYNAMIC CONNECTION SETUP IN TDM NOCS
Speakers:
Shaoteng Liu, Axel Jantsch and Zhonghai Lu, KTH, SE
Abstract
Abstract—We propose a Time-Division Multiplexing (TDM) based connection oriented NoC with a novel double-time wheel router architecture combined with a run-time parallel probing setup method. In comparison with traditional TDM connection setup methods, our design has the following advantages: (1) it allocates paths and time slots at run-time; (2) it is fast with predictable and bounded setup latency; (3) it avoids additional resources (no auxiliary network or central processor to find and manage connections); (4) it is fully distributed and therefore it scales nicely with network size. Compared to a packet based setup method, our probe based design can reduce path setup delay by up to 81% and increase network load by 110% in an 8x8 mesh, while avoiding the auxiliary network. Compared to a centralized method, our solution can double the success rate, while eliminating the central resource for path setup and reducing the wire overhead. Synthesis results suggest that our design is faster and smaller than all comparable solutions.
09:309.2.3ELASTISTORE: AN ELASTIC BUFFER ARCHITECTURE FOR NETWORK-ON-CHIP ROUTERS
Speakers:
Giorgos Dimitrakopoulos1, Ioannis Seitanidis1, Anastasios Psarras1 and Chrysostomos Nicopoulos2
1Democritus University of Thrace, GR; 2University of Cyprus, CY
Abstract
The design of scalable Network-on-Chip (NoC) architectures calls for new implementations that achieve high-throughput and low-latency operation, without exceeding the stringent area-energy constraints of modern Systems-on-Chip (SoC). The router's buffer architecture is a critical design aspect that affects both network-wide performance and implementation characteristics. In this paper, we extend Elastic Buffer (EB) architectures to support multiple Virtual Channels (VC) and we derive extit{ElastiStore}, a novel lightweight elastic buffer architecture that minimizes buffering requirements, without sacrificing performance. The integration of the proposed elastic buffering scheme in the NoC router enables the design of new router architectures -- both single-cycle and two-stage pipelined -- which offer the same performance as baseline VC-based routers, albeit at a significantly lower area/power cost.
10:00IP4-12, 581DYNAMIC CONSTRUCTION OF CIRCUITS FOR REACTIVE TRAFFIC IN HOMOGENEOUS CMPS
Speakers:
Marta Ortín-Obón1, Darío Suárez-Gracia Suárez-Gracia1, María Villaroya-Gaudó1, Cruz Izu2 and Víctor Viñals-Yúfera1
1University of Zaragoza, ES; 2University of Adelaide, AU
Abstract
Networks on Chip (NoCs) have a large impact on system performance, area and energy. Considering the characteristics of the memory subsystem while designing the NoC helps identify improvement opportunities and build more efficient designs. Leveraging the frequent request-reply pattern, our proposal dynamically builds the reply path in advance, is able to share circuits between messages, and even removes some implicit replies, significantly reducing NoC latency. A careful implementation of this circuit reservation mechanism achieves an average 17% reduction in router energy consumption, 8% smaller router area and a 2% system performance increase, compared with its baseline counterpart.
10:01IP4-13, 646IMPROVING HAMILTONIAN-BASED ROUTING METHODS FOR ON-CHIP NETWORKS: A TURN MODEL APPROACH
Speakers:
Poona Bahrebar and Dirk Stroobandt, Ghent University, BE
Abstract
The overall performance of Multi-Processor System-on-Chip (MPSoC) platforms depends highly on the efficient communication among their cores in the Network-on-Chip (NoC). Routing algorithms are responsible for the on-chip communication and traffic distribution through the network. Hence, designing efficient and high-performance routing algorithms is of significant importance. In this paper, a deadlock-free and highly adaptive path-based routing method is proposed without using virtual channels. This method strives to exploit the maximum number of minimal paths between any source and destination pair. The simulation results in terms of performance and power consumption demonstrate that the proposed method significantly outperforms the other adaptive and non-adaptive schemes. This efficiency is achieved by reducing the number of hotspots and smoothly distributing the traffic across the network.
10:00End of session
Coffee Break in Exhibition Area
On Tuesday-Thursday the coffee and lunch breaks will be located in the Exhibition Area (Terrace Level).