UB03 Session 3

Printer-friendly version PDF version

Date: Tuesday 25 March 2014
Time: 15:00 - 17:30
Location / Room: University Booth, Booth 3, Exhibition Area

LabelPresentation Title
Authors
UB03.01LARA: THE LARA COMPILER SUITE
Authors:
Joao Bispo, Pedro Pinto, Ricardo Nobre, Tiago Carvalho and Joao Cardoso, Universidade do Porto, PT
Abstract
LARA is an aspect-oriented programming (AOP) language which allows the description of sophisticated code instrumentation schemes, advanced mapping strategies including conditional decisions, based on hardware/software resources, and of sophisticated sequences of compiler transformations. Furthermore, LARA provides mechanisms for controlling all elements of a toolchain in a consistent and systematic way, using a unified programming interface. We present three compiler tools developed around the LARA technology, MATISSE, MANET and ReflectC. MATISSE is a compiler which 1) allows analyses and transformations on MATLAB code and 2) generates C code from the MATLAB code. MATISSE can be fully controlled through LARA aspects, which can define the type and shape of MATLAB variables, specify code insertion/removal actions, and define specialization directives and other additional information. MATISSE can output transformed MATLAB code and specialized C code. The knowledge provided by the LARA aspects allows MATISSE to generate C tailored to specific targets (e.g., use statically declared arrays to be compliant with the high-level synthesis tools such as Catapult C). MANET is a source-to-source compiler for ANSI C based on Cetus, and is controlled using LARA aspects. MANET manages to leverage the expressiveness and modularity of LARA to query and manipulate the Cetus AST, providing an easy compilation flow with main goal of code instrumentation and code transformations. LARA aspects allow for a simple selection of program elements in the code which can be analyzed or transformed, by either consulting their attributes or applying actions. Thus, MANET can be used to provide information reports based on compiler analyses, to implement sophisticated code instrumentation strategies, or to perform code optimizations and transformations. ReflectC is a C compiler based on CoSy's compiler framework. CoSy's configurability and retargetability make ReflectC particularly effective for exploration of compiler transformations and optimizations on possible architecture variations, and it is being used for hardware/software co-design and design space exploration (DSE). We will present demos of the tools and the use of LARA aspects and strategies to guide our suite of compilation tools providing: 1) C code generation from MATLAB code, according to information provided by LARA aspects; 2) Instrumentation of C code to be used for collecting specific compile and runtime information (e.g., execution time, range of values for specific variables, custom profiling); 3) User-controlled compiler optimizations targeting several architectures and DSE of sequences of compiler optimizations bearing in mind performance improvements. In addition to presenting examples for each of the tools of the LARA compilation suite, we show an execution of the complete toolchain, controlled by LARA aspects.

More information ...
UB03.02AN AUTOMATED DESIGN FLOW FOR FAST PROTOTYPING OF SIMULINK MODELS ONTO MPSOC
Authors:
Francesco Robino and Johnny Öberg, Royal Institute of Technology, SE
Abstract
Simulink is a modelling environment suitable to model embedded systems at system-level. However there is no standard to rapidly prototype Simulink models onto modern multiprocessor system-on-chip (MPSoC). In this demonstration we show how our NoC System Generator tool can be used as part of an automated platform-based design flow to synthesize a Simulink model to a network-on-chip based MPSoC implementation on FPGA. The performance of the generated prototype scales with the number of processors.

More information ...
UB03.03PATN: A PERFORMANCE ANALYSIS TOOL FOR NOC
Authors:
Yang Chen and Zhonghai Lu, KTH Royal Institute of Technology, SE
Abstract
With processors increased onto a single chip, and more and more time sensitive applications added to on-chip systems, performance bound analysis becomes essential for QoS Network-on-Chip (NoC) designs and evaluations. For the purpose of providing the reliable and automated analysis for QoS NoC, we propose PATN (Performance Analysis Tool for NoC), which automatically computes the end-to-end delay bounds of data flows, and backlog bounds of buffers for NoC with arbitrary topology. PATN is designed based on network calculus, which lies on solid mathematical foundations and provides well-guaranteed accuracy of the results. Network Calculus based analysis has been successfully employed for various communications networks, such as SpaceWire, AFDX, etc.. For example, Airbus adopted and approved the network calculus based analysis for certification on its aircraft A380. In this demonstration, we give a whole view of PATN through two segments. First, we explain the architecture and main functions; show the working flow and printing log by analysing end-to-end delay bound of a data flow in a simple network. The log shows that the analysis follows the theoretical methodology exactly, hence to obtain the correct and tight results, which as good as that the theory can achieve. Second, we use PATN to analyse the delay bounds and backlog bounds for 3 NoCs with different topologies -- binary tree, mesh, and hierarchical topology of binary tree and mesh. The analyses demonstrate computation speed and scalability of PATN. Moreover, comparisons of the delay bound, computed with different configuration parameters of the flows and routers, are conducted. It shows how the delay bound is effected by the parameters.

More information ...
UB03.04COMPILER FOR MAPPING STREAM PROCESSING APPLICATIONS ONTO REAL-TIME HETEROGENEOUS MULTIPROCESSOR SYSTEMS
Authors:
Stefan Geuns, Berend Dekens, Philip Wilmanns, Joost Hausmans, Guus Kuiper and Marco Bekooij, University of Twente, NL
Abstract
Heterogeneous multiprocessors system are employed for power-efficiency reasons in wearable software defined radios. These systems are hardware cost-effective and deliver a superior performance compared to their homogeneous counterparts. However these systems are notoriously hard to program without tool support, which makes it is desirable that programming is simplified with the help of an optimizing multiprocessor compiler for stream processing applications. This demonstration shows our multiprocessor compiler for mapping real-time stream processing applications onto our real-time heterogeneous multi-core system. The applications are described as sequential programs and are compiled into parallel task graphs. Buffer capacities are computed using dataflow analysis techniques given the real-time constraints of the application. Our multi-core system contains 16 MicroBlaze processor cores as well as two hardware accelerators and is prototyped on a Xilinx Virtex-6 FPGA. A connection-less communication ring is used for inter-processor communication. Our system is equipped with an analog RF front-end, which enables us to demonstrate PAL-video reception and decoding.

More information ...
UB03.05HWDEBLUR: DESIGN OF A HIGH PERFORMANCE CORE FOR REMOVING BLUR EFFECT ON IMAGES
Authors:
Giuseppe Airo' Farulla, Giulio Gambardella, Marco Indaco, Paolo Prinetto, Daniele Rolfo and Pascal Trotta, Politecnico di Torino, IT
Abstract
This work aims at developing a high performance FPGA-based IP-core able to perform a deblurring algorithm in real-time. Modern approaches to deblurring usually either only handle simple types of blur, or need heavy user inter-action. Moreover, they usually require several minutes (or even whole hours) to process a single image. Our purpose is to study the current state-of-the-art and identify the best deblurring algorithms that are suitable for a hardware implementation. The selected algorithm is optimized and implemented in hardware in order to perform the deblurring task with highest possible performances.

More information ...
UB03.06PHARAON: PARALLEL AND HETEROGENEOUS ARCHITECTURES FOR REAL-TIME APPLICATIONS
Authors:
Luciano Lavagno1, Mihai Lazarescu1, Hector Posadas2 and Eugenio Villar2
1Politecnico di Torino, IT; 2Universidad de Cantabria, ES
Abstract
In this demo, we will present the work-in-progress of the EU FP7 PHARAON project, started in September 2011. The first objective of the project is the development of new techniques and tools capable to assist the designer in the development of parallel embedded systems, from executable specifications to target-specific implementation and debugging on a multicore platform. This tool chain offers and implements several parallelization strategies, reflecting the functional and non-functional constraints of the system, and driving the designer into incremental parallelization and adaptation steps. The second objective of the project is to develop monitoring and control techniques in the middleware of the system capable to automatically adapt platform services to application requirements and therefore reduce power consumption transparently. The demo will cover specifically: - the software parallelization tool suite, - the parallel software modeling and code generation suite.

More information ...
UB03.07COMPSOC: VIRTUAL EXECUTION PLATFORMS FOR MIXED TIME-CRITICALITY APPLICATIONS
Author:
Kees Goossens, TU Eindhoven, NL
Abstract
System-on-Chip (SOC) design gets increasingly complex, as a growing number of applications are inte- grated in such systems. These applications have mixed time-criticality, i.e., some have firm-, some soft-, and others non-real-time requirements. Executing such a mix of applications on a SOC poses several challenges. First, to reduce cost, platform resources, e.g., processors, interconnect, memories, are shared between applications. However, sharing causes interference between applications, making their behaviors inter- dependent. This results in two problems for SOC design and verification: 1) accurate system-level simulation and several approaches to formal verification are infeasible, because of the explosion in the number of possible combinations of applications, inputs, and resource states and 2) verification becomes a circular process that must be repeated if an application is added, removed, or modified, making integration and verification dominant parts of SOC development, in terms of time and money. The CompSOC platform addresses these problems by executing each application on an independent virtual execution platform (VEP). The VEPs are composable, i.e., cannot affect each other's behaviors. In the temporal domain an applications actual execution never varies by even a single clock cycle. Similarly, the energy and power behaviors of applications are also composable. As a result, applications can be designed, developed, verified, and executed in isolation. The VEPs are also predictable, meaning that all interference is bounded. This makes them virtualized also in terms of performance bounds, which enables firm real-time applications to be verified using formal performance analysis frameworks. The CompSOC platform uses the CoMiK microkernel to implement virtual processors on each processor time through temporal partitioning. Each application can use its own operating system (e.g. Compose, μcOS-III) and model of computation (e.g. CSDF, KPN, TT) in its VEP, to suit its level of time criticality. As more applications are integrated on a single SOC, the need arises for more dynamic behaviour. The system should be able to start, modify and stop applications at run time without affecting running appli- cations. For this purpose the CompSOC platform has been extended with a predictable and composable resource management framework. It manages application bundles that contain 1) an application in the form of executables (ELFs on multiple processors), and also 2) the specifications of the (one or more) particular VEPs that the application executes in, consisting of virtual processors, NOC connections, virtualised mem- ories, etc. At run time, the resource management framework can dynamically load and start application bundles by creating a VEP and then loading, booting, and executing an application within it. VEPs can also be modified, stopped, and deleted at run time. Our University Booth will present virtual-execution-platform and application-bundle concepts using an interactive demonstrator. It will show that the CompSOC has been extended with dynamic functionality, without sacrificing its key strengths: composability and predictability. We will demonstrate this through the use of the resource management framework and application bundles, showing that we can create, modify and delete virtual execution platforms running a mixed time-criticality application dynamically at run-time.

More information ...
UB03.08A HOLISTIC APPROACH TO POWER MANAGEMENT FOR ENERGY HARVESTING EMBEDDED SYSTEMS
Authors:
Kyungsoo Lee, Hideki Takase and Tohru Ishihara, Kyoto University, JP
Abstract
We present a holistic approach to maximizing the energy efficiency of energy harvesting embedded systems which consist of a processor system and an energy harvesting system. A power management program integrated on a real-time OS optimally switches operation mode of the processor and configuration of the energy harvesting system according to the workload of the processor and harvesting situation. The demonstration will show that our prototype system consisting of our processor chip and harvesting system board stably runs using harvested energy only. The processor has multiple cores having a different performance in each to improve the energy efficiency of computation. The energy harvesting board has high transferring efficiency to reduce the power loss. The entire system is controlled efficiently by our power management program implemented on Toppers OS.

More information ...
UB03.09FAULTIFY: PROBABILISTIC CIRCUIT FAULT EMULATION
Authors:
David May and Walter Stechele, TUM, DE
Abstract
We want to demonstrate an FPGA-based probability-aware fault emulator and its corresponding algorithms in the context of a real-time H.264 decoder. The demo will show that reliability constraints can be relaxed inside the circuit without noticeable degradation of the image quality when carefully investigating where the constraints can be relaxed. We will show how this investigation can to be done using our emulator and we will show the effect of a relaxed robustness of the circuit in real-time.

More information ...
UB03.10RTL+: DESIGN ENVIRONMENT: WALK BEFORE YOU RUN.
Authors:
Somayeh Sadeghi-Kohan, Behnaz Pourmohseni, Amir Reza Nekooei, Hanieh Hashemi, Hamed Najafi Haghi and Zainalabedin Navabi, University of Tehran, IR
Abstract
To enable development of high level designs with hardware correspondence, synthesizability must be satisfied in a top-down manner. Thus in this work, instead of using TLM-2.0 which is not established for synthesis, we will start with a level above RT level, "RTL+". RTL+ is basically using TLM-1.0 channels and includes abstract communications and handshakings that are mainly hidden from the designer. We develop a package of SystemC channels with hardware correspondence (synthesizable HDL) for the communication between various cores (with simple interfaces) and standard buses.

More information ...
17:30End of session
18:30Exhibition Reception in Several serving points inside the Exhibition Area (Terrace Level)
The Exhibition Reception will take place in the exhibition area (Terrace Level). All exhibitors are welcome to provide drinks and snacks for delegates and visitors.