Accelerator

Accelerator Stream

The Accelerator stream will develop and demonstrate fully European processor IPs based on the RISC-V Instruction Set Architecture, providing power efficient and high throughput accelerator tiles within the GPP chip. Using RISC-V allows leveraging open source resources at hardware architecture level and software level, as well as ensuring independence from non-European patented computing technologies.

The EPAC basic building block is a tile containing up to 8 vector processors and specialized units. The processors are coherent, sharing L2 cache banks through a Network-on-Chip, each bank with its associated Home Node agent. through a Network-on-Chip. The processors will support RISC-V vector instructions, and will also control the specialised units dedicated to Stencil and Deep Learning acceleration. The vector and stencil capabilities will address HPC workloads, while the Deep-Learning units will target AI applications.

The vector processor architecture  will be based on these guiding principles:

  • Holistic throughput-oriented vision based on long vectors and task-based models
  • Hierarchical concurrency and locality exploitation
  • Communication between programming levels
  • Make it all look very close to classical sequential programming to ensure productivity

The dedicated unit architecture, on the other hand, will be geared towards a few specific applications. This specificity will be leveraged to explicitly manage the data placement and transfer from and into local scratchpad memories, targeting high-energy efficiency

The EPAC tile will be integrated both as a node in the GPP mesh, and as a stand-alone Test Chip for demonstration and software debugging purposes.

Compiler Explorer – Overview

The compiler team at the Barcelona Supercomputing Center working on EPI have setup of a Compiler Explorer for an LLVM-based compiler that targets the RISC-V and the V-extension (still draft) architecture.

Compiler Explorer is an open-source web application for interactive compiler code generation observation.

We want to use this tool to ease the analysis and study of the compiler code generation when targeting the RISC-V V-extension. This gives us valuable information in co-design as it can quickly expose pain points in the code generation. These pain points may suggest changes in the V-extension architecture or require new code generation strategies or optimizations in the compiler.

Compiler Explorer is intended for small programs or snippets not for large applications.

We have also integrated our own user-space functional emulator `vehave`. This emulator traps the vector extension instructions emitted by the compiler and emulates them with scalar instructions. This way we can execute vector applications and check their correctness under a RISC-V Linux environment. Both real hardware, such as the HiFive Unleashed, or `qemu-user` can be used. Our Compiler Explorer uses `qemu-user`.

Compiler Explorer website is https://repo.hca.bsc.es/epic/

Compiler user guide is available here: user-guide-compiler-explorer

VaRiable Precision Processor VRP

The VaRiable Precision Unit enables efficient computation in scientific domains with extensive use of iterative linear algebra kernels, such as physics and chemistry. Augmenting accuracy inside the kernel reduces rounding errors and therefore improves computation’s stability. Usual solutions for this problem have a very high impact in memory and computation time (e.g. use double precision in the intermediate calculations).

The hardware support of variable precision, byte-aligned data format for intermediate data optimizes both memory usage and computing efficiency. When the standard precision unit cannot reach the expected accuracy with standard precision (aka double), the variable precision unit takes the relay and continues with gradually augmenting precision until the tolerance error constraint is met. The offloading from the host processor (GPP) to the VRP unit is ensured with zero-copy hadnover thanks to IO-coherency between EPAC and GPP.

The VRP is embedded as a functional unit in a 64-bits RISC-V processor pipeline. The unit extends the standard RISC-V Instruction with hardwired arithmetic basic operations in variable precision for scalars: add, subtract, multiply and type conversions. It implements other additional specific instructions for comparisons, type conversion and transfers to cache. The unit features a dedicated register file for storing up to 32 scalars with up to 256 bits of mantissa precision. Its architecture is pipelined for performance, and it has an internal parallelism of 64-bits. Thus, internal operations with higher precisions multiple of 64 bits are executed by iterating on the existing hardware.

The VRP programming model is meant for smooth integration with legacy scientific libraries such as BLAS, MAGMA and linear solver libraries. The integration in the host memory hierarchy is transparent for avoiding the need of data copy, and the accelerator offers a standard support of C programs. The libraries are organized in order to expose the variable precision kernels as compatible replacements of their usual counterparts in the BLAS and solver libraries. The complexity of arithmetic operations is confined as much as possible within the lower level library routines (BLAS). Consistently, the explicit control of precision is exclusively handled at solver level.

Stencil/tensor accelerator STX

From the beginning, EPI explicitly considered “specialised blocks for stencil and deep learning (DL) acceleration. The vector and stencil capabilities will address workloads in HPC centres, while the DL block will target learning acceleration” as part of the acceleration stream motivated by “optimised performance and energy efficiency” for “specialised computations”. In the initial DoA, two different domain-specific accelerators (NTX for machine learning, and a stencil accelerator) were suggested. During the first few months of the project, researchers from Fraunhofer Institute, ETH Zürich and University of Bologna were able to merge the functionality of both units into a very efficient computation engine that has been named STX (stencil/tensor accelerator).

Such “domain-specific accelerators” are now a major trend in industry, as can be seen by multiple new announcements in the 2019 Hot-Chips symposium and AI Summit by industry heavyweights as a multitude of startups that have presented acceleration engines that were based on specialised datapaths and not general purpose vector units, confirming the significant differentiation in architecture needed for achieving top efficiency and performance in the machine learning domain.

The main goal of STX is to achieve a significantly higher (at least 5x-10x) energy efficiency over general purpose/vector units. The efficiency tells us how many computations can be performed with the unit, and the early target for the STX unit was to achieve at least 5x more energy efficiency (TFLOPS/W) than the vector unit on deep learning applications. In the first few months of the project, it became clear that these estimations are rather conservative, and the effective efficiency within EPI chips will be significantly higher. For applications that require only inference using quantized networks, this efficiency will be another 10x higher.

STX has been designed as a modular building block with several parametrization options. Each STX accelerator consists of several clusters of computing units, a typical instance would have four such clusters. Each cluster in turn consists of specialised computing engines as well as up to two RISC-V cores that are used to control the computing engines and perform additional operations. All these units will access a local scratchpad memory, which will be filled using a centralized DMA unit. This configuration allows for 64 GFLOPS (single precision FP), and multiple instances of STX can be instantiated in an EPAC tile.

STX is programmed using OpenMP, there are solutions that allow regular operations to be offloaded to the STX unit from an Arm system (in the GPP) or the 64-bit RISC-V core (in the EPAC tile) using both GCC and LLVM based flows that will be fruther refined as part of the project.

--content--
--date--

--title--

--excerpt--
--date--

Live News

We had a great time at the #EuroHPCSummit2023 in Sweden! 🦾 Thank to all of you who visited us at the poster session and who attended session with Mario Kovač (@HPCfer) in organisation with @Etp4HPC 😁 https://t.co/8XA0DJ8Lem
24/03/2023 12:10 pm
The summer school addresses young computer science researchers and engineers and is open to outstanding MSc students. Accepted students will spend one week in Barcelona, attending formal lectures, invited talks, and other activities. 🤓
23/03/2023 2:32 pm
.@TheOfficialACM Summer School on HPC Computer Architectures for AI and Dedicated Applications, co-hosted by @BSC_CNS and @la_UPC invites you to register until April 15th! ⏰ More about the summer school and registrations here ⬇️ https://t.co/cEsATZLEYs
23/03/2023 2:31 pm
RT @Etp4HPC: Our 2nd session of the afternoon is starting at #EuroHPCSummit2023 with Mario Kovac from @EuProcessor We're in room 3, join u…
22/03/2023 4:01 pm
The #EuroHPCSummit2023 is finally here! 🦾 We look forward to interesting discussions and seeing our colleagues who work on interesting projects. Don't forget to visit us today at the project poster session! 😁 https://t.co/FNr0XdObOy
20/03/2023 12:47 pm
📣 Join us in the project poster session at #EuroHPCSummit2023! Learn more about our project and our plans for the future. We look forward to seeing you in Sweden in just 4 days! 💪 https://t.co/LwKqECp1Zm
16/03/2023 11:04 am
Mario Kovač (@HPCfer) will have a keynote speech related to the EPI project today at HPC, Data & Architecture Week in Buenos Aires, Argentina. 🇦🇷 More information about the event here ➡️ https://t.co/Uc2EFDRUrK
13/03/2023 11:31 am
RT @Etp4HPC: 1 week to #EuroHPCSummit2023 ! Don’t miss the 2 sessions run by ETP4HPC on 22 March: 14:30 Emerging Technologies for HPC in Eu…
13/03/2023 9:47 am
The @EuroHPC_JU Summit is approaching! Mario Kovač from @HPCfer will participate in a session "Towards an Autonomous European HPC Supply Chain: Showcasing EuroHPC Projects." EPI will also be present in the poster session. ➡️ https://t.co/vSHGUsc6Gj See you in Sweden! 🇸🇪
06/03/2023 8:40 am
📢 EPI will be a sponsor of the 2023 edition of @TheOfficialACM Summer School on HPC Computer Architectures for AI and Dedicated Applications, co-hosted by @BSC_CNS and @la_UPC. More about the summer school and registrations here 👇 https://t.co/cEsATZLEYs
01/03/2023 1:51 pm

3 new R&I projects to boost the digital sovereignty of Europe

The European High Performance Computing Joint Undertaking (EuroHPC JU) has launched 3 new research and innovation projects. The projects aim to bring the EU and its partners in the EuroHPC JU closer to developing independent microprocessor and HPC technology and advance a sovereign European HPC ecosystem. The European Processor Initiative (EPI SGA2), The European PILOT […]
2022-02-04

Successful conclusion of European Processor Initiative Phase One

The European Processor Initiative (EPI) has successfully completed its first three-year phase, delivering cutting-edge technologies for European sovereignty on time and within a limited budget, despite the constraints of the COVID-19 pandemic Highlights include the Rhea general-purpose processor, EPI accelerator proof of concept and embedded high-performance microcontroller for automotive applications The successful completion of this […]
2021-12-22

EPI EPAC1.0 RISC-V Test Chip Samples Delivered

Another step closer to demonstrate the capabilities of a RISC-V based European microprocessor The European Processor Initiative (EPI) https://www.european-processor-initiative.eu/, a project with 28 partners from 10 European countries, with the goal of making EU achieve independence in HPC chip technologies and HPC infrastructure, is proud to announce that EPAC1.0 RISC-V Test Chip samples were delivered […]
2021-09-22

Eric Monchalin is the new Chairman of the EPI Board

General Assembly of European Processor Initiative has selected a new Chairman of the Board in July. Eric Monchalin from Atos, the company that coordinates the EPI project, is going to lead 28 partners from 10 countries in their efforts to design and implement a roadmap for a new family of low-power European processors. Eric is […]
2021-07-21

EuroHPC JU regulation published in the Official Journal of the European Union

Regulation on EuroHPC JU establishment adopted
2021-07-21

EPI to take centre stage at the ACM Europe Summer School on HPC Computer Architectures for AI and Dedicated Applications

Taking place on 30 August – 3 September 2021, the second ACM Europe Summer School on HPC Computer Architectures for AI and Dedicated Applications will be co-hosted by Barcelona Supercomputing Center (BSC), in conjunction with the Universitat Politècnica de Catalunya – Barcelona Tech (UPC). The programme of this year’s summer school, which will be fully […]
2021-07-06

EPI EPAC1.0 RISC-V Test Chip Taped-out

European Processor Initiative has successfully released EPAC1.0 Test Chip for fabrication
2021-06-01

Infineon’s Knut Hufeld Discusses Automotive Developments in EPI

Knut Hufeld, Senior Director R&D with Infineon and an Automotive Stream representative in EPI, talked about the developments in the stream with Ralf Hartmann. 
2021-03-12

EPI EPAC1.0 RISC-V core boots Linux on FPGA

EPI team successfully boots Linux on our EPAC 1.0 core subset implemented on FPGA.
2021-03-09

EPI team at HiPEAC 2021

EPI team participated in several activities at HIPEAC2021.
2021-02-17
Our website uses cookies to give you the most optimal experience online by: measuring our audience, understanding how our webpages are viewed and improving consequently the way our website works, providing you with relevant and personalized marketing content. You have full control over what you want to activate. You can accept the cookies by clicking on the “Accept all cookies” button or customize your choices by selecting the cookies you want to activate. You can also decline all cookies by clicking on the “Decline all cookies” button. Please find more information on our use of cookies and how to withdraw at any time your consent on our privacy policy.
Accept all cookies
Decline all cookies
Privacy Policy