On July 19th, the Council Regulation establishing the European High-Performance Computing Joint Undertaking was published in the Official Journal of the European Union. It was formally adopted on July 13th, by the Economic and Financial Affairs Council.
This is an important piece of news for the HPC community, opening up EuroHPC JU’s ability to draw funds from the Horizon Europe, Digital Europe and the Connecting Europe Facility programmes. The news item and full regulation text can be found on the EuroHPC JU’s website, here:
General Assembly of European Processor Initiative has selected a new Chairman of the Board in July. Eric Monchalin from Atos, the company that coordinates the EPI project, is going to lead 28 partners from 10 countries in their efforts to design and implement a roadmap for a new family of low-power European processors.
Eric is experienced in leading 100+ people organizations and managing multi tens millions of Euros projects in international environments. He is a technology-minded person who values wide range of skills and technological knowledge focused on customer expectations to turn them into reality. Furthermore, Eric’s career has been mainly built on numerous Hardware and Software R&D positions in several companies and various domains like signal processing, embedded systems, communication, storage, High Performance Computing and Artificial Intelligence.
Taking place on 30 August – 3 September 2021, the second ACM Europe Summer School on HPC Computer Architectures for AI and Dedicated Applications will be co-hosted by Barcelona Supercomputing Center (BSC), in conjunction with the Universitat Politècnica de Catalunya – Barcelona Tech (UPC).
The programme of this year’s summer school, which will be fully remote, centres around topics within the European Processor Initiative. Open hardware expert Luca Benini (ETH Zürich / University of Bologna) will be discussing machine learning from a RISC-V platform perspective, while the EPI Accelerator stream leader Jesús Labarta (BSC) will train attendees in performance analysis and hybrid programming, helping them get the very best out of their code. Meanwhile, Mauro Olivieri (Sapienza – University of Rome / BSC) will present vector acceleration in high-performance computing (HPC) and edge devices.
Keynote talks will be given by celebrated computer scientists and engineers including Luca Cardelli (Oxford University), Bill Dally (Stanford University / NVIDIA), Mihaela van der Schaar (Cambridge University) and Mateo Valero (BSC). There will also be invited talks by EPI researchers Roger Espasa (Semidynamics) on the RISC-V Avispado core, John Davis (BSC) on building an open-source ecosystem for HPC, and Marc Casas (BSC) on accelerating deep neural network training.
The school chairpersons are Mateo Valero and Josep Fernandez (UPC), while the local organizing committee is led by Eduard Ayguadé, (BSC and UPC), and Fabrizio Gagliardi (BSC and ACM).
“EPI is revolutionising computer architecture to pave the way for Europe’s technological sovereignty. As part of the EPI training programme, this summer school provides the ideal introduction to some of the initiative’s main themes,” commented Jean-Marc Denis (Atos / SiPearl), chairman of the EPI board.
Upon completion of the school, all attendees will receive a certificate and a complimentary ACM student membership. Based on the scores obtained in the practical exercises, the best performing students will receive a certificate of honour and will also invited to interviews with industry sponsors, with a view to a possible internship.
The school includes a poster session, with a prize for the best poster.
Registration will be open until 15 July, and accepted candidates will be informed by 1 August.
To register, complete the registration form on the ACM Europe website.
The European Processor Initiative (EPI) https://www.european-processor-initiative.eu/, a project with 28 partners from 10 European countries, with the goal of helping the EU achieve independence in HPC technologies is proud to announce that we have successfully booted Linux on our EPAC 1.0 core subset implemented on FPGA.
One key segment of EPI activities is to develop and demonstrate fully European processor IPs based on the RISC-V Instruction Set Architecture, providing power efficient and high throughput accelerator core named EPAC (European Processor Accelerator). Using RISC-V will allow leveraging open-source resources at hardware architecture level and software level, as well as ensuring independence from non-European patented computing technologies.
First silicon implementation of EPAC 1.0 test chip is expected in the second half of 2021 and as an important technical milestone towards that goal, we have successfully booted Linux on a subset of EPAC 1.0 synthesized on FPGA. The FPGA design includes the Avispado RISC-V core, the Vector Processing Unit (VPU), the Network on Chip (NoC), the Shared L2 Cache with Coherence Home Node (L2HN), interrupt controllers, IO peripherals and several other components. This implementation will enormously speed-up software development on the EPI HPC architecture as well as testing and improving the architecture for next generations EPAC chips.
For the development of the EPI Accelerator Test Chip, we make extensive use of FPGA technologies to verify the RTL design of the Test Chip. The subset of EPAC 1.0 emulated on FPGA we use as Software Development Vehicle (SDV) to enable early software development before the actual Test Chip silicon comes back from the foundry.
The use of FPGAs enables the testing of RTL blocks with real and very complex software in timescales that are not tractable using pure RTL simulation. Also, it allowed us to stress memory coherence and several related corner cases. We have managed to boot Linux using the EPAC 1.0 system on FPGA and the system boots within a few dozens of seconds compared to weeks using pure simulation. The system is fully usable and interactive for system software and application development and it also includes Ethernet connectivity to enable running large and complex software packages, e.g., OpenMP, MPI.
This is a serious proof of concept that gives us the confidence of a functional and viable future product.
More details on the results can also be found on EPI YouTube Channel, where you can find the second episode of EPI Talks podcast, themed with booting Linux:
Channel: https://www.youtube.com/c/EuropeanProcessorInitiative/featured
Video: https://www.youtube.com/watch?v=X4klPnBjstI
This year’s HiPEAC conference was supposed to be held in beautiful Budapest, from January 18th to 20th 2021. However, due to the persistent pandemic conditions, it had to be moved to a digital edition.
European Processor Initiative’s consortium gladly accepted the opportunity to participate, both as a sponsor and with a tutorial given to the attendees. Our tutorial was held on Monday, January 18th, and it involved the following topics and presenters:
The tutorial was very well attended, and the attendees had a lot of interesting questions.
In addition to the tutorial, the EPI team has also hosted a virtual Brazen booth, where students, young researchers, and other prospective employees popped in to ask questions about the project. From the industrial side of EPI, Fabrizio Magugliani presented EPI in the industrial session, while Jurgen Becker attended the WRC workshop and gave an invited talk that involved EPI as well.
All of the presentations from HiPEAC given by EPI members can be found in our repository:
https://www.european-processor-initiative.eu/dissemination-material/epi-at-hipeac-2021/, while the tutorial video is also available on our YouTube Channel: https://www.youtube.com/watch?v=4AYDc6THtcM&t=9476s.
Finally, a shout-out to organizers: even though the team is more than eager to participate in the next one, hopefully live, HiPEAC conference, we are thankful to organizers for such a wonderful experience and the opportunity to share our views and ideas.
The European Processor Initiative, a project with 27 partners from 10 European countries, with the goal of helping the EU achieve independence in HPC technologies, is finishing its second year of activities.
Despite 2020 bringing upon our young efforts the circumstances that were previously unimaginable, causing the subsequent cancellation of the first-ever planned European Processor Initiative Forum, the partners in the Initiative managed to stay the course and maintain all activities in all designated streams.
The year began with our latest partner, SiPearl, launching to start their activities dedicated to developing commercialized implementations of our technology. Very soon after announcing their presence on the global stage, SiPearl signed a licensing agreement with Arm and opened a branch in Germany.
At the same time, European Processor Initiative partners have finalized the first version of our RISC-V accelerator architecture, named EPAC, and we look forward to the delivery of the first European Processor Initiative silicon featuring EPAC Test Chip in the exciting year that follows. The EPAC Test Chip silicon will be complemented with PCIe EPAC Test Platform enabling the test and enhancements of the architecture for future revisions.
At the software level, we already have a compiler supporting RISC-V vector intrinsics and automatic parallelization of C/C++ codes. We are evaluating the generated code on emulation platforms that provide detailed insight for the holistic co-design of applications, compiler, and architecture. We also have other software development vehicles (SDV) where we are adapting the Operating System for the Heterogeneous ARM+RISC-V architecture of the European Processor Initiative project.
Our automotive activities in the previous year have been focused on the design of state-of-the-art automotive high-performance computing proof-of-concept with the ambition to demonstrate how European Processor Initiative IP will enable future ADAS functionality, paving the way to exploit the GPP, the RISC-V platform, the Kalray MPPA, and the Menta eFPGA IP.
Based on this project progress, the Consortium is ready to announce the updated project roadmap shown below:
European Processor Initiative will, together with colleagues from other European exascale projects, attend the virtual Supercomputing20, where we will showcase our latest developments and update to the roadmap. We invite you to attend our virtual booth and join in the lively discussions about the future of HPC!
At this year’s first virtual Supercomputing 20, the European Processor Initiative has invited several European exascale projects to share our virtual booth and showcase their progress, ideas, and results. Representatives from all of the projects and the European Processor Initiative will be available at our online booth for video chats.
The projects are:
We look forward to your visiting our booth and booking appointments for video chats with any of our representatives!
The schedule for all projects, if you want to catch someone in particular, is here:
November 17th, 2020
November 18th, 2020
November 19th, 2020
This week, on September 23-24, Korea Supercomputing Conference 2020 was held virtually and it hosted many distinguished guests who discussed the Exascale computing era. Jean-Marc Denis, EPI’s Chairman of the Board, was invited as a keynote speaker with a talk titled “The European approach for exascale ages. The road toward sovereignty”.
He explained EPI’s short-term objectives – which is to supply the EU-designed microprocessor to empower the EU exascale machines, and long-term objective, which is that Europe needs independent and sovereign access to high-performance, low-power microprocessors, from IP to products.
EPI’s mission was highlighted and fortified once again, seeing that Mr Denis’ talk came only days after the President of the European Commission, Ursula von der Leyen gave her first State of the Union speech, in which she said Europe will invest 8 billion euros in the next generation of supercomputers – cutting-edge technology made in Europe.
The full presentation is available in our repository: https://www.european-processor-initiative.eu/dissemination-material/epi-ksc20/, while the talk can also be viewed at EPI YouTube channel: https://www.youtube.com/watch?v=giicOMSG0uQ.
President of the European Commission, Ursula von der Leyen in her State of the Union speech on September 16, 2020, announced new steps in the Digital Decade agenda.
The European Commission has proposed a new regulation to help Europe achieve the leading role in supercomputing and quantum computing, and it will enable an investment of €8 billion in the next generation of supercomputers.
Von der Leyen said that the “[N]extGenerationEU is also a unique opportunity to develop a more coherent European approach to connectivity and digital infrastructure deployment. None of this is an end in itself – it is about Europe’s digital sovereignty, on a small and large scale. In this spirit, I am pleased to announce an investment of 8 billion euros in the next generation of supercomputers – cutting-edge technology made in Europe. And we want the European industry to develop our own next-generation microprocessor that will allow us to use the increasing data volumes energy-efficient and securely. This is what Europe’s Digital Decade is all about!”
SOTEU speech can be find here: https://ec.europa.eu/commission/presscorner/detail/en/SPEECH_20_1655
Full press release is here: https://ec.europa.eu/commission/presscorner/detail/en/ip_20_1592
The European Processor Initiative (EPI) is building a new central processing unit (CPU) with European technology. This CPU will bundle an accelerator, based on the open source RISC-V architecture. This accelerator will include support for the upcoming V-extension of RISC-V. At the Barcelona Supercomputing Center (BSC), we have been busy at work building software tools and infrastructure to explore and learn about the benefits and challenges that this extension brings to the table.
RISC-V is a relative newcomer in the Instruction Set Architecture (ISA) space, along with well-established ones like x86-64 and AArch64. Its main distinctive feature is that it is an open source specification that anyone can take to implement and extend, be it for commercial purposes or research studies. The specification is maintained by the RISC-V Foundation and its members.
Another important aspect of the RISC-V ISA is the fact that it has been designed to be modular. Conscious that not all the features present in modern ISAs may be of interest to everyone, the ISA is structured in a base specification and a set of standard extensions. Also, given its open source nature, anyone can define their own non-standard extensions, and the ISA caters to that possibility by providing customization features. Standard extensions have the advantage that they are eventually ratified by the RISC-V Foundation and its members after a collaborative development of the specification and gathering of experience in early implementations.
Standard extensions are commonly identified with a single letter. One of them is the V-extension. V stands for vector. This extension aims to provide vector computation capabilities to the RISC-V ecosystem. It is currently under development so no hardware or software exists that supports it.
Programs running on a computer are made up by a number of instructions. Each instruction is executed by the CPU, and most of them just do a simple operation on a single data value. For instance, an instruction may add two numbers. While this is perfectly reasonable for most applications, there are several domains, spanning from High Performance Computing (HPC) to Digital Signal Processing (DSP), where applications need to repeatedly perform the same computation over a regular set of data. In our running example, rather than just adding two numbers, applications in these domains are better served by an instruction that is able to add, pair-wise, two sets of numbers. These sets of numbers are called vectors (conversely, a single number is called a scalar). In this scenario, using vector instructions can result in applications that run more efficiently both in time and in energy consumption.
These kind of instructions traditionally have been called Single Instruction Multiple Data (SIMD), as a single instruction is able to operate over a set of data instead of individual elements. Many of the well-established vendors like Intel and Arm already provide SIMD instructions in the ISAs they maintain. Examples are Intel’s SSE, AVX-2 and AVX-512 and Arm’s Advanced SIMD and SVE.
There are a number of distinctive features in the V-extension that make it very interesting and also pose some challenges to implementers. An important one is that the extension does not prescribe the length of the vectors. Traditional SIMD ISAs prescribe the size of a vector as 128, 256 or 512 bit. A drawback with this approach is that a new ISA is required every time there is an interest to enlarge the vectors. The alternative is to allow the vendor to choose the size depending on its market needs. For instance, this is what Arm’s SVE does, where vectors can range from 128 bit to 2048 bit, in multiples of 128 bit. The V-extension currently only requires the width to be a power-of-2, so it is possible to cover markets that are well served with shorter vectors, like DSP, and also markets that benefit from longer vectors, like HPC.
Another interesting feature of the V-extension is that it restores the concept of vector length, a feature reminiscent of ancient vector architectures of the 1970s. The vector length tells the CPU how many elements in the vector have to be processed.
SIMD ISAs often process the whole vector so this concept does not exist there. An application has to consider the case when there is not enough data to fill a full vector. One option is to resort in this case to regular, scalar, instructions. Another option is to keep using vector instructions but discard some of the results computed, using a common feature called masking that may come with an extra penalty. The vector length can be used to reduce the number of elements being processed without requiring extra instructions, like in the first option, or having to compute a mask. For some implementations like in EPI’s VPU, shortening the vector length also allows to shorten the latency of instructions – no computation cycle is needed for the unused “tail” of the vector.
To explore the software side of the V-extension, we took the LLVM open source compiler which, except for the V-extension, already has good support for a number of the standard extensions of RISC-V. LLVM is an umbrella project for open source compiler and other toolchain-related projects like linkers or static analyzers.
In order to enable the exploration of the V-extension, we designed and implemented an initial set of C/C++ builtins. These builtins allow the C/C++ developers to be able to target the V-extension instructions from their applications. Along with our partners at EPI, we have ported several computational kernels, core parts of applications that are used very frequently, using the developed builtins. Some of those kernels are classic in HPC, such as matrix multiply, sparse matrix vector and the FFTW implementation of the Fast Fourier Transform (FFT). Algorithms from other domains, such as cryptography, have also been evaluated under the V-extension.
Finally, we also implemented in the compiler initial autovectorization support. A compiler can determine that vector instructions can be used without having to use builtins. Because of the two distinctive features of V-ext described earlier, compilers do not have good support yet in this area. Most of our work here is very infrastructural in making sure the compiler can vectorize in the way we believe is the best for the V-extension. We hope to be able to provide better support in this area, which is under intense work-in-progress status.
The V-extension is still being built, so no hardware exists that can execute V-extension instructions. This limits users and developers of the compiler as they would not be able to tell if their program and the compiler work correctly. To unblock this issue, we developed an emulator, called Vehave, that runs on top of existing RISC-V Linux platforms. This way the correctness of the applications and the compiler can be validated using the emulator.
We also implemented in the emulator a mechanism to generate traces of the executed vector instructions. These traces can be loaded in BSC’s trace visualization tool Paraver. This provides information valuable to the users and developer of the compiler.
For instance, application developers can determine that their application requires the compiler instructions that are known to be slow in a specific implementation of the V-extension. Compiler developers can identify redundant instructions or complicated instruction sequences. At BSC, we identified some of those complicated instruction sequences and we reported to the V-extension work group. The specification was extended with new individual instructions that achieve the same functionality of the original sequences.
In order to allow quicker experimentation, since all compilers are large pieces of software, we installed a version of Compiler Explorer that can work with the compiler we developed. This way it is possible to share small snippets of code to evaluate the quality of the code emitted by the compiler. This tool is publicly available at https://repo.hca.bsc.es/epic