Serpens for Kepler
PRACE Fourth Implementation Phase
PRACE, the Partnership for Advanced Computing, was established in May 2010 as a permanent pan-European High Performance Computing service providing world-class systems for world-class science. Six systems at the highest performance level (Tier-0) are deployed by Germany, France, Italy and Spain providing researchers with over 9 billion core hours of compute time. HPC experts from twenty-five member states - funded in part in three implementation projects - enabled users from academia and industry to ascertain leadership and remain competitive in the Global Race. Currently PRACE is preparing for PRACE 2.0, the successor of the initial five year period.
Scheduled finish date: 2021-03-03
The objectives of PRACE-4IP are to build on and seamlessly continue the successes of PRACE and start new innovative and collaborative activities proposed by the consortium.
These include: assisting the transition to PRACE 2.0; strengthening the internationally recognised PRACE brand; continuing advanced training which so far provided more than 15.000 person-training days to over 4700 persons, preparing strategies and best practices towards exascale computing, coordinating and enhancing the operation of the multi-tier HPC systems and services, and supporting users to exploit massively parallel systems and novel architectures.
The proven project structure will be used to achieve each of the objectives in six dedicated work packages. The activities are designed to increase Europe's research and innovation potential especially through: seamless and efficient Tier-0 services and a pan-European HPC ecosystem including national capabilities; promoting take-up by industry and special offers to SMEs; analysing new flexible business models for PRACE 2.0; proposing strategies for deployment of leadership systems; collaborating with the ETP4HPC, the coming CoEs and other European and international organisations on future architectures, training, application support and policies.
- Ensure long-term sustainability of the infrastructure. The project will assist the PRACE Research Infrastructure (PRACE RI) in managing the transition from the business model used in the Initial Period (2010-2015), where it demonstrated the case for a European HPC research infrastructure relying on the strong engagement of four hosting partners (BSC representing Spain, CINECA representing Italy, GCS representing Germany and GENCI representing France) who funded and deployed the petaflop/s systems used by PRACE RI.
- Promote Europe’s leadership in HPC applications. Scientific and engineering modeling and simulation require capabilities of supercomputers. The project will enable application codes for PRACE leadership platforms and prepare for future systems, notably those with architectural innovation embodied in accelerators or co-processors, by investigating new programming tools and developing suitable benchmarks.
- Increase European human resources skilled in HPC and HPC applications. The project will contribute by organizing highly visible events, enhancing the state of the art training provided by the PRACE Advanced Training Centres (PATCs), targeting both the academic and industrial domains. On-line training will be improved and a pilot will assess a Massively open online Course (MooC).
- Support a balanced eco-system of HPC resources for Europe’s researchers. The project will contribute to this objective through tasks addressing: a) the improvement of PRACE operations; b) the prototyping of new services including “urgent computing”, the visualization of extreme size computational data, and the provision of repositories for open source scientific libraries. Links will be established with other e-infrastructures and the Centres of Excellence which will be created in Horizon 2020. The existing international collaborations will be extended.
- Evaluate new technologies and support Europe’s path for using ExaFlop/s resources. The project will extend its market watch and evaluation based on user requirements, study best practices for energy-efficiency and lower environmental impact throughout the life cycle of large HPC infrastructures and define best practices for prototype planning and evaluation. This will contribute to solve a wide range of technological, architectural and programming challenges for the exaflop/s area.
- Disseminate effectively the PRACE results. This targets engaging European scientists and engineers in the wider utilisation of high end HPC. The project will continue to organise well known events like PRACEdays, Summer of HPC and the International HPC Summer School in order to promote and support innovative scientific approaches in modelling, simulation and data-analysis. With the extended presence at conferences (e.g. SC, ISC or ICT) the project is seeking wider support of the general public for HPC, in particular by illustrating success stories.
The work of PRACE-4IP is structured into seven Work Packages to effectively achieve its objectives and address the requirements of PRACE and the European HPC ecosystem:
WP1: Management of the Contract
WP2: Organisational Concept of the RI
WP3: Communication and Dissemination
WP5: HPC Commissioning and Prototyping
WP6: Operational Services for the HPC Eco-System
WP7: Application Enabling and Support
The official partner of PRACE-4IP project is Institute of Bioorganic Chemistry PAS – Poznan Supercomputing and Networking Center. Additionally, the following institutions from Poland are involved as the Third Party:
• WCNS – Wroclaw Center for Networking and Supercomputing
• Cyfronet – Academic Computer Centre Cyfronet AGH
• TASK - Academic Computer Centre in Gdansk
Organisational Concept of the RI
Organisational support for PRACE 2.0 development
An expert team in business analysis and organisational support for the PRACE RI formed by Council advisors, and legal experts will provide support to the PRACE Council and BoD to implement different elements of PRACE 2.0 (the model that will succeed the Initial Period) and assist in the analysis of its evolution. This support will include, but will not be limited to:
• The analysis of the financial implications of the funding models, including analysis of billing options or potential state aid issues;
• The research for avoiding taxation problems within the selected options;
• The design of process proposals to audit cost models for the operation of HPC Centres providing European services with different financing sources;
• The proposal of solutions to implement in-kind or in-cash funding options;
• The support for providing ready to sign agreements or definition of precise working rules for PRACE 2.0;
• The legal support to arising issues of organisational nature in the different Work Packages.
This task will also report on the general advances made towards the new PRACE model.
Enhanced PRACE management processes and supporting tools
Task 'PRACE Management Processes and Tools' in PRACE-3IP developed recommendations and guidelines for enhanced PRACE management processes and supporting tools. The objective was to identify concepts in the excellence framework as basis for methodological support for the PRACE management in evolving PRACE into a more processoriented organisation. The present task builds on top of these recommendations to assist the association on their implementation. A small team with specialised knowledge will provide support in processes like:
• Development of the PRACE 2.0 Vision statement and Values description together with the Council.
• Deploy the methodology for process description defined in Task 2.3 of PRACE-3IP (having seminars and one to one sessions with the PRACE aisbl staff).
• Set up of a process and the necessary tools to capture Institutional PRACE Knowledge, and have a process improvement mechanism (through the specific design of processes and the implementation of a Knowledge data base, and a corrective actions data base).
• Set up a process for managing priorities (through the implementation of a strategic review and operational review process and other needed mechanisms described in PRACE-3IP).
• Support in the development of the KPI analysis processes widening its scope beyond the impact assessment, assisting in the creation of KPI on other deployed processes.
Services for Industry and SMEs
Best practices for energy-efficient HPC Centre Infrastructures design and operations
This task will focus on the major issues of power efficiency and low environmental impact of large scale HPC centre infrastructures.
The installation requirements for 100-petaflops and future exascale systems will be studied, due to the particular importance of anticipating these requirements when planning or upgrading the facility of an HPC centre. The requirements will be complemented by one of the large scientific data storage repositories. The necessary contacts with vendors and HPC sites will be continued through ‘European Workshops on HPC Centre Infrastructures’. After five such successful events, annual workshops will be organized. This will allow to continue the study of major HPC centre infrastructures in Europe and worldwide in order to take advantage of the experience of the most advanced sites. Solutions developed for large commercial data centres and their applicability to HPC centres will be also considered.
This task will also strengthen links with European initiatives in the domain of data infrastructures and with US DOE Energy Efficient HPC Working Group.
A list of best practices for HPC centre infrastructure construction and operation will be produced and maintained, and white papers on topics such a cooling, power supply or infrastructure monitoring updated or created.
Infrastructure operations, Applications and Training
PRACE has established an on-line, centralised service for sharing training resources in the form of the PRACE Training Portal. The Training Portal supports the user communities by offering training materials, video tutorials, information about PRACE training events and links to other useful training resources. The Training Portal and related on-line resource will be maintained and further developed by adding new material from PRACE events. A PRACE CodeVault aimed at sharing code examples, model solutions and other tips in HPC programming within the community will also be launched. In order to improve the on-line training service even further, Massively open on-line Course (MooC) will be piloted as a new training method. The MooC pilot will consist of an analysis and selection of a suitable platform followed by developing and running an initial set of 1-2 on-line courses.
Operation and coordination of the comprehensive common PRACE operational services
This Task will continue the coordinated operation of the common PRACE operational services.
These services include:
• network services (dedicated network provided by GEANT connecting the Tier-0 and the major Tier-1 centres);
• data services (e.g. GridFTP);
• concerted procedures for resource management (e.g. UNICORE and Globus GRAM);
• harmonised procedures for Authorization, Authentication and Accounting services (e.g. set up of the infrastructure to the use of Public Key Infrastructure facilities (PKI), user administration and accounting, gsi-ssh);
• user services (e.g. common production environment, user documentation);
• monitoring services for operations group;
• generic services (i.e. wiki’s, trouble ticket system, source code repositories).
This task will also manage the Service Catalogue describing all the supported services, updating them and monitoring the related KPI and will support the activities of the PRACE Security Forum. The information related to the PRACE Infrastructure will be documented on the PRACE web site.
To build a pan-European HPC ecosystem where Tier-1 systems are essential as a stepping stone towards Tier-0 systems or to utilise specific architectures which are only available in other countries, this task will also provide operational support on national Tier-1 systems to users from academia and industry or to prototype and asses new operational services.
Analysis and development of prototypal new services
An efficient and state of the art HPC infrastructure at European level should be ready to operate innovative services to afford scientific, technological and societal challenges. Examples of these services are:
• the provision of urgent computing services where the emerging computations results can help to issue critical decisionmaking paths and afford a critical, national-scale emergency;
• the link with large-scale scientific instruments (i.e. satellites, laser facilities, sequencers, synchrotrons, etc.) providing large amount of data and information which more generally require to improve the support of data intensive applications;
• smart post processing tools including in situ visualisation to check and visualise dynamically the evolution of large volumes of data produced by simulations on extreme scale systems, where the data size represents a barrier for standard processing and visualisation methodologies;
• provision of repositories for European open source scientific libraries and applications, to promote wide adoption, uniformity at consolidation of European products.
This task will analyse these new services and investigate their prototypal implementations of these services at the preproduction level (involving first Tier-1 systems and then Tier-0 systems) to assess the functionality. It will investigate the possible adoption in a next phase as production services, analysing for them the aspects of service certification.
Furthermore, this task will address the technical evolution of the existing PRACE operational services.
Link with other e-infrastructures and CoEs
The PRACE infrastructure will establish links with complementary e-infrastructures (network, grid/cloud and data infrastructures) and the future Centres of Excellence to identify commonalities and foster the technical interoperability across their services for the benefit of the users. The successful collaboration with other infrastructures is a key factor to guarantee the evolution of PRACE toward a more integrated vision where either large research infrastructures, individual users, or communities can seamlessly access and use available resources. The work is organized so to
ensure that community requirements are taken in consideration by piloting well defined use cases and creating bilateral collaborations for being more effective in the achievement of results. More specifically, this task will establish connections with relevant counterparts in the following two areas.
• Security. The joint participation with the complementary e-infrastructures to the SCI (Security for Collaborating Infrastructures) group will continue with the ambition to make it a formal organization so to guarantee the constant exchange of information on security incidents, the development of security policies and procedures, the definition of best practices in this area.
• Data. The collaboration with other European Data Infrastructures will leverage on the experience of community pilots done in past PRACE projects with EUDAT and EGI. The aim of this activity is to guarantee the long term preservation and sharing of data produced on the PRACE systems by outsourcing data management functions to other e-Infrastructures. Through the implementation of joint community pilots, the interoperability of participating infrastructures will be investigated to experiment the seamless access to available services, the integration of respective identities managers, the easy transfer of data, the adoption of common tools and protocols, the execution of distributed scientific workflows.
The successful collaboration with XSEDE in PRACE-3IP could be extended to other infrastructures world-wide, with a further call to address interoperability between the infrastructures and address common services. Finally, it is important to underline how many research infrastructures might not access computing and data services directly, but rather through work-flow engines that, coupling different services together, can perform complex tasks easily. This is one of the reasons why ensuring the interoperability between PRACE and other e-Infrastructures is fundamental to improve the users experience and raise the awareness around PRACE services.
Enabling Applications Codes for PRACE Systems
Task will provide applications enabling support for selected codes that are important for the European academic research and industrial communities. This will include projects that are successful in applying for Preparatory Access (PA) or through SHAPE. In the field of PRACE 2.0 “Tier-1 services for Tier-0”, PA calls will cover both Tier-0 and Tier-1, where the Tier-1 focus will be on applications that could scale towards Tier-0. Task 7.1 will collaborate with the PRACE aisbl office on the Calls for Support for PA and SHAPE. These Calls will continue to be on a regular schedule and we will also provide technical input into the peer review processes. Work with these selected applications codes may either focus on improving the performance of a specific application code, or on support for solving a particular computational problem, or on overcoming the barriers that prevent effective use of HPC. The results will generally include quantitative measures of the improvement in performance and scaling, and these will be reported using the standard PRACE applications reporting template. Task 7.1 will ensure that there is an increasing range of important applications that can run efficiently on state-of-the-art supercomputers.
Supporting European HPC Researchers
This task will disseminate information that will help the European HPC community benefit from current and future HPC systems. Much of this work will support and enhance the PRACE training activities. Task will maintain and extend the successful series of best practice guides to new architectures/systems; during the timeframe of this project, likely architectures include Haswell/Broadwell x86 processors and many-core/accelerators (nVIDIA and Intel Knight’s Landing). This task will also update and maintain the Unified European Applications Benchmark Suite (UEABS) produced by PRACE-2IP and continued by PRACE-3IP, so that it can be used as one of the benchmarks in future procurements and to help European researchers chose systems that are appropriate for their computational requirements. Benchmark performance is also useful for monitoring of the performance/power of systems. Task will contribute appropriate code samples in a variety of key programming paradigms.