• Themes
• Newsmakers

# Supercomputers

19.01.2022
12:06 GROMACS in the cloud: A global supercomputer to speed up alchemical drug design. (arXiv:2201.06372v1 [cs.DC])

We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy calculations. We set up a compute cluster in the Amazon Web Services (AWS) cloud that incorporates various different instances with Intel, AMD, and ARM CPUs, some with GPU acceleration. Using representative biomolecular simulation systems we benchmark how GROMACS performs on individual instances and across multiple instances. Thereby we assess which instances deliver the highest performance and which are the most cost-efficient ones for our use case. We find that, in terms of total costs including hardware, personnel, room, energy and cooling, producing MD trajectories in the cloud can be as cost-efficient as an on-premises cluster given that optimal cloud instances are chosen.

06:03 GROMACS in the cloud: A global supercomputer to speed up alchemical drug design. (arXiv:2201.06372v1 [cs.DC])

We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy calculations. We set up a compute cluster in the Amazon Web Services (AWS) cloud that incorporates various different instances with Intel, AMD, and ARM CPUs, some with GPU acceleration. Using representative biomolecular simulation systems we benchmark how GROMACS performs on individual instances and across multiple instances. Thereby we assess which instances deliver the highest performance and which are the most cost-efficient ones for our use case. We find that, in terms of total costs including hardware, personnel, room, energy and cooling, producing MD trajectories in the cloud can be as cost-efficient as an on-premises cluster given that optimal cloud instances are chosen.

18.01.2022
01:40 Can Supercomputers Really Keep Up With the Human Brain?

Data scientists face new challenges as their efforts to examine every single neuron in the human brain place extreme processing demands on supercomputers.

12.01.2022
19:24 UK’s most powerful supercomputer has booted up and is doing science

ARCHER2, a £79 million machine funded by the UK government, is still in a testing period, but already working on real science such as modelling volcanic plumes

11.01.2022
17:46 Scientists use Summit supercomputer, deep learning to predict protein functions at genome scale

A team of scientists led by the Department of Energy's Oak Ridge National Laboratory and the Georgia Institute of Technology is using supercomputing and revolutionary deep learning tools to predict the structures and roles of thousands of proteins with unknown functions.

15:26 Scientists use Summit supercomputer, deep learning to predict protein functions at genome scale

A team of scientists led by the Department of Energy’s Oak Ridge National Laboratory and the Georgia

10:08 Citizen Science, Supercomputers and AI

Citizen scientists have helped researchers discover new types of galaxies, design drugs to fight COVID-19, and map the

07.01.2022
19:22 Light–matter interactions simulated on the world’s fastest supercomputer

Researchers have developed a computational approach for simulating interactions between matter and light at the atomic scale. The team tested their method by modeling light -- matter interactions in a thin film of amorphous silicon dioxide, composed of more than 10,000 atoms, using the world's fastest supercomputer, Fugaku. The proposed approach is highly efficient and could be used to study a wide range of phenomena in nanoscale optics and photonics.

15:39 Updated exascale system for Earth simulations

The Earth—with its myriad interactions of atmosphere, oceans, land and ice components—presents an extraordinarily complex system for investigation. For researchers, simulating the dynamics of these systems has presented a process that is just as complex. But today, Earth system models capable of weather-scale resolution take advantage of powerful new computers to simulate variations in Earth systems and anticipate decade-scale changes that will critically impact the U.S. energy sector in coming years.

10:31 Light-matter interactions at the atomic scale simulated on the world's fastest supercomputer

Light-matter interactions form the basis of many important technologies, including lasers, light-emitting diodes (LEDs), and atomic clocks. However, usual computational approaches for modeling such interactions have limited usefulness and capability. Now, researchers have developed a technique that overcomes these limitations.

09:21 Updated exascale system for Earth simulations

A new version of the Energy Exascale Earth System Model (E3SM) is two times faster than the previous

06.01.2022
14:25 Now Europe wants its own super-powerful supercomputer

Europe is getting serious about supercomputing, with plans to build a high-end exascale device.

16.12.2021
16:41 Toward fusion energy, team models plasma turbulence on the nation's fastest supercomputer

A team modeled plasma turbulence on the nation's fastest supercomputer to better understand plasma behavior

10.12.2021
21:04 Exotic six-quark particle predicted by supercomputers

The predicted existence of an exotic particle made up of six elementary particles known as quarks by RIKEN researchers could deepen our understanding of how quarks combine to form the nuclei of atoms.

07:03 Establishing a non-hydrostatic global atmospheric modeling system (iAMAS) at 3-km horizontal resolution with online integrated aerosol feedbacks on the Sunway supercomputer of China. (arXiv:2112.04668v1 [physics.ao-ph])

During the era of global warming and highly urbanized development, extreme and high impact weather as well as air pollution incidents influence everyday life and might even cause the incalculable loss of life and property. Although with the vast development of numerical simulation of atmosphere, there still exists substantial forecast biases objectively. To predict extreme weather, severe air pollution, and abrupt climate change accurately, the numerical atmospheric model requires not only to simulate meteorology and atmospheric compositions and their impacts simultaneously involving many sophisticated physical and chemical processes but also at high spatiotemporal resolution. Global atmospheric simulation of meteorology and atmospheric compositions simultaneously at spatial resolutions of a few kilometers remains challenging due to its intensive computational and input/output (I/O) requirement. Through multi-dimension-parallelism structuring, aggressive and finer-grained optimizing,

07.12.2021
10:56 Artificial intelligence supercomputer to ‘accelerate research’ at Case Western Reserve University

More than 250 researchers across nearly two dozen research groups—from computer science to materials science to robotics will

05.12.2021
10:51 Identifying proteins using nanopores and supercomputers

The amount and type of proteins human cells produce provide important details about a person’s health and how

26.11.2021
18:59 Can a Russian supercomputer help a national chess hero to win the world title?

Russian chess world championship hopeful Ian Nepomniachtchi is up against it when facing the dominant Magnus Carlsen in Dubai. But could a supercomputer help him to realize his potential? Read Full Article at RT.com

25.11.2021
18:08 More than 300 exoplanets have been discovered in deep space thanks to a newly created algorithm using data from NASA's spacecraft and supercomputer

An additional 301 exoplanets have been confirmed, thanks to a new deep learning algorithm, NASA said. The significant addition to the ledger was made possible by the ExoMiner deep neural network, which was created using data from NASA's Kepler spacecraft and its follow-on, K2. It uses the space agency's supercomputer, Pleiades and is capable of deciphering the difference between real exoplanets and 'false positives.'

22.11.2021
07:29 Optimisation of job scheduling for supercomputers with burst buffers. (arXiv:2111.10200v1 [cs.PF])

The ever-increasing gap between compute and I/O performance in HPC platforms, together with the development of novel NVMe storage devices (NVRAM), led to the emergence of the burst buffer concept - an intermediate persistent storage layer logically positioned between random-access main memory and a parallel file system. Since the appearance of this technology, numerous supercomputers have been equipped with burst buffers exploring various architectures. Despite the development of real-world architectures as well as research concepts, Resource and Job Management Systems, such as Slurm, provide only marginal support for scheduling jobs with burst buffer requirements. This research is primarily motivated by the alerting observation that burst buffers are omitted from reservations in the procedure of backfilling in existing job schedulers. In this dissertation, we forge a detailed supercomputer simulator based on Batsim and SimGrid, which is capable of simulating I/O contention and I/O

18.11.2021
20:51 Microsoft now has one of the world's fastest supercomputers (and no, it doesn't run on Windows)

A Microsoft Azure supercomputer dubbed 'Voyager-EUS2' has made it into the rankings of the world's 10 fastest machines. Microsoft's supercomputer, with a benchmark speed of 30 Petaflops per second (Pflop/s) is still well behind China's Tianhe-2A and the US Department of Energy's IBM-based Summit supercomputer, but it's the only major cloud provider with a supercomputer ranked in the top 10 in the high performance computing (HPC) Top500 list.

15:37 Microsoft now has one of the world's fastest supercomputers (and no, it doesn't run on Windows)

Microsoft makes it into the top 10 fastest supercomputers in the world.

11:23 An energy-efficient scheduling algorithm for shared facility supercomputer centers. (arXiv:2111.08978v1 [cs.DC])

The evolution of high-performance computing is associated with the growth of energy consumption. Performance of cluster computes (is increased via rising in performance and the number of used processors, GPUs, and coprocessors. An increment in the number of computing elements results in significant growth of energy consumption. Power grids limits for supercomputer centers (SCC) are driving the transition to more energy-efficient solutions. Often upgrade of computing resources is done step-by-step, i.e. parts of older supercomputers are removed from service and replaced with newer ones. A single SCC at any time can operate several computing systems with different performance and power consumption. That is why the problem of scheduling parallel programs execution on SCC resources to optimize energy consumption and minimize the increase in execution time (energy-efficient scheduling) is important. The goal of the presented work was the development of a new energy-efficient algorithm for

17.11.2021
21:35 MLCommons unveils a new way to evaluate the world's fastest supercomputers

The new MLPerf machine learning metric for supercomputing is designed to capture the aggregate ML capabilities of a whole supercomputer

18:03 Nvidia’s New Supercomputer Will Create a ‘Digital Twin’ of Earth to Fight Climate Change

It’s crunch time on climate change, and companies, governments, philanthropists, and NGOs around the world are starting to take action, be it through donating huge sums of money to the cause, building a database for precise tracking of carbon emissions, creating a plan for a clean hydrogen economy, or advocating for solar geoengineering—among many other […]

16.11.2021
12:00 Updated exascale system for earth simulations

A new version of the Energy Exascale Earth System Model (E3SM) is two times faster than its earlier

10.11.2021
21:14 Identifying individual proteins using nanopores and supercomputers

The amount and types of proteins our cells produce tell us important details about our health. Researchers have shown that it is possible to identify individual proteins with single-amino acid resolution and nearly 100% accuracy. Their method uses nanopores -- engineered openings that generate an electrical signal when molecules are pulled through by a specific enzyme.

03.11.2021
08:54 Scientists tackle antibiotic resistance by using supercomputers

Scientists may have made a giant leap in fighting the biggest threat to human health by using supercomputing to keep pace with the impressive ability of diseases to evolve.

29.10.2021
08:05 Towards Large-Scale Rendering of Simulated Crops for Synthetic Ground Truth Generation on Modular Supercomputers. (arXiv:2110.14946v1 [cs.CV])

Computer Vision problems deal with the semantic extraction of information from camera images. Especially for field crop images, the underlying problems are hard to label and even harder to learn, and the availability of high-quality training data is low. Deep neural networks do a good job of extracting the necessary models from training examples. However, they rely on an abundance of training data that is not feasible to generate or label by expert annotation. To address this challenge, we make use of the Unreal Engine to render large and complex virtual scenes. We rely on the performance of individual nodes by distributing plant simulations across nodes and both generate scenes as well as train neural networks on GPUs, restricting node communication to parallel learning.

28.10.2021
10:48 Closing the "Quantum Supremacy" Gap: Achieving Real-Time Simulation of a Random Quantum Circuit Using a New Sunway Supercomputer. (arXiv:2110.14502v1 [quant-ph])

We develop a high-performance tensor-based simulator for random quantum circuits(RQCs) on the new Sunway supercomputer. Our major innovations include: (1) a near-optimal slicing scheme, and a path-optimization strategy that considers both complexity and compute density; (2) a three-level parallelization scheme that scales to about 42 million cores; (3) a fused permutation and multiplication design that improves the compute efficiency for a wide range of tensor contraction scenarios; and (4) a mixed-precision scheme to further improve the performance. Our simulator effectively expands the scope of simulatable RQCs to include the 10*10(qubits)*(1+40+1)(depth) circuit, with a sustained performance of 1.2 Eflops (single-precision), or 4.4 Eflops (mixed-precision)as a new milestone for classical simulation of quantum circuits; and reduces the simulation sampling time of Google Sycamore to 304 seconds, from the previously claimed 10,000 years.

09:41 Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large scale GPU equipped supercomputer. (arXiv:2110.14557v1 [physics.comp-ph])

Running kinetic plasma physics simulations using grid-based solvers is very demanding both in terms of memory as well as computational cost. This is primarily due to the up to six-dimensional phase space and the associated unfavorable scaling of the computational cost as a function of grid spacing (often termed the curse of dimensionality). In this paper, we present 4d, 5d, and 6d simulations of the Vlasov--Poisson equation with a split-step semi-Lagrangian discontinuous Galerkin scheme on graphic processing units (GPUs). The local communication pattern of this method allows an efficient implementation on large-scale GPU-based systems and emphasizes the importance of considering algorithmic and high-performance computing aspects in unison. We demonstrate a single node performance above 2 TB/s effective memory bandwidth (on a node with 4 A100 GPUs) and show excellent scaling (parallel efficiency between 30% and 67%) for up to 1536 A100 GPUs on JUWELS Booster.

25.10.2021
04:42 EXSCALATE: An extreme-scale in-silico virtual screening platform to evaluate 1 trillion compounds in 60 hours on 81 PFLOPS supercomputers. (arXiv:2110.11644v1 [cs.DC])

The social and economic impact of the COVID-19 pandemic demands the reduction of the time required to find a therapeutic cure. In the contest of urgent computing, we re-designed the Exscalate molecular docking platform to benefit from heterogeneous computation nodes and to avoid scaling issues. We deployed the Exscalate platform on two top European supercomputers (CINECA-Marconi100 and ENI-HPC5), with a combined computational power of 81 PFLOPS, to evaluate the interaction between 70 billions of small molecules and 15 binding-sites of 12 viral proteins of Sars-Cov2. The experiment lasted 60 hours and overall it performed a trillion of evaluations.

21.10.2021
18:29 Supercomputer simulations reveal how protein crowding in cells impacts interactions

Supercomputer simulations by RIKEN researchers have revealed how drug binding to a protein target changes as the surrounding environment becomes more cluttered with other proteins. These simulations could help improve drug development since they shed light on why some drugs work in theory but flop in practice.

16:14 Nvidia adds GeForce Now RTX 3080 subscription: 'Gamers deserve a supercomputer too'

Nvidia's upgrade to GeForce Now is powered by what the company calls a SuperPOD, which has more than 1,000 GPUs delivering more than 39 petaflops of graphics horsepower.

20.10.2021
04:02 Energy-based Accounting Model for Heterogeneous Supercomputers. (arXiv:2110.09987v1 [cs.DC])

In this paper we present a new accounting model for heterogeneous supercomputers. An increasing number of supercomputing centres adopt heterogeneous architectures consisting of CPUs and hardware accelerators for their systems. Accounting models using the core hour as unit of measure are redefined to provide an appropriate charging rate based on the computing performance of different processing elements, as well as their energy efficiency and purchase price. In this paper we provide an overview of existing models and define a new model that, while retaining the core hour as a fundamental concept, takes into account the interplay among resources such as CPUs and RAM, and that bases the GPU charging rate on energy consumption. We believe that this model, designed for Pawsey Supercomputing Research Centre's next supercomputer Setonix, has a lot of advantages compared to other models, introducing carbon footprint as a primary driver in determining the allocation of computational workflow on

15.10.2021
21:21 Supercomputers Mimic Brain Activity, Hunt for COVID Treatments

Data scientists are using a technique known as deep learning, computer algorithms patterned on the brain's signaling networks, to identify combinations of medicines to treat infectious disease.

14.10.2021
23:04 Updated Exascale system for Earth simulations is faster than its predecessor

A new version of the Energy Exascale Earth System Model (E3SM) is two times faster than its earlier version released in 2018.

04.10.2021
22:04 Supercomputers reveal how X chromosomes fold, deactivate

Using supercomputer-driven dynamic modeling based on experimental data, researchers can now probe the process that turns off one X chromosome in female mammal embryos. This new capability is helping biologists understand the role of RNA and the chromosome's structure in the X inactivation process, leading to a deeper understanding of gene expression and opening new pathways to drug treatments for gene-based disorders and diseases.

03:54 PICSAR-QED: a Monte Carlo module to simulate Strong-Field Quantum Electrodynamics in Particle-In-Cell codes for exascale architectures. (arXiv:2110.00256v1 [physics.plasm-ph])

Physical scenarios where the electromagnetic fields are so strong that Quantum ElectroDynamics (QED) plays a substantial role are one of the frontiers of contemporary plasma physics research. Investigating those scenarios requires state-of-the-art Particle-In-Cell (PIC) codes able to run on top high-performance computing machines and, at the same time, able to simulate strong-field QED processes. This work presents the PICSAR-QED library, an open-source, portable implementation of a Monte Carlo module designed to provide modern PIC codes with the capability to simulate such processes, and optimized for high-performance computing. Detailed tests and benchmarks are carried out to validate the physical models in PICSAR-QED, to study how numerical parameters affect such models, and to demonstrate its capability to run on different architectures (CPUs and GPUs). Its integration with WarpX, a state-of-the-art PIC code designed to deliver scalable performance on upcoming exascale

01.10.2021
05:10 Toward Performance-Portable PETSc for GPU-based Exascale Systems. (arXiv:2011.00715v2 [cs.MS] UPDATED)

The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization.The PETSc design for performance portability addresses fundamental GPU accelerator challenges and stresses flexibility and extensibility by separating the programming model used by the application from that used by the library, and it enables application developers to use their preferred programming model, such as Kokkos, RAJA, SYCL, HIP, CUDA, or OpenCL, on upcoming exascale systems. A blueprint for using GPUs from PETSc-based codes is provided, and case studies emphasize the flexibility and high performance achieved on current GPU-based systems.

24.09.2021
08:24 Preparing for exascale: Argonne’s Aurora supercomputer to drive brain map construction

21.09.2021
18:16 Cell's energy secrets revealed with supercomputers

It takes two to tango, as the saying goes.

11:05 Enabling particle applications for exascale computing platforms. (arXiv:2109.09056v1 [cs.DC])

The Exascale Computing Project (ECP) is invested in co-design to assure that key applications are ready for exascale computing. Within ECP, the Co-design Center for Particle Applications (CoPA) is addressing challenges faced by particle-based applications across four sub-motifs: short-range particle-particle interactions (e.g., those which often dominate molecular dynamics (MD) and smoothed particle hydrodynamics (SPH) methods), long-range particle-particle interactions (e.g., electrostatic MD and gravitational N-body), particle-in-cell (PIC) methods, and linear-scaling electronic structure and quantum molecular dynamics (QMD) algorithms. Our crosscutting co-designed technologies fall into two categories: proxy applications (or apps) and libraries. Proxy apps are vehicles used to evaluate the viability of incorporating various types of algorithms, data structures, and architecture-specific optimizations and the associated trade-offs; examples include ExaMiniMD, CabanaMD, CabanaPIC, and

04:01 Pawsey unwraps first stage of AU$48m Setonix HPE research supercomputer Stage 1 of the unveiling of the HPE Cray EX supercomputer will be used by researchers to 'fine tune' the supercomputer before it's fully operational next year. 14.09.2021 08:39 GPU Algorithms for Efficient Exascale Discretizations. (arXiv:2109.05072v1 [cs.DC]) In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED software stack, including the libCEED, MAGMA, MFEM, libParanumal, and Nek projects. We report performance and capability improvements in several CEED-enabled applications on both NVIDIA and AMD GPU systems. 07:35 GPU Algorithms for Efficient Exascale Discretizations. (arXiv:2109.05072v1 [cs.DC]) In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED software stack, including the libCEED, MAGMA, MFEM, libParanumal, and Nek projects. We report performance and capability improvements in several CEED-enabled applications on both NVIDIA and AMD GPU systems. 13.09.2021 10:56 Efficient Exascale Discretizations: High-Order Finite Element Methods. (arXiv:2109.04996v1 [cs.DC]) Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high efficiency in the area of PDE discretizations on unstructured grids is to use matrix-free/partially-assembled high-order finite element methods, since these methods can increase the accuracy and/or lower the computational time due to reduced data motion. In this paper we provide an overview of the research and development activities in the Center for Efficient Exascale Discretizations (CEED), a co-design center in the Exascale Computing Project that is focused on the development of next-generation discretization software and algorithms to enable a wide range of finite element applications to run efficiently on future hardware. CEED is a 10:56 3D Real-Time Supercomputer Monitoring. (arXiv:2109.04532v1 [cs.DC]) Supercomputers are complex systems producing vast quantities of performance data from multiple sources and of varying types. Performance data from each of the thousands of nodes in a supercomputer tracks multiple forms of storage, memory, networks, processors, and accelerators. Optimization of application performance is critical for cost effective usage of a supercomputer and requires efficient methods for effectively viewing performance data. The combination of supercomputing analytics and 3D gaming visualization enables real-time processing and visual data display of massive amounts of information that humans can process quickly with little training. Our system fully utilizes the capabilities of modern 3D gaming environments to create novel representations of computing hardware which intuitively represent the physical attributes of the supercomputer while displaying real-time alerts and component utilization. This system allows operators to quickly assess how the supercomputer is 10:02 Efficient Exascale Discretizations: High-Order Finite Element Methods. (arXiv:2109.04996v1 [cs.DC]) Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high efficiency in the area of PDE discretizations on unstructured grids is to use matrix-free/partially-assembled high-order finite element methods, since these methods can increase the accuracy and/or lower the computational time due to reduced data motion. In this paper we provide an overview of the research and development activities in the Center for Efficient Exascale Discretizations (CEED), a co-design center in the Exascale Computing Project that is focused on the development of next-generation discretization software and algorithms to enable a wide range of finite element applications to run efficiently on future hardware. CEED is a 02.09.2021 05:54 Plan-based Job Scheduling for Supercomputers with Shared Burst Buffers. (arXiv:2109.00082v1 [cs.DC]) The ever-increasing gap between compute and I/O performance in HPC platforms, together with the development of novel NVMe storage devices (NVRAM), led to the emergence of the burst buffer concept - an intermediate persistent storage layer logically positioned between random-access main memory and a parallel file system. Despite the development of real-world architectures as well as research concepts, resource and job management systems, such as Slurm, provide only marginal support for scheduling jobs with burst buffer requirements, in particular ignoring burst buffers when backfilling. We investigate the impact of burst buffer reservations on the overall efficiency of online job scheduling for common algorithms: First-Come-First-Served (FCFS) and Shortest-Job-First (SJF) EASY-backfilling. We evaluate the algorithms in a detailed simulation with I/O side effects. Our results indicate that the lack of burst buffer reservations in backfilling may significantly deteriorate scheduling. We 01.09.2021 08:35 ExaWorks: Workflows for Exascale. (arXiv:2108.13521v1 [cs.DC]) Exascale computers will offer transformative capabilities to combine data-driven and learning-based approaches with traditional simulation applications to accelerate scientific discovery and insight. These software combinations and integrations, however, are difficult to achieve due to challenges of coordination and deployment of heterogeneous software components on diverse and massive platforms. We present the ExaWorks project, which can address many of these challenges: ExaWorks is leading a co-design process to create a workflow software development Toolkit (SDK) consisting of a wide range of workflow management tools that can be composed and interoperate through common interfaces. We describe the initial set of tools and interfaces supported by the SDK, efforts to make them easier to apply to complex science challenges, and examples of their application to exemplar cases. Furthermore, we discuss how our project is working with the workflows community, large computing facilities as 07:30 Bridging the Gap between Deep Learning and Frustrated Quantum Spin System for Extreme-scale Simulations on New Generation of Sunway Supercomputer. (arXiv:2108.13830v1 [cond-mat.str-el]) Efficient numerical methods are promising tools for delivering unique insights into the fascinating properties of physics, such as the highly frustrated quantum many-body systems. However, the computational complexity of obtaining the wave functions for accurately describing the quantum states increases exponentially with respect to particle number. Here we present a novel convolutional neural network (CNN) for simulating the two-dimensional highly frustrated spin-$1/2J_1-J_2$Heisenberg model, meanwhile the simulation is performed at an extreme scale system with low cost and high scalability. By ingenious employment of transfer learning and CNN's translational invariance, we successfully investigate the quantum system with the lattice size up to$24\times24$, within 30 million cores of the new generation of Sunway supercomputer. The final achievement demonstrates the effectiveness of CNN-based representation of quantum-state and brings the state-of-the-art record up to a brand-new 31.08.2021 12:49 IBM's fastest supercomputer will be used to find better ways to produce green electricity GE has revealed more information about two research projects that were awarded compute time on the Summit supercomputer. 30.08.2021 06:58 JUWELS Booster -- A Supercomputer for Large-Scale AI Research. (arXiv:2108.11976v1 [cs.DC]) In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the J\"ulich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance. We exemplify its potential for research application by presenting large-scale AI research highlights from various scientific fields that require such a facility. 25.08.2021 14:09 DOE's Argonne Lab to deploy new GPU-based supercomputer Polaris Accelerated by 2,240 Nvidia A100 Tensor Core GPUs, the Polaris system will be able to achieve almost 1.4 exaflops of theoretical AI performance. 17.08.2021 20:54 Supercomputer calculates pi to a record-breaking 68.2 trillion digits Researchers are set to break the world record for the most precise value of pi, after using an advanced computer to calculate the famous constant to 68.2 trillion decimal places. 20:17 Cracking a mystery of massive black holes and quasars with supercomputer simulations Researchers address some of the questions surrounding these massive and enigmatic features of the universe by using new, high-powered simulations. 15:53 Cracking a mystery of massive black holes and quasars with supercomputer simulations At the center of galaxies, like our own Milky Way, lie massive black holes surrounded by spinning gas. Some shine brightly, with a continuous supply of fuel, while others go dormant for millions of years, only to reawaken with a serendipitous influx of gas. It remains largely a mystery how gas flows across the universe to feed these massive black holes. 16.08.2021 09:34 Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers. (arXiv:2108.05969v1 [cs.DC]) Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable$^3$-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable$^3$-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational resource on HPC within a computational budget. As a result, the proposed Scalable$^3$-BO framework is 08:27 Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers. (arXiv:2108.05969v1 [cs.DC]) Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable$^3$-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable$^3$-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational resource on HPC within a computational budget. As a result, the proposed Scalable$^3$-BO framework is 08:27 Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers. (arXiv:2108.05969v1 [cs.DC]) Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable$^3$-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable$^3$-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational resource on HPC within a computational budget. As a result, the proposed Scalable$^3$-BO framework is 10.08.2021 04:37 Preparing for Performance Analysis at Exascale. (arXiv:2108.04002v1 [cs.DC]) Performance tools for forthcoming heterogeneous exascale platforms must address two principal challenges when analyzing execution measurements. First, measurement of extreme-scale executions generates large volumes of performance data. Second, performance metrics for heterogeneous applications are significantly sparse across code regions. To address these challenges, we developed a novel "streaming aggregation" approach to post-mortem analysis that employs both shared and distributed memory parallelism to aggregate sparse performance measurements from every rank, thread and GPU stream of a large-scale application execution. Analysis results are stored in a pair of sparse formats designed for efficient access to related data elements, supporting responsive interactive presentation and scalable data analytics. Empirical analysis shows that our implementation of this approach in HPCToolkit effectively processes measurement data from thousands of threads using a fraction of the compute 02.08.2021 13:22 Supercomputers are becoming another cloud service. Here's what it means Designing for the usual cloud workloads isn't the same as designing for high performance computing: Azure is trying to achieve both. 30.07.2021 11:48 Supercomputer-Generated Models Provide Better Understanding of Esophageal Disorders Gastroesophageal reflux disease, more commonly known as GERD, impacts around 20 percent of U.S. citizens, according to the 22.07.2021 07:58 Preparing for exascale: Argonne’s Aurora to accelerate discoveries in particle physics at CERN Argonne researchers will use the lab’s upcoming exascale supercomputer to aid in the search for new physics discoveries. 16.07.2021 08:55 Improving I/O Performance for Exascale Applications through Online Data Layout Reorganization. (arXiv:2107.07108v1 [cs.DC]) The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read and 14.07.2021 02:25 Supercomputer predicts cell-membrane permeability of cyclic peptides Scientists have developed a computational method based on large-scale molecular dynamics simulations to predict the cell-membrane permeability of cyclic peptides using a supercomputer. Their protocol has exhibited promising accuracy and may become a useful tool for the design and discovery of cyclic peptide drugs, which could help us reach new therapeutic targets inside cells beyond the capabilities of conventional small-molecule drugs or antibody-based drugs. 13.07.2021 15:59 TSUBAME supercomputer predicts cell-membrane permeability of cyclic peptides Scientists at Tokyo Institute of Technology have developed a computational method based on large-scale molecular dynamics simulations to predict the cell-membrane permeability of cyclic peptides using a supercomputer. Their protocol has exhibited promising accuracy and may become a useful tool for the design and discovery of cyclic peptide drugs, which could help us reach new therapeutic targets inside cells beyond the capabilities of conventional small-molecule drugs or antibody-based drugs. 07.07.2021 12:14 This powerful new supercomputer is taking on some of healthcare's hardest problems Cambridge-1 is the UK's fastest supercomputer, and it is dedicated to advancing healthcare research. 28.06.2021 19:20 A new supercomputer has joined the top five most powerful devices around the world The latest iteration of the Top500 puts the Perlmutter supercomputer in the spotlight. 10:20 New University of Edinburgh supercomputer powered by Nvidia Nvidia announces powering a new system for a Scottish university and a handful of updates to its HPC portfolio as part of MWC 2021. 24.06.2021 05:13 BFTrainer: Low-Cost Training of Neural Networks on Unfillable Supercomputer Nodes. (arXiv:2106.12091v1 [cs.DC]) Supercomputer FCFS-based scheduling policies result in many transient idle nodes, a phenomenon that is only partially alleviated by backfill scheduling methods that promote small jobs to run before large jobs. Here we describe how to realize a novel use for these otherwise wasted resources, namely, deep neural network (DNN) training. This important workload is easily organized as many small fragments that can be configured dynamically to fit essentially any node*time hole in a supercomputer's schedule. We describe how the task of rescaling suitable DNN training tasks to fit dynamically changing holes can be formulated as a deterministic mixed integer linear programming (MILP)-based resource allocation algorithm, and show that this MILP problem can be solved efficiently at run time. We show further how this MILP problem can be adapted to optimize for administrator- or user-defined metrics. We validate our method with supercomputer scheduler logs and different DNN training scenarios, and 23.06.2021 17:37 Machine learning for solar energy is a supercomputer killer Supercomputers could find themselves out of a job thanks to a suite of new machine learning models that produce rapid, accurate results using a normal laptop. 14:42 Tesla Built a Supercomputer to Develop Camera-Only Self-Driving Tech Tesla is talking about what it sees as the next leap in autonomous driving that could do away with lidar and radar, leaving self-driving cars to get around with regular optical cameras only. The post Tesla Built a Supercomputer to Develop Camera-Only Self-Driving Tech appeared first on ExtremeTech. 07:52 High Performance Optimization at the Door of the Exascale. (arXiv:2106.11819v1 [cs.DC]) quest for processing speed potential. In fact, we always get a fraction of the technically available computing power (so-called {\em theoretical peak}), and the gap is likely to go hand-to-hand with the hardware complexity of the target system. Among the key aspects of this complexity, we have: the {\em heterogeneity} of the computing units, the {\em memory hierarchy and partitioning} including the non-uniform memory access (NUMA) configuration, and the {\em interconnect} for data exchanges among the computing nodes. Scientific investigations and cutting-edge technical activities should ideally scale-up with respect to sustained performance. The special case of quantitative approaches for solving (large-scale) problems deserves a special focus. Indeed, most of common real-life problems, even when considering the artificial intelligence paradigm, rely on optimization techniques for the main kernels of algorithmic solutions. Mathematical programming and pure combinatorial methods are not 07:52 Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis. (arXiv:2106.11469v1 [cs.DC]) X-ray scattering experiments using Free Electron Lasers (XFELs) are a powerful tool to determine the molecular structure and function of unknown samples (such as COVID-19 viral proteins). XFEL experiments are a challenge to computing in two ways: i) due to the high cost of running XFELs, a fast turnaround time from data acquisition to data analysis is essential to make informed decisions on experimental protocols; ii) data collection rates are growing exponentially, requiring new scalable algorithms. Here we report our experiences analyzing data from two experiments at the Linac Coherent Light Source (LCLS) during September 2020. Raw data were analyzed on NERSC's Cori XC40 system, using the Superfacility paradigm: our workflow automatically moves raw data between LCLS and NERSC, where it is analyzed using the software package CCTBX. We achieved real time data analysis with a turnaround time from data acquisition to full molecular reconstruction in as little as 10 min -- sufficient time 07:41 Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis. (arXiv:2106.11469v1 [cs.DC]) X-ray scattering experiments using Free Electron Lasers (XFELs) are a powerful tool to determine the molecular structure and function of unknown samples (such as COVID-19 viral proteins). XFEL experiments are a challenge to computing in two ways: i) due to the high cost of running XFELs, a fast turnaround time from data acquisition to data analysis is essential to make informed decisions on experimental protocols; ii) data collection rates are growing exponentially, requiring new scalable algorithms. Here we report our experiences analyzing data from two experiments at the Linac Coherent Light Source (LCLS) during September 2020. Raw data were analyzed on NERSC's Cori XC40 system, using the Superfacility paradigm: our workflow automatically moves raw data between LCLS and NERSC, where it is analyzed using the software package CCTBX. We achieved real time data analysis with a turnaround time from data acquisition to full molecular reconstruction in as little as 10 min -- sufficient time 22.06.2021 10:31 Three-body problem -- from Newton to supercomputer plus machine learning. (arXiv:2106.11010v1 [cs.OH]) The famous three-body problem can be traced back to Newton in 1687, but quite few families of periodic orbits were found in 300 years thereafter. As proved by Poincar\`{e}, the first integral does not exist for three-body systems, which implies that numerical approach had to be used in general. In this paper, we propose an effective approach and roadmap to numerically gain planar periodic orbits of three-body systems with arbitrary masses by means of machine learning based on an artificial neural network (ANN) model. Given any a known periodic orbit as a starting point, this approach can provide more and more periodic orbits (of the same family name) with variable masses, while the mass domain having periodic orbits becomes larger and larger, and the ANN model becomes wiser and wiser. Finally we have an ANN model trained by means of all obtained periodic orbits of the same family, which provides a convenient way to give accurate enough predictions of periodic orbits with arbitrary 21.06.2021 20:40 Oil giant Petrobras launches Latin America's largest supercomputer The new equipment is intended to support the Brazilian company in its data processing requirements and to reduce geological and operational risks. 27.05.2021 20:59 US Energy Department launches the Perlmutter AI supercomputer The next-generation supercomputer that will deliver nearly four exaflops of AI performance. 17.05.2021 09:00 Toward Real-time Analysis of Experimental Science Workloads on Geographically Distributed Supercomputers. (arXiv:2105.06571v1 [cs.DC]) Massive upgrades to science infrastructure are driving data velocities upwards while stimulating adoption of increasingly data-intensive analytics. While next-generation exascale supercomputers promise strong support for I/O-intensive workflows, HPC remains largely untapped by live experiments, because data transfers and disparate batch-queueing policies are prohibitive when faced with scarce instrument time. To bridge this divide, we introduce Balsam: a distributed orchestration platform enabling workflows at the edge to securely and efficiently trigger analytics tasks across a user-managed federation of HPC execution sites. We describe the architecture of the Balsam service, which provides a workflow management API, and distributed sites that provision resources and schedule scalable, fault-tolerant execution. We demonstrate Balsam in efficiently scaling real-time analytics from two DOE light sources simultaneously onto three supercomputers (Theta, Summit, and Cori), while 14.05.2021 15:22 Supercomputer simulations unlock an old space weather puzzle Scientists have long questioned why the bursts of hot gas from the sun do not cool down as fast as expected, and have now used a supercomputer to find out. 28.04.2021 09:20 Singapore to build second national supercomputer with more on roadmap Built to support the local research community, the new national supercomputer will run on warm water-cooled system designed for tropical climates and provide up to 10 petaflops of raw compute capacity when it is operational in early-2022. 26.04.2021 17:09 Advancing AI With a Supercomputer: A Blueprint for an Optoelectronic “Brain” Building a computer that can support artificial intelligence at the scale and complexity of the human brain will be a colossal engineering effort. Now researchers at the National Institute of Standards and Technology have outlined how they think we’ll get there. How, when, and whether we’ll ever create machines that can match our cognitive capabilities […] 23.04.2021 17:08 Met Office and Microsoft to build weather-forecasting supercomputer The Met Office and Microsoft are to build a world-leading supercomputer capable of providing more accurate warnings of severe weather as part of a multimillion-pound agreement. The weather today is... 22.04.2021 16:07 This billion-dollar supercomputer will be used to create super-accurate weather forecasts The Met Office is investing in a new supercomputer to improve the precision of weather and climate models. 06:43 Quantum ESPRESSO towards the exascale. (arXiv:2104.10502v1 [physics.comp-ph]) Quantum ESPRESSO is an open-source distribution of computer codes for quantum-mechanical materials modeling, based on density-functional theory, pseudopotentials, and plane waves, and renowned for its performance on a wide range of hardware architectures, from laptops to massively parallel computers, as well as for the breadth of its applications. In this paper we present a motivation and brief review of the ongoing effort to port Quantum ESPRESSO onto heterogeneous architectures based on hardware accelerators, which will overcome the energy constraints that are currently hindering the way towards exascale computing. 21.04.2021 09:19 Exascale Landau collision operator in the Cuda programming model applied to thermal quench plasmas. (arXiv:2104.10000v1 [physics.plasm-ph]) The Landau form of the Fokker-Planck equation is the gold standard for plasmas dominated by small angle collisions, however its$\Order{N^2}\$ work complexity has limited its practicality. This paper extends previous work on a fully conservative finite element method for this Landau collision operator with adaptive mesh refinement, optimized for vector machines, by porting the algorithm to the Cuda programming model with implementations in Cuda and Kokkos, and by reporting results within a Vlasov-Maxwell-Landau model of a plasma thermal quench. With new optimizations of the Landau kernel and ports of this kernel, the sparse matrix assembly and algebraic solver to Cuda, the cost of a well resolved Landau collision time advance is shown to be practical for kinetic plasma applications. This fully implicit Landau time integrator and the plasma quench model is available in the PETSc (Portable, Extensible, Toolkit for Scientific computing) numerical library.

18.04.2021
23:25 Supercomputer simulations illuminate the behavior of dominant G form of SARS-CoV-2

Large-scale supercomputer simulations at the atomic level show that the dominant G form variant of the COVID-19-causing virus is more infectious partly because of its greater ability to readily bind to its target host receptor in the body, compared to other variants.

15.04.2021
01:11 Dell, Nvidia power new "cloud-native supercomputer" in the UK

The expanded system at the University of Cambridge will deliver multi-tenant high performance computing for research spanning astrophysics, nuclear fusion power generation and clinical medicine applications.

09.04.2021
12:41 US adds Chinese supercomputer centers to export blacklist

The United States on Thursday restricted trade with top Chinese supercomputing centers, saying that Beijing's growing efforts in the field could have military uses that pose dangers. The seven centers or entities were put on the US government's entity list, which means they require special permission for exports and imports from the United States. "Supercomputing capabilities are vital for the development of many -- perhaps almost all -- modern weapons and national security systems, such as nuclear weapons and hypersonic weapons," Commerce Secretary Gina Raimondo said in a statement. She said the commerce department would "use the full extent of its authorities to prevent China from leveraging US technologies to support these destabilizing military modernization efforts." The centers hit with the restrictions include the National Supercomputing Center in the eastern city of Wuxi, home to the Sunway TaihuLight, which was considered the world's fastest when it was launched in 2016 --

08:42 US blacklists seven Chinese supercomputer groups

President Biden's actions continue US moves to make it harder for China to obtain its technology.

30.03.2021
11:05 MT-lib: A Topology-aware Message Transfer Library for Graph500 on Supercomputers. (arXiv:2103.15024v1 [cs.DC])

We present MT-lib, an efficient message transfer library for messages gather and scatter in benchmarks like Graph500 for Supercomputers. Our library includes MST version as well as new-MST version. The MT-lib is deliberately kept light-weight, efficient and friendly interfaces for massive graph traverse. MST provides (1) a novel non-blocking communication scheme with sending and receiving messages asynchronously to overlap calculation and communication;(2) merging messages according to the target process for reducing communication overhead;(3) a new communication mode of gathering intra-group messages before forwarding between groups for reducing communication traffic. In MT-lib, there are (1) one-sided message; (2) two-sided messages; and (3) two-sided messages with buffer, in which dynamic buffer expansion is built for messages delivery. We experimented with MST and then testing Graph500 with MST on Tianhe supercomputers. Experimental results show high communication efficiency and

29.03.2021
15:12 Europe Plans 20,000 GPU Supercomputer to Create ‘Digital Twin’ of Earth

GPU prices are not moving in the right direction. The plan to create a digital twin of Earth might end up delayed due to the relative lack of available GPUs, but this isn't going to be an overnight project.  The post Europe Plans 20,000 GPU Supercomputer to Create ‘Digital Twin’ of Earth appeared first on ExtremeTech.

27.03.2021
11:23 Preparing for exascale: Aurora supercomputer to help scientists visualize the spread of cancer

Scientists are preparing a cancer modeling study to run on Argonne’s upcoming Aurora supercomputer before it goes online

20.03.2021
00:52 Chemists use supercomputers to understand solvents

To understand the fundamental properties of an industrial solvent, chemists with the University of Cincinnati turned to a supercomputer.

18.03.2021
15:24 This powerful supercomputer was built in just 20 weeks, with a bit of help from a tiny robot

Nvidia announced that it would build Cambridge-1 only 20 weeks ago. Now the supercomputer is almost up and running, despite a global health crisis.

17.03.2021
22:40 SUPERCOMPUTERS ADVANCE LONGER-LASTING, FASTER-CHARGING BATTERIES

In an effort to curb the rise in overall carbon vehicle emissions, the state of California recently announced

22:17 Supercomputers Help Accelerate Alzheimer’s Research

Since 2009, Daniel Tward and his collaborators have analyzed more than 47,000 images of human brains via MRI Cloud—a

09.03.2021
16:29 The world's most powerful supercomputer is now up and running

Japan's Fugaku supercomputer is likely to become researchers' new favorite toy.

First← Previous12345678Previous →