May 22

VSC User Day 2018

Today, I am attending the 4th VSC User Day at the “Paleis de Academiën” in Brussels. Flemish researchers for whom the lifeblood of their research flows through the chips of a supercomputer are gathered here to discuss their experiences and present their research.

Some History

About 10 years ago, at the end of 2007 and beginning of 2008, the 5 Flemish universities founded the Flemish Supercomputer Center (VSC). A virtual organisation with one central goal:  Combine their strengths and know-how with regard to High Performance Compute (HPC) centers to make sure they were competitive with comparable HPC centers elsewhere.

By installing a super-fast network between the various university compute centers, each Flemish researcher has nowadays access to state-of-the-art computer infrastructure, independent of his or her physical location. A researcher at the University of Hasselt, like myself, can easily run calculations on the supercomputers installed at the university of Ghent or Leuven. In October 2012 the existing university supercomputers, so-called Tier-2 supercomputers, are joined by the first Flemish Tier-1 supercomputer, which was housed at the brand new data-centre of Ghent University. This machine is significantly larger than the existing Tier-2 machines, and allows Belgium to become the 25th member of the PRACE network, a European network which provides computational researchers access to the best and largest computer facilities in Europe. The fast development of computational research in Flanders and the explosive growth in the number of computational researchers, combined with the first shared Flemish supercomputer (in contrast to the university TIER-2 supercomputers, which some still consider private property rather than part of VSC) show the impact of the virtual organisation that is the VSC. As a result, on January 16th 2014, the first VSC User Day is organised, bringing together HPC users from all 5 universities  and industry. Here the users share their experiences and discuss possible improvements and changes. Since then, the first Tier-1 supercomputer has been decommissioned and replaced by a brand new Tier-1 machine, this time located at the KU Leuven. Furthermore, the Flemish government has put 30M€ aside for super-computing in Flanders, making sure that also in the future Flemish computational research stays competitive. The future of computational research in Flanders looks bright.

Today is User Day 2018

During the 4th VSC User Day, researchers of all 5 Flemish universities will be presenting the work they are performing on the supercomputers of the VSC network. The range of topics is very broad: from first principles materials modelling to chip design, climate modelling and space weather. In addition there will also be several workshops, introducing new users to the VSC and teaching advanced users the finer details of GPU-code and code optimization and parallelization. This later aspect is hugely important during the use of supercomputers in an academic context. Much of the software used is developed or modified by the researchers themselves. And even though this software can present impressive behavior, it doe not speed up automatically if you provide it access to more CPU’s. This is a very non-trivial task the researchers has to take care of, by carefully optimizing and parallelizing his or her code.

To support the researchers in their work, the VSC came up with ingenious poster-prizes. The three best posters will share 2018 node days of calculation time (about 155 years of calculations on a normal simple computer).

Wish me luck!

 

Single-slide presentation of my poster @VSC User Day 2018.

Single-slide presentation of my poster @VSC User Day 2018.

Mar 27

Fairy tale science or a science fairy tale?

Once upon a time…

Once upon a time, a long time ago—21 days ago to be precise—there was a conference in the tranquil town of Hasselt. Every year, for 23 years in a row, researchers gathered there for three full days, to present and adore their most colorful and largest diamonds. For three full days, there was just that little bit more of a sparkle to their eyes. They divulged where new diamonds could be found, and how they could be used. Three days to could speak without any restriction, without hesitation, about the sixth element which bonds them all. Because all knew the language. They honored the magic of the NV-center and the arcane incantations leading to the highest doping. All, masters of their common mystic craft.

At the end of the third day, with sadness in their harts they said their good-byes and went back, in small groups, to their own ivory tower, far far away. With them, however, they took a small sparkle of hope and expectation, because in twelve full moons they would reconvene. Bringing with them new and grander tales and even more sparkling diamonds, than had ever been seen before.

For most outsiders, the average conference presentation is as clear as an arcane conjuration of a mythological beast. As scientist, we are often trapped by the assumption that our unique expertise is common knowledge for our public, a side-effect of our enthusiasm for our own work.

Clear vs. accurate

In a world where science is facing constant pressure due to the financing model employed—in addition to the up-rise in “fake news” and “alternative facts”— it is important for young researchers to be able to bring their story clearly and accurately.

However, clear and accurate often have the bad habit of counteracting one-another, and as such, maintaining a good balance between the two take a lot more effort than one might expect. Focus on either one aspect (accuracy or clarity) tends to be disastrous. Conference presentations and scientific publications tend to focus on accuracy, making them not clear at all for the non-initiate. Public presentations and news paper articles, on the other hand, focus mainly on clarity with fake news accidents waiting to happen. For example, one could recently read that 7% of the DNA of the astronaut Scott Kelly had changed during a space-flight, instead of a change of in gene-expression. Although both things may look similar, they are very different. The latter presents a rather natural response of the (human) body to any stress situation. The former, however, removes Scott from the human race entirely. Even the average gorilla would be closer related to you and I, than Scott Kelly, as they differ less than 5% in their DNA from our DNA. So keeping a good balance between clarity and accuracy is important, albeit not that easy. Time pressure plays an important role here.

Two extremes?

Wetenschapsbattle Trophy: Each of the contestants of the wetenschapsbattle received a specially designed and created hat from the children of the school judging the contest. Mine has diamonds and computers. 🙂

In the week following the diamond conference in Hasselt, I also participated in a sciencebattle. A contest in which researchers have to explain their research to a public of 6-to 12-year-olds in a time-span of 15 minutes. These kids are judge, jury and executioner of the contest so to speak. It’s a natural reflex to place these two events at the opposite ends of a scale. And it is certainly true for some aspects; The entire room volunteering spontaneously when asked for help is something which happens somewhat less often at a scientific conference. However, clarity and accuracy should be equally central aspects for both.

So, how do you explain your complex research story to a crowd of 6-to 12-year-olds? I discovered the answer during a masterclass by The Floor is Yours.  Actually, more or less the same way you should tell it to an audience of adults, or even your own colleagues. As a researcher you are a specialist in a very narrow field, which means that no-one will loose out when focus is shifted a bit more to clarity. The main problem you encounter here, however, is time. This is both the time required to tell your story (forget “elevator pitches”, those are good if you are a used-car salesman, they are not for science) as well as the time required to prepare your story (it took me a few weeks to build and then polish my story for the children).

Most of this time is spent answering the questions: “What am I actually doing?” and “Why am I doing this specifically?“. The quest for metaphors which are both clear and accurate takes quite some time. During this task you tend to suffer, as a scientist, from the combination of your need for accuracy and your deep background knowledge. These are the same inhibitors a scientist encounters when involved in a public discussion on his/her own field of expertise.

Of course you also do not want to be pedantic:

Q: What do you do?

A: I am a Computational Materials Researcher.

Q: Compu-what??

A: 1) Computational = using a computer

2) Materials = everything you see around you, the stuff everything is made of

3) Researcher = Me

However, as a scientist, you may want to use such imaginary discussions during your preparation. Starting from these pedantic dialogues, you trace a path along the answers which interest you most. The topics which touch your scientific personality. This way, you take a step back from your direct research, and get a more broad picture. Also, by telling about theme’s, you present your research from a more broad perspective, which is more easily accessible to your audience: “What are atoms?“, “How do you make diamond?“, “What is a computer simulation?

At the end—after much blood, sweat and tears—your story tells something about your world as a whole. Depending on your audience you can include more or less detailed aspects of your actual day-to-day research, but at its hart, it remains a story.

Because, if we like it or not, in essence we all are “Pan narrans“, storytelling apes.

Jan 19

Newsflash: Book-chapter on MOFs and Zeolites en route to bookstores near you.

It is almost a year ago that I wrote a book-chapter, together with Bartek Szyja, on MOFs and Zeolites. Coming March 2018, the book will be available through University press. It is interesting to note that in a 13 chapter book, ours was the only chapter dealing with the computational study and simulation of these materials…so there is a lot more that can be done by those who are interested and have the patience to perform these delicate and often difficult but extremely rewarding studies. From my time as a MOF researcher I have learned two important things:

  1. Any kind of interesting/extreme/silly physics you can imagine will be present in some MOFs. In this regard, the current state of the MOF/COF field is still in its infancy as most experimental work focuses on  simple applications such as catalysis and gas storage, for which other materials may be better suited. These porous materials may be theoretically interesting for direct industrial application, but the synthesis cost generally will be a bottleneck. Instead, looking toward the fundamental physics applications: Low dimensional magnetism, low dimensional conduction, spin-filters, multiferroics, electron-phonon interactions, interactions between spin and mechanical properties,…. MOFs are a true playground for the theoretician.
  2. MOFs are very hard to simulate correctly, so be wary of all (published) results that come computationally cheap and easy. Although the unit-cell of any MOF is huge, with regard to standard solid state materials, the electron interactions are also quite long range, so the first Brillouin zone needs very accurate sampling (something often neglected). Also spin-configurations can have a huge influence, especially in systems with a rather flat potential energy surface.

In the book-chapter, we discuss some basic techniques used in the computational study of MOFs, COFs, and Zeolites, which will be of interest to researchers starting in the field. We discuss molecular dynamics and Monte Carlo, as well as Density Functional Theory and all its benefits and limitations.

Jan 09

A Spectre and Meltdown victim: VASP

Over the last weekend, two serious cyber security issues were hot news: Meltdown and Spectre [more links, and links](not to be mistaken for a title of a bond-movie). As a result, also academic HPC centers went into overdrive installing patches as fast as possible. The news of the two security issues went hand-in-hand with quite a few belittling comments toward the chip-designers ignoring the fact that no-one (including those complaining now) discovered the problem for over decade. Of course there was also the usual scare-mongering (cyber-criminals will hack our devices by next Monday, because hacks using these bugs are now immediately becoming their default tools etc.) typical since the beginning of the  21st century…but now it is time to return back to reality.

One of the big users on scientific HPC installations is the VASP program(an example), aimed at the quantum mechanical simulation of materials, and a program central to my own work. Due to an serendipitous coincidence of a annoyingly hard to converge job I had the opportunity to see the impact of the Meltdown and Spectre patches on the performance of VASP: 16% performance loss (within the range of the expected 10-50% performance loss for high performance applications [1][2][3]).

The case:

  • large HSE06 calculation of a 71 atom defective ZnO supercell.
  • 14 irreducible k-points (no reduction of the Hartree-Fock k-points)
  • 14 nodes of 24 cores, with KPAR=14, and NPAR=1 (I know NPAR=24 is the recommended option)

The calculation took several runs of about 10 electronic steps (of each about 5-6 h wall-time, about 2.54 years of CPU-time per run) . The relative average time is shown below (error-bars are the standard deviation of the times within a single run). As the final step takes about 50% longer it is treated separately. As you can see, the variation in time between different electronic steps is rather small (even running on a different cluster only changes the time by a few %). The impact of the Meltdown/Spectre patch gives a significant impact.

Impact of Meltdown/Spectre patch on VASP performance

Impact of Meltdown/Spectre patch on VASP performance for a 336 core MPI job.

 

The HPC team is currently looking into possible workarounds that could (partially) alleviate the problem. VASP itself is rather little I/O intensive, and a first check by the HPC team points toward MPI (the parallelisation framework required for multi-node jobs) being ‘a’ if not ‘the’ culprit. This means that also an impact on other multi-node programs is to be expected. On the bright side, finding a workaround for MPI would be beneficial for all of them as well.

So far, tests I performed with the HPC team not shown any improvements (recompiling VASP didn’t help, nor an MPI related fix). Let’s keep our fingers crossed, and hope the future brings insight and a solution.

 

Jan 01

Review of 2017

Happy New Year

2017 has come and gone. 2018 eagerly awaits getting acquainted. But first we look back one last time, trying to turn this into a old tradition. What have I done during the last year of some academic merit.

Publications: +4

Completed refereeing tasks: +8

  • The Journal of Physical Chemistry (2x)
  • Journal of Physics: Condensed Matter (3x)
  • Diamond and Related Materials (3x)

Conferences & workshops: +5 (Attended) 

  • Int. Conference on Diamond and Carbon Materials (DCM) 2017, Gothenburg, Sweden, September 3rd-7th, 2017 [oral presentation]
  • Summerschool: “Upscaling techniques for mathematical models involving multiple scales”, Hasselt, Belgium, June 26th-29th, 2017 [poster presentation]
  • VSC-user day, Brussels, Belgium, June 2nd, 2017 [poster presentation]
  • E-MRS 2017 Spring Meeting, Strasbourg, France, May 22nd-26th, 2017 [1 oral + 2 poster presentations]
  • SBDD XXII, Hasselt University, Belgium, March 8th-10th, 2017 [poster presentation]

PhD-students: +1

  • Mohammadreza Hosseini (okt.-… ,Phd student physical chemistry, Tarbiat Modares University, Teheran, Iran)

Bachelor-students: +2

Current size of HIVE:

  • 48.5K lines of program (code: 70 %)
  • ~70 files
  • 45 (command line) options

Hive-STM program:

And now, upward and onward, a new year, a fresh start.

Nov 12

Slow science: the case of Pt induced nanowires on Ge(001)

Free-standing Pt-induced nanowire on Ge(001).

Simulated STM image of the Pt-induced nanowires on the Ge(001) surface. Green discs indicate the atomic positions of the bulk-Ge atoms; red: Pt atoms embedded in the top surface layers; yellow: Ge atoms forming the nanowire observed by STM.

Ten years ago, I was happily modeling Pt nanowires on Ge(001) during my first Ph.D. at the university of Twente. As a member of the Computational Materials Science group, I also was lucky to have good and open contact with the experimental research group of Prof. Zandvliet, whom was growing these nanowires. In this environment, I learned there is a big difference between what is easy in experiment and what is easy in computational research. It also taught me to find a common ground which is “easy” for both (Scanning tunneling microscopy (STM) images in this specific case).

During this 4-year project, I quickly came to the conclusion that the nanowires could not be formed by Pt atoms, but that it needed to be Ge atoms instead. Although the simulated STM images were  very convincing, it was really hard to overcome the experimental intuition…and experiments which seemed to contradict this picture (doi: 10.1016/j.susc.2006.07.055 ). As a result, I spend a lot of time learning about the practical aspects of the experiments (an STM tip is a complicated thing) and trying to extract every possible piece of information published and unpublished. Especially the latter provided important support. The “ugly”(=not good for publishing) experimental pictures tended to be real treasures from my computational point of view. Of course, much time was spent on tweaking the computational model to get a perfect match with experiments (e.g. the 4×1 periodicity), and trying to reproduce experiments seemingly supporting the “Ge-nanowire” model (e.g. simulation of CO adsorption and identification of the path along the wire the molecule follows.).

In contrast to my optimism at the end of my first year (I believed all modeling could be finished before my second year ended), the modeling work ended up being a very complex exercise, taking 4 years of research. Now I am happy that I was wrong, as the final result ended up being very robust and became “The model for Pt induced nanowires on Ge(001)“.

Upon doing a review article on this field five years after my Ph.D. I was amazed (and happy) to see my model still stood. Even more, there had been complex experimental studies (doi: 10.1103/PhysRevB.85.245438) which even seemed to support the model I proposed. However, these experiments were stil making an indirect comparison. A direct comparison supporting the Ge nature of the nanowires was still missing…until recently.

In a recent paper in Phys. Rev. B (doi: 10.1103/PhysRevB.96.155415) a Japanese-Turkish collaboration succeeded in identifying the nanowire atoms as Ge atoms. They did this using an Atomic Force Microscope (AFM) and a sample of Pt induced nanowires, in which some of the nanowire atoms were replaced by Sn atoms. The experiment rather simple in idea (execution however requires rather advanced skills): compare the forces experienced by the AFM when measuring the Sn atom, the chain atoms and the surface atoms. The Sn atoms are easily recognized, while the surface is known to consist of Ge atoms. If the relative force of the chain atom is the same as that of the surface atoms, then the chain consists of Ge atoms, while if the force is different, the chain consists of Pt atoms.

*small drum-roll*

And they found the result to be the same.

Yes, after nearly 10 years since my first publication on the subject, there finally is experimental proof that the Pt nanowires on Ge(001) consist of Ge atoms. Seeing this paper made me one happy computational scientist. For me it shows the power of computational research, and provides an argument why one should not be shy to push calculations to their limit. The computational cost may be high, but at least one is performing relevant work. And of course, never forget, the most seemingly easy looking experiments are  usually not easy at all, so as a computational materials scientist you should not take them for granted, but let those experimentalists know how much you appreciate their work and effort.

Oct 26

Audioslides tryout.

One of the new features provided by Elsevier upon publication is the creation of audioslides. This is a kind of short presentation of the publication by one of the authors. I have been itching to try this since our publication on the neutral C-vancancy was published. The interface is quite intuitive, although the adobe flash tend to have a hard time finding the microphone. However, once it succeeds, things go quite smoothly. The resolution of the slides is a bit low, which is unfortunate (but this is only for the small-scale version, the large-scale version is quite nice as you can see in the link below). Maybe I’ll make a high resolution version video and put it on Youtube, later.

The result is available here (since the embedding doesn’t play nicely with WP).

And a video version can be found here.
 

Sep 23

Revisiting the Neutral C-Vacancy in Diamond: Localization of Electrons through DFT+U

Authors: Danny E. P. Vanpoucke and Ken Haenen
Journal: Diam. Relat. Mater 79, 60-69 (2017)
doi: 10.1016/j.diamond.2017.08.009
IF(2016): 2.561
export: bibtex
pdf: <DiamRelatMater>

 

Combining a scan over possible values for U and J with reference electronic structures obtained using the hybrid functional HSE06, DFT+U can be fit to provide hybrid functional quality electronic structures at the cost of DFT calculations.
Graphical Abstract: Combining a scan over possible values for U and J with reference electronic structures obtained using the hybrid functional HSE06, DFT+U can be fit to provide hybrid functional quality electronic structures at the cost of DFT calculations.

Abstract

The neutral C-vacancy is investigated using density functional theory calculations. We show that local functionals, such as PBE, can predict the correct stability order of the different spin states, and that the success of this prediction is related to the accurate description of the local magnetic configuration. Despite the correct prediction of the stability order, the PBE functional still fails predicting the defect states correctly. Introduction of a fraction of exact exchange, as is done in hybrid functionals such as HSE06, remedies this failure, but at a steep computational cost. Since the defect states are strongly localized, the introduction of additional on site Coulomb and exchange interactions, through the DFT+U method, is shown to resolve the failure as well, but at a much lower computational cost. In this work, we present optimized U and J parameters for DFT+U calculations, allowing for the accurate prediction of defect states in defective
diamond. Using the PBE optimized atomic structure and the HSE06 optimized electronic structure as reference, a pair of on-site Coulomb and exchange parameters (U,J) are fitted for DFT+U studies of defects in diamond.

Related:

Poster-presentation: here

DFT+U series (varying J) for a specific spin state of the C-vacancy defect.

DFT+U series (varying J) for a specific spin state of the C-vacancy defect.

Sep 23

A combined experimental and theoretical investigation of the Al-Melamine reactive milling system: a mechanistic study towards AlN-based ceramics

Authors: Seyyed Amin Rounaghi, Danny E.P. Vanpoucke, Hossein Eshghi, Sergio Scudino, Elaheh Esmaeili, Steffen Oswald and Jürgen Eckert
Journal: J. Alloys Compd. 729, 240-248 (2017)
doi: 10.1016/j.jallcom.2017.09.168
IF(2016): 3.133
export: bibtex
pdf: <J.Alloys Compd.>

 

Graphical Abstract: Evolution of the end products as function of Al and N content during ball-milling synthesis of AlN.
Graphical Abstract: Evolution of the end products as function of Al and N content during ball-milling synthesis of AlN.

Abstract

A versatile ball milling process was employed for the synthesis of hexagonal aluminum nitride (h-AlN) through the reaction of metallic aluminum with melamine. A combined experimental and theoretical study was carried out to evaluate the synthesized products. Milling intermediates and products were fully characterized via various techniques including XRD, FTIR, XPS, Raman and TEM. Moreover, a Boltzmann distribution model was proposed to investigate the effect of milling energy and reactant ratios on the thermodynamic stability and the proportion of different milling products. According to the results, the reaction mechanism and milling products were significantly influenced by the reactant ratio. The optimized condition for AlN synthesis was found to be at Al/M molar ratio of 6, where the final products were consisted of nanostructured AlN with average crystallite size of 11 nm and non-crystalline heterogeneous carbon.

Aug 29

Exa-scale computing future in Europe?

As a computational materials scientist with a main research interest in the ab initio simulation of materials, computational resources are the life-blood of my research. Over the last decade, I have seen my resource usage grow from less than 100.000 CPU hours per year to several million CPU-hours per year. To satisfy this need for computational resources I have to make use of HPC facilities, like the TIER-2 machines available at the Flemish universities and the Flemish TIER-1 supercomputer, currently hosted at KU Leuven. At the international level, computational scientists have access to so called TIER-0 machines, something I no doubt will make use of in the future. Before I continue, let me first explain a little what this TIER-X business actually means.

The TIER-X notation is used to give an indication of the size of the computer/supercomputer indicated. There are 4 sizes:

  •  TIER-3: This is your personal computer(laptop/desktop) or even a small local cluster of a research group. It can contain from one (desktop) up to a few hundred CPU’s (local cluster). Within materials research, this is sufficient for quite a few tasks: post-processing of data, simple force-field based calculations, or even small quantum chemical or solid state calculations. A large fraction of the work during my first Ph.D. was performed on the local cluster of the CMS.
  • TIER-2: This is a supercomputer hosted by an institute or university. It generally contains over 1000 CPUs and has a peak performance of >10 TFLOPS (1012 Floating Point Operations Per Second, compare this to 1-50×10FLOPS or 1-25 GFLOPS of an average personal computer). The TIER-2 facilities of the VUB and UAntwerp both have a peak performance of about 75TFLOPS , while the machines at Ghent University and the KU Leuven/Uhasselt facilities both have a peak performance of about 230 TFLOPS. Using these machines I was able to perform the calculations necessary for my study of dopant elements in cerates (and obtain my second Ph.D.).
  • TIER-1: Moving up one more step, there are the national/regional supercomputers. These generally contain over 10.000 CPUs and have a peak performance of over 100 TFLOPS. In Flanders the Flemish Supercomputer Center (VSC) manages the TIER-1 machine (which is being funded by the 5 Flemish universities). The first TIER-1 machine was hosted at Ghent University, while the second and current one is hosted at KU Leuven, an has a peak performance of 623 TFLOPS (more than all TIER-2 machines combined), and cost about 5.5 Million € (one of the reasons it is a regional machine). Over the last 5 years, I was granted over 10 Million hours of CPU time, sufficient for my study of Metal-Organic Frameworks and defects in diamond.
  • TIER-0: This are international level supercomputers. These machines contain over 100.000 CPUs, and have a peak performance in excess of 1 PFLOP (1 PetaFLOP = 1000 TFLOPS). In Europe the TIER-0 facilities are available to researchers via the PRACE network (access to 7 TIER-0 machines, accumulated 43.49 PFLOPS).

This is roughly the status of what is available today for Flemish scientists at various levels. With the constantly growing demand for more processing power, the European union, in name of EuroHPC, has decided in march of this year, that Europe will host two Exa-scale computers. These machines will have a peak performance of at least 1 EFLOPS, or 1000 PFLOPS. These machines are expected to be build by 2024-2025. In June, Belgium signed up to EuroHPC as the eighth country participating, in addition, to the initial 7 countries (Germany, France, Spain, Portugal, Italy, Luxemburg and The Netherlands).

This is very good news for all involved in computational research in Flanders. There is the plan to build these machines, there is a deadline, …there just isn’t an idea of what these machines should look like (except: they will be big, massively power consuming and have a target peak performance). To get an idea what users expect of such a machine, Tier-1 and HPC users have been asked to put forward requests/suggestions of what they want.

From my user personal experience, and extrapolating from my own usage I see myself easily using 20 million hours of CPU time each year by the time these Exa-scale machines are build. Leading a computational group would multiply this value. And then we are talking about simple production purpose calculations for “standard” problems.

The claim that an Exa-scale scale machine runs 1000x faster than a peta-scale machine, is not entirely justified, at least not for the software I am generally encountering. As software seldom scales linearly, the speed-gain from Exa-scale machinery mainly comes from the ability to perform many more calculations in parallel. (There are some exceptions which will gain within the single job area, but this type of jobs is limited.) Within my own field, quantum mechanical calculation of the electronic structure of periodic atomic systems, the all required resources tend to grow with growth of the problem size. As such, a larger system (=more atoms) requires more CPU-time, but also more memory. This means that compute nodes with many cores are welcome and desired, but these cores need the associated memory. Doubling the cores would require the memory on a node to be doubled as well. Communication between the nodes should be fast as well, as this will be the main limiting factor on the scaling performance. If all this is implemented well, then the time to solution of a project (not a single calculation) will improve significantly with access to Exa-scale resources. The factor will not be 100x from a Pflops system, but could be much better than 10x. This factor 10 also takes into account that projects will have access to much more demanding calculations as a default (Hybrid functional structure optimization instead of simple density functional theory structure optimization, which is ~1000x cheaper for plane wave methods but is less accurate).

At this scale, parallelism is very important, and implementing this into a program is far from a trivial task. As most physicists/mathematicians/chemists/engineers may have the skills for writing scientifically sound software, we are not computer-scientists and our available time and skills are limited in this regard. For this reason, it will become more important for the HPC-facility to provide parallelization of software as a service. I.e. have a group of highly skilled computer scientists available to assist or even perform this task.

Next to having the best implementation of software available, it should also be possible to get access to these machines. This should not be limited to a happy few through a peer review process which just wastes human research potential. Instead access to these should be a mix of guaranteed access and peer review.

  • Guaranteed access: For standard production projects (5-25 million CPU hours/year) university researchers should have a guaranteed access model. This would allow them to perform state of the are research without too much overhead. To prevent access to people without the proven necessary need/skills it could be implemented that a user-database is created and appended upon each application. Upon first application, a local HPC-team (country/region/university Tier-1 infrastructure) would have to provide a recommendation with regard to the user, including a statement of the applicant’s resource usage at that facility. Getting resources in a guaranteed access project would also require a limited project proposal (max 2 pages, including user credentials, requested resources, and small description of the project)
  • Peer review access: This would be for special projects, in which the researcher requires a huge chunk of resources to perform highly specialized calculations or large High-throughput exercises (order of 250-1000 million CPU hours, e.g. Nature Communications 8, 15959 (2017)). In this case a full project with serious peer review (including rebuttal stage, or the possibility to resubmit after considering the indicated problems). The goal of this peer review system should not be to limit the number of accepted projects, but to make sure the accepted projects run successfully.
  • Pay per use: This should be the option for industrial/commercial users.

What could an HPC user as myself do to contribute to the success of EuroHPC? This is rather simple, run the machine as a pilot user (I have experience on most of the TIER-2 clusters of Ghent University and both Flemish Tier-1 machines. I successfully crashed the programs I am using by pushing them beyond their limits during pilot testing, and ran into rather unfortunate issues. 🙂 That is the job of a pilot user, use the machine/software in unexpected ways, such that this can be resolved/fixed by the time the bulk of the users get access.) and perform peer review of the lager specialized projects.

Now the only thing left to do is wait. Wait for the Exa-scale supercomputers to be build…7 years to go…about 92 node-days on Breniac…a starting grant…one long weekend of calculations.

Appendix

For simplicity I use the term CPU to indicate a single compute core, even though technically, nowadays a single CPU will contain multiple cores (desktop/laptop: 2-8 cores, HPC-compute node: 2-20 cores / CPU (or more) ). This to make comparison a bit more easy.

Furthermore, modern computer systems start more and more to rely on GPU performance as well, which is also a possible road toward Exa-scale computing.

Orders of magnitude:

  • G = Giga = 109
  • T = Tera = 1012
  • P = Peta = 1015
  • E = Exa = 1018

Older posts «