Tag: academic life

VSC-users day 2019

It is becoming an interesting yearly occurrence: the VSC user day. During this 5th edition, HPC users of the various Flemish universities gather together at the Belgian Royal Academy of Science (KVAB) to present their state-of-the-art work using the Flemish Tier-1 and Tier-2 supercomputers. This is done during a poster-presentation session. This year, I presented my work with regard to vibrational spectra in solids and periodic systems. In contrast to molecules, vibrational spectra in solids are rarely investigated at the quantum mechanical level due to their high cost. I show that imaginary modes are not necessarily a result of structural instabilities, and I present a method for identifying the vibrational spectrum of a defect.

Poster for the VSC user day 2019.

In addition, international speakers discuss recent (r)evolutions in High Performance Computing, and during workshops, the participants are introduced in new topics such as GPU-computing, parallelization, and the VSC Cloud and data platform. The possibilities of GPU were presented by Ehsan, of the VSC, showing extreme speedups of 10x to 100x, strongly depending on the application, the graphics card. It is interesting to see that simple CUDA prama’s can be used to obtain such effects…maybe I should have a go at them for the Hirshfeld and phonon parts of my HIVE code…if they can deal with quadruple precision, and very large arrays. During the presentation of Joost Vandevondele (ETH Zürich) we learned what the future holds with regard to next generation HPC machines. As increasing speed becomes harder and harder to obtain, people are again looking into dedicated hardware systems, a situation akin to the founding days of HPC. Whether this is a situation we should applaud remains to be seen, as it means that we are moving back to codes written for specific machines. This decrease in portability will probably be alleviated by high level scripting languages (such as python), which at the same time result in a significant loss of the initial gain. (Think of the framework approach to modern programming which leads to trivial applications requiring HPC resources to start.)

In addition, this year the HPC-team of the TIER-1 machine is present for a panel discussion, presenting the future of the infrastructure. The machine nearly doubled in size which is great news. Let us hope that in addition for financing hardware, there is also a significant budget considered for a serious extension of a dedicated HPC support team. Running a Tier-1 machine is not something one does as a side-project, but which requires a constant vigilance of a dedicated team to deal with software updates, resulting compatibility issues, conflicting scripts and just hardware and software running haywire because they can.

With this hope, I look toward the future. A future where computational research is steadily are every quickly is becoming common place in the fabric om academic endeavors.

Universiteit Van Vlaanderen

A bit over 1 month ago, I told you about my adventure at the film studio of “de Universiteit Van Vlaanderen“. Today is the day the movie is officially released. You can find it at the website of de Universiteit Van Vlaanderen: Video. The video is in Dutch as this is a science-communication platform aimed at the local population, presenting the expertise available at our local universities.

 

In addition to this video, I was asked by Knack magazine to write a piece on the topic presented. As computational research is my central business I wrote a piece on the subject introducing the general public to the topic. The piece can be read here (in Dutch).

And of course, before I forget, this weekend there was also the half-yearly daylight saving exercise with our clocks.[and in Dutch]

 

SBDD XXIV: Diamond workshop

The participants to SBDD XXIV of 2019.  (courtesy of Jorne Raymakers, SBDD XXIV secretary) 

 

Last week the 24th edition of the Hasselt diamond workshop took place (this year chaired by Christoph Becher). It’s already the fourth time, since 2016, I have attended this conference, and each year it is a joy to meet up with the familiar faces of the diamond research field. The program was packed, as usual. And this year the NV-center was again predominantly present as the all-purpose quantum defect in diamond. I keep being amazed at how much it is used (although it has a rather low efficiency) and also about how many open question remain with regard to its incorporation during growth. With a little luck, you may read more about this in the future, as it is one of a few dozen ideas and questions I want to investigate.

A very interesting talk was given by Yamaguchi Takahide, who is combining hexagonal-BN and H-terminated diamond for high performance electronic devices. In such a device the h-BN leads to the formation of a 2D hole-gas at the interface (i.e., surface transfer doping), making it interesting for low dimensional applications. (And it of course hints at the opportunities available with other 2D materials.) The most interesting fact, as well as the most mind-boggling to my opinion, was the fact that there was no clear picture of the atomic structure of the interface. But that is probably just me. For experiments, nature tends to make sure everything is alright, while we lowly computational materials artificers need to know where each and every atom belongs. I’ll have to make some time to find out.

A second extremely interesting presentation was given by Anke Krueger (who will be the chair of the 25th edition of SBDD next year), showing of her groups skill at creating fluorine terminated diamond…without getting themselves killed. The surface termination of diamond with fluorine comes with many different hazards, going from mere poisoning, to fire and explosions. The take-home message: “kids don’t try this at home”. Despite all this risky business, a surface coverage of up to 85% was achieved, providing a new surface termination for diamond, with a much stronger trapping of negative charges near the surface, ideal for forming negatively charged NV centers.

On the last day, Rozita Rouzbahani presented our collaboration on the growth of B doped diamond. She studied the impact of growth conditions on the B concentration and growth speed of B doped diamond surfaces. My computational results corroborate her results and presents the atomic scale mechanism resulting in an increased doping concentration upon increased growth speed. I am looking forward to the submission of this nice piece of research.

And now, we wait another year for the next edition of SBDD, the celebratory 25th edition with a focus on diamond surfaces.

Universiteit Van Vlaanderen: Will we be able to design new materials using our smartphone in the future?

Yesterday, I had the pleasure of giving a lecture for the Universiteit van Vlaanderen, a science communication platform where Flemish academics are asked to answer “a question related to their research“. This question is aimed to be highly clickable and very much simplified. The lecture on the other hand is aimed at a general lay public.

I build my lecture around the topic of materials simulations at the atomic scale. This task ended up being rather challenging, as my computational research has very little direct overlap with the everyday life of the average person. I deal with supercomputers (which these days tend to be bench-marked in terms of smartphone power) and the quantum mechanical simulation of materials at the atomic scale, two other topics which may ring a bell…but only as abstract topics people may have heard of.

Therefor, I crafted a story taking people on a fast ride down the rabbit hole of my work. Starting from the almost divine power of the computational materials scientist over his theoretical sample, over the reality of nano-scale materials in our day-to-day lives, past the relative size of atoms and through the game nature of simulations and the salvation of computational research by grace of Moore’s Law…to the conclusion that in 25 years, we may be designing the next generation of CPU materials on our smartphone instead of a TIER-1 supercomputer. …did I say we went down the rabbit hole?

The television experience itself was very exhilarating for me. Although my actual lecture took only 15 minutes, the entire event took almost a full day. Starting with preparations and a trial run in the afternoon (for me and my 4 colleagues) followed by make-up (to make me look pretty on television 🙂 … or just to reduce my reflectance). In the evening we had a group diner meeting the people who would be in charge of the technical aspects and entertainment of the public. And then it was 19h30. Tensions started to grow. The public entered the studio, and the show was ready to start. Before each lecture, there was a short interview to test sound and light, and introduce us to the public. As the middle presenter, I had the comfortable position not to be the first, so I could get an idea of how things went for my colleagues, and not to be the last, which can really be destructive on your nerves.

At 21h00, I was up…

and down the rabbit hole we went. 

 

 

Full periodic table, with all elements presented with their relative size (if known)

Full periodic table, with all elements presented with their relative size (if known) created for the Universiteit van Vlaanderen lecture.

 

Newsflash: Materials of the Future

This summer, I had the pleasure of being interviewed by Kim Verhaeghe, a journalist of the EOS magazine, on the topic of “materials of the future“. Materials which are currently being investigated in the lab and which in the near or distant future may have an enormous impact on our lives. While brushing up on my materials (since materials with length scales of importance beyond 1 nm are generally outside my world of accessibility), I discovered that to cover this field you would need at least an entire book just to list the “materials of the future”. Many materials deserve to be called materials of the future, because of their potential. Also depending on your background other materials may get your primary attention.

In the resulting article, Kim Verhaeghe succeeded in presenting a nice selection, and I am very happy I could contribute to the story. Introducing “the computational materials scientist” making use of supercomputers such as BrENIAC, but also new materials such as Metal-Organic Frameworks (MOF) and shedding some light on “old” materials such as diamond, graphene and carbon nanotubes.

Review of 2017

Happy New Year

2017 has come and gone. 2018 eagerly awaits getting acquainted. But first we look back one last time, trying to turn this into a old tradition. What have I done during the last year of some academic merit.

Publications: +4

Completed refereeing tasks: +8

  • The Journal of Physical Chemistry (2x)
  • Journal of Physics: Condensed Matter (3x)
  • Diamond and Related Materials (3x)

Conferences & workshops: +5 (Attended) 

  • Int. Conference on Diamond and Carbon Materials (DCM) 2017, Gothenburg, Sweden, September 3rd-7th, 2017 [oral presentation]
  • Summerschool: “Upscaling techniques for mathematical models involving multiple scales”, Hasselt, Belgium, June 26th-29th, 2017 [poster presentation]
  • VSC-user day, Brussels, Belgium, June 2nd, 2017 [poster presentation]
  • E-MRS 2017 Spring Meeting, Strasbourg, France, May 22nd-26th, 2017 [1 oral + 2 poster presentations]
  • SBDD XXII, Hasselt University, Belgium, March 8th-10th, 2017 [poster presentation]

PhD-students: +1

  • Mohammadreza Hosseini (okt.-… ,Phd student physical chemistry, Tarbiat Modares University, Teheran, Iran)

Bachelor-students: +2

Current size of HIVE:

  • 48.5K lines of program (code: 70 %)
  • ~70 files
  • 45 (command line) options

Hive-STM program:

And now, upward and onward, a new year, a fresh start.

Functional Molecular Modelling: simulating particles in excel

This semester I had several teaching assignments. I was a TA for the course biophysics for the first bachelor biomedical sciences, supervised two 3rd bachelor students physics during their first steps in the realm of computational materials science, and finally, I was responsible for half the course Functional Molecular Modelling for the first Masters Biomedical students (Bioelectronics and Nanotechnology). In this course, I introduce the the students into the basic concepts of classical molecular modelling (quantum modelling is covered by Prof. Wilfried Langenaeker). It starts with a reiteration of some basic concepts from statistics and moves on to cover the canonical ensemble. Things get more interesting with the introduction into Monte-Carlo(MC) and Molecular Dynamics(MD), where I hope to teach the students the basics needed to perform their own MC and MD simulations. This also touches the heart of what this course should cover. If I hear a title like Functional Molecular Modelling, my thoughts move directly to practical applications, developing and implementing models, and performing simulations. This becomes a bit difficult as none of the students have any programming experience or skills.

Luckily there is excel. As the basic algorithms for MC and MD are actually quite simple, this office package can be (ab)used to allow the students to perform very simple simulations. This even without the use of macro’s or any advanced features. Because Excel can also plot the data present in the cells, you immediately see how properties of the simulated system vary during the simulation, and you get direct update of all graphs every time a simulation is run.

It seems I am not the only one who is using excel for MD simulations. In 1995, Fraser and Woodcock even published a paper detailing the use of excel for performing MD simulations on a system of 100 particles. Their MD is a bit more advanced than the setup I used as it made heavy use of macro’s and needed some features to speed things up as much as possible. With the x486 66MHz computers available at that time, the simulations took of the order of hours. Which was impressive, as they served as an example of how computational speed had improved over the years, and compared to the months of supercomputer resources one of the authors had needed 25 years earlier to perform the same thing for his PhD. Nowadays the same excel simulation should only take minutes, while an actual program in Fortran or C may even execute the same thing in a matter of seconds or less.

For the classes and exercises, I made use of a simple 3-atom toy-model with Lennard-Jones interactions. The resulting simulations remain clear allowing their use for educational purposes. In case of  MC simulations, a nice added bonus is the fact that excel updates all its fields automatically when a cell is modified. As a result, all random numbers are regenerated and a new simulation can be performed by saving the excel-sheet or just modifying a not-used cell.

Monte Carlo in excel. A system of three particles on a line, with one particle fixed at 0. All particles interact through a Lennard-Jones potential. The Monte Carlo simulation shows how the particles move toward their equilibrium position.

Monte Carlo in excel. A system of three particles on a line, with one particle fixed at 0. All particles interact through a Lennard-Jones potential. The Monte Carlo simulation shows how the particles move toward their equilibrium position.

The simplicity of Newton’s equations of motion make it possible to perform simple MD simulations, and already for a three particle system, you can see how unstable the algorithm is. Implementation of the leap-frog algorithm isn’t much more complex and shows incredible the stability of this algorithm. In the plot of the total energy you can even see how the algorithm fights back to retain stability (the spikes may seem large, but the same setup with a straight forward implementation of Newton’s equation of motion quickly moves to energies of the order of 100).

Molecular dynamics in excel. A system of three particles on a line, with one particle fixed at 0. All particles interact through a Lennard-Jones potential. The Molecular dynamics simulation shows how the particles move as time evolves. Their positions are updated using the leap-frog algorithm. The extreme hard nature of the Lennard-Jones potential gives rise to the sharp spikes in the total energy. It is this last aspect which causes the straight forward implementation of Newton's equations of motion to fail.

Molecular dynamics in excel. A system of three particles on a line, with one particle fixed at 0. All particles interact through a Lennard-Jones potential. The Molecular dynamics simulation shows how the particles move as time evolves. Their positions are updated using the leap-frog algorithm. The extreme hard nature of the Lennard-Jones potential gives rise to the sharp spikes in the total energy. It is this last aspect which causes the straight forward implementation of Newton’s equations of motion to fail.

 

Scaling of VASP 5.4.1 on TIER-1b BrENIAC

When running programs on HPC infrastructure, one of the first questions to ask yourself is: “How well does this program scale?

In applications for HPC resources, this question plays a central role, often with the additional remark: “But for your specific system!“. For some software packages this is an important remark, for other packages this remark has little relevance, as the package performs similarly for all input (or for given classes of input). The VASP package is one of the latter. For my current resource application at the Flemish TIER-1 I set out to do a more extensive scaling test of the VASP package. This is for 2 reasons. The first being the fact that I will be using a newer version of VASP: vasp 5.4.1 (I am currently using my own multiply patched version 5.3.3.). The second being the fact that I will be using a brand new TIER-1 machine (second Flemish TIER-1, as our beloved muk retired the end of 2016).

Why should I put in the effort to get access to resources on such a TIER-1 supercomputer? Because such machines are the life blood of the computational materials scientist. They are their sidekick in the quest for understanding of materials. Over the past 4 years, I was granted (and used) 20900 node-days of calculation time (i.e. over 8 Million hours of CPU time, or 916 years of calculation time) on the first TIER-1 machine.

Now back to the topic. How well does VASP 5.4.1 behave? That depends on the system at hand, and how good you choose the parallelization settings.

1. Parallelization in VASP

VASP provides several parameters which allow for straightforward parallelization of the simulation:

  • NPAR : This parameter can be set to parallelize over the electronic bands. As a consequence, the number of bands included in a calculation by VASP will be a multiple of NPAR. (Note : Hybrid calculations are an exception as they require NPAR to be set to the number of cores used per k-point.)
  • NCORE : The NCORE parameter is related to NPAR via NCORE=#cores/NPAR, so only one of these can be set.
  • KPAR : This parameter can be set to parallelize over the set of irreducible k-points used to integrate over the first Brillouin zone. KPAR should therefore best be a divisor of the number of irreducible k-points.
  • LPLANE : This boolean parameter allows one to switch on parallelization over plane waves. In general this will give rise to a small but consistent speedup (observed in previous scaling tests). As such we have chosen to set this parameter = .TRUE. for all calculations.
  • NSIM : Sets up a blocked mode for the RMM-DIIS algorithm. (cf. manual, and further tests by Peter Larsson). As our tests do not involve the RMM-DIIS algorithm, this parameter was not set.

In addition, one needs to keep the architecture of the HPC-system in mind as well: NPAR, KPAR and their product should be divisors of the number of nodes (and cores) used.

2. Results

Both NPAR and KPAR parameters can be set simultaneously and will have a serious influence on the speed of your calculations. However, not all possible combinations give the same speedup. Even worse, not all combinations are beneficial with regard to speed. This is best seen for a small 2 atom diamond primitive cell.

Timing primitive uni cell diamond.

Timing results for various combinations of the NPAR and KPAR parameters for a 2 atom primitive unit cell of diamond.

Two things are clear. First of all, switching on a parallelization parameter does not necessarily mean the calculation will speed up. In some case it may actually slow you down. Secondly, the best and worst performance is consistently obtained using the same settings. Best: KPAR = maximal, and NPAR = 1, worst: KPAR = 1 and NPAR = maximal.

This small system shows us what you can expect for systems with a lot of k-points and very few electronic bands ( actually, any real calculation on this system would only require 8 electronic bands, not 56 as was used here to be able to asses the performance of the NPAR parameter.)

In a medium sized system (20-100 atoms), the situation will be different. There, the number of k-points will be small (5-50) while the natural number of electronic bands will be large (>100). As a test-case I looked at my favorite Metal-Organic Framework: MIL-47(V).

scaling of MIL-47(V) at PBE level

Timing results for several NPAR/KPAR combinations for the MIL-47(V) system.

This system has only 12 k-points to parallelize over, and 224 electronic bands. The spread per number of nodes is more limited than for the small system. In contrast, the general trend remains the same: KPAR=high, NPAR=low, with an optimum performance when KPAR=#nodes. Going beyond standard DFT, using hybrid functionals, also retains the same picture, although in some cases about 10% performance can be gained when using one half node per k-point. Unfortunatly, as we have very few k-points to start from, this will only be an advantage if the limiting factor is the number of nodes available.

An interesting behaviour is seen when one keeps the k-points/#nodes ratio constant:

scaling at constant k-point/node ratio

Scaling behavior for either a constant number of k-points (for dense k-point grid in medium sized system) or constant k-point/#nodes ratio.

As you can see, VASP performs really well up to KPAR=#k-points (>80% efficiency). More interestingly, if the k-point/#node ratio is kept constant, the efficiency (now calculated as T1/(T2*NPAR) with T1 timing for a single node, and T2 for multiple nodes) is roughly constant. I.e. if you know the walltime for a 2-k-point/2-nodes job, you can use/expect the same for the same system but now considering 20-k-points/20-nodes (think Density of States and Band-structure calculations, or just change of #k-points due to symmetry reduction or change in k-point grid.)  😆

3. Conclusions and Guidelines

If one thing is clear from the current set of tests, it is the fact that good scaling is possible. How it is attained, however, depends greatly on the system at hand. More importantly, making a poor choice of parallelization settings can be very detrimental to the obtained speed-up and efficiency. Unfortunately when performing calculations on an HPC system, one has externally imposed limitations to work with:

  • Memory available per core
  • Number of cores per node[1]
  • Size of your system: #atoms, #k-points, & #bands

Here are some guidelines (some open doors as well):

  • Wherever possible k-point parallelization should be driven to a maximum (KPAR as large as possible). The limiting factor here is the actual number of k-points and the amount of memory available. The latter due to the fact that a higher value of KPAR leads to a higher memory requirement.[2]
  • Use the Γ-version of VASP for Γ-point only calculations. It reduces memory usage significantly (3.7Gb→ 2.8Gb/core for a 512 atom diamond system) and increases the computational efficiency, sometimes even by a factor of 2.
  • NPAR parallelization can be used to reduce the memory load for high KPAR calculations, but increasing KPAR will always outperform the same increase of NPAR.
  • In case only NPAR parallelization is available, due to too few k-points, and working with large systems, NPAR parallelization is your last resort, and will perform reasonably well, up to a point.
  • Electronic steps show very consistent timing, so scaling tests can be performed with only a 5-10 electronic steps, with standard deviations comparable in absolute size for PBE and HSE06.

 

In short:

K-point parallelism will save you, wherever possible !

 

[1] 28 is a lousy number in that regard as its prime-decomposition is 2x2x7, leaving little overlap with prime-decompositions of the number of k-points, which more often than you wish end up being prime numbers themselves 😥
[2] The small system’s memory requirements varied from 0.15 to 1.09 Gb/core for the different combinations.

BrENIAC: the new Flemish TIER-1 Supercomputer.

breniacYesterday was a good day for computational scientists in Flanders. The new TIER-1 machine, named BrENIAC, located at the university of Leuven, was inaugurated and is now officially open to all users of the Flemish university associations: UAntwerpen, VUB, UGhent, UHasselt, and KULeuven. The name refers to one of the first (super)computers ever built: ENIAC. This new machine will take over the task of the first TIER-1 machine (muk, located at the university of Ghent), which will be decommissioned at the end of this year. BrENIAC is ranked 196th in the current top 500 of supercomputers, and costs 5.5 M€. This is of course without the annual cost of power usage and technical personnel which will maintain the machine and provide support for the scientists running calculations. With its 580 compute nodes, containing 28 cores each (or 2 14-core CPU’s of the type Broadwell E5-2680v4), the number of available cores has roughly doubled. Also memory access should have improved, which gives rise to a theoretical threefold increase of the peak performance.

However, this peak performance is measured with “benchmark” tests, which tend to behave much better than real  life programs. This is because the average scientific programmer doesn’t write the best optimized code (ok, “commercial” programs these days may even behave worse :p )  for various reasons, time constraints being one of them. So my first task, before I start running my simulations on the new TIER-1 machine, will be to benchmark VASP and my own HIVE-code.

Two videos of my new sidekick:

 

You can see me in my front-row position in this picture taken during the non-academic part of the inauguration.

tUL Life Sciences Research Day 2016

Yesterday was the tUL Life Sciences Research Day 2016. A conference event build around finding collaboration possibilities between the University of Hasselt in Belgium and the University of Maastricht (The Netherlands)…after all tUL is the “transnational University Limburg” which brings two universities together that are only separated some 26 km, but you have to cross a national border.

Although Life sciences itself is not my personal niche, I went to look for opportunities, as nano-particles which are used for drug delivery often consist of metals or oxides. These materials on the other hand are my niche. I used my current work on MOFs as a means to show what is possible from the ab-initio point of view, and presented this as a poster.

tUL Life Science Research Day 2016 Poster

Poster presented at the tUL Life Sciences Research Day, depicting my work on the unfunctionalized and the functionalized MIL-47(V) MOF.