On the fifth and final day of the workshop we return to the lab. Our task as a group: optimize our raspberry pink lacquer with regard to hardness, glossiness and chemical resistance.
The four cans of base material made during day 1 of the workshop were mixed to make sure we were all using the same base material (there are already sufficient noise introducing variables present, so any that can be eliminated should be.). Next, each team got a set of recipes generated with the ML algorithm to create. The idea was to parallelise the human part of the process. This would actually also have made for a very interesting exercise to perform in a computer science program. It showed perfectly how bottlenecks are formed and what impact is of serial sections and access/distribution of resources (or is this just in my mind? 😎 ). After a first round of samples, we already tried to improve the performance of our unit by starting the preparation of the next batch (prefetching 😉 ) while the results of the previous samples were entered into the ML algorithm, and that was run.
At the end of two update rounds, we discussed the results, there were already some clear improvements visible, but a few more rounds would have been needed to get to the best situation. A very interesting aspect to notice during such an exercise, is the difference in the concept of accuracy for the experimental side and the computational side of the story. While the computer easily spits out values in grams with 10 significant digits, at the experimental side of the story it was already extremely hard to get the same amounts with an accuracy of 0.02 gram (the present air currents give larger changes on the scale).
This workshop was a very satisfying experience. I believe I learned most with regard to Machine Learning from the unintentional observation in the lab. Thank you Christian and Kevin!
Day 4 of the workshop is again a machine learning centered day. Today we were introduced into the world of Gaussian Processes, and ML approach which is rooted in statistics and models data by looking at the averages of a distribution of functions…it is a function of functions. In contrast to most other ML approaches it is also very well suited for small data sets, which is why I had my eye on them already for quite some time. However, Gaussian Processes are not perfect and interestingly enough, their drawbacks and benefits seem quite complementary with the benefits and drawbacks to neural networks. Deep Gaussian Covariance Networks (DGCN) find their origin in this observation, and were designed with the idea of compensating the drawbacks of both approaches by combining them. The resulting approach is rather powerful and in contrast to any other ML approach: it does not have any hyper-parameter!!
Tomorrow, during the last day of the workshop, we will be using this DGCN to optimize our raspberry pink lacquers.
Today the workshop shifted gears a bit. We left the experimental side of the story and moved fully into the world of machine learning. This change went hand-in-hand with a doubling of the number of participants, showing how a hot-topic machine learning really is. Kevin Cremanns, who is presenting this part of the workshop, started by putting things into perspective a bit, and warned everyone not to hope for magical solutions (ML and AI have their problems), while at the same time presenting some very powerful examples of what is possible. A fun example is the robotic arm learning to flip pancakes:
During the introduction, all the usual suspects of machine learning passed the stage. And although you can read about them in every ML-book, it is nice to hear them discussed by someone who uses them on a daily basis. This mainly because practical details (often omitted in text-books) are also mentioned, helping one to avoid the same mistakes many have made before you. Furthermore, the example codes provided are extremely well documented, making them an interesting source of teaching material (the online manuals for big libraries like sci-kit learn or pandas tend to be too abstract, too big, and too intertwined for new users).
All-in-all a very interesting day. I look forward to tomorrow, as then we will be introduced into the closed source machine learning library developed at the University Hochschule Niederrhein.
Today was the second day of the machine learning workshop on coatings. After having focused on the components of coatings, today our focus went to characterization and deposition. The set of available characterization techniques is as extensive as the possible components to use. There was, however, one thing which grabbed my attention: “The magical human observer”. Several characterization techniques were presented to heavily rely on the human observer’s opinion and Fingerspitzengefühl. Sometimes this even came with the suggestion that such a human observer outperforms the numerical results of characterization machinery. This makes me wonder if this isn’t an indication of a poor translation of the human concept to the experiment intended to perform the same characterization. Another important factor to keep in mind when building automation frameworks and machine learning models.
In the afternoon, we again put on our lab coats and goggles. The task of the day: put our raspberry pink lacquer on different substrates and characterize the glossiness (visually) and the pendulum hardness.
Tomorrow the machine learning will kick in.
Today was the first day of school…not only for my son, but for me as well. While he bravely headed for the second grade of primary school, I was en route to the first day of a week-long workshop on Machine Learning and Coatings technology at the Hochschule Niederrhein in Krefeld. A workshop combining both the practical art of creating coating formulations and the magic of simulation, more specifically machine learning.
During my career as a computational materials researcher, I have worked with almost every type of material imaginable (from solids to molecules, including the highly porous things in between called MOFs), and looked into every aspect available, be it configuration (defects , surfaces, mixtures,…) or materials properties (electronic structure, charge transfer, mechanical behavior and spin configurations). But each and every time, I did this from a purely theoretical perspective*. As a result, I have not set foot in a lab (except when looking for a colleague) since 2002 or 2003, so you can imagine my trepidation at the prospect of having to do “real” lab-work during this workshop.
Participating in such a practical session— even such a ridiculously simple and safe one— is a rather interesting experience. The safety-goggles, white-coat and gloves are cool to wear, true, but from my perspective as a computational researcher who wants to automate things, this gives me a better picture of what is going on. For example, we** carefully weigh 225.3 grams of a liquid compound and add 2.2 grams of another (each with an accuracy of about 0.01 gram). In another cup, we collect two dye compounds (powders), again trying our best to perfectly match the prescribed quantities. But when the two are combined in the mixer it is clear that a significant quantity (multiple grams) are lost, just sticking to the edge of the container and spatula. So much for carefully weighing (of course a pro has tricks and skills to deal with this better than we did, but still). Conclusion: (1)Error bars are important, but hard to define. (2) Mixtures made by hand or by a robot should be quite different in this regard.
For the theoretical part of my brain, mixing 10 compounds is just putting them in the same box and stir, mix or shake. Practice can be quite different, especially if you need 225 grams of compound A, and 2.2 grams of compound B. This means that for the experimentalist there is a “natural order” for doing things. This order does not exist at the theoretical side of the spectrum***, where I build my automation and machine learning. This, in addition to the implicit interdependence of combined compounds, gives the high-dimensional space of possible mixtures a rather contorted shape. This gives rise to several questions begging for answers, such as: how important is this order, and can we (ab)use all this to make our search space smaller (but still efficient to sample).
At the end of the day, I learned a lot of interesting things and our team of three ended up with a nice raspberry pink varnish.
Next, day two, where we will characterize our raspberry pink varnish.
* Yes, I do see how strange this may appear for someone whose main research focus is aimed at explaining and predicting experiments. 🙂
** We were divided in teams of 2-3 people, so there were people with actual lab skills nearby to keep me safe. However, if this makes you think I was just idly present in the background, I have to disappoint you. I am brave enough to weigh inanimate powders and slow flowing resins 😉 .
*** Computational research in its practice uses aspects of both the experimental and theoretical branches of research. We think as theoreticians when building models and frameworks, and coax our algorithms to a solution with a gut-feeling and Fingerspitzengefühl only experimentalists can appreciate.
Today and tomorrow, there is a 2-day summer school on science communication held at the University of Antwerp: Let’s Talk Science! During this summer school there are a large number of workshops to participate in, and lectures to attend, dealing with all aspects of science communication.
I was invited to represent Hasselt University (and science communication done by its members) during the plenary panel session starting the summer school. The goal of this plenary session was to share our experiences and thoughts on science communication. The contributions varied from hands-on examples to more abstract presentations of what to keep in mind, including useful tips. The central aim of my presentation was directed at identifying the boundary between science communication and scientific communication. Or more precisely, showing that this border may be more artificial than we are aware of. By showing that everyone’s unique in his/her expertise and discipline, I provided the link between conference presentations and presentations for the general public. I traveled through my history of science communication, starting in the middle: with the Science Battle. An event, I wrote about before, where you are asked to explain your work in 15 minutes to an audience of 6-to 12-year-olds. Then I worked my way back via my blog and contributions to “Ik heb been vraag” (such as: if you drop a penny from the Eiffel tower, will this kill someone on the ground?) to the early beginning of my research: simulating STM images. In the latter case, although I was talking to experts in their field (experimental growth and characterization), their total lack of experience in modelling and quantum mechanical simulations transformed my colleagues into “general public”. This is an important aspect to realize, not only for science communication, but also for scientific communication. As a consequence this also means that most of the tips and tricks applicable to science communication are also applicable to scientific communication.
For example: tell a coherent story. As noted by one of my favorite authors – Terry Pratchett – the human species might have better been called “Pan Narrans”, the storytelling ape. We tell stories and we remember by stories. This is also a means to make your scien(ce/tific) communication more powerful. I told the story of my passion during science explained and my lecture for de Universiteit van Vlaanderen.
A final point I touched is the question of “Why?”. Why should you do science communication? Some may note that is our duty as scientists, since we are payed with taxpayer money. But personally I believe this is not a good incentive. Science communication should originate from your own passion. It should be because you want to, instead of because you have to. If you want to, it is much easier to show you passion, show your interest, and also take the time to do it.
This brought me back to my central theme: Science communication can be simple and small. E.g. projecting simulated STM images on the wall’s of the medieval castle in Ghent (Gravensteen) during a previous edition of the Ghent Light Festival.
It is becoming an interesting yearly occurrence: the VSC user day. During this 5th edition, HPC users of the various Flemish universities gather together at the Belgian Royal Academy of Science (KVAB) to present their state-of-the-art work using the Flemish Tier-1 and Tier-2 supercomputers. This is done during a poster-presentation session. This year, I presented my work with regard to vibrational spectra in solids and periodic systems. In contrast to molecules, vibrational spectra in solids are rarely investigated at the quantum mechanical level due to their high cost. I show that imaginary modes are not necessarily a result of structural instabilities, and I present a method for identifying the vibrational spectrum of a defect.
In addition, international speakers discuss recent (r)evolutions in High Performance Computing, and during workshops, the participants are introduced in new topics such as GPU-computing, parallelization, and the VSC Cloud and data platform. The possibilities of GPU were presented by Ehsan, of the VSC, showing extreme speedups of 10x to 100x, strongly depending on the application, the graphics card. It is interesting to see that simple CUDA prama’s can be used to obtain such effects…maybe I should have a go at them for the Hirshfeld and phonon parts of my HIVE code…if they can deal with quadruple precision, and very large arrays. During the presentation of Joost Vandevondele (ETH Zürich) we learned what the future holds with regard to next generation HPC machines. As increasing speed becomes harder and harder to obtain, people are again looking into dedicated hardware systems, a situation akin to the founding days of HPC. Whether this is a situation we should applaud remains to be seen, as it means that we are moving back to codes written for specific machines. This decrease in portability will probably be alleviated by high level scripting languages (such as python), which at the same time result in a significant loss of the initial gain. (Think of the framework approach to modern programming which leads to trivial applications requiring HPC resources to start.)
In addition, this year the HPC-team of the TIER-1 machine is present for a panel discussion, presenting the future of the infrastructure. The machine nearly doubled in size which is great news. Let us hope that in addition for financing hardware, there is also a significant budget considered for a serious extension of a dedicated HPC support team. Running a Tier-1 machine is not something one does as a side-project, but which requires a constant vigilance of a dedicated team to deal with software updates, resulting compatibility issues, conflicting scripts and just hardware and software running haywire because they can.
With this hope, I look toward the future. A future where computational research is steadily are every quickly is becoming common place in the fabric om academic endeavors.
Yesterday, it was the annual meeting of the Belgian Physical Society, a nice event where Belgian physicists come together to present their latest work. It also provides a good opportunity for junior scientist to present their work (e.g. through the young speaker contest, an event I won in 2011).
The three students competing for the young speaker award presented three very interesting topics going from the creation of Aharonov-Bohm cages over the observation of high energy cosmic rays with detectors of several thousand square kilometers to the temperature determination of clusters. This years young speaker award went very deservedly to Daniela Mockler for her work on the measurement of cosmic rays.
Before lunch there was the usual conference picture (can you spot me? 😎 )
After our lunch there was the poster session. This year, I decided only to present a poster of my work on vibrational spectra. It combined work on Eu defects in diamond, the vibrational spectrum of Lactose and water and a method for fingerprinting defects in diamond.
During the parallel sessions, I attended the parallel session on physics education. Domien Van der Elst highlighted the daunting task of dealing with a serious shortage in Physics and Mathematics teachers. He suggest the creation of an online platform to (replace/)supplement physics teachers. Despite the possible benefits (and the fact that some big companies are looking into similar setups) I remain skeptical. My main worry being GDPR (privacy) wise and the growing trend that software users are more and more considered as sources providing data and private information to mine and use for a companies benefit (not the software user). The second talk by Bart Huyskens was very inspiring. From his practical experience as a high school teacher, he develops, together with a colleague, hardware, software and courseware for STEM projects. And when he says STEM projects he means it: projects containing ALL parts of a STEM education. Hearing him talk, it is not hard to start dreaming up possible projects, both short and long term. The third and final presentation of the session was by Phillipe Leonard on the concept of “Challenge Labs“. During these labs, teams of students get a “simple problem” to solve. However, while trying to solve the the problem, they discover nothing is what it seems, and they need to learn to think outside the box. This definitely is an interesting method of teaching (assuming good support by the teacher involved) which has the possibility to lift students to a higher level.
After the coffee break, I attended the remainder of the condensed matter session. During this session Michael Sluydts presented his Machine-Learning work on dopants in Si and Ge. An approach which should be very suitable for diamond as well.
All in all a very interesting day.
A bit over 1 month ago, I told you about my adventure at the film studio of “de Universiteit Van Vlaanderen“. Today is the day the movie is officially released. You can find it at the website of de Universiteit Van Vlaanderen: Video. The video is in Dutch as this is a science-communication platform aimed at the local population, presenting the expertise available at our local universities.
In addition to this video, I was asked by Knack magazine to write a piece on the topic presented. As computational research is my central business I wrote a piece on the subject introducing the general public to the topic. The piece can be read here (in Dutch).
And of course, before I forget, this weekend there was also the half-yearly daylight saving exercise with our clocks.[and in Dutch]