Building synthesis recipes for microstructures using NLP

  1. Building synthesis recipes for microstructures using NLP

    25422 / Solid-state physics
    Promotor(en): S. Cottenier, M. Sluydts / Begeleider(s): M. Sluydts, N. N.

    Background and problem

    Computational materials simulations have been able to rapidly accelerate the development of new materials by predicting their properties before they are ever made. One problem however is that once we identify an interesting candidate material, we still have to actually produce it. When the material properties require specific large-scale order such as the microstructure we will only be able to achieve it using highly specific processing even if we know exactly what the composition of the material is.

    Luckily, material scientists have built up extensive experience and shared it in the scientific literature. It is by far not unreasonable to expect that somewhere within the literature there are important hints towards producing these new materials, but how do we find it? Reading all of the literature is impractical and reading only what seems relevant to us may cause us to miss important connections. This is where Natural Language Processing or NLP comes in.

    Using deep learning NLP knowledge can be extracted from large bodies of literature completely automatically and used to build large-scale databases of materials knowledge. Moreover, from this knowledge such models can even learn to write synthesis recipes for us suggesting concepts like processing time, methods and precursors with nearly no input. This offers great promise for the future of materials science, where a painstaking literature study might one day be replaced by simply posing the question to our AI lab assistant.

    Goal

    Within this thesis you will explore existing datasets of annotated synthesis recipes and expand them towards the field of microstructures. You will then train deep learning models to extract and annotate synthesis recipes from the wider literature using Named Entity Recognition [1] The extracted information will be used to build a database of synthesis knowledge and language models used to write synthesis recipes. These insights can then be linked to answer open research questions from our experimental research partners focusing on steel microstructures.

    Collaboration with company

    Epotentia (epotentia.com) for machine learning support

    1. Study programme
      Master of Science in Engineering Physics [EMPHYS], Master of Science in Sustainable Materials Engineering [EMMAEN], Master of Science in Physics and Astronomy [CMFYST]
      Keywords
      natural language processing, deep learning, process-structure-property relations, microstructure
      References

      List of figures:

      Figure X: work flow for an NLP algorithm that extracts processing recipes from the scientific literature

      References:

      https://chemrxiv.org/articles/preprint/Named_Entity_Recognition_and_Norm...
      https://www.nature.com/articles/s41597-019-0224-1

    Contact

    Stefaan Cottenier