Molecular chemistry is lagging behind in term of open science.
Although modelization by quantum mechanics applied to chemistry has become almost mandatory in any major publication, computational raw data is most of the time kept in the labs or destroyed.
The information may not be reused or reproduced.
The first objective of the Quchempedia project is to constitute a large collaborative open platform that will store and present quantum molecular chemistry results. Original output files will be available to be reused to tackle new chemical studies for different applications.
The second objective of this project is to develop artificial intelligence and optimization methods in order to explore efficiently the highly combinatorial molecular space.
Since 2017, a collaboration was born at the university of Angers between T. Cauchy, theoretical chemist and B. Da Mota computer scientist. They obtained a small funding from the University of Angers. It has allowed the creation of a small server and storage capacity, associated with internships.
In mid-2018, a postdoctoral funding on machine learning in this subject was obtained from the region Pays-de-la-Loire. Its main objectives were centered on the role of the datasets on the generalization of ML predictions.
This collaboration has highlighted in 2019 that the chemical diversity of reference datasets in quantum chemistry for small molecules is indeed not optimized (https://jcheminf.biomedcentral.com/articles/10.1186/s13321-019-0391-2). A joint thesis on the subject of molecular optimization has also started in 2019.
A combinatorial optimization approach has been chosen for molecule generation with an evolutionary algorithm. With actions close to the atomic level, the EvoMol generator can freely explore the chemical space and respond in principle to diversified problems (https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00458-z). To compute the molecules proposed by EvoMol, a collaborative computing infrastructure has been launched, the QuChemPedia BOINC project (quchempedia.univ-angers.fr/athome/) with several hundred registered volunteers worldwide.
More than 2 million small compounds have already been computed in DFT (end of 2020) in a manner that maximizes the diversity of chemical environments.
Benoit DA MOTA. Maître de conférences i.e. Associate professor (tenure). LERIA. Université d'Angers.
Thomas CAUCHY. Maître de conférences i.e. Associate professor (tenure). MOLTECH-Anjou. Université d'Angers - CNRS.
Previous Related Internships:
2018 Brice HARISMENDY first platform
2019 Theo DEZE BOINC project