INFORMATION ABOUT PROJECT,
SUPPORTED BY RUSSIAN SCIENCE FOUNDATION

The information is prepared on the basis of data from the information-analytical system RSF, informative part is represented in the author's edition. All rights belong to the authors, the use or reprinting of materials is permitted only with the prior consent of the authors.

 

COMMON PART


Project Number14-43-00024

Project titleChemoinformatics approaches to organic and metabolic reactions: from empirical to predictive chemistry

Project LeadVarnek Alexandre

AffiliationKazan (Volga region) Federal University, Kazan University, KFU,

Implementation period 2014 - 2016  extension for 2017 - 2018

PROJECT EXTENSION CARD

Research area 03 - CHEMISTRY AND MATERIAL SCIENCES, 03-705 - Chemical informatics

KeywordsChemoinformatics, molecular modeling, expert systems, organic and metabolic reactions, chemical databases, OSAR / QSPR


 

PROJECT CONTENT


Annotation
Synthetic organic chemistry is essentially an empirical science. Although chemists often use quantum mechanics concepts to interpret results of their studies, conducting real chemical experiments is guided only by human experience acquired by trial and errors method. We suggest that application of modern informational technologies may significantly reduce human and material costs by guiding chemists to select optimal reaction conditions, the most suitable reactants as well as to predict the parameters of reactions to be performed. Another important problem of modern chemistry is related to huge amount of data generated by modern synthetic facilities. According to our estimations, one microfluidic flow reactor can readily produce till 100 reactions a day, i.e. 20-30 thousand reactions a year. Thus, in a few years, one single research group can collect information about hundred thousand reactions including their properties and reaction conditions which should be automatically treated and rationalized. In this project we suggest a unique approach – Condensed Graphs of Reaction - which could efficiently be used both to develop predictive models for reaction parameters and to automatically rationalize “big” reaction data. We also envision that of the developed techniques could be extended to metabolic transformations of xenobiotics for which prediction of regioselectivity of particular transformations is often required in drug design projects in order to avoid toxic or reactive metabolites. Chemical reaction is a difficult object for the modeling because it involves different species of two types (reactants and products) and it is explicitly affected by reaction conditions (catalyst, solvent, additives). Quantum chemistry methods are still too time consuming and inefficient to perform large-scale prediction of thermodynamic and kinetic parameters of reactions, corresponding yield, variation of reaction selectivity as a function of reaction conditions. This problem could in principle be solved using chemoinformatics approaches. In chemoinformatics predictive models are developed from the analysis of available experimental data. Chemoinformatics approaches are typically oriented toward individual molecules which are encoded by molecular descriptors. In this context, chemical reactions represent a difficult case either, because it is not always clear for which species among reactants or products molecular descriptors should be generated. In this project we will apply an original concept of Condensed Graph of Reaction (CGR) which considers a chemical reaction as one single pseudo-molecule. This significant and critical simplification allows one to treat chemical reactions by powerful chemoinformatics tools developed for individual molecules. Special fragment descriptors could readily be generated for CGR followed by their applications in Quantitative Structure-Reactivity Relationships (QSRR) or reactions similarity search. The fact that any chemical reaction is represented by ensemble of fragment descriptors opens a revolutionary way to build multidimensional chemical reactions’ space which could be analyzed using dimensionality reduction techniques. The latter allows one to visualize, analyze and compare large chemical databases. In this project we intend to apply the CGR methodology to the analysis of organic and metabolic reactions. Namely, the following approaches will be developed: • Quantitative Structure-Reactivity Relationships (QSRR) approach based on special fragment descriptors issued from CGR. Predictive models for kinetic (reaction rate constants) and thermodynamic (equilibrium constant) properties and yield for cycloaddition, SN1, SN2, E1, E2 and some other classes of reactions will be built. • reaction similarity-based expert system suggesting optimal reaction conditions leading to desirable selective chemical transformations. This approach will be realized for the particular case of hydrogenation reactions; • a model predicting regioselectivity of some metabolic reactions (e.g., aromatic hydroxylation) • an approach to analyze the content of large chemical or metabolic databases using specific CGR patterns - “reaction signatures”; • visualization of chemical or metabolic databases using Generative Topographic Mapping (GTM) Experimental data used for the modeling will be assembled in two databases prepared in the framework of this project: (i) Hydrogenation database collecting hydrogenation batch and flow reactions, and (ii) QSRR database containing chemical reactions for which kinetic and thermodynamic data as well as reaction conditions are available. These will be unique collections of reactions combining both literature data, unpublished data from numerous PhD thesis defended in Kazan Federal University (KFU) in the period 1970-2000 and new experiments carried out within this project. The developed models and software tools will be available for the users via WEB interface. This interface will be extremely user-friendly: a chemist should input a structural formula a reaction which he/she intend to carry out in order to obtain immediate answer from the computer. The approaches of “big” reaction data analysis will be recommended for the implementation to chemical databases developers. This project contains complementary theoretical (chemoinformatics) and experimental parts. The chemoinformatics part deals with the development of new approaches to the optimization of reaction conditions, prediction of reaction parameters as a function of structure of reactants and reaction conditions. Experimental studies are planned to generate some training data and to validate the developed models. Both classical (batch) and modern (flow, microwave) reactions will be considered. International team of leading experts in chemoinformatics from France, USA, Canada and Ukraine, the Russian part includes scientists from the Kazan Federal University and Moscow State University. Most members of the team collaborated on several projects in the past resulting in common publications. All team members will participate in teaching within the new Master program in Chemoinformatics founded at KFU in 2012. The master students will also be involved in research carried out within the framework of our project. Finally, the participants from KFU and Strasbourg have worked together to organize international conferences and summer schools in chemoinformatics both in Russia and in France. We will continue this activity within this project.

Expected results
Deliveries of this project concern the novel methodology and software developments as well as some particular applications. Methodology • Representation of complex reactions (including uncompleted and multi-step reactions) by CGR; • Development of property enriched fragment descriptors for reactions; • Reaction similarity approach based on CGR, • Inductive Learning Transfer methods for improving predictive performance of QSRR models, • Method of visualization and analysis of large chemical reaction spaces. Software development The following programs will be developed o Encoding reactions into CGR, o Fragment descriptors generation from CGR, o Stacking procedure combining different types of QSRR models o Iterative Generative Topographic Mapping for visualization of large datasets Databases development Hydrogenation reactions database QSRR database Applications • QSRR models for some kinetic (rate constant) and thermodynamic (reaction heat, yield) parameters of cycloaddition, SN1, SN2, E1, E2 and some other reactions proceeding in large variety of solvents. • Models for regioselectivity prediction in some metabolic reactions, • Expert system for optimization of reaction conditions for hydrogenation reactions (batch and flow) • Visualization of large chemical databases using Generative Topographic Mapping approach (in collaboration with Reaxys (Elsevier)) All developed programs, structure-reactivity models and will be available for the users via INTERNET interface. We believe that this project is beneficial for the whole chemists’ community. Any drug- or material design project inevitably includes synthetic stage which sometimes is extremely long if traditional trial and error concept is used. The project will result in new approaches and algorithms of chemical reaction mining which will significantly improve the efficiency the chemist’s work. The novel approaches will be implemented in user-friendly software tools which will guide researcher in his everyday work and help him to select optimal reactants and conditions as well as to handle large volumes of experimental data. Particular application to metabolic reaction may be widely used in drug design projects. We believe that developed tools will save a lot of human and material resources. Step by step chemistry will move from empirical to predictive science !


 

REPORTS


Annotation of the results obtained in 2016
This project is devoted to the development of the universal methodology of analysis, visualization and modeling of chemical and metabolic reactions, to the creation of the database of kinetic and thermodynamic parameters of reactions and to the development of unique software tools which allows one to assess important reaction parameters including optimal reaction conditions. The core element of the development is a special reaction representation approach – Condensed Graph of Reaction (CGR). CGR is a molecular graph representing structure of all reactants and products as well as reaction transformations. This representation contains full information about a reaction, can be restored to a common reaction representation and saves storage space. Reaction descriptors can easily be computed for CGR which opens an opportunity to apply to reactions standard chemoinformatics approaches used for individual molecules including similarity searching, data visualization and “structure-property” modeling. In 2016, we continued development of methodology for chemical reaction mining. Apart “classical” ISIDA fragments two types pf property-enriched descriptors were used in modelbuilding. The first one implied the mapping of some physico-chemical properties (atomic charges, lipophilicity, etc.) on the atoms. In the second one (“Electronic Effect Descriptors”) the influence of substituents of reaction center have been taken into account. It has been shown that property-enriched descriptors improve the model’s performance only in combination with the fragments containing marked atoms. New modelability parameter based on Hilbert-Schmidt theory has been suggested in order to select descriptors optimally fitted modelled property. Earlier suggested similarity principle has been used to assess optimal reaction conditions of deprotection reactions. Using a set of some 72000 reactions extracted from the Reaxys database, this approach demonstrated rather high (some 90%) accuracy of predictions. This approach has also been used for reaction selectivity assessment. This methodology has been implemented in software tool able to predict optimal reaction conditions on the basis of automatized treatment of reaction data. The prototype of software tool for automatized analysis of protective groups’ reactivity under catalytic hydration conditions has also been developed. The similarity principle for chemical reactions was formulated: similar reactions belong to the same type, occurred in similar conditions and have similar quantitative characteristics (rate, yield, selectivity). Recommendations for quantitative estimation of similarity were suggested. This approach has been applied to assessment of optimal reaction conditions for cleavage of protecting groups using “raw” reaction data from the Reaxys database. Prediction accuracy of this reaction similarity based approach is about 90%. It can easily be extended to the selectivity analysis of protecting group cleavage. This similarity-based approach for protective groups reactivity has been implemented in software tool. Inductive learning transfer and ensemble modeling approaches have been used in development of the models for kinetic and thermodynamic properties of reactions. We demonstrated that ensemble learning technique improves prediction accuracy in most cases. On the other hand, inductive learning transfer (Feature Net technology) only slightly improves performance of models for tautomer constant equilibrium. New visualization approach based on alignment of objects on a template of various topologies using Hilbet-Schmidt independence criterion has been developed. This technique locates reactions in graph vertices in such a way that incident objects are similar. The latter allows both to visualize ensemble of chemical reactions as well as to perform different types of searches in databases. Reaction signature – a CRG motif identifying reaction of a given type – has been presented by a hash-code. Several types of hierarchically linked signatures were proposed to relate different reaction classes. We demonstrated that application of CGR in combination with Matched Molecular Pairs (MMP) approach can been used to the analysis of substituents effects on kinetics and thermodynamics of chemical reactions. On example of SN2 reactions we also demonstrated that this approach also helps to identify the errors in mechanism annotation, which is hardly possible to do manually for large amounts of data. We continue to collect kinetic and thermodynamic reaction data for the QSRR DB database. Thus, rates of 8400 SN2, E2, SN1 and Diels-Alder reactions. We have collected some 14500 data on acidity of organic compounds and 4000 data on the strength of hydrogen-bonding complexes. We have adopted a database of 1.1 million reactions from patents. More than 142000 hydration reactions have been extracted from the Reaxys database. Predictive models for prediction of rate constants of SN2, E2, Diels-Alder reactions, tautomer equilibrium constants, acidity of organic compounds in aqueous phase, metabolic transformations of organic compounds, optimal condition of Michael reactions and optimal catalysts for protecting group cleavage by catalytic hydrogenation were developed using data from created databases and approaches developed in this project. The special server was developed to deploy the built models. It is accessible at the address cimm.kpfu.ru. The developed approaches were implemented in several software products. The CGRTools library was created which includes functions and classes to handle reaction data. The prototype of the service for structural search in reaction databases was developed. The tool for automatic model development was created as well as the service for publishing of obtained models. Experimental data about rate constants of hydrogenation flow reactions was collected. It was shown that the speed of data collection is limited by the stage of products analysis. Reproducibility of output (yield, products content) of flow reactions is difficult to achieve. Thus, using these data in “structure-reactivity” modeling is quite problematic. Two international events with elements of schools were organized: The School on Computer-Aided Molecular Design ((www.kpfu.ru/camd2016.html) and the School at the satellite symposium “From empirical to predictive chemistry” at XX Medeleev Congress on General and Applied Chemistry (www.kpfu.ru/e2pc2016.html, https://mendeleev2016.uran.ru/). In 2016, 7 articles were published in journals indexed by WoS and Scopus, 8 publications in journals index ed by RISC (4 articles and 4 conference papers). All obligations of the project were fulfilled. The agreement with company RELX Group (Elsevier), Switzerland, concerning collaboration in the field of reaction modeling and development of algorithms to handle reaction data in the Reaxys database was signed in December 2016.

 

Publications

1. Baskin I.I., Winkler D., Tetko I.V. A renaissance of neural networks in drug discovery Expert Opinion on Drug Discovery, Vol. 11, Is. 8, P. 785-795 (year - 2016) https://doi.org/10.1080/17460441.2016.1201262

2. Glavatskikh M., Madzhidov T., Solov'ev V., Marcou G., Horvath D., Varnek A. Predictive Models for the Free Energy of Hydrogen Bonded Complexes with Single and Cooperative Hydrogen Bonds Molecular Informatics, Vol. 35, Is. 11-12, Pp. 629-638 (year - 2016) https://doi.org/10.1002/minf.201600070

3. Khayrullina A.I., Madzhidov T.I., Nugmanov R.I., Afonina V.A., Baskin I.I., Varnek A. Подход для создания атом-атомного отображения с использованием наивного байесовского классификатора Ученые записки Казанского университета. Серия Естественные науки, - (year - 2017)

4. Lin A.I., Madzhidov T.I., Klimchuk O., Nugmanov R.I., Antipin I.S., Varnek A. Automatized assessment of protective group reactivity: a step toward big reaction data analysis Journal of chemical information and modeling, Vol. 56, Is. 11, P. 2140-2148 (year - 2016) https://doi.org/10.1021/acs.jcim.6b00319

5. Madzhidov T.I., Gimadiev T.R., Malakhova D.A., Nugmanov R.I., Antipin I.S., Varnek A. Соотношение «структура – реакционная способность» в реакциях Дильса-Альдера с использованием подхода конденсированных графов реакций Журнал структурной химии, - (year - 2016)

6. Marcou G., Horvath D., Varnek A. Kernel target alignment parameter: a new modelability measure for regression tasks Journal of chemical information and modeling, Vol. 56, Is. 1, P. 6-11 (year - 2016) https://doi.org/10.1021/acs.jcim.5b00539

7. Nugmanov R.I., Madzhidov T.I., Antipin I.S., Varnek A.A. Автоматическое определение пропущенных реагентов и продуктов в уравнении химических реакций Ученые записки Казанского университета. Серия Естественные науки, - (year - 2017)

8. Polishchuk P., Madzhidov T., Gimadiev T., Bodrov A., Nugmanov R., Varnek A. Structure–reactivity modeling using mixture-based representation of chemical reactions Journal of Computer-Aided Molecular Design, Vol. 31, Is. 9, P. 829-839 (year - 2017) https://doi.org/10.1007/s10822-017-0044-3

9. Polishchuk P., Tinkov O., Khristova T., Ognichenko L., Kosinskaya A., Varnek A., Kuz'min V. Structural and physico-chemical interpretation (SPCI) of QSAR models and its comparison with matched molecular pair analysis Journal of chemical information and modeling, Vol. 56, Is. 8, P. 1455-1469 (year - 2016) https://doi.org/10.1021/acs.jcim.6b00371

10. Tetko I.V., Maran U., Tropsha A. Public (Q)SAR services, integrated modeling environments, and model repositories on the web: state of the art and perspectives for future development Molecular informatics, - (year - 2016) https://doi.org/10.1002/minf.201600082

11. Zhokhova N.I., Baskin I.I. Energy-Based Neural Networks as a Tool for Harmony-Based Virtual Screening Molecular Informatics, Vol. 36, Is. 11, No article 1700054 (year - 2017) https://doi.org/10.1002/minf.201700054

12. Baskin I.I., Madzhidov T.I., Antipin I.S., Varnek A. Искусственный интеллект в синтетической химии: достижения и перспективы Russian Chemical Reviews, V. 86, Is. 11, P. 1127 - 1156 (year - 2017) https://doi.org/10.1070/RCR4746

13. Polishchuk P. Interpretation of Quantitative Structure−Activity Relationship Models: Past, Present, and Future Journal of Chemical Information and Modeling, Vol. 57, Is. 11, P. 2618-2639 (year - 2017) https://doi.org/10.1021/acs.jcim.7b00274

14. Latypov E.I., Neklyudov S.A., Klimchuk O., Antipin I.S., Varnek A. Creation of the database and expert system for heterogeneous hydrogenation in continuous-flow XX Менделеевский съезд по общей и прикладной химии. В 5 т. Т.5: тез. докл. - Екатеринбург: Уральское отделение Российской академии наук, C. 143 (year - 2016)

15. Lin A.I., Madzhidov T.I., Nugmanov R.I., Antipin I., Klimchuk O., Varnek A. Assessment of protective groups reactivity from data analysis XX Менделеевский съезд по общей и прикладной химии. В 5 т. Т.5: тез. докл. - Екатеринбург: Уральское отделение Российской академии наук, C. 144 (year - 2016)

16. Lin A.I., Madzhidov T.I., Nugmanov R.I., Antipin I., Klimchuk O., Varnek A. Similarity-based assessment of optimal reaction conditions XX Менделеевский съезд по общей и прикладной химии. В 5 т. Т.5: тез. докл. - Екатеринбург: Уральское отделение Российской академии наук, C. 145 (year - 2016)

17. Madzhidov T.I., Lin A.I., Nugmanov R.I., Klimchuk O., Antipin I., Varnek A. Prediction of optimal reaction conditions XX Менделеевский съезд по общей и прикладной химии. В 5 т. Т.5: тез. докл. - Екатеринбург: Уральское отделение Российской академии наук, C. 118 (year - 2016)


Annotation of the results obtained in 2014
A unified scheme for chemical reactions databases based on InstantJChem (ChemAxon) software has been suggested. Several databases for substitution and elimination reactions, tautomeric and metabolomic transformations within this scheme have been built. Information about reaction transformations, conditions and rate constants have been manually collected and curated. Totally, 104 SN1 reactions, 1669 SN2 reactions, 121 Е1 reactions, 709 Е2 reactions, 1076 tautomeric transformations and 136 metabolic hydroxylation reactions have been collected. Collected data has been used for structure-reactivity modeling and prediction of rate constants of chemical reactions and tautomeric transformations. The GTM/ISIDA program for chemical data visualization using incremental generative topographical maps (GTM) approach has been developed. Applicability of this approach has been demonstrated on databases consisting of more than two million compounds. An approach for property prediction based on GTM has also been proposed. The GTM and ADDAGRA methods have been adapted for representation of chemical space(s) of reactions. Quantitative structure-reactivity models both for SN2 and Е2 reactions and tautomeric transformations have been developed. Prediction performance of the models is comparable with experimental error of rate constants determination. We demonstrated that different approaches could successfully be used for reactions representation: condensed graph of reactions in combination with fragmental descriptors or mixture representation. Importance of taking into account reaction conditions (solvent, temperature, etc) has been investigated. Experimental measurements of rate constants of SN2 reaction of azide-containing calixarenes formation have been performed and further used for validation of the earlier obtained structure-reactivity models. A good agreement between predicted and observed rate constants for calixarenes in cone conformation has been found. It was shown that usage of steric descriptors for reasonable assessement of calixarenes reactivity is required. Fifty-two reactions of catalytic hydrogenation in flow reactor conditions have been carried out. In these reactions, catalyst, pressure and temperature were systematically varied. It has been demonstrated that chemistry of flow and batch reactions is similar however in flow reactor reactions are performed much faster. Seminar-school “From empirical to predictive chemistry” with some 100 participants has been held. Its program included lectures and oral presentations of the experts in organic chemistry, quantum chemistry and chemoinformatics both from Russia and abroad Our work resulted in three articles which were either accepted or published (two – in foreign journals and one article in a Russian one).

 

Publications

1. - Chemical Data Visualization and Analysis with Incremental GTM: Big Data Challenge Journal of Chemical Information and Modeling, - (year - 2014) https://doi.org/10.1021/ci500575y

2. - Разработка моделей «структура-свойство» в реакциях нуклеофильного замещения с участием азидов Журнал структурной химии, Т. 55, №6, C.1080 – 1087 (year - 2014)

3. - GTM-based QSAR models and their applicability domains Molecular Informatics, - (year - 2014)


Annotation of the results obtained in 2015
In this project, we focused on the development of new approaches to the analysis, curation, modeling and visualization of chemical data. Developed approaches (a) are well suited to the treatment of chemical reactions, and (b) help to establish relationships between the structure of reactants and products, on the one hand, and various reaction characteristics, on the other hand. Most of the developed approaches are rather universal, i.e., they can be applied to both chemical reactions and to individual molecules. An important part of the project concerns the development of the concept of chemical similarity, which has been extended to chemical reactions. It is assumed that reactions involving reactants and products of similar chemical structures: (a) belong to the same type and proceed under one same mechanism, (b) occur under similar experimental conditions, (c) have similar quantitative characteristics (rate, yield, selectivity, etc.). An optimal workflow to encode reactions by bitstrings and to choose a proper metric in the assessment of reactions similarity has been established using earlier developed Neighborhood Behavior approach The Inductive Transfer (IT) approach has been tested as a way to improve the models’ performance. Specifically, we applied a popular IT strategy - Feature Nets - which uses property predicted by other models as descriptors for building new QSAR models. It has shown that this approach significantly reduced the error of predictions of tautomeric equilibrium constants and represents an interesting alternative to the modeling the optimal conditions for Michael reactions. For the first time, an incremental version of the Generative Topographic Mapping approach, iGTM, has been used to visualize large volumes of reaction data. This methodology was applied to a set of 48 000 reactions extracted from the ChemSpider Reactions database and to 70 000 hydrogenation reactions retrieved from the Reaxys database. It has been shown that different types of reactions form distinct clusters on two-dimensional GTM maps. This demonstrates that these maps can be used as a tool of classification of new chemical reactions. We also developed novel Stargate GTM method which links two different multidimensional spaces (e.g., descriptors and properties spaces) through a common 2-dimensional latent space. This opens a way to predict reactivity profiles of chemical compounds, as well as to design compounds with the desired reactivity profile. A workflow of building a “universal” GTM map describing several properties simultaneously has been suggested. It implies an analysis of the ensemble of regression models built on selected data subsets. This approach has been tested on a large data sample extracted from the ChEMBL database. It was shown that different classification models obtained with the help of the map selected according to this strategy performed well for the majority of the studied properties. In contrast to the classical methods of QSAR modeling, this method successfully combines the ability to model and visualize the data. The latter facilitates the interpretation of modeling results. The methodology of automatic processing of chemical reactions in databases has been significantly extended using the Condensed Graph of Reaction (CGR) approach. Our efforts were focused on (a) the restoration of lost information on reactants and products, (b) improvement of the performances of the atom-atom mapping (AAM) procedure, and (c) automatic extraction of reaction signatures and their use for preforming a hierarchical classification of reactions. We demonstrated that encoding unbalanced reactions by Condensed Graphs allows one to restore their structural parts. A "consensus" procedure minimizing AAO errors has been proposed. It involves a sequential use of different mapping algorithms. It has been demonstrated that "reaction signatures" uniquely identifying reaction types can efficiently be represented with CGR subgraphs. An algorithm of encoding a reaction center by a single number – hash-code – has been developed. A multilevel description of reaction centers facilitating classification of chemical reactions has been proposed. The efficiency of developed approaches of automatic processing of chemical reactions has been illustrated on large datasets extracted from the ChemSpider Reactions and QSRR DB databases. New data on chemical reaction have been incorporated into the QSRR DB database. The methods developed in this project have been used for automatic processing of information for 152,000 reactions extracted from the Reaxys and ChemSpider Reactions databases. Totally, more than 100.000 reactions of different types have been prepared for the modeling. 136 xenobiotics metabolic transformation reactions catalyzed by cytochrome P450 have been extracted from the Metabolite Database. 1500 Diels-Alder reactions from the collection of PhD and Habilitation theses in KFU as well as 1000 SN2-type reactions, and some 1200 tautomeric equilibrium reactions from library reference books KFU have been collected manually. All these data were stored in a convenient format for the successive modeling studies. A prototype of the database for hydrogenation flow reactions has been prepared. It includes a procedure of automatic standardization of reaction structure representation and atom-atom mapping. Totally, in 2015 more than 500 flow hydrogenation reactions under various conditions have been collected. In the framework of this project, predictive models were built for the following characteristics of molecules and reactions: • the melting point of chemical compounds, • regioselectivity of metabolic reactions, • rate constant of cycloaddition reactions, • rate constants of bimolecular elimination reactions, • optimal conditions for the Michael reactions, • halogen bonds strength, • selectivity of nucleophilic substitution / elimination. Some of these models are available for the users at http://infochim.u-strasbg.fr/webserv/VSEngine.html. The experimental part of the project consisted in systematic investigations of different parameters affecting reproducibility of the yield of catalytic hydrogenation reactions in a flow reactor. In particular, this concerned the catalysts aging and preparation. It has been shown that some popular catalysts, in which an activated carbon was used as a carrier, were not stable enough during the experiment. In order to avoid this phenomenon we suggested a new technique allowing one to significantly accelerate the data collection, to improve the yield reproducibility and to increase the catalyst lifetime. A workflow of the detailed analysis of the composition of the reaction products using the method of gas-liquid chromatography-mass spectrometry has been suggested upon studying the model reaction of hydrogenation of nitrobenzene and 4-nitrophenol. This will allow us in 2016 to generate large amount of the reaction data for the QSRR DB database. It should be noted that the methodological and software developments achieved in this projects were appreciated by some international organizations specialized in the collection and storage of reaction data. Thus, within next two month we expect signing an agreement with the Elsevier company (the largest in the world publisher of chemical literature) related to the cooperation in the field of reactions modeling and to the development of algorithms for manipulating reaction information in the database Reaxys. It should also be mentioned that the participant of the project, Dr T. Madzhidov, in 2015 has become a member of the Advisory Board of the ChemSpider Reactions database established by the Royal Society of Chemistry (UK).

 

Publications

1. - Expert System for Predicting Reaction Conditions: The Michael Reaction Case Journal of Chemical Information and Modeling, V. 55, Is. 2, P. 239–250 (year - 2015) https://doi.org/10.1021/ci500698a

2. - Консенсусный подход к созданию атом-атомного отображения в химических реакциях Бутлеровские сообщения, Т.44, №12, С. 170-176 (year - 2015)

3. - Соотношение «структура–реакционная способность» в реакциях бимолекулярного элиминирования с использованием подхода конденсированных графов реакций Журнал структурной химии, Т. 56, №7, С.1293-1300 (year - 2015) https://doi.org/10.15372/JSC20150701

4. - Stargate GTM: Bridging Descriptor and Activity Spaces Journal of Chemical Information and Modeling, V. 55, P. 2403-2410 (year - 2015) https://doi.org/10.1021/acs.jcim.5b00398

5. - How Accurately Can We Predict the Melting Points of Drug-like Compounds? Journal of Chemical Information and Modeling, V.54, Is. 12, P. 3320–3329 (year - 2014) https://doi.org/10.1021/ci5005288

6. - Predictive Models for Halogen-Bond Basicity of Binding Sites of Polyfunctional Molecules Molecular Informatics, - (year - 2015) https://doi.org/10.1002/minf.201500116

7. - Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds Journal of Computer-Aided Molecular Design, - (year - 2015) https://doi.org/10.1007/s10822-015-9882-z