Cheminformatics has been an active field in various guises since the 1970s and earlier, with activity in academic departments and commercial pharmaceutical research and development departments.[2][page needed][citation needed] The term chemoinformatics was defined in its application to drug discovery by F.K. Brown in 1998:[3]
Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.
Since then, both terms, cheminformatics and chemoinformatics, have been used,[citation needed] although, lexicographically, cheminformatics appears to be more frequently used,[when?][4][5] despite academics in Europe declaring for the variant chemoinformatics in 2006.[6] In 2009, a prominent Springer journal in the field was founded by transatlantic executive editors named the Journal of Cheminformatics.[7]
A primary application of cheminformatics is the storage, indexing, and search of information relating to chemical compounds.[13] The efficient search of such stored information includes topics that are dealt with in computer science, such as data mining, information retrieval, information extraction, and machine learning.[citation needed] Related research topics include:[citation needed]
Chemical data can pertain to real or virtual molecules. Virtual libraries of compounds may be generated in various ways to explore chemical space and hypothesize novel compounds with desired properties. Virtual libraries of classes of compounds (drugs, natural products, diversity-oriented synthetic products) were recently generated using the FOG (fragment optimized growth) algorithm.[16] This was done by using cheminformatic tools to train transition probabilities of a Markov chain on authentic classes of compounds, and then using the Markov chain to generate novel compounds that were similar to the training database.
In contrast to high-throughput screening, virtual screening involves computationally
screening in silico libraries of compounds, by means of various methods such as
docking, to identify members likely to possess desired properties
such as biological activity against a given target. In some cases, combinatorial chemistry is used in the development of the library to increase the efficiency in mining the chemical space. More commonly, a diverse library of small molecules or natural products is screened.
^Weininger, David (1988). "SMILES, a Chemical Language and Information System: 1: Introduction to Methodology and Encoding Rules". Journal of Chemical Information and Modeling. 28 (1): 31–36. doi:10.1021/ci00057a005. S2CID5445756.
^Murray-Rust, Peter; Rzepa, Henry S. (1999). "Chemical Markup, XML, and the Worldwide Web. 1. Basic Principles". Journal of Chemical Information and Computer Sciences. 39 (6): 928–942. doi:10.1021/ci990052b.
^Kutchukian, Peter; Lou, David; Shakhnovich, Eugene (2009). "FOG: Fragment Optimized Growth Algorithm for the de Novo Generation of Molecules occupying Druglike Chemical". Journal of Chemical Information and Modeling. 49 (7): 1630–1642. doi:10.1021/ci9000458. PMID19527020.