Headed to ACS San Diego? Join us for Happy Hour!
← Back to glossary

Cheminformatics

Field
Field
Field

Cheminformatics, also known as chemical informatics or chemoinformatics, is the field of science that combines chemistry with computer science to process and analyze chemical data. It involves the use of computational techniques to store, retrieve, analyze, and visualize chemical information. This includes the management of chemical databases, the design of chemical compounds, the prediction of molecular properties, and the analysis of chemical reactions. Two major challenges addressed by cheminformatics are compound representation and QSAR.

Importance in Computational Drug Discovery:

  1. Data Management: Efficient handling and management of large volumes of chemical data, including structures, properties, and biological activities.
  2. Virtual Screening: Enables the rapid screening of large libraries of compounds to identify potential ligands.
  3. Molecular Modeling: Facilitates the modeling and simulation of molecular structures and interactions, aiding in the design and optimization of potential drug candidates.
  4. Predictive Modeling: Uses machine learning and statistical methods to predict the properties and activities of molecules, guiding the selection and optimization of leads.
  5. Integration: Integrates with other disciplines like bioinformatics and structural biology to provide a holistic approach to drug discovery.

Key Tools

  1. RDKit:
    • An open-source cheminformatics software that provides tools for working with chemical informatics, including molecular fingerprints, descriptors, and substructure searching.
  2. Open Babel:
    • An open-source chemical toolbox designed to interconvert file formats, handle molecular data, and perform cheminformatics tasks.
  3. KNIME:
    • An open-source data analytics platform that integrates various cheminformatics and bioinformatics tools for data analysis and visualization.
  4. DeepChem:
    • An open-source library that uses deep learning models for various cheminformatics tasks, including molecule generation, property prediction, and activity prediction.
  5. ChEMBL:
    • A database of bioactive molecules with drug-like properties, providing a wealth of information for cheminformatics analysis.
  6. DeepOrigin Tools: Use Balto to interrogate and manage data for any of the following in a simple conversation.
    • SMILESToWeight: For calculating the molecular weight of compounds.
    • FuncGroups: For determining functional groups in molecules.
    • QED: For evaluating the drug-likeness of molecules.
    • LogP, LogS, LogD: For predicting molecular properties such as solubility and partition coefficients.

Literature

Cheminformatics in Drug Discovery, an Industrial Perspective

  • Publication Date: 2018-09-01
  • DOI: 10.1002/minf.201800041
  • Summary: This review highlights traditional areas like virtual screening, library design, and high‐throughput screening analysis, and discusses the application of machine learning in early drug discovery for tasks like de novo molecular design and prediction of chemical reactions.

Special Issue: Cheminformatics in Drug Discovery

  • Publication Date: 2018-03-20
  • DOI: 10.1002/cmdc.201800123
  • Summary: A special issue presenting 20 articles on cheminformatics in drug design, summarizing common themes within in silico drug discovery.

Using Cheminformatics in Drug Discovery

  • Publication Date: N/A
  • DOI: 10.1007/164_2015_23
  • Summary: Illustrates how cheminformatics can be applied to designing novel compounds that are active at the primary target and have good predicted ADMET properties.

Exploiting Vector Pattern Diversity of Molecular Scaffolds for Cheminformatics Tasks in Drug Discovery

  • Publication Date: 2024-03-04
  • DOI: 10.1021/acs.jcim.3c01674
  • Summary: This work demonstrates the usefulness of considering exploited vectors during different phases of the drug design process to provide a quantitative and objective description of chemical diversity.

Predictive cheminformatics in drug discovery: statistical modeling for analysis of micro-array and gene expression data

  • Publication Date: N/A
  • DOI: 10.1007/978-1-61779-965-5_9
  • Summary: Reviews considerations in statistical modeling and summarizes best practices in predictive cheminformatics.

Cheminformatics in Natural Product‐based Drug Discovery

  • Publication Date: 2020-07-28
  • DOI: 10.1002/minf.202000171
  • Summary: Provides a survey of the scope and limitations of cheminformatics methods in natural product‐based drug discovery.

Study of role of Cheminformatics in the Modern Drug Discovery Process

  • Publication Date: 2022-05-22
  • DOI: 10.46243/jst.2022.v7.i01.pp49-52
  • Summary: Discusses the need for cheminformatics due to the vast quantities of data generated by new approaches to drug discovery such as high-throughput screening and combinatorial chemistry.

Keynote Lecture “Cheminformatics in natural product-based drug discovery”

  • Publication Date: 2022-12-01
  • DOI: 10.1055/s-0042-1758912
  • Summary: Highlights the importance of cheminformatics in natural product-based drug discovery.

Deep learning in drug discovery: a futuristic modality to materialize the large datasets for cheminformatics

  • Publication Date: 2022-10-28
  • DOI: 10.1080/07391102.2022.2136244
  • Summary: Investigates the performance of various deep learning algorithms in generating high-quality, interpretable datasets for drug design and development.

Application of Machine Learning and Molecular Modeling in Drug Discovery and Cheminformatics

  • Publication Date: 2021-08-16
  • DOI: 10.1201/9781003126164-10
  • Summary: Discusses the integration of machine learning and molecular modeling in drug discovery and cheminformatics.