TIL: Useful Datamol functions

https://docs.datamol.io/stable/index.html

import datamol as dm

Cluster a set of molecules using the butina clustering algorithm and a given threshold

dm.cluster.cluster_mols(mols, cutoff=0.2, feature_fn=None, n_jobs=1)

Compute conformers of a molecule

dm.conformers.generate(mol, ...)

Convert a list of mols to a dataframe using each mol properties as a column

dm.convert.to_df(mols)

Compute a list of opiniated molecular properties

dm.descriptors.compute_many_descriptors(mol)

Compute the molecular fingerprint given a molecule or a SMILES

dm.fp.to_fp(mol, as_array=True, fp_type='ecfp')

Generate all possible fragmentation of a molecule

dm.fragment.frag(mol, remove_parent=False, sanitize=True, fix=True)

Read an SDF file

dm.io.read_sdf(urlpath)

Write molecules to a file

dm.io.to_sdf(mols, urlpath)

Context manager to disable RDKit logs

with dm.log.without_rdkit_log():
    mol = dm.to_mol("CCCCO")  # potential RDKit logs won't show

Disable all rdkit logs

dm.log.disable_rdkit_log()

Standardize and sanitize a molecule

mol = dm.mol.to_mol("O=C(C)Oc1ccccc1C(=O)O")
mol = dm.mol.fix_mol(mol)
mol = dm.mol.sanitize_mol(mol)
mol = dm.mol.standardize_mol(mol)

Generate an image out of a molecule or a list of molecules

dm.viz.to_image(mols, legends)



Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • TIL: Multi-node GPU training with SkyPilot and PyTorch Lightning
  • TIL: Request GCP quota increase
  • TIL: Template data processing script with pathlib, fire, joblib, loguru, and tqdm
  • TIL: Template data exploration Jupyter notebook
  • TIL: Useful SkyPilot Commands