8000 GitHub - baitutanglj/molgpt
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

baitutanglj/molgpt

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LigGPT

In this work, we train small custom GPT on Moses and Guacamol dataset with next token prediction task. The model is then used for unconditional and conditional molecular generation. We compare our model with previous approaches on the Moses and Guacamol datasets. Saliency maps are obtained for interpretability using Ecco library.

  • The processed Guacamol and MOSES datasets in csv format can be downloaded from this link:

https://drive.google.com/drive/folders/1LrtGru7Srj_62WMR4Zcfs7xJ3GZr9N4E?usp=sharing

  • Original Guacamol dataset can be found here:

https://github.com/BenevolentAI/guacamol

  • Original Moses dataset can be found here:

https://github.com/molecularsets/moses

To train the model, make sure you have the datasets' csv file in the same directory as the code files.

  • For unconditional training run:
python train.py --run_name unconditional_moses --data_name moses --num_props 0 
  • For property based conditional training:
python train.py --run_name conditional_moses --data_name moses --num_props 1 --property logp
  • For scaffold based conditional training:
python train.py --run_name scaffold_moses --data_name moses --scaffold --num_props 0

If you find this work useful, please cite:

Bagal, Viraj; Aggarwal, Rishal; Vinod, P. K.; Priyakumar, U. Deva (2021): LigGPT: Molecular Generation using a Transformer-Decoder Model. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.14561901.v1

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0