GoFlow: Efficient Transition State Geometry Prediction with Flow Matching and E(3)-Equivariant Neural Networks

GoFlow is an open-source model for predicting transition state geometries of single-step organic reactions. This repository contains the official implementation, including all scripts to fully reproduce the results reported in the paper.

Installation

Install GoFlow dependencies with uv (recommended):

# Install uv
pip install uv
# Install dependencies
uv sync -n
uv add torch_scatter torch_sparse torch_cluster torch_spline_conv torch_geometric -f https://data.pyg.org/whl/torch-2.6.0+cu124.html --no-build-isolation -n

Configuration

We use Hydra for managing model configurations and experiments.

All hyper-parameters are found in the configs directory and its subdirectories (./configs).

Dataset

GoFlow is trained and evaluated on the open-source RDB7 database by Spiekermann et al.. The raw .csv and .xyz files are located in the data/RDB7/raw_data directory.

To set up the dataset when using the repository for the first time, follow these steps:

Generate indices required for creating the dataset splits by running the preprocess_extract_rxn_core.sh script.
Create split files by running the preprocess_create_splits.sh script, which produces .pkl files containing the split indices.
Preprocess the data by executing the preprocessing.sh script. This will generate the data.pkl file. Make sure to adjust the paths to the .csv and .xyz files inside the script as needed.

The processed data, i.e., each reaction, is stored as a PyG object in a Python list and is located in the data/RDB7/processed_data directory as data.pkl.

Usage

Each experiment has a separate shell script (.sh files).

E.g. to train and test the model for all dataset splits, run bash train_test_all_splits.sh in a Unix shell.

Modify the shell scripts as required to set custom paths for your input and output directories. Also, edit the configuration files as needed.

To run quantum mechanical saddle point optimizations, use the tsopt.inp script for ORCA together with a predicted transition state geometry (xyz file).

Acknowledgement

GoFlow is built upon open-source code provided by TsDiff and GotenNet.

License

Our model and code are released under MIT License.

Cite

If you use this code in your research, please cite the following paper:

@article{galustian2025goflow,
  author = {Galustian, Leonard and Mark, Konstantin and Karwounopoulos, Johannes and Kovar, Maximilian and Heid, Esther},
  title = {GoFlow: Efficient Transition State Geometry Prediction with Flow Matching and E(3)-Equivariant Neural Networks},
  year = {2025},
  doi = {10.26434/chemrxiv-2025-bk2rh},
  journal = {ChemRxiv}
}

Name	Name	Last commit date
Latest commit History 16 Commits
configs	configs
data	data
data_processing	data_processing
flow_matching	flow_matching
gotennet	gotennet
logs	logs
tsdiff	tsdiff
utils	utils
.DS_Store	.DS_Store
.gitignore	.gitignore
.python-version	.python-version
LICENSE	LICENSE
environment.yml	environment.yml
flow_train.py	flow_train.py
goflow_paper_molecules_analysis.ipynb	goflow_paper_molecules_analysis.ipynb
preprocess_create_splits.sh	preprocess_create_splits.sh
preprocess_extract_rxn_core.sh	preprocess_extract_rxn_core.sh
preprocessing.py	preprocessing.py
preprocessing.sh	preprocessing.sh
pyproject.toml	pyproject.toml
readme.md	readme.md
split_preprocessed.py	split_preprocessed.py	< 8000 /div>
test_ablate_samples.sh	test_ablate_samples.sh
test_ablate_steps.sh	test_ablate_steps.sh
train_test_all_model_sizes.sh	train_test_all_model_sizes.sh
train_test_all_splits.sh	train_test_all_splits.sh
train_test_error_bar.sh	train_test_error_bar.sh
tsopt.inp	tsopt.inp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GoFlow: Efficient Transition State Geometry Prediction with Flow Matching and E(3)-Equivariant Neural Networks

Installation

Configuration

Dataset

Usage

Acknowledgement

License

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

heid-lab/goflow

Folders and files

Latest commit

History

Repository files navigation

GoFlow: Efficient Transition State Geometry Prediction with Flow Matching and E(3)-Equivariant Neural Networks

Installation

Configuration

Dataset

Usage

Acknowledgement

License

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages