8000 GitHub - Roche/OligoGym: OligoGym is a python package that streamlines processes involving featurization, training and evaluation of predictive models of oligonucleotide properties. Oligonucleotides include antisense oligonucleotides (ASOs) and small interfering RNAs (siRNAs).
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ OligoGym Public

OligoGym is a python package that streamlines processes involving featurization, training and evaluation of predictive models of oligonucleotide properties. Oligonucleotides include antisense oligonucleotides (ASOs) and small interfering RNAs (siRNAs).

License

Notifications You must be signed in to change notification settings

Roche/OligoGym

Repository files navigation

OligoGym 🏃

Description

OligoGym is a package that streamlines the training and evaluation of predictive models of oligonucleotide (ASOs, siRNAs) properties. The core components of OligoGym are its featurizers and models. The featurizers convert compounds represented using the HELM notation into a set of features that can be used by machine learning models. The models are implemented using PyTorch Lightning and scikit-learn, and they can be trained and evaluated on various datasets. They are implemented in a way that allows for easy integration with the featurizers, making it simple to switch between different featurizers and models.

OligoGym is designed to be easy to use and flexible, making it suitable for both researchers and practitioners in the field of oligonucleotide design and optimization.

Example code

from oligogym.features import KMersCounts
from oligogym.models import LinearModel
from oligogym.data import DatasetDownloader

downloader = DatasetDownloader()
data = downloader.download("siRNA1")
X_train, X_test, y_train, y_test = data.split(split_strategy="random")
feat = KMersCounts(k=[1, 2, 3], modification_abundance=True)
X_kmer_train = feat.fit_transform(X_train)
X_kmer_test = feat.transform(X_test)

model = LinearModel()
model.fit(X_kmer_train, y_train)
y_pred = model.predict(X_kmer_test)

Featurizers

The following featurizers are currently implemented:

  • KMersCounts
  • OneHotEncoder
  • Thermodynamics

Models

The following models are currently implemented:

  • SKLearnModel
    • NearestNeighborsModel
    • RandomForestModel
    • XGBoostModel
    • LinearModel
    • GaussianProcessModel
    • TabPFNModel
  • LightningModel
    • MLP
    • CNN
    • CausalCNN
    • GRU

Prerequisites

Installation

Clone the repository and navigate to the project directory:

git clone github.com/Roche/oligogym
cd oligogym
poetry install

Usage

Activate the virtual environment:

poetry shell

Development

Code Formatting

Format code using Black:

poetry run black oligogym/ tests/

Linting

Lint code using Flake8:

poetry run flake8 oligogym/ tests/

Testing

Run tests using Pytest:

poetry run pytest

About

OligoGym is a python package that streamlines processes involving featurization, training and evaluation of predictive models of oligonucleotide properties. Oligonucleotides include antisense oligonucleotides (ASOs) and small interfering RNAs (siRNAs).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0