GitHub - MolGen/selene: a framework for training sequence-level deep learning networks

You have found Selene, a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Installation

Selene is a Python 3+ package. We recommend using it with Python 3.6 or above. Package installation should only take a few minutes (less than 10 minutes, typically ~2-3 minutes) with any of these methods (pip, conda, source).

Installing selene with Anaconda (for Linux):

conda install -c bioconda selene-sdk

Installing selene with pip:

pip install selene-sdk

Installing selene from source:

First, download the latest commits from the source repository (or download the latest tagged version of Selene for a stable release):

git clone https://github.com/FunctionLab/selene.git

The setup.py script requires NumPy. Please make sure you have this already installed.

If you plan on working in the selene repository directly, we recommend setting up a conda environment using selene-cpu.yml or selene-gpu.yml (if CUDA is enabled on your machine) and activating it. These environment YAML files list specific versions of package dependencies that we have used in the past to test Selene.

Selene contains some Cython files. You can build these by running

python setup.py build_ext --inplace

Otherwise, if you would like to locally install Selene, you can run

python setup.py install

Please install docopt before running the command-line script selene_cli.py provided in the repository.

About Selene

Selene is composed of a command-line interface and an API (the selene-sdk Python package). Users supply their data, model architecture, and configuration parameters, and Selene runs the user-specified operations (training, evaluation, prediction) for that sequence-based model.

For a more detailed overview of the components in the Selene software development kit (SDK), please consult the page here.

Documentation

The documentation 6824 for Selene is available here.

Examples

In general, we recommend that the manuscript case studies and the tutorials be run on a machine with a GPU. All examples take significantly longer when run on a CPU machine.

Tutorials

Tutorials for Selene are available here.

It is possible to run the tutorials (Jupyter notebook examples) on a standard CPU machine--you should not expect to fully finish running the training examples unless you can run them for more than 2-3 days, but they can all be run to completion on CPU in a couple of days. You can also change the training parameters (e.g. total number of steps) so that they complete in a much faster amount of time.

The non-training examples (variant effect prediction, in silico mutagenesis) can be run fairly quickly (variant effect prediction might take 20-30 minutes, in silico mutagenesis in 10-15 minutes).

Please see the README in the tutorials directory for links and descriptions to the specific tutorials.

Manuscript case studies

The code to reproduce case studies in the manuscript is available here.

Each case has its own directory and README describing how to run these cases. We recommend consulting the step-by-step breakdown of each case study that we provide in the methods section of the manuscript as well.

The manuscript examples were only tested on GPU. Our GPU (NVIDIA Tesla V100) time estimates:

Case study 1 finishes in about 1 day on a GPU node.
Case study 2 takes 6-7 days to run training (distributed the work across 4 v100s).
Case study 3 (variant effect prediction) takes about 1 day to run.

Name		Name	Last commit message	Last commit date
Latest commit History 470 Commits
config_examples		config_examples
docs		docs
manuscript		manuscript
models		models
selene_sdk		selene_sdk
tutorials		tutorials
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
selene-cpu.yml		selene-cpu.yml
selene-gpu.yml		selene-gpu.yml
selene_cli.py		selene_cli.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Installing selene with Anaconda (for Linux):

Installing selene with pip:

Installing selene from source:

About Selene

Documentation

Examples

Tutorials

Manuscript case studies

About

Releases

Packages

Languages

License

MolGen/selene

Folders and files

Latest commit

History

Repository files navigation

Installation

Installing selene with Anaconda (for Linux):

Installing selene with pip:

Installing selene from source:

About Selene

Documentation

Examples

Tutorials

Manuscript case studies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages