About pflow

Pflow is a python package written to generate data for training generative models of proteins using molecular dynamics.

Quick start

Pflow is based on dflow, tt is recommended to use k8s enviroment to run the workflow unless it is not doable anyway. The dflow team provide a community version of k8s deepmodeling k8s, making the use of pflow very convenient. To use the community version of k8s, one first need to register a Bohrium account in Bohrium and learn a few concepts (job, jobgroup, project id) in the Bohrium website documents. Then the use of pflow is very easy.

Set the enviroment variables

Just set the enviroment variables based on your personal Bohrium account information by

export DFLOW_HOST=https://workflows.deepmodeling.com
export DFLOW_K8S_API_SERVER=https://workflows.deepmodeling.com
export DFLOW_S3_REPO_KEY=oss-bohrium
export DFLOW_S3_STORAGE_CLIENT=dflow.plugins.bohrium.TiefblueClient
export BOHRIUM_USERNAME="<bohrium-email>"
export BOHRIUM_PASSWORD="<bohrium-password>"
export BOHRIUM_PROJECT_ID="<bohrium-project-id>"

Install pflow

Install the latest pflow

git clone git@github.com:PKUfjh/protein_flow.git
cd protein_flow
pip install setuptools_scm
pip install .

Run an example

Change to the pflow directory

cd protein_flow

Run an example to generate data for CHICKEN VILLIN protein (2F4K)

pflow submit -i ./tests/data/ -c ./rid/template/rid_gmx_dih.json -m ./rid/template/machine_bohrium_k8s.json

Workflow visualization

The example sets number of walkers for parallel equilibrium MD to be 2. The number of nonequilibrium MD run is 7, so we will have 7 trajectories in total after the workflow run. The workflow can be visualized in the argo UI

Main procedure of Pflow

Pflow will run in the following order.

Equilibrium MD;
Clustering and selection;
Nonequilibrium MD using Moving Restraints.
Transform the xtc trajectory into npz format.

Equilibrium MD

Pflow will first run several equilibrium MD in parallel, the number of parallelism is set by the users in json file. The aim of this step is to generate configurations near the equilibirum ensemble of the initial structure.

Clustering and selection

This procedure will cluster the configurations generated by equilibrium MD, and select typical configuration in each cluster, which are used in the following nonequilibrium MD run.

Nonequilibrium MD using Moving Restraints

This procedure will run nonequilibirum MD to stretch the protein to straight configurations. The steps, final length and stretching strength is set by the users in the json file.

Transform the xtc trajectory into npz format.

This procedure will transform the xtc trajectory into npz format for training, this will make the data easier for python to read and process.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
install		install
pflow		pflow
test		test
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of contents

About pflow

Quick start

Set the enviroment variables

Install pflow

Run an example

Workflow visualization

Main procedure of Pflow

Equilibrium MD

Clustering and selection

Nonequilibrium MD using Moving Restraints

Transform the xtc trajectory into npz format.

About

Releases

Packages

Languages

PKUfjh/protein_flow

Folders and files

Latest commit

History

Repository files navigation

Table of contents

About pflow

Quick start

Set the enviroment variables

Install pflow

Run an example

Workflow visualization

Main procedure of Pflow

Equilibrium MD

Clustering and selection

Nonequilibrium MD using Moving Restraints

Transform the xtc trajectory into npz format.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages