Dynamics Retrieval

Dynamics retrieval (SSA, LPSA, NLSA) methods with application to time-resolved serial crystallography data and other (synthetic, climate).

References:

Casadei, C. M. et al., Structural Dynamics (2022), Dynamics retrieval from stochastically weighted incomplete data by low-pass spectral analysis. https://doi.org/10.1063/4.0000156

Casadei, C. M. et al., Structural Dynamics (2023), Low-pass spectral analysis of time-resolved serial crystallography data. https://doi.org/10.1063/4.0000178

Installation

Start by cloning the code.

git clone https://github.com/CeciliaCasadei/dynamics-retrieval.git
cd dynamics-retrieval

Using conda is recommended to install dependencies. A new conda environment can be created with

conda env create -f environment.yml

After this, install the package:

pip install -e .[dev]

Many workflows currently require editing the source code, so installing in developer mode (-e) is recommended.

Testing

To test:

cd workflows
python test_package.py

Workflows

The LPSA and NLSA code are contained in the dynamics_retrieval package. However significant pre-processing is required to prepare data for analysis. Preparation and analysis scripts are provided in the scripts* directories, which can be customized to your application. Code for LPSA and NLSA analysis is contained in the library, with wrappers calling the functions within workflows directory.

TR-SFX Workflow

This is the general workflow used for serial crystallography. Scripts for bovine rhodopsin (rho) and bacteriorhodopsin (bR) are provided. Bacteriorhodopsin TR-SFX data can be found on zenodo. The general flow is as follows:

scripts_crystfel_*
- Use CrystFEL to process TR-SFX data to produce stream files with indexed intensities (indexamajig, ambigator) & a list of scale factors (partialator).
- Calculate merging statistics, to e.g. estimate the desired high-resolution cutoff.
scripts_data_reduction_*
- See details for bR, rhodopsin
- Start with streams, scaling factors, and space group (eg asuP6_3.m)
- Process
  - Extract reflection intensities for each frame from the stream
  - Apply scale factors for each frame
  - Apply symmetry transformations
  - Add timing info for each frame
  - Filter for desired timing distribution (eg uniform timepoints)
- Output data matrix (1 column per frame)
workflows
- run_TR-SFX_LPSA.py runs scripts for dynamics retrieval
- produces reconstructed reflection intensities for each timestep
scripts_make_maps
- Converts output to mtz for use in phenix
scripts_map_analysis
- Integrate difference density around a feature of interest

Conventions

The following naming conventions can be helpful in understanding the code. Variable names often reflect the mathematical notation used in the papers.

x, the main input with each frame as one column (m reflections ⨯ S timesteps)
- For SFX typical sizes would be m=10^4, S=10^5
q, the concatenation number. Number of frames in x that get concatenated together to form supervectors in X. Should be odd.
- q needs to be optimized, but values on order of 10^4 worked for SFX
X, the superframe matrix (qm reflections ⨯ S timesteps). Not stored explicitely.
F or Φ, the matrix of harmonic functions
jmax or f_max: maximum harmonic frequency to keep
p, number of frames in the reconstructed X to average to make reconstructed x. p = 0 takes a single frame (for performance); 2p+1 = q then all frames are averaged.
20, number of modes to save (hard coded in several functions)

Temporary files

To allow workflows to be resumed, large objects are often saved to joblib (.jbl) files and then loaded during later steps. Files are expected to be saved to various fixed paths relative to a temporary directory, the results_path.

Examples:

input_data_sparsity_*.jbl: x matrix (s frames ⨯ m reflections) giving intensities for each (h,k,l)
dT.jbl
input_data_mask_sparsity*.jbl: masks measured values of x
F_on.jbl: Phi matrix with LPSA harmonic functions, orthonomalized

Settings files

Settings are currently passed via a python module consisting of global variables. Examples are in workflows/settings*.py. Typical variables include the results_path, S, q, etc.

Name	Name	Last commit message
Latest commit History 167 Commits
dynamics_retrieval	dynamics_retrieval
scripts_crystfel_bR	scripts_crystfel_bR
scripts_crystfel_rho	scripts_crystfel_rho
scripts_data_reduction_bR	scripts_data_reduction_bR
scripts_data_reduction_rho	scripts_data_reduction_rho
scripts_make_maps	scripts_make_maps
scripts_map_analysis	scripts_map_analysis
scripts_parallel_submission	scripts_parallel_submission
tests	tests
workflows	workflows
.gitignore	.gitignore
AUTHORS.md	AUTHORS.md
CHANGELOG.md	CHANGELOG.md
MANIFEST.in	MANIFEST.in
Notes_LPSA.pdf	Notes_LPSA.pdf
README.md	README.md
environment.yml	environment.yml
requirements.txt	requirements.txt
requirements_dev.txt	requirements_dev.txt
setup.cfg	setup.cfg
setup.py	setup.py
tox.ini	tox.ini	8000

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dynamics Retrieval

Installation

Testing

Workflows

TR-SFX Workflow

Conventions

Temporary files

Settings files

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

CeciliaCasadei/dynamics-retrieval

Folders and files

Latest commit

History

Repository files navigation

Dynamics Retrieval

Installation

Testing

Workflows

TR-SFX Workflow

Conventions

Temporary files

Settings files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages