Short Tandem Repeats (STRs) are a type of genetic variation that are associated with many rare diseases. Information about pathogenic STRs is often out-of-date and scattered across different databases, making it difficult to find and interpret STR variants. STRchive ("ess tee archive") aims to solve this problem by providing a central community resource.
⭐️ View the data at strchive.org ⭐️
If you use STRchive in your research, please cite: Hiatt, L., Weisburd, B., Dolzhenko, E., Rubinetti, V., Avvaru, A.K., VanNoy, G.E., Kurtas, N.E., Rehm, H.L., Quinlan, A. and Dashnow, H.✉, 2025. STRchive: a dynamic resource detailing population-level and locus-specific insights at tandem repeat disease loci. Genome medicine doi: [https://doi.org/10.1101/2024.05.21.24307682].
STRchive by Harriet Dashnow is licensed under CC BY 4.0
- Harriet Dashnow
- Laurel Hiatt
- Akshay Avvaru
- Vincent Rubinetti
- Macayla Weiner
If you notice an error, omission, or update, feel free to leave a comment or create a pull request.
To make a change to the STRchive data itself, please edit data/STRchive-loci.json
Then run the "linting" script and fix any errors:
python scripts/check-loci.py data/STRchive-loci.json
From the root directory, run:
snakemake
Or to skip retrieve and manubot stages, which will speed things up substantially:
snakemake --config stages="skip-refs"
python scripts/make-catalog.py -g hg38 -f TRGT data/STRchive-loci.json data/STRchive-disease-loci.hg38.TRGT.bed
python scripts/make-catalog.py -g T2T -f TRGT data/STRchive-loci.json data/STRchive-disease-loci.T2T-chm13.TRGT.bed
python scripts/make-catalog.py -g hg19 -f TRGT data/STRchive-loci.json data/STRchive-disease-loci.hg19.TRGT.bed
python scripts/make-catalog.py -f bed -g hg38 data/STRchive-loci.json data/STRchive-disease-loci.hg38.bed
python scripts/make-catalog.py -f bed -g T2T data/STRchive-loci.json data/STRchive-disease-loci.T2T-chm13.bed
python scripts/make-catalog.py -f bed -g hg19 data/STRchive-loci.json data/STRchive-disease-loci.hg19.bed
New install:
conda env create --file scripts/environment.yml
conda activate strchive
Update existing installation:
conda activate strchive
conda env update --file scripts/environment.yml --prune
conda activate strchive
Note: biomaRt isn't playing nicely with conda, so installing it within the R script where it is used.