GPMsDB-tk

GPMsDB-tk v1.0.1 was released on March 7, 2023.

GPMsDB-tk/GPMsDB-dbtk are software toolkits for assigning taxonomic identification to user-provided MALDI-TOF mass spectrometry profiles obtained from bacterial and archaeal cultured isolates. They take advantages of a newly developed database of protein mass profiles predicted from ~200,000 bacterial and archaeal genome sequences. This toolkit is also designed to work with customized databases, allowing microbial identification based on user-provided genome/metagenome-assembled genome (MAG) sequences. The GPMsDB-tk is open source and released under the GNU General Public License (Version 3).

Please post questions and issues related to GPMsDB-tk on the Issues section of the GitHub repository.

Installing and using GPMsDB-tk

Prerequisites

Python (version 3.7 or higher)
Cython (version 0.29.1 or higher)
scipy (version 1.7.3 or higher)
matplotlib (version 3.5.0 or higher)

In the source directory, the following command will compile and install the software in your python environment.

git clone https://github.com/ysekig/GPMsDB-tk
cd GPMsDB-tk
python setup.py install

During the installation, you may see some deprecation warnings like “easy_install command is deprecated” but this will not cause any issues for GPMsDB-tk.

GPMsDB-tk requires ~7 GB of external data that needs to be downloaded from Zenodo (DOI: 10.5281/zenodo.8245428) and unarchived:

tar xvzf R01-RS95.tar.gz

GPMsDB-tk requires an environment variable named GPMsDB_PATH to be set to the directory containing the unarchived reference data.

export GPMsDB_PATH=/path/to/release/package/

If you are interested in cu 71B9 stomizing the database with user-provided genomes/metagenome-assembled genomes (MAGs), GPMsDB-dbtk should also be installed.

Features

Peak-list characterization:
- inspect -> Inspection of a peak-list and generate peak plots
- adjust -> m/z adjustment for given peak-list
Strain identification based on peak-list(s)
- identify -> Search for the best-matching genome(s) without m/z adjustment
- identify_wf -> Full identification workflow (option "-aa" should be set for m/z adjustment)
- identify_bwf -> Full identification workflow for a batch of files (option "-aa" should be set for m/z adjustment)
Peak annotation
- peak -> Annotate protein names and Tigrfam/Pfam markers genes
- peak_wf -> Full peak-list characterization workflow (option "-aa" should be set for m/z adjustment)
- peak_bwf -> Full peak-list characterization workflow for a batch of files (option "-aa" should be set for m/z adjustment)

Output values for the option "identify"

protein_hit: number of hits with all proteins predicted for reference genome
ribosomal_hit: number of hits with ribosomal proteins predicted for reference genome
score: matching value calculated for reference genome (higher better matching)
probability value: the frequency of appearance of a given score inferred based on scores from 100 randomely selected reference genomes
likelihood(%): likelihood of correct identification (empirical, based on ribosomal_hit and probability.
ncbi_name: NCBI oraganism name (genus/species) for reference genome
ncbi_strain: NCBI strain name for reference genome
taxonomy_gtdb: GTDB taxonomy string for reference genome

Bug Reports

Please report bugs through the GitHub issues system, or contact Yuji Sekiguchi (y.sekiguchi@aist.go.jp)

Copyright

This package is under the conditions of the GNU General Public License (Version 3). See LICENSE for further details.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
GPMsDB_tk		GPMsDB_tk
bin		bin
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPMsDB-tk

Installing and using GPMsDB-tk

Features

Output values for the option "identify"

Bug Reports

Copyright

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

ysekig/GPMsDB-tk

Folders and files

Latest commit

History

Repository files navigation

GPMsDB-tk

Installing and using GPMsDB-tk

Features

Output values for the option "identify"

Bug Reports

Copyright

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages