8000 GitHub - mobiusklein/mzmeta-poc
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

mobiusklein/mzmeta-poc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a proof-of-concept for embedding sample or experimental metadata in mzML or a compatible data model.

Theorized usage

The idea behind this program was to see what kind of information might be portable between an SDRF file and an mzML file. The SDRF file includes a comment[data file] column which tells you which raw file (or mzML file) a row describes. We can build the sampleList for an mzML file from the SDRF rows that correspond to its source RAW file.

msconvert path/to/RAW | mzmeta path/to/sdrf > path/to/mzML

Example

PXD017710 has been annotated with an SDRF file, describing a TMT multiplexing experiment with multiple samples per MS run. If mzmeta were run on 20200219_KKL_SARS_CoV2_pool1_F1.raw being converted to mzML, it would pull in details describing each of the samples based upon the rows listed for that data file in the SDRF file.

    <sampleList count="9">
      <sample id="sample_1" name="Sample 1">
        <cvParam accession="BFO:0000040" cvRef="BFO" name="material type" value="cell"/>
        <userParam type="xsd:string" name="assay name" value="run 1"/>
        <cvParam accession="EFO:0005521" cvRef="EFO" name="technology type" value="proteomic profiling by mass spectrometry"/>
        <cvParam accession="OBI:0100026" cvRef="OBI" name="organism" value="Homo sapiens"/>
        <cvParam accession="EFO:0000635" cvRef="EFO" name="organism part" value="colon"/>
        <userParam type="xsd:string" name="characteristics[sex]" value="male"/>
        <cvParam accession="EFO:0000246" cvRef="EFO" name="age" value="72"/>
        <cvParam accession="EFO:0000399" cvRef="EFO" name="developmental stage" value="adult"/>
        <cvParam accession="HANCESTRO:0000004" cvRef="HANCESTRO" name="ancestry category" value="caucasian"/>
        <cvParam accession="EFO:0000324" cvRef="EFO" name="cell type" value="not available"/>
        <cvParam accession="EFO:0000408" cvRef="EFO" name="disease" value="colon cancer"/>
        <userParam type="xsd:string" name="characteristics[cell line]" value="CaCo-2"/>
        <userParam type="xsd:string" name="characteristics[infect]" value="bridge mixed pool"/>
        <cvParam accession="EFO:0000721" cvRef="EFO" name="time" value="none"/>
        <cvParam accession="EFO:0002091" cvRef="EFO" name="biological replicate" value="1"/>
        <cvParam accession="MS:1002621" cvRef="MS" name="TMT reagent 131"/>
        <cvParam accession="PRIDE:0000577" cvRef="PRIDE" name="file uri" value="https://ftp.pride.ebi.ac.uk/pride/data/archive/2020/03/PXD017710/20200219_KKL_SARS_
6147
CoV2_pool1_F1.raw"/>
        <cvParam accession="MS:1000858" cvRef="MS" name="fraction identifier" value="1"/>
        <cvParam accession="MS:1001808" cvRef="MS" name="technical replicate" value="1"/>
        <userParam type="xsd:string" name="comment[modification parameters]" value="NT=TMT6plex;PP=Any N-term;AC=UNIMOD:737;MT=fixed"/>
        <userParam type="xsd:string" name="comment[modification parameters]" value="NT=Carbamidomethyl;TA=C;AC=UNIMOD:4;MT=fixed"/>
        <userParam type="xsd:string" name="comment[modification parameters]" value="NT=Oxidation;TA=M;AC=UNIMOD:35;MT=variable"/>
        <userParam type="xsd:string" name="comment[modification parameters]" value="NT=13C6-15N4;TA=R;AC=UNIMOD:267;MT=variable"/>
        <userParam type="xsd:string" name="comment[cleavage agent details]" value="NT=Trypsin"/>
        <userParam type="xsd:string" name="comment[cleavage agent details]" value="NT=Lys-C"/>
        <userParam type="xsd:string" name="comment[fragment mass tolerance]" value="not available"/>
        <userParam type="xsd:string" name="comment[precursor mass tolerance]" value="not available"/>
        <userParam type="xsd:string" name="factor value[infect]" value="bridge mixed pool"/>
        <userParam type="xsd:string" name="factor value[time]" value="none"/>
      </sample>

In some cases, I could map the columns to controlled vocabulary terms. In others, I chose to just roundtrip all the SDRF field details as-is lacking a clear non-lossy mechanism for encoding that information.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0