8000 htsfile gives nonsensical results for .fq and .fq.gz files · Issue #719 · samtools/htslib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

htsfile gives nonsensical results for .fq and .fq.gz files #719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tseemann opened this issue Jun 22, 2018 · 1 comment
Closed

htsfile gives nonsensical results for .fq and .fq.gz files #719

tseemann opened this issue Jun 22, 2018 · 1 comment

Comments

@tseemann
Copy link

I do realise that htsfile does not specifically claim to work with fastq data, but i think it should be able to parse these common formats and return a correct answer:

$ htsfile reads.fq
reads.fq:       SAM version 1 sequence text

$ htsfile reads.fq.gz
reads.fq.gz:    SAM version 1 BGZF-compressed sequence data
@jmarshall
Copy link
Member
jmarshall commented Jun 22, 2018

You already reported this or something very similar as #200 😄. I must dust off my “recognise particular text formats” branch…

jmarshall added a commit to jmarshall/htslib that referenced this issue Jun 26, 2018
Add htsExactFormat entries for these four file types to htslib/hts.h.

Place FASTQ in the sequence_data category as it can be considered to
be a representation of unmapped SAM. FASTA on the other hand is not,
as we don't operate on it in the same way as SAM/BAM/CRAM sequence data
and in particular we wouldn't ever want hts_open() to accept it, as
then swapping filename arguments in `samtools mpileup -f foo bar` etc
would be less detectable.

Fixes samtools#719.
jmarshall added a commit to jmarshall/htslib that referenced this issue Aug 30, 2019
Add htsExactFormat entries for these four file types to htslib/hts.h.

Place FASTQ in the sequence_data category as it can be considered to
be a representation of unmapped SAM. FASTA on the other hand is not,
as we don't operate on it in the same way as SAM/BAM/CRAM sequence data
and in particular we wouldn't ever want hts_open() to accept it, as
then swapping filename arguments in `samtools mpileup -f foo bar` etc
would be less detectable.

Add tests verifying that FASTA and FASTQ (and SAM) can be read
line-by-line via hts_open/hts_getline/hts_close.

Fixes samtools#719.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0