-
Notifications
You must be signed in to change notification settings - Fork 10
output vcf files from phasing contain many repetitive header lines #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service an 8000 d privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @ilivyatan, Can you provide the header for the 27909_pass.wf_snp.vcf.gz? Thanks |
Any updates? Is the header readable? |
Hi @ilivyatan The header file you provided checks out fine. In the next version, additional conditions will be added to try to prevent issues with repetitive headers. |
The attached file is just the excerpt of the header up until the first column header line: AFTER that line and after the data starts there are 913 additional header-like lines INTERLEAVED within, and running all throughout the file. |
|
It's in phased_27909_mod.vcf |
I noticed that the following line is missing in the phased header file you provided. Could you please confirm this for me?
By the way, could you test with the latest version, v1.5.2, to see if the same issue?
|
We have also encountered this problem. I've taken a look at
So it looks as though
I'd suggest if looking for header lines that these checks specifically look for |
Hi @SamStudio8 Thank you very much for digging into this issue. Thanks, |
@SamStudio8 @ilivyatan The v1.7 release has resolved this repeated header issue. We forgot to mention this issue in the release notes, but the SnpParser::writeLine has been rewritten as suggested by Sam. |
Hi,
This is a really great tool and works very fast, thanks!
We've been trying it out in different ways and many times the resulting phased VCF files contain multiple lines of headers in the middle of the files.
For example, these lines appear 913 times in one file!
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase set identifier">
##longphaseVersion=1.5.1
##commandline="phase -s data/27909_pass.wf_snp.vcf.gz --sv-file data/27909_pass.wf_sv.vcf.gz --mod-file mod_27909.vcf -b data/27909_pass_merged.bam -r data/GRCh38.no_alt_analysis_set.fasta -t 16 -o phased_27909_mod --indels --ont "
This also occurs with joint phasing of just SNV and SV vcf files.
Why do comments appear in the middle of the output VCF files?
The text was updated successfully, but these errors were encountered: