8000 `vcf_parse()` fails · Issue #358 · samtools/htslib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

vcf_parse() fails #358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
noporpoise opened this issue Mar 20, 2016 · 3 comments
Closed

vcf_parse() fails #358

noporpoise opened this issue Mar 20, 2016 · 3 comments

Comments

@noporpoise
Copy link
Contributor

vcf_parse fails with the following error:

[E::vcf_parse_format] Invalid character '?' in 'GT' FORMAT field at 1:25

Here is a short example:

char hdrstr[] = "##fileformat=VCFv4.2\n"
"##FILTER=<ID=PASS,Description=\"All filters passed\">\n"
"##fileDate=20160317\n"
"##FORMAT=<ID=GT,Number=1,Type=String,Description=\"Genotype\">\n"
"##reference=ref.fa\n"
"##contig=<ID=1,length=50,assembly=ref>\n"
"#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\tMrT\n";

char vline[] = "1\t25\t.\tC\tT\t.\tPASS\t.\tGT\t0/1\n";

Code to parse strings hdrstr and vline into bcf_hdr_t and bcf1_t objects then print them out:

#include "htslib/vcf.h"

#define die(fmt,...) do { \
  fprintf(stderr, "[%s:%i] Error: %s() "fmt"\n", __FILE__, __LINE__, __func__, ##__VA_ARGS__); \
  exit(EXIT_FAILURE); \
} while(0)

...

fprintf(stdout, "%s%s\n", hdrstr, vline);
bcf1_t *v = bcf_init();
bcf_hdr_t *hdr = bcf_hdr_init("w");
if(bcf_hdr_parse(hdr, hdrstr) != 0) die("Cannot construct VCF header");
size_t vlen = strlen(vline);
kstring_t ks = {.l = vlen, .m = vlen, .s = vline};
if(vcf_parse(&ks, hdr, v) != 0) die("Cannot construct VCF entry: '%s'", vline);
memset(&ks, 0, sizeof(ks));
if(vcf_format(hdr, v, &ks) != 0) die("vcf_format failed");
int hdrslen = 0;
char *outhdr = bcf_hdr_fmt_text(hdr, 0, &hdrslen);
fprintf(stdout, "%s%s", outhdr, ks.s);
free(outhdr);
free(ks.s);
bcf_destroy(v);
bcf_hdr_destroy(hdr);

Compile:

gcc -Wall -Wextra -I../htslib -o vtest vtest.c ../htslib/libhts.a -lz

The vcf_parse() line fails with the above error.

Apologies: this issue was originally posted to bcftools issue tracker by mistake.

@noporpoise
Copy link
Contributor Author

ed35e41 appears to be the offending commit. The above test passes on previous commits and fails on this one. Am I doing something wrong or is vcf_parse() really broken?

@jmarshall
Copy link
Member

This is failing because vcf_parse() does not expect to see \n in its input string.

@noporpoise
Copy link
Contributor Author

That fixed it - thank you @jmarshall. I've suggested a pull request to update to the function documentation.

jmarshall added a commit that referenced this issue Mar 29, 2016
Show the exact invalid character for unprintable characters too;
improves error messages such as in #358.
charles-plessy added a commit to Debian/htslib that referenced this issue Apr 25, 2016
HTSlib release 1.3.1: bug fix release, notably error checking

* Improved error checking and reporting, especially of I/O errors when
  writing output files (samtools#17, samtools#315, PR samtools#271, PR samtools#317).

* Build fixes for 32-bit systems; be sure to run configure to enable
  large file support and access to 2GiB+ files.

* Numerous VCF parsing fixes (samtools#321, samtools#322, samtools#323, samtools#324, samtools#325; PR samtools#370).
  Particular thanks to Kostya Kortchinsky of the Google Security Team
  for testing and numerous input parsing bug reports.

* HTSlib now prints an informational message when initially creating a
  CRAM reference cache in the default location under your $HOME directory.
  (No message is printed if you are using $REF_CACHE to specify a location.)

* Avoided rare race condition when caching downloaded CRAM reference sequence
  files, by using distinctive names for temporary files (in addition to O_EXCL,
  which has always been used).  Occasional corruption would previously occur
  when multiple tools were simultaneously caching the same reference sequences
  on an NFS filesystem that did not support O_EXCL (PR samtools#320).

* Prevented race condition in file access plugin loading (PR samtools#341).

* Fixed mpileup memory leak, so no more "[bam_plp_destroy] memory leak [...]
  Continue anyway" warning messages (samtools#299).

* Various minor CRAM fixes.

* Fixed documentation problems samtools#348 and samtools#358.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0