8000 EOF block missing after haplotagging · Issue #105 · twolinin/longphase · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

EOF block missing after haplotagging #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. W 8000 e’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
robert-a-forsyth opened this issue May 7, 2025 · 8 comments
Open

EOF block missing after haplotagging #105

robert-a-forsyth opened this issue May 7, 2025 · 8 comments

Comments

@robert-a-forsyth
Copy link

I ran longphase haplotag with the following command

longphase \
    haplotag \
     \
    --threads 6 \
    -o CCS15_normal \
    --reference Homo_sapiens_assembly38.fasta \
    --snp-file CCS15_germline.vcf.gz \
    --bam CCS15_mapped.bam \
     \

When I run samtools quickcheck on CCS15_mapped.bam, I get no problems, but running it on the CCS15_normal I recieve CCS15_normal.bam was missing EOF block when one should be present.. This is causing issues with my downstream analysis, what could be happening?

my log file for the run is attached

command.log

@twolinin
Copy link
Owner
twolinin commented May 7, 2025

Hi @robert-a-forsyth,

Could you help me run the following commands to check the file?
samtools quickcheck -v CCS15_normal.bam
samtools view CCS15_normal.bam
samtools index CCS15_normal.bam

Thanks

@robert-a-forsyth
Copy link
Author

samtools quickcheck -v CCS15_normal.bam ->

CCS15_normal.bam was missing EOF block when one should be present.
CCS15_normal.bam

as mentioned in above issue

The output for samtools view is too large

ditto the index, but it also starts with this warning
[W::bam_hdr_read] EOF marker is absent. The input is probably truncated

@twolinin
Copy link
Owner
twolinin commented May 7, 2025

It looks like this BAM file is corrupted. There could be several possible reasons, such as hard drive issues or incomplete file copying. You can try using samtools view to recover any alignments that are still usable. If you still have the raw reads, it’s better to redo the alignment.

Thanks

@robert-a-forsyth
Copy link
Author

Yes, the issue is that only the output bam is corrupted, the input bam is not. When I rerun longphase, i consistently have a corrupted bam outputted. This suggests to me that it is a problem with longphase if samtools does not pick up an error with my input bam

@twolinin
Copy link
Owner
twolinin commented May 7, 2025

Sorry, I replied too quickly. So just to confirm. When you ran the above checks on the input file CCS15_mapped.bam, there were no error messages, correct?

@robert-a-forsyth
Copy link
Author

yes

@twolinin
Copy link
Owner
twolinin commented May 7, 2025

Since samtools quickcheck only checks the file header and the end of the BAM, so I wanted to confirm whether there might still be issues with the original input. We'll look into whether haplotagging is causing any problems, and may consider adding a more thorough BAM validation step going forward.

Thanks

@ythuang0522
Copy link
Collaborator

Hi @robert-a-forsyth There could be several possbilities. Most of the time this error is related to incomplete writing of bam. Have you ever rerun the haplotag and reproduce the same EOF error? If the EOF error is reproducible, would be great if you can test our sample bam/vcf and reference.. Or you can provide a subset of your bam and vcf for us to test (samtools view -b input.bam chr:start-end > output.bam).

PS: I noticed your reference genome contains a high number of chromosomes (e.g., alt and HLA and decoy). We generally follow Heng Li's recommendation of no alt version unless you are using an ALT-aware aligner. But this may not be related to this EOF error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0