8000 Question about the read length distribution for subtype graphs · Issue #79 · formbio/laava · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Question about the read length distribution for subtype graphs #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dlbowie0 opened this issue Feb 4, 2025 · 9 comments
Closed

Question about the read length distribution for subtype graphs #79

dlbowie0 opened this issue Feb 4, 2025 · 9 comments
Labels
question Further information is requested

Comments

@dlbowie0
Copy link
dlbowie0 commented Feb 4, 2025

The previous version of the pipeline that I was using is 79eaf54 (24/10/2024) and I pulled the updated version of the pipeline e425de0 (29/01/2025). I re-run an analysis on a ssAAV dataset to compare the results. The results were the same between both versions of the pipeline (i.e. the same proportions of reads and percentage were found for each subtype). However, with the update of the read length distribution graph does not reflect the reality of the results table.

PREVIOUS VERSION
Image

CURRENT Version

Image

The results table showed that there wasn't a large no. reads attributed to the full length ssAAV. However, looking at the current version graph it would appear as if the full length ssAAV very more present.

@etal
Copy link
Contributor
etal commented Feb 4, 2025

Could you show your results table, too? There is a bugfix between those two commits where previously the plot didn't account for effective_count (i.e. inferred number of SMRTbell adapters on each read); now it does, and I've reduced the bin size so the peaks are tighter. The plot now matches the violin plots earlier in the report.

@etal etal added the question Further information is requested label Feb 4, 2025
@dlbowie0
Copy link
Author
dlbowie0 commented Feb 5, 2025

I see that there is an update 2 days ago. I will re-run the analysis and if I have the same problem, I will get back to you.

@dlbowie0
Copy link
Author
dlbowie0 commented Feb 5, 2025

I tried the new update (commit 913a785) on the same dataset and this is the error that is generated.

Image

@etal
Copy link
Contributor
etal commented Feb 12, 2025

Is this PacBio sequencing data? What do the sequencing read IDs in the BAM/SAM/FASTQ look like?

@dlbowie0
Copy link
Author
dlbowie0 commented Feb 13, 2025

Is this PacBio sequencing data? What do the sequencing read IDs in the BAM/SAM/FASTQ look like?

No, the data is from a nanopore sequencing run. This is an example of what the read ID looks like in the BAM file : f86ee2ba-e497-488d-8302-bc8e32926523

@etal
Copy link
Contributor
etal commented Feb 13, 2025

OK, in that case laava's read ID regex, automatic sequencing run ID extraction, and effective_count logic won't work. Some small changes in the code will be needed to get it working.

Does that read ID have any internal structure to it, e.g. does each block separated by - have distinct meaning? I'm looking for an embedded sequencing run ID and sequence strand information if that's available.

@dlbowie0
Copy link
Author

OK, in that case laava's read ID regex, automatic sequencing run ID extraction, and effective_count logic won't work. Some small changes in the code will be needed to get it working.

Does that read ID have any internal structure to it, e.g. does each block separated by - have distinct meaning? I'm looking for an embedded sequencing run ID and sequence strand information if that's available.

No, they don't signify anything. It is just the reads are randomly assigned an ID.

@dlbowie0
Copy link
Author

I tested the new update 16c61dc and the graph is still not reflecting the reality of the results table.

@dlbowie0
Copy link
Author

Here is the results table and here is the graph:

Image

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants
0