-
Notifications
You must be signed in to change notification settings - Fork 29
kallisto bustools with reference transcriptome #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
kb ref makes a reference transcriptome from a genome fasta and gtf. If you already have a transcriptome, there's no need to use kb ref. Simply use |
Thanks for the explanation! |
Dear Delaney, sorry for re-open the issue. In order to use the Moreover, is there another option to perform the counting without using this file? I would like to use a de novo assembled transcriptome so I don't have this piece of information. Thanks! |
It's just a tab file with transcript in first column and gene name in second column. You need this file to performing the counting -- but, if you want, you can pretend that each transcript is its own gene (i.e. put the transcript name in both columns). The main issue is that kb count will discard all multimappers (i.e. if a UMI maps to more than 1 gene, that UMI will not be counted). Thus, multimapping might be a big issue if you pretend each transcript belongs to a different gene. There are ways around this (e.g. if you use the --tcc option in kb count, an EM algorithm will try to probablistically figure out what to do with the multimappers). It basically boils down to: If you have a UMI associated with transcripts A, B, and C but have no gene-level information, how do you want to count that UMI? |
Hi Delaney, thank you very much for your explanation! Is there a way to not discard multimappers? And assign the count to the transcript with the most reliable alignment or something similar. To explain my context a little bit, I'm working with a non-model organism and I've obtained my own curated reference transcriptome. Now I would like to use it for single-cell analysis, so I was searching for a counting algorithm that worked with a reference transcriptome. For the time being, I think I'll use your workaround to see how it behaves and maybe perform a sequence clustering to my transcriptome prior to the counting. I know it's not the perfect procedure, but I'll let you know how it goes :) |
Dear team,
I'm a little bit confused about the build index step. The manual says that it builds a transcriptome index but needs as input a genomic fasta and a gff. I would like to create the count table using a reference transcriptome. Is this possible with kallisto + bustools?
Thank you,
Marta.
The text was updated successfully, but these errors were encountered: