Closed
Description
The Python and R scripts emit several tabular (TSV) outputs, and there is some significant duplication between them. Yet the report generation scripts depends on an .Rdata input that captures the R environment with both the TSV contents and additional dataframes that are not serialized elsewhere.
Review the existing TSVs and dataframes used for report generation, and change the scripts to emit TSVs following a more practical tabular schema. Those tables may include:
- Sample ID and other relevant input metadata (see Feature request: more annotation in final report #12)
- Read alignment info relative to the vector annotation
- Sequence variants/errors from CIGAR strings
- Vector genome type/subtype labels currently computed within the R, e.g. "snapback", "other"
- Summary counts at each classification level
Metadata
Metadata
Assignees
Labels
No labels