8000 The processes for splitting files are running very slowly with large numbers of input samples · Issue #120 · nf-core/rnasplice · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
8000 Skip to content
The processes for splitting files are running very slowly with large numbers of input samples #120
Closed
@amizeranschi

Description

@amizeranschi

Hello,

We are attempting to run this pipeline with a rather complex scenario: 850 contrasts and 3400 samples. The processes SPLIT_FILES_TPM, SPLIT_FILES_IOE and SPLIT_FILES_IOI appear to be running very slowly.

For example, looking at SPLIT_FILES_TPM, this has been running for 12 hours and only produced 61 TPM files so far. At this rate, it should take around 4 weeks to finish this process, before even attempting to run Suppa2...

We have successfully run these samples through nf-core/rnaseq (which took less than a week on 512 cores and 2 TB of RAM), and nf-core/differentialabundance (which only took 2 hours to run DESeq2). We now intend to run Suppa2 via this pipeline.

Any advice for speeding things up would be much appreciated. This is the command that we are running:

nextflow run nf-core/rnasplice -r 1.0.2 -c custom.config -params-file params.yml --igenomes_ignore --genome null \
--input /path/to/sample-sheet-rnasplice.csv \
--outdir run-rnasplice -profile docker --source salmon_results \
--fasta /path/to/genome/gencode_v45_spike-ins.fasta \
--gtf /path/to/genome/gencode_v45_spike-ins.gtf \
--star_index /path/to/genome/index/star \
--salmon_index /path/to/genome/index/salmon \
--dexseq_exon false --dexseq_dtu false \
--suppa --suppa_per_local_event \
--contrasts /path/to/contrasts-rnasplice.csv \
--sashimi_plot false

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0