8000 input file name collision · Issue #180 · nf-core/scdownstream · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

input file name collision #180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
weishwu opened this issue May 22, 2025 · 2 comments · Fixed by #181
Closed

input file name collision #180

weishwu opened this issue May 22, 2025 · 2 comments · Fixed by #181
Labels
bug Something isn't working

Comments

@weishwu
Copy link
weishwu commented May 22, 2025

It's my first time trying this pipeline, so it may be a foolish mistake I made.
This is the error I got:

  Process `NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:QUALITY_CONTROL:AMBIENT_RNA_REMOVAL:CELDA_DECONTX` input file name collision -- There are multiple input files for each of the following file names: 11670-OMR-1.h5ad

Below is my sample sheet. Each sample has a "sample_filtered_feature_bc_matrix.h5" and a "sample_raw_feature_bc_matrix.h5".

sample,filtered,unfiltered
11670-OMR-1,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-1/count/sample_filtered_feature_bc_matrix.h5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-1/count/sample_raw_feature_bc_matrix.h5
11670-OMR-10,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-10/count/sample_filtered_feature_bc_matrix.h5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-10/count/sample_raw_feature_bc_matrix.h5
11670-OMR-11,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-11/count/sample_filtered_feature_bc_matrix.h5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-11/count/sample_raw_feature_bc_matrix.h5
11670-OMR-5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-5/count/sample_filtered_feature_bc_matrix.h5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-5/count/sample_raw_feature_bc_matrix.h5
11670-OMR-6,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-6/count/sample_filtered_feature_bc_matrix.h5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-6/count/sample_raw_feature_bc_matrix.h5
11670-OMR-8,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-8/count/sample_filtered_feature_bc_matrix.h5,inputs/11670-OMR/10x_analysis_11670-OMR/Sample_11670-OMR-Pool01/per_sample_outs/11670-OMR-8/count/sample_raw_feature_bc_matrix.h5
12346-OMR-10,inputs/12346-OMR/10x_analysis_12346-OMR/Sample_12346-OMR-Pool01/per_sample_outs/12346-OMR-10/count/sample_filtered_feature_bc_matrix.h5,inputs/12346-OMR/10x_analysis_12346-OMR/Sample_12346-OMR-Pool01/per_sample_outs/12346-OMR-10/count/sample_raw_feature_bc_matrix.h5
12346-OMR-11,inputs/12346-OMR/10x_analysis_12346-OMR/Sample_12346-OMR-Pool01/per_sample_outs/12346-OMR-11/count/sample_filtered_feature_bc_matrix.h5,inputs/12346-OMR/10x_analysis_12346-OMR/Sample_12346-OMR-Pool01/per_sample_outs/12346-OMR-11/count/sample_raw_feature_bc_matrix.h5
12346-OMR-12,inputs/12346-OMR/10x_analysis_12346-OMR/Sample_12346-OMR-Pool01/per_sample_outs/12346-OMR-12/count/sample_filtered_feature_bc_matrix.h5,inputs/12346-OMR/10x_analysis_12346-OMR/Sample_12346-OMR-Pool01/per_sample_outs/12346-OMR-12/count/sample_raw_feature_bc_matrix.h5

And my command-line:

nextflow run nf-core/scdownstream \
  -r dev \
  --input ./sample_sheet.csv \
  --outdir ./results \
  --ambient_removal decontx \
  --doublet_detection scrublet \
  --doublet_detection_threshold 1 \
  --integration_methods seurat \
  -c nextflow_resource.cfg \
  -profile lh_standard \
  -resume

I first thought the error could be because all the h5 files have the same names across different samples for filtered and unfiltered separately. However, the same error popped up after I renamed the files.

Thanks.

@weishwu weishwu added the bug Something isn't working label May 22, 2025
@nictru
Copy link
Collaborator
nictru commented May 22, 2025

This is an actual bug in the pipeline. It occurs because you insert .h5 files, and the first thing the pipeline does is convert it to <sample>.h5ad. Then it passes it on to the next process, which is decontX in your case.

The problem is that there are two files (filtered and unfiltered) with the same sample id and thus the same file name. The h5 to h5ad conversion should use the filtered/unfiltered information in the file name to prevent this.

Thanks for the issue, should be a quick one.

nictru added a commit that referenced this issue May 22, 2025
@nictru nictru linked a pull request May 22, 2025 that will close this issue
@nictru
Copy link
Collaborator
nictru commented May 22, 2025

Please make sure to pull the latest version of the dev branch, and the problem should not occur anymore

fasterius pushed a commit to NBISweden/scdownstream that referenced this issue May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0