8000 Feature request: add max Filesize. · Issue #24 · jvirkki/dupd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Feature request: add max Filesize. #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tmsd2001 opened this issue Dec 25, 2019 · 4 comments
Open

Feature request: add max Filesize. #24

tmsd2001 opened this issue Dec 25, 2019 · 4 comments
Assignees

Comments

@tmsd2001
Copy link

cleaning a disk by min Filesize is pretty easy to do by hand.
But GBytes of small files is hard to compare by hand.
With min and max options can everyone work as he likes.

@jvirkki
Copy link
Owner
jvirkki commented Dec 25, 2019

This will be easy to add.

I'm curious to hear more about the use case. Do you have a large quantity of big files that are known to be unique, so you'd like the scan to go faster by ignoring them?

@jvirkki jvirkki self-assigned this Dec 25, 2019
@tmsd2001
Copy link
Author

on my disk are 299 Files greater 1 GB and I haven't counted them yet, but there will be millions of files smaller than 10 bytes from Backups.
The 299 files are currently running on the network and are not yet finished hashing, but it has already taken 280 minutes. I think the big ones block the process for now.
If I have more details I can write them.

@jvirkki
Copy link
Owner
jvirkki commented Dec 26, 2019

Scanning files mounted over the network (whether NFS or other) will be slow, no matter what. If there is any way to run the scan on the host which has the disk(s) locally, that would be the best approach.

If some of the files are local and some are network mounted (not sure if that is your case), you could exclude the remote ones using the -X option (see docs).

I'll add an option to exclude files larger than a given size. That said, if you have millions of files smaller than 10 bytes being read over the network, that will also be slow, likely more so than the large files (depending how large they are and network speed). If you're not doing so already, you might want to exclude the smallest files with -m 10 or whichever size limit makes sense for you.

@tmsd2001
Copy link
Author

I tried it on the local pc, it is faster there even though it has significantly less power.
With more memory, it would be even faster because I had to add a larger swap partition.
There the network is the bottleneck.
But I still liked that the many small files are pictures or icons. Is there an option to just search jpg or png and so on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants
0