Open
Description
Since hashing already is getting done, why not save the hash to the report database.
This would allow me to merge by hash two separate dupd runs on different external drives.
I can import the sqlitedb's into python/pandas (since I'm not familiar with SQL) merge them and get a new list of possible duplicates.
eg..
`
import pandas as pd
import sqlite3
con = sqlite3.connect("dupd.db3")
dupx = pd.read_sql('SELECT * FROM duplicates WHERE each_size > 10000;', con)
`
I've tried to modify the code myself, to add hashes, but not having used C for 20 years, it's not been very successful.
I suspect it would not be difficult, and possibly quite useful to other users too.
Metadata
Metadata
Assignees
Labels
No labels