8000 Search getting 0 results · Issue #5 · moloch--/leakdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Search getting 0 results #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
enzyro opened this issue Aug 28, 2020 · 3 comments
Open

Search getting 0 results #5

enzyro opened this issue Aug 28, 2020 · 3 comments
Assignees

Comments

@enzyro
Copy link
enzyro commented Aug 28, 2020

Describe the bug
I'm trying to match 3M lines of emails to a 700M lines (roughly 50GB), but after everything is going smoothly and after doing a bunch of tests, I can't get a single match returned, even on the emails/user that I know are in my dataset for sure.
All the processes are running on an AWS instance (so I followed the server deployment steps), tried to build from source and use the released version, tried to split my data into smaller files, but still no results. I tried launching the server version and requesting through http request as well.

The really wieird thing is that when I run a search on the test folder you provide, it works properly with your provided indexes.
But when I try to regenerate the indexes for small.txt using the doc from the wiki, I'm not getting any results and when I diff my generated index, and the one you provide, they differ, so I'm guessing it has something to do with how the index generation/sorting .

To Reproduce
Steps to reproduce the behavior:

  1. ./leakdb-curator --format colon-newline --recursive --target ./large-folder-containing-all --output normalized.json
  2. ./leakdb-curator --json normalized.json
  3. ./leakdb-curator search -i leakdb/email.idx -j leakdb/bloomed.json -v "xxx@gmail.com"
    Response : Found 0 results ..
  4. grep -F "xxx@gmail.com" bloomed.json
    Response : {"email": "xxx", "user": "xxx", "domain": "gmail.com", "password": "xxx"}

I really wish I could get this to work because it looks amazing, I'm at your disposal for any questions/tests you want me to run.

Enzyro

@moloch-- moloch-- self-assigned this Aug 28, 2020
@flyingdan
Copy link

Hey, just wondering if there are any updates on this issue. Just making my way though the code to see if anything jumps out at me too.

@moloch--
Copy link
Owner

Sorry, not had much time to dig into it been very busy. Lmk if you find something!

@GlitchWitch
Copy link
GlitchWitch commented Sep 15, 2022

@enzyro @flyingdan Did either of you ever find a solution to this? I am running into the same issue using the latest Linux release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
0