8000 Initial implementation of flat vector search by lintool · Pull Request #2510 · castorini/anserini · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Initial implementation of flat vector search #2510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 30, 2024
Merged

Initial implementation of flat vector search #2510

merged 7 commits into from
May 30, 2024

Conversation

lintool
Copy link
Member
@lintool lintool commented May 29, 2024
  • 1st installment - initial implementation checked with BEIR BGE (cached queries).
  • no quantized yet.
  • no ONNX yet.

The following works:

python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat

@lintool lintool marked this pull request as draft May 29, 2024 20:52
Copy link
codecov bot commented May 29, 2024

Codecov Report

Attention: Patch coverage is 86.09023% with 37 lines in your changes are missing coverage. Please review.

Project coverage is 67.07%. Comparing base (d721fc8) to head (d4a8588).
Report is 1 commits behind head on master.

Files Patch % Lines
...ain/java/io/anserini/search/FlatDenseSearcher.java 75.30% 17 Missing and 3 partials ⚠️
...nserini/index/codecs/AnseriniFlatVectorFormat.java 75.60% 8 Missing and 2 partials ⚠️
...ava/io/anserini/search/SearchFlatDenseVectors.java 94.18% 4 Missing and 1 partial ⚠️
...c/main/java/io/anserini/index/AbstractIndexer.java 0.00% 1 Missing ⚠️
.../java/io/anserini/index/IndexFlatDenseVectors.java 98.21% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #2510      +/-   ##
============================================
+ Coverage     66.66%   67.07%   +0.41%     
- Complexity     1432     1469      +37     
============================================
  Files           214      218       +4     
  Lines         12319    12585     +266     
  Branches       1507     1523      +16     
============================================
+ Hits           8213     8442     +229     
- Misses         3587     3618      +31     
- Partials        519      525       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lintool lintool marked this pull request as ready for review May 30, 2024 11:43
@lintool lintool merged commit 2152338 into master May 30, 2024
3 checks passed
@lintool lintool deleted the flat branch May 30, 2024 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0