Refactor fusion implementation to use ScoredDocs #2620

lintool · 2024-10-09T21:32:14Z

The current implementation of fusion is built around TrecRun, e.g., https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/fusion/TrecRunFuser.java

But we already have ScoredDocs:
https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/search/ScoredDocs.java

Should we refactor our implementation to use ScoredDocs?

Potential cons:

faster?
less code duplication

The text was updated successfully, but these errors were encountered:

lilyjge · 2025-03-20T17:54:45Z

Working on it!

lilyjge · 2025-04-06T18:57:09Z

Running times for end to end fusion after ScoredDocs refactoring from #2774 , with run_fusion_regression.py. Ranx cache increases speed quite a bit, so for fairness I cleared its cache before each run. Times measured in seconds.

Corpus	Anserini	Ranx (no cache)
trec-covid	8.41	39.22
bioasq	27.66	44.14
nfcorpus	16.95	41.11
nq	146.01	79.36
hotpotqa	307.29	115.63
fiqa	31.15	44.33
signal1m	10.83	41.21
trec-news	8.71	41.79
robust04	17.32	43.48
arguana	75.83	77.96
webis-touche2020	8.12	41.77
cqadupstack-android	32.50	44.89
cqadupstack-english	67.82	53.68
cqadupstack-gaming	68.07	53.83
cqadupstack-gis	42.33	46.31
cqadupstack-mathematica	35.56	45.95
cqadupstack-physics	47.28	47.58
cqadupstack-programmers	41.47	46.80
cqadupstack-stats	31.84	44.48
cqadupstack-tex	121.55	69.58
cqadupstack-unix	49.15	50.12
cqadupstack-webmasters	25.69	43.47
cqadupstack-wordpress	27.55	43.62
quora	370.70	135.31
dbpedia-entity	25.66	216.71
scidocs	58.29	64.45
fever	284.88	2645.57
climate-fever	72.67	802.02
scifact	17.73	42.15

lintool · 2025-04-06T19:42:55Z

Thanks @lilyjge !

Two questions:

Can you add a column with previous impl that doesn't use ScoredDocs? I want to make sure this impl is actually faster!
For instances where we're slower than ranx... you have any idea what's going on?

lilyjge · 2025-04-06T20:10:33Z

Sure! I did run it against the previous implementation and the difference could be felt, but yeah I'll put it into numbers.
As the number of queries increase, our running time grows faster than ranx does, suggesting that ranx might scale better. Despite that, ranx seems to struggle whenever docids or query ids are not numerical. This is just a guess, based on that fever, climate-fever, and dbpedia where ranx really struggled, all have word-based docids, and the datasets where ranx did better than ours (nq, hotpotqa, cqadupstack-english, cqadupstack-gaming, cqadupstack-tex, and quora) almost all have strictly numerical docids/topics.

lilyjge · 2025-04-08T14:19:44Z

Table including running times of previous implementation before refactoring:

Corpus	Previous	Anserini	Ranx (no cache)
trec-covid	9.18	8.41	39.22
bioasq	130.21	27.66	44.14
nfcorpus	42.69	16.95	41.11
nq	5142.86	146.01	79.36
hotpotqa	19875.92	307.29	115.63
fiqa	169.81	31.15	44.33
signal1m	13.83	10.83	41.21
trec-news	9.48	8.71	41.79
robust04	37.32	17.32	43.48
arguana	1354.49	75.83	77.96
webis-touche2020	8.87	8.12	41.77
cqadupstack-android	188.44	32.50	44.89
cqadupstack-english	856.34	67.82	53.68
cqadupstack-gaming	879.65	68.07	53.83
cqadupstack-gis	288.17	42.33	46.31
cqadupstack-mathematica	257.86	35.56	45.95
cqadupstack-physics	420.64	47.28	47.58
cqadupstack-programmers	290.90	41.47	46.80
cqadupstack-stats	175.51	31.84	44.48
cqadupstack-tex	2996.13	121.55	69.58
cqadupstack-unix	420.10	49.15	50.12
cqadupstack-webmasters	111.48	25.69	43.47
cqadupstack-wordpress	116.75	27.55	43.62
quora	33452.81	370.70	135.31
dbpedia-entity	80.28	25.66	216.71
scidocs	425.20	58.29	64.45
fever	16475.03	284.88	2645.57
climate-fever	966.24	72.67	802.02
scifact	47.22	17.73	42.15

lintool mentioned this issue Apr 6, 2025

Basic rank fusion implementation in Anserini #2308

Closed

lilyjge mentioned this issue Apr 6, 2025

Refactor fusion implementation to use ScoredDocs #2774

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor fusion implementation to use ScoredDocs #2620

Refactor fusion implementation to use ScoredDocs #2620

Refactor fusion implementation to use ScoredDocs #2620

Refactor fusion implementation to use ScoredDocs #2620

Comments