-
Notifications
You must be signed in to change notification settings - Fork 487
Adds fusion feature for Anserini #2590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…yserini convention
pom.xml
Outdated
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.junit.jupiter</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
algin indentation with above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a need for a new package src/main/java/io/anserini/trectools/
- just dump everything into io.anserini.fusion
for now?
|
||
// Constructor with reSort parameter | ||
public TrecRun(Path filepath, Boolean reSort) throws IOException { | ||
this.resetData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consistent two space indentation?
} | ||
|
||
private void resetData() { | ||
runData = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -0,0 +1,3 @@ | |||
1 Q0 pyeb86on 1 24 reciprocal_rank_fusion_k=60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put all these files under a fusion/
directory please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test files were actually from Daniel early on; I didn't include my test files on the commit. Of course, I could move these to a fusion/
directory or just delete them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these files are actually used, then please move. If they're not, then remove.
@@ -0,0 +1,76 @@ | |||
package io.anserini.trectools; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add boilerplate to here also?
Oh, I see, you're modeling the structure after pyserini... yea, let's do a from-ground-up redesign. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please look more carefully at the test cases?
|
||
@Before | ||
public void setUp() throws IOException { | ||
sampleFilePath = Paths.get("runs/testlong/run.neuclir22-zh-en-splade.splade.topics.neuclir22-en.splade.original-desc_title.txt"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check? This file doesn't exist...?
Please verify that the test cases are actually doing something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So sorry. The file seemed a bit too big for git. The plan was to bring this up when doing more tests after the code style/structure is accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good enough for now...
The design is following :
Did a number of tests on codes but not thorough enough. Will test more if the design is accepted.
Should close #2355 and the results align with pyserini.