Does topics files must contain either only integers or only strings as topic ids? #2119
Unanswered
luisvenezian
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Yes, this is expected behavior. See: https://github.com/castorini/pyserini/blob/master/pyserini/query_iterator.py#L77 Decision is whether we read with Behavior is a side effect of how the reader tries its best to determine the key type. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi!
I began testing the idea I mentioned in this GitHub discussion. During this process, I encountered a behavior that I wasn’t expecting. I apologize in advance if this is already covered in the documentation.
Specifically, it seems that the file passed to the --topics argument must contain either only integers or only strings as topic IDs. Mixed types (e.g., having both 1998 and ZZZZ in the same file) results in error. Below is a minimal example to reproduce the issue:
The first two files work as expected. However, when I run the following command using the third file:
I got the following:
Is this the expected behavior?
Thank you in advance for your time and support.
Beta Was this translation helpful? Give feedback.
All reactions