Open
Description
Hello,
I tried to run the evaluation script for InternVL-Chat-V1-5
with the latest version of VLMEvalKit (main branch) and got the following error:
Processing JSON file: imdb_multiple_mcq.json
InternVL model version: V1.5
Traceback (most recent call last):
File "./VLMEvalKit/evaluate.py", line 86, in <module>
main()
File "/./VLMEvalKit/evaluate.py", line 73, in main
ret = model.generate(question_input, dataset="MCQ")
File "./VLMEvalKit/vlmeval/vlm/base.py", line 116, in generate
return self.generate_inner(message, dataset)
File "./VLMEvalKit/vlmeval/vlm/internvl/internvl_chat.py", line 467, in generate_inner
return self.generate_v1_5(message, dataset)
File "./VLMEvalKit/vlmeval/vlm/internvl/internvl_chat.py", line 348, in generate_v1_5
max_num = max(1, min(self.max_num, self.total_max_num // image_num))
ZeroDivisionError: integer division or modulo by zero
I checked the input to the function generate_v1_5 in "./VLMEvalKit/vlmeval/vlm/internvl/internvl_chat.py" (line 348) and got the following:
[{'type': 'text', 'value': 'images/imdb_multiple_mcq/question_1_1.png'}, {'type': 'text', 'value': 'images/imdb_multiple_mcq/question_1_2.png'}, {'type': 'text', 'value': 'images/imdb_multiple_mcq/question_1_3.png'}, {'type': 'text', 'value': 'images/imdb_multiple_mcq/question_1_4.png'}, {'type': 'text', 'value': "Which images contain the celebrity ..."}]
So it seems image paths are considered as text by the parser. I got similar issue when I tried other InternVL models (eg InternVL2-1B
). Can you help how FaceXBench can be used with the latest version of VLMEvalKit?
Metadata
Metadata
Assignees
Labels
No labels