[run] parallelize benchmark runs #167

phisad · 2025-03-27T14:15:22Z

The adventuregame takes about 1-3 hours to run, when we run every episode on after the other.

However, the episodes run independently, so they running an episode could be parallelized.

We should add a "-p" option to the "run" command to enable parallel runs of the benchmark.

Possible side effects:

for not HF models this might flood the API services. Hence an appropriate delay must be set anyways.
for HF models we cannot easily load the model N times. However, the model would require to join all threads to batch the requests. Possibly slowing down again the execution.

A workaround for the HF case could be to start the model separately as a service that allows incoming streams of requests and batches them automatically. Possibly vllm can do this.

phisad added the dev: enhancement A new feature or improvement to existing functionality label Mar 27, 2025

phisad added the priority: low label Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[run] parallelize benchmark runs #167

[run] parallelize benchmark runs #167

[run] parallelize benchmark runs #167

[run] parallelize benchmark runs #167

Comments