-
-
Notifications
You must be signed in to change notification settings - Fork 131
Batch Processing Feature #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Will take it for implementation! hope to meet the standards :) |
Here are some extra details: |
Sorry, just saw this -- will take a swing when #53 is merged. |
Hey @willccbb, any update on this? Would be super helpful to have |
@willccbb doesn't have the bandwidth. This feature is now open and back in backlog. |
Hi @Blaizzy, I'm interested in implementing the batch processing feature for MLX-VLM. After reviewing the issue requirements and the existing codebase, I understand this involves:
The PR #53 that refactored KVCache implementation provides a good foundation for this work. I plan to implement this in stages:
My implementation will include configurable parameters for batch generation with sensible defaults:
Would you be open to my contribution? Looking forward to your feedback! |
Uh oh!
There was an error while loading. Please reload this page.
Overview
The goal is to add support for efficient batch processing of inputs to the MLX-VLM library. This will allow users to process multiple images and text prompts simultaneously to generate corresponding outputs in a single batch, improving performance.
Use cases:
Note: Tag @Blaizzy for code reviews and questions.
Requirements
Support batched inputs:
Perform batch processing:
Generate batched outputs:
Error handling:
API design:
Documentation and examples:
Implementation
Testing
Delivery
By implementing this batch processing feature, MLX-VLM will provide users with the ability to efficiently process multiple inputs simultaneously, improving performance and usability of the library for various vision-language tasks.
The text was updated successfully, but these errors were encountered: