8000 Fix Kimi-VL Chat template by Blaizzy · Pull Request #376 · Blaizzy/mlx-vlm · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix Kimi-VL Chat template #376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 26, 2025
Merged

Fix Kimi-VL Chat template #376

merged 1 commit into from
May 26, 2025

Conversation

Blaizzy
Copy link
Owner
@Blaizzy Blaizzy commented May 26, 2025

add jinja to allowed patterns

Closes #373

@Blaizzy Blaizzy requested a review from Copilot May 26, 2025 10:48
@Blaizzy Blaizzy marked this pull request as ready for review May 26, 2025 10:48
Copy link
@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for downloading .jinja template files by including the extension in the allowed pattern list for model artifacts.

  • Added "*.jinja" to the file pattern whitelist in get_model_path.
Comments suppressed due to low confidence (2)

mlx_vlm/utils.py:140

  • Add or update a test case to verify that .jinja files are correctly matched and downloaded by get_model_path.
"*.jinja",

mlx_vlm/utils.py:140

  • The function's docstring or inline comment should be updated to mention that .jinja files are now supported in the allowed patterns.
"*.jinja",

@Blaizzy Blaizzy merged commit 2068970 into main May 26, 2025
2 checks passed
@Blaizzy Blaizzy deleted the pc/fix-kimi-vl-chat-template branch May 26, 2025 10:52
breynolds007 added a commit to breynolds007/mlx-vlm that referenced this pull request Jun 22, 2025
fix: black and isort requests

fix: Default to False for backward-compatible behaviour

docs: Add docs for new feature

FastAPI server (Blaizzy#321)

* adding server (chat and batch endpoints not tested - generate OK with streaming on or off )

* cleanup

* progress on openAi api

* openai working

* removing batch processing

* cleaning openai model

* removing adapter_path and resize_shape and seed

---------

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>

feat: Resume training from saved adapter files (Blaizzy#327)

* feat: Add ability to resume from a saved adapter path.

* docs: Fix docs to use hyphen and not underscore

* nit: Update parameter to use --adapter-path

* fix: Order of imports

---------

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>

docs: Fix command-line params in LORA.md to use dash separators (Blaizzy#331)

Fix internVL3 (Blaizzy#333)

fix models/internvl_chat/vision.py transpose issue (Blaizzy#340)

[gemma3] Improve generation quality when using multiple images (Blaizzy#342)

* working

* expand image mask in mlx

* fix batch size squeezing

* batch size fix

* working masked scatter 2d

* 3d masked scatter

* refactor masked_scatter

* integrate mlx_masked_scatter into gemma3

* remove copy

* fix whitespace

* del masked_scatter.py

* del bos workaround

* Add better MLX masked scatter

---------

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>

Update utils.py (Blaizzy#322)

remove duplicate get_model_path (Blaizzy#343)

[phi3_v] Remove logit dtype conversion from float32 to uint32 (Blaizzy#329)

* [phi3_v] Remove logit dtype conversion from float32 to uint32

* Update mlx_vlm/models/phi3_v/phi3_v.py

---------

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>

Make Gemma3 and Qwen2.5VL text-image input merging functions public static (Blaizzy#335)

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>

fix: remove self reference in static method prepare_inputs_for_multimodal (Blaizzy#348)

Fix VLMs upcasting (Blaizzy#350)

* fix gemma upcasting

* fix internvl

* fix kimi_vl

* fix tests

bump version (Blaizzy#351)

Fixes Qwen2 VL Position Id (Blaizzy#319)

* handles case for position ids in qwen2

* remove duplicate

* fix: generation command

* resets rope_deltas

* address comments

* passes image and vdo grid to kwargs

* refactor config and remove duplicate input_ids

* fix tests

---------

Co-authored-by: Prince Canuma <prince.gdt@gmail.com>

Pixtral: make merge_input_ids function public static (Blaizzy#355)

Add test for StoppingCriteria reset (Blaizzy#357)

Fix typo in POST request example (Blaizzy#358)

Fix early exit for StoppingCriteria (Blaizzy#356)

Add max_position_embeddings to TextConfig (Blaizzy#360)

Fix revision argument not passed in load (Blaizzy#361)

* Forward revision argument in load function

* format

Add initial MkDocs setup (Blaizzy#362)

* Link examples to GitHub

* add GH actions

* format

Update deploy-docs.yml (Blaizzy#363)

Update update-changelog.yml (Blaizzy#364)

Update mkdocs configuration (Blaizzy#365)

* Update mkdocs config and remove placeholders

* Update mkdocs.yml

* add overrides

Remove hardcoded pytorch install error message (Blaizzy#369)

* Remove pytorch install message

* Fix formatting

* Fix removed change

Update processor tests (Blaizzy#370)

* update tests for processor

* fix tests

add jinja to allowed patterns (Blaizzy#376)

Fix multi images understand for Qwen2 and Qwen2.5 VL (Blaizzy#377)

* fix qwen multi images

* fix formatting

* remove unsed param

* simplify batching logic

Qwen2.5 vl fix  (Blaizzy#378)

* handles case for position ids in qwen2

* remove duplicate

* fix: generation command

* resets rope_deltas

* qwen 2.5vl position embeddings

* missing kwarg

* rope deltas

* fix

* nit

* address comments

* passes image and vdo grid to kwargs

* Updated qwen2.5vl

* config structure

* formatting

* fix tests

* fix tests and refactor config image ids

* fix inference

* fix inference

* remove unused

* fix mask bug

* fix offset indexing

* Remove unused

* remove unused

* fix qwen2_5_vl prompt

* remove unused

---------

Co-authored-by: Prince Verma <princev@lambdatest.com>
Co-authored-by: Dillon DuPont <v-ddupont@microsoft.com>
Co-authored-by: ddupont <3820588+ddupont808@users.noreply.github.com>
Co-authored-by: Prince Verma <prncvrm@gmail.com>

Bump mlx to v0.26.0 (Blaizzy#381)

bump (Blaizzy#382)

fix: Set lora-alpha default to 0.1 for backward consistency

fix: Changed to use store_false for consistency with --apply-chat-template parameter

fix: Corrected store_false vs store_true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kimi-VL-A3B-Thinking-8bit fails to run due to apply_chat_template error
1 participant
0