8000 Support for VLLM models? SmolVLM, Florence-2, etc · Issue #1576 · jolibrain/deepdetect · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Support for VLLM models? SmolVLM, Florence-2, etc #1576
Open
@cchadowitz

Description

@cchadowitz

I reached out on Gitter but I'm not sure if that's still actively used:

With all the LLM and VLM models announced and released recently, are there thoughts or plans around supporting those types of models in DeepDetect? I'm specifically most interested in multimodal models like the new HuggingFace SmolVLM Instruct models and Microsoft Florence-2 vision models, both based on the HuggingFace Transformers library.
SmolVLM is available as a variety of ONNX models, while Florence-2 is available as pytorch .bin, both with their own set of various configs for the model, tokenizers, preprocessors, etc, so I'm not sure how much (or which) is viable for use in DeepDetect.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0