8000 GitHub - kids/vllm: updates for model-name checking and stop tokens
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

kids/vllm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vLLM

Easy, fast, and cheap LLM serving for everyone


针对本地部署的模型,一般单机只部署单个模型,为了统一下游负载均衡的调用,更新了调用时模型名的校验机制;同时将<|im_end|>,<|im_start|>作为默认的stop token

本地启动的模型一般以路径为模型名,调用时需要对齐 比如:

python -m vllm.entrypoints.openai.api_server --model /path/some-merged-gpt --trust-remote-code

那么请求

requests.post('http://serv-url.com/v1/chat/completions',json={
            "model": "/path/some-merged-gpt",
            "messages": [{"role": "user", "content": "hello"}]
        },headers={"Content_Type": "application/json"})

中model的参数就必须暴露启动时的路径 因此在只有一个模型serving的场景下,bypass了模型名校验,可以使用任意名称作为model参数调用 modifications are in vllm/entrypoints/openai/serving_engine.py

add <|im_end|>,<|im_start|> as default stop tokens in: /vllm/entrypoints/openai/protocol.py

About

updates for model-name checking and stop tokens

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 81.1%
  • Cuda 15.8%
  • C++ 1.4%
  • CMake 0.8%
  • Shell 0.5%
  • Dockerfile 0.2%
  • Other 0.2%
0