💥
CUDA Out Of Memory
It's a feature, NOT a bug.
Highlights
Pinned Loading
-
abliteration
abliteration PublicMake abliterated models with transformers, easy and fast
-
turboderp-org/exllamav2
turboderp-org/exllamav2 PublicA fast inference library for running LLMs locally on modern consumer-class GPUs
-
theroyallab/tabbyAPI
theroyallab/tabbyAPI PublicThe official API server for Exllama. OAI compatible, lightweight, and fast.
-
hiyouga/LLaMA-Factory
hiyouga/LLaMA-Factory PublicUnified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
-
SJTU-IPADS/PowerInfer
SJTU-IPADS/PowerInfer PublicHigh-speed Large Language Model Serving for Local Deployment
-
CrazyBoyM/llama3-Chinese-chat
CrazyBoyM/llama3-Chinese-chat PublicLlama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.