8000 Orion-zhen (Orion) · GitHub

More Web Proxy on the site http://driver.im/

Orion-zhen

Follow

💥

CUDA Out Of Memory

Orion Orion-zhen

💥

CUDA Out Of Memory

Follow

It's a feature, NOT a bug.

39 followers · 16 following

Achievements

Achievements

Highlights

Developer Program Member

Pinned Loading

abliteration abliteration Public

Make abliterated models with transformers, easy and fast

Python 77 13
turboderp-org/exllamav2 turboderp-org/exllamav2 Public

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4.2k 317
theroyallab/tabbyAPI theroyallab/tabbyAPI Public

The official API server for Exllama. OAI compatible, lightweight, and fast.

Python 995 112
hiyouga/LLaMA-Factory hiyouga/LLaMA-Factory Public

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53.5k 6.6k
SJTU-IPADS/PowerInfer SJTU-IPADS/PowerInfer Public

High-speed Large Language Model Serving for Local Deployment

C++ 8.2k 434
CrazyBoyM/llama3-Chinese-chat CrazyBoyM/llama3-Chinese-chat Public

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4.2k 338

0