Spycsh

Spycsh

17 followers · 0 following

Intel
Shanghai
14:20 (UTC +08:00)

Achievements

x3 x3

Achievements

x3 x3

Organizations

Lists (1)

Sort

🚀 My stack

1 repository

Stars

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,875 973 Updated Apr 30, 2025

lipku / LiveTalking

Real time interactive streaming digital human

Python 5,455 806 Updated May 1, 2025

pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI

Python 5,881 729 Updated May 3, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 6,280 537 Updated May 4, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 2,777 286 Updated Apr 30, 2025

deepseek-ai / DeepSeek-V3

Python 96,400 15,684 Updated Apr 9, 2025

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Python 722 68 Updated Aug 22, 2024

spring-projects / spring-ai

An Application Framework for AI Engineering

Java 4,951 1,354 Updated May 3, 2025

HabanaAI / hccl_demo

C++ 20 15 Updated Apr 29, 2025

antgroup / echomimic_v2

[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Python 3,643 424 Updated Feb 27, 2025

PKU-YuanGroup / LLaVA-CoT

LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 1,973 75 Updated Apr 13, 2025

anliyuan / Ultralight-Digital-Human

一个超轻量级、可以在移动端实时运行的数字人模型

Python 1,867 269 Updated Mar 5, 2025

kleinlee / DH_live

每个人都能用的数字人

Python 1,371 288 Updated May 3, 2025

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 15,432 1,959 Updated May 4, 2025

bklieger-groq / g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,213 380 Updated Jan 27, 2025

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,712 129 Updated Apr 21, 2025

opea-project / GenAIEval

Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination

Jupyter Notebook 31 52 Updated Apr 30, 2025

opea-project / GenAIComps

GenAI components at micro-service level; GenAI service composer to create mega-service

Python 140 191 Updated May 2, 2025

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 28,660 3,371 Updated Jan 26, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 45,807 5,046 Updated Apr 25, 2025

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,937 550 Updated Apr 11, 2025

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,846 748 Updated Dec 17, 2024

kserve / kserve

Standardized Serverless ML Inference Platform on Kubernetes

Python 4,130 1,164 Updated May 3, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,402 1,398 Updated May 4, 2025

timsainb / noisereduce

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

Jupyter Notebook 1,645 246 Updated Dec 28, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,783 2,805 Updated May 4, 2025

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,171 211 Updated Oct 8, 2024

intel / optimization-manual

Contains the source code examples described in the "Intel® 64 and IA-32 Architectures Optimization Reference Manual"

Assembly 794 91 Updated May 3, 2024

he-y / Awesome-Pruning

A curated list of neural network pruning resources.

2,437 330 Updated Apr 4, 2024

cedrickchee / awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization, tools, and learning material.

513 60 Updated Sep 21, 2024