Stars
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
[ICCV 2023] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
High-speed Large Language Model Serving for Local Deployment
Development repository for the Triton language and compiler
Efficient Triton Kernels for LLM Training
Read-only mirror of Trusted Firmware-A
[ACL 2025] iAgent: LLM Agent as a Shield between User and Recommender Systems
Evaluation data, LLMs query code and results for "Large Language Models as Zero-Shot Conversational Recommenders" on CIKM 2023.
[PGAI@CIKM 2023] PyTorch Implementation of LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking
Code for the Paper "Zero-Shot Next-Item Recommendation using Large Pretrained Language Models"
An other implementation of GRU4REC using PyTorch
GRU4Rec is the original Theano implementation of the algorithm in "Session-based Recommendations with Recurrent Neural Networks" paper, published at ICLR 2016 and its follow-up "Recurrent Neural Ne…
RUCAIBox / CIKM2020-S3Rec
Forked from aHuiWang/CIKM2020-S3RecCode for CIKM2020 "S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization"
A high-throughput and memory-efficient inference and serving engine for LLMs
SASRec: Self-Attentive Sequential Recommendation
RUCAIBox / LC-Rec
Forked from zhengbw0324/LC-Rec[ICDE'24] Code of "Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation."
(WWW'25 + Netflix) The first CRS that retrieves collaborative filtering knowledge with two-step context-aware reflection.
Large Language Model for Generative Recommendation
(WWW'24 + LinkedIn) The first RS that tightly combines LLM with ID-based RS
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
CIKM'23, Prompt Distillation for Efficient LLM-based Recommendation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Felafax is building AI infra for non-NVIDIA GPUs
SGLang is a fast serving framework for large language models and vision language models.
Example models using DeepSpeed