Stars
TinyChatEngine: On-Device LLM Inference Library
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" mea…
Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616
The Triton Inference Server provides an optimized cloud and edge inferencing solution.