Stars
Bringing Language Models to the Most Resource Constrained Devices
Utilities intended for use with Llama models.
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models
Efficient and easy multi-instance LLM serving
Repository to host and maintain scale-sim-v2 code
Conditional channel- and precision-pruning on neural networks
Code for "Effective Bayesian Heteroscedastic Regression with Deep Neural Networks" (NeurIPS 2023)
Energy-aware Timing Analysis of Intermittent Programs
Open source software accompanying the publication: "Improving the forward progress of Transient
Code for Adaptive Deep Neural Network Inference Optimization with EENet
LaLaRAND: Flexible Layer-by-Layer CPU/GPU Scheduling for Real-Time DNN Tasks
Overview of conditional computution and dynamic CNNs for computer vision, with a focus on reducing computational complexity
This repository contains Adaptive Early-Exit (AdaEE).
This repository contains the program used to train and evaluate a Branched DNN capable of early-exit semantic segmentation, suited for an edge-cloud co-inference scenario in smart cities..
Improve a Model's accuracy by distilling knowledge to the earlier layers of the model. Improves accuracy and performance of lightweight DNN models
A curated list of early exiting (LLM, CV, NLP, etc)
Improving Low-Latency Predictions in Multi-Exit Neural Networks via Block-Dependent Losses
Code and model for "Peeking into the Future: Predicting Future Person Activities and Locations in Videos", Liang et al, CVPR 2019