Stars
An AI-powered task-management system you can drop into Cursor, Lovable, Windsurf, Roo, and others.
data-to-paper: Backward-traceable AI-driven scientific research
A personal knowledge management and sharing system for VSCode
Official Github Repo for ICLR 2025 Paper EMMA: EMPOWERING MULTI-MODAL MAMBA WITH STRUCTURAL AND HIERARCHICAL ALIGNMENT
The first decoder-only multimodal state space model
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
๐ฅ๐ฅ๐ฅLatest Papers, Codes and Datasets on Vid-LLMs.
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
An AI-powered arXiv paper summarization website with a virtual assistant for answering questions.
๐ Level up your GitHub profile readme with customizable cards including LOC statistics!
Codebase for Aria - an Open Multimodal Native MoE
Reading list for research topics in state-space models
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models)
Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.
[AAAI-25] Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference
Pytorch implementation of various Knowledge Distillation (KD) methods.
Strong and Open Vision Language Assistant for Mobile Devices
aider is AI pair programming in your terminal
LLaVA-JP is a Japanese VLM trained by LLaVA method
๐ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Gather around the table, and have a discussion to catch up the latest trend of machine learning ๐ค
Production-ready platform for agentic workflow development.
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
[CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning