Stars
https://huggingface.co/spaces/WildVision/vision-arena
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
The suite of modeling video with Mamba
Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training (NeurIPS 2024)
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
DeepSeek-VL: Towards Real-World Vision-Language Understanding
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
Generative Models by Stability AI
Benchmarking LLMs with Challenging Tasks from Real Users
Toy Gaussian Splatting visualization in Unity
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
A one-stop library to standardize the inference and evaluation of all the conditional image generation models. (ICLR 2024)
Implementation of Nougat Neural Optical Understanding for Academic Documents
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Implementation of "Visualize Before You Write: Imagination-Guided Open-Ended Text Generation".
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
Language modeling via stochastic processes. Oral @ ICLR 2022.
TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.