-
Focoos AI
- Turin, Italy
- in/ardaerendogru
Stars
10000An up-to-date list of works on Multi-Task Learning
Taskonomy: Disentangling Task Transfer Learning [Best Paper, CVPR2018]
[IROS24]A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
A comprehensive C++20 cache simulator for analyzing memory hierarchy performance with configurable cache levels, replacement policies, and inclusion strategies
The repo for "Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator"
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
The simplest, fastest repository for training/finetuning small-sized VLMs.
[CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Official repository for "AM-RADIO: Reduce All Domains Into One"
Collection of AWESOME vision-language models for vision tasks
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Implementing DeepSeek R1's GRPO algorithm from scratch
[CVPR2025W] Official repository for the paper: "Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation"
Convert PDF to markdown + JSON quickly with high accuracy
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
[CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).
Testing adaptation of the DINOv2 encoder for vision tasks with Low-Rank Adaptation (LoRA)
Approximating neural network loss landscapes in low-dimensional parameter subspaces for PyTorch
The official project website of "ScaleKD: Strong Vision Transformers Could Be Excellent Teachers" (ScaleKD for short, accepted to NeurIPS 2024).
🚀 Lightning-fast computer vision models. Fine-tune SOTA models with just a few lines of code. Ready for cloud ☁️ and edge 📱 deployment.
Verilog HDL implementation of SDRAM controller and SDRAM model