PetrosKataras

Petros Kataras PetrosKataras

33 followers · 3 following

https://p-kataras.info/

Achievements

Stars

AMAAI-Lab / SonicVerse

SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning

Python 37 2 Updated Jun 19, 2025

allenai / molmo

Code for the Molmo Vision-Language Model

Python 543 39 Updated Dec 12, 2024

IDEA-Research / Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 2,391 258 Updated May 26, 2025

jovanavidenovic / DAM4SAM

[CVPR 2025] "A Distractor-Aware Memory for Visual Object Tracking with SAM2"

Python 340 25 Updated Jun 26, 2025

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,378 77 Updated May 28, 2025

ClaudiaCuttano / SAMWISE

[CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"

Python 279 13 Updated Jun 28, 2025

UniFlowMatch / UFM

UFM: A Unified Dense Image Correspondence Estimator for both Optical Flow & Wide Baseline Matching Tasks. Matches any pair of images.

Python 181 3 Updated Jun 14, 2025

rerun-io / annotation-example

Python 40 2 Updated Jun 10, 2025

SkalskiP / top-cvpr-2025-papers

About This repository is a curated collection of the most exciting and influential CVPR 2025 papers. 🔥 [Paper + Code + Demo]

Python 650 35 Updated Jun 16, 2025

block / goose

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

Rust 15,467 1,284 Updated Jul 4, 2025

sparkjsdev / spark

✨ An advanced 3D Gaussian Splatting renderer for THREE.js

TypeScript 744 40 Updated Jul 4, 2025

facebookresearch / EdgeTAM

[CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"

Jupyter Notebook 469 32 Updated Apr 30, 2025

bryanswkim / Chain-of-Zoom

Official repository for "Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment"

Python 683 67 Updated Jun 2, 2025

showlab / OmniConsistency

The official code implementation of the paper "OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data."

Python 364 24 Updated Jun 8, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,287 50 Updated Jun 14, 2025

PrimitiveAnything / PrimitiveAnything

[SIGGRAPH 2025] PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

Python 329 12 Updated May 13, 2025

UCSC-VLAA / OpenVision

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Python 280 17 Updated May 15, 2025

JanPalasek / fast-reflection-removal

Removes reflections quickly and easily.

Python 24 1 Updated Feb 10, 2024

roboflow / trackers

A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms

Python 1,785 153 Updated Jul 2, 2025

lpiccinelli-eth / UniK3D

[CVPR 2025] UniK3D: Universal Camera Monocular 3D Estimation

Python 543 33 Updated Jun 11, 2025

THU-MIG / yoloe

YOLOE: Real-Time Seeing Anything [ICCV 2025]

Python 1,400 125 Updated Jun 26, 2025

HazyResearch / minions

Big & Small LLMs working together

Python 1,046 116 Updated Jul 4, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,251 253 Updated Jun 12, 2025

roboflow / rf-detr

RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.

Python 2,310 254 Updated Jul 3, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 9,405 898 Updated Jul 4, 2025

Acly / krita-ai-diffusion

Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.

Python 8,805 464 Updated Jun 30, 2025

modelcontextprotocol / python-sdk

The official Python SDK for Model Context Protocol servers and clients

Python 15,626 1,967 Updated Jul 4, 2025

jlowin / fastmcp

🚀 The fast, Pythonic way to build MCP servers and clients

Python 13,953 854 Updated Jul 4, 2025

jackaudio / jack-example-tools

Official examples and tools from the JACK project

C 44 17 Updated Jul 7, 2024

microsoft / Magma

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Python 1,744 131 Updated May 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Petros Kataras PetrosKataras

Achievements

Achievements

Block or report PetrosKataras

Stars

AMAAI-Lab / SonicVerse

allenai / molmo

IDEA-Research / Grounded-SAM-2

jovanavidenovic / DAM4SAM

facebookresearch / perception_models

ClaudiaCuttano / SAMWISE

UniFlowMatch / UFM

rerun-io / annotation-example

SkalskiP / top-cvpr-2025-papers

block / goose

sparkjsdev / spark

facebookresearch / EdgeTAM

bryanswkim / Chain-of-Zoom

showlab / OmniConsistency

ByteDance-Seed / Seed1.5-VL

PrimitiveAnything / PrimitiveAnything

UCSC-VLAA / OpenVision

JanPalasek / fast-reflection-removal

roboflow / trackers

lpiccinelli-eth / UniK3D

THU-MIG / yoloe

HazyResearch / minions

QwenLM / Qwen2.5-Omni

roboflow / rf-detr

facebookresearch / vggt

Acly / krita-ai-diffusion

modelcontextprotocol / python-sdk

jlowin / fastmcp

jackaudio / jack-example-tools

microsoft / Magma