8000 jun297 (Junhyeok Kim) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View jun297's full-sized avatar

Block or report jun297

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 1,933 82 Updated May 21, 2025

[ICML 2025] Official repository for paper "Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation"

Python 145 35 Updated May 15, 2025

Quick Long Video Understanding

Python 39 3 Updated May 25, 2025

D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]

Python 2,366 210 Updated Apr 11, 2025

An open collection of implementation tips, tricks and resources for training large language models

Python 473 23 Updated Mar 8, 2023

From the Transistor to the Web Browser, a rough outline for a 12 week course

6,207 484 Updated Oct 12, 2021

Implementation for Describe Anything: Detailed Localized Image and Video Captioning

Python 1,126 59 Updated May 6, 2025
Python 79 5 Updated May 6, 2025

Minimal and annotated implementations of key ideas from modern deep learning research.

Jupyter Notebook 669 61 Updated Jun 2, 2025

Roo Code (prev. Roo Cline) gives you a whole dev team of AI agents in your code editor.

TypeScript 14,875 1,541 Updated Jun 2, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,042 309 Updated May 11, 2025

A Comprehensive Evaluation Benchmark for Open-Vocabulary Detection (AAAI 2024)

Python 50 3 Updated May 7, 2024

Automatically fetch the titles of pasted links

TypeScript 583 69 Updated Dec 15, 2024

Enhanced Quick Switcher plugin for Obsidian.md

TypeScript 503 13 Updated May 17, 2025

UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, and inpainting.

Python 105 5 Updated Apr 2, 2025
Python 14 3 Updated Apr 8, 2025

Official PyTorch Implementation of Opt-CWM: Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals.

Python 19 1 Updated Mar 27, 2025

A conference poster format with structure, content, creation, and presentation recommendations.

60 6 Updated Feb 16, 2025

Scaling Vision Pre-Training to 4K Resolution

162 7 Updated May 31, 2025

[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'

Python 203 9 Updated Apr 20, 2025

Reproduction of DeepSeek-R1

Python 231 23 Updated Apr 14, 2025

The official implement of "Grounded Chain-of-Thought for Multimodal Large Language Models"

Python 11 1 Updated Mar 21, 2025

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 8,154 822 Updated Aug 12, 2024

CLI interfaces & config objects, from types

Python 645 32 Updated Jun 1, 2025

MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.

Python 1 Updated Mar 15, 2025
Python 93 5 Updated May 27, 2025

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one

Zig 78,379 3,123 Updated Jun 2, 2025

👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.

Zig 31,126 839 Updated Jun 2, 2025
Jupyter Notebook 6 Updated May 30, 2025

Video Search and Streaming Agent 🕵️‍♂️

Python 469 31 Updated Jan 31, 2024
Next
0