wusize

Size Wu (吴思泽) wusize

PhD student@NTU

91 followers · 38 following

Achievements

x2 x2

Achievements

x2 x2

Highlights

Harmon Public

Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Python 108 1 Other Updated May 21, 2025
anonymous Public

Updated May 16, 2025
wusize.github.io Public
Forked from academicpages/academicpages.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

JavaScript 2 1 MIT License Updated May 7, 2025
SimpleAR Public
Forked from wdrink/SimpleAR

Python MIT License Updated Apr 16, 2025
WISE Public
Forked from PKU-YuanGroup/WISE

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Python Updated Apr 3, 2025
Show-o Public
Forked from showlab/Show-o

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python Apache License 2.0 Updated Feb 28, 2025
Janus Public
Forked from deepseek-ai/Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python MIT License Updated Feb 1, 2025
lmms-eval Public
Forked from EvolvingLMMs-Lab/lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python Updated Jan 26, 2025
Open-MAGVIT2 Public
Forked from vinyesm/Open-MAGVIT2

A packaging of Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python Apache License 2.0 Updated Oct 6, 2024
RADIO Public
Forked from NVlabs/RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 1 Other Updated Aug 23, 2024
F-LMM Public

[CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models

Python 89 1 Other Updated Aug 5, 2024
OMG-Seg Public
Forked from lxtGH/OMG-Seg

OMG-LLaVA and OMG-Seg codebase

Python 1 Other Updated Aug 4, 2024
chameleon Public
Forked from facebookresearch/chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python Other Updated Jul 3, 2024
MMT-Bench Public
Forked from OpenGVLab/MMT-Bench

ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Python Updated Jun 18, 2024
Visual-CoT Public
Forked from deepcs233/Visual-CoT

Visual CoT: Unleashing Chain-of-Thought Reasoning in the Multi-Modal Language Model

Python Apache License 2.0 Updated May 2, 2024
DeepSpeed Public
Forked from deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python Apache License 2.0 Updated Mar 25, 2024
CLIPSelf Public

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

detection open-vocabulary vision-language-model

Python 187 10 Other Updated Feb 5, 2024
CLIM Public

[AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation

detection open-vocabulary-detection

Python 29 3 Other Updated Feb 4, 2024
LLaVA-Grounding Public
Forked from UX-Decoder/LLaVA-Grounding

Python Apache License 2.0 Updated Jan 22, 2024
UNINEXT Public
Forked from MasterBin-IIAU/UNINEXT

[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval

Python MIT License Updated Nov 9, 2023
CLIP Public

Jupyter Notebook MIT License Updated Nov 2, 2023
ovdet Public

[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection

object-detection open-vocabulary cvpr2023

Python 182 5 Other Updated Oct 25, 2023
OVD_Contest Public

Python 2 Apache License 2.0 Updated Oct 12, 2023
CAT-Seg Public
Forked from cvlab-kaist/CAT-Seg

Official Implementation of "CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation"

Python Updated Sep 11, 2023
raw_ovd Public

Python 7 Updated Aug 23, 2023
RegionCLIP Public
Forked from microsoft/RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Python Apache License 2.0 Updated Aug 13, 2023
SAN Public
Forked from MendelXu/SAN

Open-vocabulary Semantic Segmentation

Python MIT License Updated May 9, 2023
open_clip-1 Public
Forked from mlfoundations/open_clip

An open source implementation of CLIP.

Python Other Updated Apr 23, 2023
multiview_pose Public

[ICCV2021] Code Release of Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images

Python 50 9 Apache License 2.0 Updated Mar 5, 2023
colorization Public

This is the code of the colorization project of the National Innovation Program.

Python 1 Updated Jul 19, 2021

Size Wu (吴思泽) wusize

Achievements

Achievements

Highlights

Harmon Public

Uh oh!

anonymous Public

Uh oh!

wusize.github.io Public

Uh oh!

SimpleAR Public

Uh oh!

WISE Public

Uh oh!

Show-o Public

Uh oh!

Janus Public

Uh oh!

lmms-eval Public

Uh oh!

Open-MAGVIT2 Public

Uh oh!

RADIO Public

Uh oh!

F-LMM Public

Uh oh!

OMG-Seg Public

Uh oh!

chameleon Public

Uh oh!

MMT-Bench Public

Uh oh!

Visual-CoT Public

Uh oh!

DeepSpeed Public

Uh oh!

CLIPSelf Public

Uh oh!

CLIM Public

Uh oh!

LLaVA-Grounding Public

Uh oh!

UNINEXT Public

Uh oh!

CLIP Public

Uh oh!

ovdet Public

Uh oh!

OVD_Contest Public

Uh oh!

CAT-Seg Public

Uh oh!

raw_ovd Public

Uh oh!

RegionCLIP Public

Uh oh!

SAN Public

Uh oh!

open_clip-1 Public

Uh oh!

multiview_pose Public

Uh oh!

colorization Public

Uh oh!