8000 eslambakr / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View eslambakr's full-sized avatar

Block or report eslambakr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 534 18 Updated May 24, 2024

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python 1,467 74 Updated Jan 24, 2025
Python 175 9 Updated Jul 12, 2024

Official implementation of UnifiedReward & UnifiedReward-Think

Python 449 11 Updated Jun 18, 2025
Python 135 Updated Jul 1, 2025

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 1,491 62 Updated Jun 26, 2025

Open-source unified multimodal model

Python 4,476 375 Updated Jul 2, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,776 2,272 Updated Jul 7, 2025

A linear estimator on top of clip to predict the aesthetic quality of pictures

Jupyter Notebook 569 23 Updated Aug 15, 2022

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Python 125 3 Updated Jun 12, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 8,511 655 Updated May 29, 2025

An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.

Python 914 124 Updated Jul 4, 2025

Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"

Python 41 3 Updated Sep 8, 2023

Minimal reproduction of DeepSeek R1-Zero

Python 11,984 1,492 Updated Apr 24, 2025

Fast and memory-efficient exact attention

Python 18,222 1,785 Updated Jul 6, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,644 8,538 Updated Jul 7, 2025

Train transformer language models with reinforcement learning.

Python 14,474 2,018 Updated Jul 6, 2025

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 16,571 1,510 Updated Sep 5, 2024

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

Python 1,115 45 Updated Jun 20, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,427 2,238 Updated Feb 1, 2025

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,791 84 Updated Aug 15, 2024

Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

Python 64 5 Updated Jun 11, 2025

Benchmark for generative image models

Jupyter Notebook 93 4 Updated Sep 9, 2023

Code release for our NeurIPS 2024 Spotlight paper "GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing"

Jupyter Notebook 135 8 Updated Oct 23, 2024

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,167 52 Updated Jun 18, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,510 732 Updated Jul 7, 2025
Python 111 10 Updated Jan 27, 2025

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 4,206 367 Updated Jun 15, 2025

[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 356 12 Updated Apr 25, 2025
Next
0