8000 billhhh (Bill Wang) / Starred · GitHub

More Web Proxy on the site http://driver.im/

billhhh

Follow

Coding

Bill Wang billhhh

Coding

Follow

AI research scientist

211 followers · 389 following

UAE
https://huwang01.github.io

Achievements

Achievements

Starred repositories

AMAP-ML / GPG

GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning

Python 128 5 Updated May 14, 2025

BritishMachineVisionAssociation / BMVCTemplate

Paper template and author instructions for BMVC

TeX 172 105 Updated Apr 28, 2025

billhhh / KRPO_LLMs_RL

The code repository for paper "Kalman Filter Enhanced Group Relative Policy Optimization for Language Model Reasoning"

Python 2 Updated May 16, 2025

open-thought / system-2-research

System 2 Reasoning Link Collection

833 74 Updated Mar 16, 2025

policy-gradient / GRPO-Zero

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,331 50 Updated Apr 18, 2025

paperswithcode / releasing-research-code

Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)

2,772 730 Updated May 19, 2023

sdicastro / KOVA

Kalman Optimization for Value Approximation

Python 11 1 Updated Feb 17, 2020

chunhuizhang / llm_rl

llm & rl

Jupyter Notebook 121 12 Updated May 12, 2025

philschmid / deep-learning-pytorch-huggingface

Jupyter Notebook 1,183 240 Updated Feb 27, 2025

HyperGAI / HPT

HPT - Open Multimodal LLMs from HyperGAI

Python 315 22 Updated Jun 6, 2024

willccbb / verifiers

Verifiers for LLM Reinforcement Learning

Python 961 112 Updated May 16, 2025

dnl-blkv / mcdowell-cv

A Nice-looking CV template made into LaTeX

TeX 2,352 803 Updated May 10, 2024

academicpages / academicpages.github.io

Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.

HTML 14,152 46,363 Updated May 14, 2025

aburkov / theLMbook

This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov

Jupyter Notebook 1,557 252 Updated May 10, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 13,763 1,886 Updated May 16, 2025

open-thought / tiny-grpo

Minimal hackable GRPO implementation

Python 225 29 Updated Jan 31, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,436 2,250 Updated May 16, 2025

lsdefine / simple_GRPO

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,045 86 Updated Apr 3, 2025

rsshyam / GRPO

Python 59 6 Updated Jul 28, 2024

kaapana / kaapana

Kaapana is an open source toolkit for state of the art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios wit…

Python 193 50 Updated May 16, 2025

aimagelab / mammoth

An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning

Python 678 117 Updated May 15, 2025

Guang000 / Awesome-Dataset-Distillation

A curated list of awesome papers on dataset distillation and related applications.

HTML 1,657 146 Updated May 6, 2025

ContinualAI / avalanche

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.

Python 1,889 305 Updated Mar 11, 2025

BioMedIA-MBZUAI / SALT

This repository contains the official source code for SALT: Parameter-Efficient Fine-Tuning via Singular Value Adaptation with Low-Rank Transformation.

Python 23 1 Updated Apr 16, 2025

johannes-fetz / joengine

Jo Engine is an open source 2D and 3D game engine for the Sega Saturn written in C under MIT license

C 234 35 Updated Apr 20, 2025

kvablack / ddpo-pytorch

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support

Python 573 53 Updated Mar 22, 2024

Mauville / MedCLIP

Medical image captioning using OpenAI's CLIP

Jupyter Notebook 75 15 Updated Mar 7, 2023

liaolea / MVP

Modeling Variants of Prompts for Vision-Language Models

Python 4 1 Updated Apr 1, 2025

billhhh / Rethink-Merge

The code repository of from [paper](https://arxiv.org/abs/2411.09263) "Rethinking Weight-Averaged Model-merging".

Python 2 1 Updated Mar 6, 2025

zsef123 / PointRend-PyTorch

A PyTorch implementation of PointRend: Image Segmentation as Rendering

Jupyter Notebook 379 75 Updated Feb 15, 2020

Starred topics

multihead-attention

multi-head-attention

0