8000 billhhh (Bill Wang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View billhhh's full-sized avatar
:octocat:
Coding
:octocat:
Coding

Block or report billhhh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning

Python 128 5 Updated May 14, 2025

Paper template and author instructions for BMVC

TeX 172 105 Updated Apr 28, 2025

The code repository for paper "Kalman Filter Enhanced Group Relative Policy Optimization for Language Model Reasoning"

Python 2 Updated May 16, 2025

System 2 Reasoning Link Collection

833 74 Updated Mar 16, 2025

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,331 50 Updated Apr 18, 2025

Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)

2,772 730 Updated May 19, 2023

Kalman Optimization for Value Approximation

Python 11 1 Updated Feb 17, 2020

llm & rl

Jupyter Notebook 121 12 Updated May 12, 2025

HPT - Open Multimodal LLMs from HyperGAI

Python 315 22 Updated Jun 6, 2024

Verifiers for LLM Reinforcement Learning

Python 961 112 Updated May 16, 2025

A Nice-looking CV template made into LaTeX

TeX 2,352 803 Updated May 10, 2024

Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.

HTML 14,152 46,363 Updated May 14, 2025

This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov

Jupyter Notebook 1,557 252 Updated May 10, 2025

Train transformer language models with reinforcement learning.

Python 13,763 1,886 Updated May 16, 2025

Minimal hackable GRPO implementation

Python 225 29 Updated Jan 31, 2025

Fully open reproduction of DeepSeek-R1

Python 24,436 2,250 Updated May 16, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,045 86 Updated Apr 3, 2025
Python 59 6 Updated Jul 28, 2024

Kaapana is an open source toolkit for state of the art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios wit…

Python 193 50 Updated May 16, 2025

An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning

Python 678 117 Updated May 15, 2025

A curated list of awesome papers on dataset distillation and related applications.

HTML 1,657 146 Updated May 6, 2025

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.

Python 1,889 305 Updated Mar 11, 2025

This repository contains the official source code for SALT: Parameter-Efficient Fine-Tuning via Singular Value Adaptation with Low-Rank Transformation.

Python 23 1 Updated Apr 16, 2025

Jo Engine is an open source 2D and 3D game engine for the Sega Saturn written in C under MIT license

C 234 35 Updated Apr 20, 2025

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support

Python 573 53 Updated Mar 22, 2024

Medical image captioning using OpenAI's CLIP

Jupyter Notebook 75 15 Updated Mar 7, 2023

Modeling Variants of Prompts for Vision-Language Models

Python 4 1 Updated Apr 1, 2025

The code repository of from [paper](https://arxiv.org/abs/2411.09263) "Rethinking Weight-Averaged Model-merging".

Python 2 1 Updated Mar 6, 2025

A PyTorch implementation of PointRend: Image Segmentation as Rendering

Jupyter Notebook 379 75 Updated Feb 15, 2020
Next
0