mythy-xie

mythy-xie

1 follower · 23 following

Jiangsu

Stars

zjdx1998 / seucourseshare

东南大学课程共享计划

1,380 318 Updated Sep 30, 2020

lllyasviel / Fooocus

Focus on prompting and generating

Python 44,757 6,961 Updated Jan 24, 2025

kaixindelele / ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研，利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 18,896 1,947 Updated Apr 4, 2024

Taeuk-Jang / FADES

Jupyter Notebook 3 1 Updated Nov 19, 2023

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,676 651 Updated May 13, 2025

chwoong / LiRE

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning (ICML 2024)

Python 14 2 Updated Jun 18, 2024

Farama-Foundation / Metaworld

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning

Python 1,441 293 Updated Apr 8, 2025

haarnoja / sac

Soft Actor-Critic

Python 1,092 239 Updated Nov 29, 2023

marc-rigter / rambo

Official code for "RAMBO: Robust Adversarial Model-Based Offline RL", NeurIPS 2022

Python 27 6 Updated Jun 2, 2023

tuna / blogroll

世界一流兼容并包TUNA协会收集的周围同学们的Blog

Python 932 136 Updated Mar 28, 2025

trivialmips / nontrivial-mips

NonTrivial-MIPS is a synthesizable superscalar MIPS processor with branch prediction and FPU support, and it is capable of booting linux.

SystemVerilog 599 102 Updated Jul 7, 2020

oh-lab / APPO

Official code for ICLR'25 paper [Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning]

Jupyter Notebook 1 Updated Feb 28, 2025

yihaosun1124 / OfflineRL-Kit

An elegant PyTorch offline reinforcement learning library for researchers.

Python 329 37 Updated Apr 17, 2024

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 8,852 895 Updated May 4, 2025

mschweizer / Pref-RL

Pref-RL provides ready-to-use PbRL agents that are easily extensible.

Python 11 4 Updated Aug 31, 2022

DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 10,613 1,834 Updated May 9, 2025

CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,641 477 Updated Jan 8, 2024

PawelMlyniec / Walker-2D

Python 1 Updated Mar 1, 2021

rll-research / BPref

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

Python 121 28 Updated Nov 3, 2021

mrahtz / learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

Python 318 70 Updated Nov 29, 2021

HumanCompatibleAI / learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

Python 29 7 Updated Jul 27, 2021

anthropics / hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,739 139 Updated Sep 19, 2023

THUDM / ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python 1,397 71 Updated Jan 24, 2025

NeoNeuron / professional-workshop-3

上海交通大学致远数学方向专业研讨课3 [计算神经科学专题]

Jupyter Notebook 18 8 Updated Oct 26, 2023

macbian-admin / macos-subsystem-for-linux

Guide on how to use Qemu to create a similar effect to Windows Subsystem for Linux on macOS. Unfinished; contributions are welcome!

585 11 Updated Sep 18, 2022

python / cpython

The Python programming language

Python 66,947 31,868 Updated May 14, 2025

TheAlgorithms / Python

All Algorithms implemented in Python

Python 200,465 46,753 Updated May 14, 2025

TianxingChen / Embodied-AI-Guide

[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide

5,069 327 Updated May 9, 2025

boyu-ai / Hands-on-RL

https://hrl.boyuai.com/

Jupyter Notebook 3,410 662 Updated Nov 22, 2022

ikostrikov / implicit_q_learning

Python 267 39 Updated Jan 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly