8000 mythy-xie / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View mythy-xie's full-sized avatar

Block or report mythy-xie

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

东南大学课程共享计划

1,380 318 Updated Sep 30, 2020

Focus on prompting and generating

Python 44,757 6,961 Updated Jan 24, 2025

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 18,896 1,947 Updated Apr 4, 2024
Jupyter Notebook 3 1 Updated Nov 19, 2023

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,676 651 Updated May 13, 2025

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning (ICML 2024)

Python 14 2 Updated Jun 18, 2024

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning

Python 1,441 293 Updated Apr 8, 2025

Soft Actor-Critic

Python 1,092 239 Updated Nov 29, 2023

Official code for "RAMBO: Robust Adversarial Model-Based Offline RL", NeurIPS 2022

Python 27 6 Updated Jun 2, 2023

世界一流兼容并包TUNA协会收集的周围同学们的Blog

Python 932 136 Updated Mar 28, 2025

NonTrivial-MIPS is a synthesizable superscalar MIPS processor with branch prediction and FPU support, and it is capable of booting linux.

SystemVerilog 599 102 Updated Jul 7, 2020

Official code for ICLR'25 paper [Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning]

Jupyter Notebook 1 Updated Feb 28, 2025

An elegant PyTorch offline reinforcement learning library for researchers.

Python 329 37 Updated Apr 17, 2024

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 8,852 895 Updated May 4, 2025

Pref-RL provides ready-to-use PbRL agents that are easily extensible.

Python 11 4 Updated Aug 31, 2022

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 10,613 1,834 Updated May 9, 2025

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,641 477 Updated Jan 8, 2024
Python 1 Updated Mar 1, 2021

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

Python 121 28 Updated Nov 3, 2021

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

Python 318 70 Updated Nov 29, 2021

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

Python 29 7 Updated Jul 27, 2021

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,739 139 Updated Sep 19, 2023

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python 1,397 71 Updated Jan 24, 2025

上海交通大学 致远数学方向 专业研讨课3 [计算神经科学专题]

Jupyter Notebook 18 8 Updated Oct 26, 2023

Guide on how to use Qemu to create a similar effect to Windows Subsystem for Linux on macOS. Unfinished; contributions are welcome!

585 11 Updated Sep 18, 2022

The Python programming language

Python 66,947 31,868 Updated May 14, 2025

All Algorithms implemented in Python

Python 200,465 46,753 Updated May 14, 2025

[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide

5,069 327 Updated May 9, 2025

https://hrl.boyuai.com/

Jupyter Notebook 3,410 662 Updated Nov 22, 2022
Next
0