-
-
-
trl Public
Forked from huggingface/trlTrain transformer language models with reinforcement learning.
Python Apache License 2.0 UpdatedJun 18, 2024 -
alignment-handbook Public
Forked from huggingface/alignment-handbookRobust recipes to align language models with human and AI preferences
Python Apache License 2.0 UpdatedMar 12, 2024 -
self_alignment Public
Retrieval-Augmented Self-Alignment (RASA)
-
safe-rlhf Public
Forked from PKU-Alignment/safe-rlhfSafe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Python Apache License 2.0 UpdatedDec 2, 2023 -
alpaca_eval Public
Forked from tatsu-lab/alpaca_evalAn automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Jupyter Notebook Apache License 2.0 UpdatedNov 26, 2023 -
RAIN Public
Forked from SafeAILab/RAINOfficial implementation of [RAIN: Your Language Models Can Align Themselves without Finetuning]
Python BSD 2-Clause "Simplified" License UpdatedOct 10, 2023 -
-
OvercookedGPT Public
Forked from BladeTransformerLLC/OvercookedGPTAn OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic multi-agent settings.
Python MIT License UpdatedMay 15, 2023 -
auto_literature Public
Forked from wilmerwang/autoLiteratureAutomatically arrange literature
Python UpdatedApr 10, 2023 -
peer_bc_ct Public
Forked from Stable-Baselines-Team/stable-baselinesMirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms
Python UpdatedNov 22, 2021 -
tslda Public
Replication of paper "Topic Modeling based Sentiment Analysis on Social Media for Stock Market Prediction".
-
-
-
rl-baselines-zoo Public
Forked from araffin/rl-baselines-zooA collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Python MIT License UpdatedNov 19, 2020 -
-
multiagent-particle-envs Public
Forked from openai/multiagent-particle-envsCode for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
Python MIT License UpdatedMay 18, 2020 -
end-to-end-negotiator Public
Forked from facebookresearch/end-to-end-negotiatorDeal or No Deal? End-to-End Learning for Negotiation Dialogues
Python Other UpdatedMay 4, 2020 -
baselines Public
Forked from openai/baselinesOpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Python MIT License UpdatedApr 26, 2020 -
tianshou Public
Forked from thu-ml/tianshouAn elegant, flexible, and superfast PyTorch deep Reinforcement Learning platform.
Python MIT License UpdatedApr 22, 2020 -
PeerLoss Public
Learning with Noisy Labels by adopting a peer prediction loss function.
-
L_DMI Public
Forked from Newbeeer/L_DMICode for NeurIPS 2019 Paper, "L_DMI: An Information-theoretic Noise-robust Loss Function"
Python UpdatedNov 13, 2019 -
-
-
-
-
torch-ac Public
Forked from lcswillems/torch-acRecurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms: A2C and PPO
-
exploration-by-disagreement Public
Forked from pathak22/exploration-by-disagreement[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement
-
trading_strategy Public
Course project of SJTU EE359 Data Mining (advised by Prof. Bo Yuan), where we use reinforcement learning to decide trading strategy.