gohsyi

🎯

Focusing

Hongyi Guo gohsyi

🎯

Focusing

Ph.D. at Northwestern University.

32 followers · 33 following

Northwestern University
Evanston

Achievements

openrlhf Public

Apache License 2.0 Updated Dec 6, 2024
tt3 Public

Python 1 Updated Nov 13, 2024
trl Public
Forked from huggingface/trl

Train transformer language models with reinforcement learning.

Python Apache License 2.0 Updated Jun 18, 2024
alignment-handbook Public
Forked from huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

Python Apache License 2.0 Updated Mar 12, 2024
self_alignment Public

Retrieval-Augmented Self-Alignment (RASA)

Jupyter Notebook 1 Updated Feb 21, 2024
safe-rlhf Public
Forked from PKU-Alignment/safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python Apache License 2.0 Updated Dec 2, 2023
alpaca_eval Public
Forked from tatsu-lab/alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook Apache License 2.0 Updated Nov 26, 2023
RAIN Public
Forked from SafeAILab/RAIN

Official implementation of [RAIN: Your Language Models Can Align Themselves without Finetuning]

Python BSD 2-Clause "Simplified" License Updated Oct 10, 2023
LightZero Public
Forked from opendilab/LightZero

Python Apache License 2.0 Updated Sep 29, 2023
OvercookedGPT Public
Forked from BladeTransformerLLC/OvercookedGPT

An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic multi-agent settings.

Python MIT License Updated May 15, 2023
auto_literature Public
Forked from wilmerwang/autoLiterature

Automatically arrange literature

Python Updated Apr 10, 2023
peer_bc_ct Public
Forked from Stable-Baselines-Team/stable-baselines

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Python Updated Nov 22, 2021
tslda Public

Replication of paper "Topic Modeling based Sentiment Analysis on Social Media for Stock Market Prediction".

Python 2 1 MIT License Updated Mar 6, 2021
troubleshooting Public

All issues I encountered, continuously updating

Updated Jan 13, 2021
cheatsheet Public

Updated Dec 24, 2020
rl-baselines-zoo Public
Forked from araffin/rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Python MIT License Updated Nov 19, 2020
survey Public

MIT License Updated Nov 11, 2020
multiagent-particle-envs Public
Forked from openai/multiagent-particle-envs

Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Python MIT License Updated May 18, 2020
end-to-end-negotiator Public
Forked from facebookresearch/end-to-end-negotiator

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Python Other Updated May 4, 2020
baselines Public
Forked from openai/baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python MIT License Updated Apr 26, 2020
tianshou Public
Forked from thu-ml/tianshou

An elegant, flexible, and superfast PyTorch deep Reinforcement Learning platform.

Python MIT License Updated Apr 22, 2020
PeerLoss Public

Learning with Noisy Labels by adopting a peer prediction loss function.

Python 35 4 MIT License Updated Mar 3, 2020
L_DMI Public
Forked from Newbeeer/L_DMI

Code for NeurIPS 2019 Paper, "L_DMI: An Information-theoretic Noise-robust Loss Function"

Python Updated Nov 13, 2019
look_for_words Public

Looking for words? Try me.

Python MIT License Updated Oct 9, 2019
gohsyi.github.io Public

CSS Other Updated Sep 19, 2019
taxi Public

CUMCM 2019, Problem C

Python Updated Sep 16, 2019
CoPiEr Public
Forked from ravi-lanka-4/CoPiEr

Co-training for Policy Learning

C Updated Aug 8, 2019
torch-ac Public
Forked from lcswillems/torch-ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms: A2C and PPO

Python 1 MIT License Updated Jul 22, 2019
exploration-by-disagreement Public
Forked from pathak22/exploration-by-disagreement

[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement

Python 1 Updated Jul 18, 2019
trading_strategy Public

Course project of SJTU EE359 Data Mining (advised by Prof. Bo Yuan), where we use reinforcement learning to decide trading strategy.

Python 5 Updated Jun 30, 2019

Hongyi Guo gohsyi

Achievements

Achievements

openrlhf Public

Uh oh!

tt3 Public

Uh oh!

trl Public

Uh oh!

alignment-handbook Public

Uh oh!

self_alignment Public

Uh oh!

safe-rlhf Public

Uh oh!

alpaca_eval Public

Uh oh!

RAIN Public

Uh oh!

LightZero Public

Uh oh!

OvercookedGPT Public

Uh oh!

auto_literature Public

Uh oh!

peer_bc_ct Public

Uh oh!

tslda Public

Uh oh!

troubleshooting Public

Uh oh!

cheatsheet Public

Uh oh!

rl-baselines-zoo Public

Uh oh!

survey Public

Uh oh!

multiagent-particle-envs Public

Uh oh!

end-to-end-negotiator Public

Uh oh!

baselines Public

Uh oh!

tianshou Public

Uh oh!

PeerLoss Public

Uh oh!

L_DMI Public

Uh oh!

look_for_words Public

Uh oh!

gohsyi.github.io Public

Uh oh!

taxi Public

Uh oh!

CoPiEr Public

Uh oh!

torch-ac Public

Uh oh!

exploration-by-disagreement Public

Uh oh!

trading_strategy Public

Uh oh!