Pinned Loading
-
Trust-Region-Preference-Approximation
Trust-Region-Preference-Approximation PublicForked from volcengine/verl
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
Python 10
-
Reproduce-DeepSeek-R1-Survey
Reproduce-DeepSeek-R1-Survey PublicThis repository collects various works that reproduce DeepSeek R1, as well as works related to DeepSeek R1 and the DeepSeek series.
-
Diffusion-Demo
Diffusion-Demo PublicModified version from Google Colab(https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing#scrollTo=BIc33L9-uK4q)
Python
-
-
jsikyoon/dreamer-torch
jsikyoon/dreamer-torch PublicPytorch version of Dreamer, which follows the original TF v2 codes.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.