8000 lafmdp (Jing-Cheng Pang) / Starred · GitHub

More Web Proxy on the site http://driver.im/

lafmdp

Follow

🎯

Focusing

Jing-Cheng Pang lafmdp

🎯

Focusing

Follow

Ph.D. student at Nanjing University. Interested in reinforcement learning.

41 followers · 15 following

Nanjing University
NanJing, Jiangsu, China
04:17 (UTC +08:00)
https://www.lamda.nju.edu.cn/pangjc

Achievements

Achievements

Lists (8)

Sort

🤖Autonomous Agent

Agent perceives its environment, takes actions autonomously to achieve goals, and may improve its performance with learning or acquiring knowledge.

Benchmark

Benchmark for experimental environments, algorithms, etc.

49 repositories

Efficiency

Implementing something with high efficiency.

Interesting tools 🔨

Some interesting tools

53 repositories

Models🌲

Open-sourced foundation models, language models, multi-modal models.

Paper collections📚

Paper Implementation 📄

Released algorithm implementation.

54 repositories

Tutorial 📚

Tutorials or statistical list.

13 repositories

Stars

ByteDance-Seed / Seed-Thinking-v1.5

770 13 Updated Apr 20, 2025

LAMDA-RL / KALM

Forked from CharlieBrown-v1/KALM

[NeurIPS‘24] KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

1 Updated Jan 20, 2025

LAMDA-RL / ImagineBench

A benchmark for evaluating reinforcement learning algorithms that train the policies using imaginary rollouts from LLMs.

Python 7 Updated May 27, 2025

LAMDA-RL / CoLA

Python 4 Updated Mar 26, 2025

polixir / NeoRL2

Python 12 2 Updated May 20, 2025

yuanyaaa / InCLET

Official Code Repository for 《InCLET: Large Language Model In-context Learning can Improve Embodied Instruction-following》

Python 3 1 Updated Mar 17, 2025

OpenManus / OpenManus-RL

A live stream development of RL tunning for LLM agents

Python 2,864 397 Updated May 23, 2025

mila-iqia / babyai

BabyAI platform. A testbed for training agents to understand and execute language commands.

Python 728 151 Updated Oct 1, 2023

stepjam / RLBench

A large-scale benchmark and learning environment.

Python 1,392 270 Updated Jan 25, 2025

Trevor-emt / Reviwo

Python 6 1 Updated Mar 2, 2025

LAMDA-RL / Pretrained_BWArea_2.7B_30G

Pre-trained Models of BWArea Model

Python 9 Updated Sep 10, 2024

tsinghua-fib-lab / AgentSociety

AgentSociety: Large-scale Social Simulation to Understand Human Behaviors and Society through LLM-driven Agents

Python 303 49 Updated May 20, 2025

deepseek-ai / DeepSeek-R1

89,492 11,569 Updated Apr 9, 2025

dzhng / deep-research BC54

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…

TypeScript 16,391 1,683 Updated Apr 12, 2025

yenche123 / liubai

Supercharge yourself!

TypeScript 808 76 Updated May 27, 2025

NovaSky-AI / SkyThought

Sky-T1: Train your own O1 preview model within $450

Python 3,253 324 Updated May 18, 2025

CharlieBrown-v1 / KALM

Python 7 3 Updated Apr 18, 2025

THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

Python 2,587 183 Updated Jan 30, 2025

InternLM / lagent

A lightweight framework for building LLM-based agents

Python 2,134 218 Updated Mar 14, 2025

SWE-bench / SWE-bench

SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

Python 2,988 515 Updated May 22, 2025

XiaoMi / ha_xiaomi_home

Xiaomi Home Integration for Home Assistant

Python 19,861 1,015 Updated May 23, 2025

zhanshijinwat / Steel-LLM

Train a 1B LLM with 1T tokens from scratch by personal

Jupyter Notebook 660 70 Updated Apr 27, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)

Python 6,842 669 Updated May 27, 2025

OpenDriveLab / End-to-end-Autonomous-Driving

[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving

3,074 285 Updated Dec 17, 2024

FLAIROx / Kinetix

Reinforcement learning on general 2D physics environments in JAX. ICLR 2025 Oral.

Python 183 7 Updated Mar 22, 2025

elicassion / sugarl

Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"

Python 53 2 Updated Oct 10, 2024

Tencent-Hunyuan / Hunyuan3D-1

Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation

Python 3,383 262 Updated Jan 21, 2025

polixir / OfflineRL

A collection of offline reinforcement learning algorithms.

Python 185 21 Updated Nov 26, 2024

deepcs233 / Visual-CoT

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 318 15 Updated Dec 22, 2024

yihaosun1124 / OfflineRL-Kit

An elegant PyTorch offline reinforcement learning library for researchers.

Python 337 38 Updated Apr 17, 2024

0