8000 wsyadc (Shenyang Wang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View wsyadc's full-sized avatar
  • Northeastern University (China)
  • Shenyang, Liaoning, China

Block or report wsyadc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

End-to-End Navigation with VLMs

Python 86 5 Updated Apr 7, 2025

[IROS'25 Oral] WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation

Python 83 2 Updated Jun 16, 2025
Jupyter Notebook 83 2 Updated Dec 29, 2023

【CVPR 2025 Highlight】MonSter: Marry Monodepth to Stereo Unleashes Power

Python 588 33 Updated Jun 23, 2025
Python 142 21 Updated Jun 19, 2025

Code of the paper "NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning" (TPAMI 2025)

Python 85 2 Updated Jun 4, 2025
Python 15 Updated May 20, 2025

translation_baseline

Python 3 Updated May 22, 2025

[ECCV 2024] Official implementation of NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models

Python 188 12 Updated Sep 20, 2024

RAG向量召回示例

Python 127 22 Updated Feb 14, 2024

Convert PDF to markdown + JSON quickly with high accuracy

Python 26,149 1,686 Updated Jun 27, 2025

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Python 1,278 82 Updated Jun 27, 2025

Reading list for research topics in embodied vision

630 77 Updated Jun 13, 2025

Ideas and thoughts about the fascinating Vision-and-Language Navigation

234 16 Updated Jun 28, 2023

MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Python 670 24 Updated Jun 25, 2025

LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 2,018 77 Updated May 13, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 781 47 Updated May 14, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,812 213 Updated Jun 27, 2025

This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…

Python 621 13 Updated Jun 26, 2025

A fork to add multimodal model training to open-r1

Python 1,316 63 Updated Feb 8, 2025

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

668 20 Updated Jun 21, 2025

R1-onevision, a visual language model capable of deep CoT reasoning.

Python 532 14 Updated Apr 13, 2025

从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!

Jupyter Notebook 1,441 95 Updated Apr 13, 2025

WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge

Python 120 15 Updated Nov 11, 2024

支持查询主流agent框架技术文档的MCP server(支持stdio和sse两种传输协议), 支持 langchain、llama-index、autogen、agno、openai-agents-sdk、mcp-doc、camel-ai 和 crew-ai

Python 117 24 Updated May 5, 2025

A lightweight, powerful framework for multi-agent workflows

Python 12,019 1,814 Updated Jun 27, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 11,251 1,142 Updated Jun 27, 2025

Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.

Python 889 126 Updated Sep 15, 2024
Python 45 1 Updated Jun 4, 2025
Next
0