-
University of Science and Technology of China
- China
-
10:35
(UTC +08:00) - https://www.jsingmog.top/
Lists (2)
Sort Name ascending (A-Z)
Stars
Democratizing Reinforcement Learning for LLMs
SPRATeam-USTC / RFL-MSD
Forked from JingMog/RFL-MSD[AAAI'25 Oral] "RFL: Simplifying Chemical Structure Recognition with Ring-Free Language".
Official PyTorch implementation of our paper "Multimodal Tree Decoder for Table of Contents Extraction in Document Images"
Code for paper: MATHS: Multimodal Transformer-based Human-readable Solver
A pipeline for the automatic construction of geometry problems along with step-by-step solutions.
the official implementation of our AAAI 2025 paper, DocMamba
Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…
verl: Volcano Engine Reinforcement Learning for LLMs
Witness the aha moment of VLM with less than $3.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Janus-Series: Unified Multimodal Understanding and Generation Models
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
✨✨Latest Advances on Multimodal Large Language Models
Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".
[ICASSP 2025] "Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention"
This code if an example on how to use the deplot model provided by the authors together with LLM in your own python files.
Pandaaaa906 / RFL-MSD
Forked from JingMog/RFL-MSDOfficial Implementation of our paper "RFL: Simplifying Chemical Structure Recognition with Ring-Free Language", accepted by AAAI 2025.
手写了卷积神经网络内核,来处理图上的节点分类与链路预测任务,在三个数据集cora,citeseer,ppi上进行试验,并分析了自环、层数、DropEdge、PairNorm、激活函数等因素对模型的分类和预测性能的影响。
[AAAI'25 Oral] "RFL: Simplifying Chemical Structure Recognition with Ring-Free Language".