Highlights
- Pro
Stars
verl: Volcano Engine Reinforcement Learning for LLMs
Train transformer language models with reinforcement learning.
This is the official repository of the paper "FunReason: Enhancing Large Language Models’ Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement"
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
A LaTeX Template for Dissertation Writing at the University of Electronic Science and Technology of China Since 2024
Fully open reproduction of DeepSeek-R1
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
A python library for social event detection
An Open-Source Package for Knowledge Embedding (KE)
PyTorch code for SpERT: Span-based Entity and Relation Transformer
UDPipe based preprocessing of the ACE05 dataset
Official inference repo for FLUX.1 models
🚀 豆包大模型逆向API【特长:超强联网搜索】,零配置部署,多路token支持,仅供测试,如需商用请前往官方开放平台。
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…
Repo for ShenNong-TCM-LLM (“神农”大模型,首个中医药中文大模型)
Convert PDF to markdown + JSON quickly with high accuracy
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…
By defining the Schema, use `go generate` to generate database CRUD or HTTP request code.
🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and analyzes papers from 🤗 Hugging Face's daily papers, providing …
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)