Highlights
- Pro
Stars
Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122
ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels (easy/hard) across eight real-life scenarios.
"Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems" in SIGIR'21
An Open-source RL System from ByteDance Seed and Tsinghua AIR
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
An LLM-based autonomous agent controlling real-world applications via RESTful APIs
The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.
xLAM: A Family of Large Action Models to Empower AI Agent Systems
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
Efficient LLM Inference over Long Sequences
Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"
Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
A coding agent framework, that works on its own codebase.
(NeurIPS 2024) AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
Synthesizing High-quality Text-to-SQL Data at Scale. SynSQL-2.5M is the first million-scale cross-domain text-to-SQL dataset.
Contextual Harnessing for Efficient SQL Synthesis
This is a continuously updated handbook for readers to easily track the latest Text-to-SQL techniques in the literature and provide practical guidance for researchers and practitioners. Official re…
[EMNLP 2023 Findings] ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought
Dialog2Flow: convert your dialogs to flows. This repository accompanies the paper "Dialog2Flow: Pre-training Soft-Contrastive Sentence Embeddings for Automatic Dialog Flow Extraction", accepted to …
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
Suna - Open Source Generalist AI Agent
A collection of research and survey papers of real-time bidding (RTB) based display advertising techniques.