GitHub - idiap/sdialog: Synthetic Dialog Generation and Analysis with LLMs

SDialog is a modular, extensible Python toolkit for synthetic dialogue generation and analysis, designed for research and development with instruction-tuned Large Language Models (LLMs). It enables flexible, persona-driven, multi-agent dialogue simulation, orchestration, and scenario management, making it ideal for building, evaluating, and experimenting with conversational agents.

🚀 Motivation

Modern conversational AI research and applications increasingly require high-quality, flexible, and reproducible synthetic dialogues for training, evaluation, and benchmarking. SDialog addresses the need for:

Standardization: Clear definitions for dialogue, persona, and event structures.
Abstraction: Abstract interfaces for both single-agent and multi-agent dialogue generation.
Fine-grained Control: Orchestration to inject instructions, simulate user behaviors, and enforce scenario constraints.
LLM Integration: Seamless integration with instruction-tuned LLMs, prompt management, and memory handling.
Scenario and Dataset Management: Tools for managing complex scenarios, flowcharts, and persona definitions.

✨ Features

Persona-based Role-Playing: Define rich agent personas to simulate realistic conversations.
Multi-Agent Dialogue: Generate dialogues between multiple agents, each with their own persona and behavior.
Dialogue Orchestration: Control agent actions and inject instructions dynamically using orchestrators.
Scenario Management: Easily describe and manage dialogue scenarios, including flowcharts and user/system goals.
Flexible Serialization: Export dialogues and events in JSON or plain text for downstream tasks.
Integration with LLMs: Out-of-the-box support for Ollama and LangChain, with planned support for HuggingFace models.

⚡ Installation

pip install sdialog

Note: You must have Ollama running on your system to use the default LLM integration.
curl -fsSL https://ollama.com/install.sh | sh

🏁 Quick Start

Define personas, create agents, and generate a dialogue:

from sdialog import Persona, PersonaAgent

# Define personas
alice = Persona(name="Alice", role="friendly barista", personality="cheerful and helpful")
bob = Persona(name="Bob", role="customer", personality="curious and polite")

# Create agents
alice_agent = PersonaAgent("llama2", persona=alice, name="Alice")
bob_agent = PersonaAgent("llama2", persona=bob, name="Bob")

# Generate a dialogue
dialog = alice_agent.dialog_with(bob_agent)
dialog.print()

🎛️ Orchestration Example

Add orchestration to control dialogue length or simulate agent behaviors:

from sdialog.orchestrators import LengthOrchestrator, ChangeMindOrchestrator

length_orch = LengthOrchestrator(min=3, max=6)
mind_orch = ChangeMindOrchestrator(probability=0.5, reasons=["changed plans", "new information"], max_times=1)
alice_agent = alice_agent | length_orch | mind_orch

📚 STAR Dataset Integration

Work with the STAR dataset for scenario-driven dialogue generation:

from sdialog.datasets import STAR

STAR.set_path("/path/to/star-dataset")

scenario = {
    "Domains": ["banking"],
    "UserTask": "Open a new account",
    "WizardTask": "Assist with account opening",
    "Happy": True,
    "MultiTask": False,
    "WizardCapabilities": [{"Task": "open_account", "Domain": "banking"}]
}

system_agent, user_agent = STAR.get_agents_for_scenario(scenario, "llama2")

dialog = system_agent.dialog_with(user_agent)
dialog.print()

📖 Documentation

Documentation - Full package documentation, including installation, API reference, usage guides, and advanced examples available.
API Reference: See docstrings in the codebase for detailed documentation of all classes and functions.
Tutorials: Tutorials for hands-on examples as Jupyter Notebooks.

🙏 Acknowledgments

This work was supported by the EU Horizon 2020 project ELOQUENCE (grant number 101070558).

This work was also initially created in preparation for the 2025 Jelinek Memorial Summer Workshop on Speech and Language Technologies (JSALT 2025) as part of the work done by the "Play your Part" research group.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github		.github
docs		docs
src/sdialog		src/sdialog
tests		tests
tutorials		tutorials
.flake8		.flake8
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
pyproject.toml		pyproject.toml
readthedocs.yml		readthedocs.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Motivation

✨ Features

⚡ Installation

🏁 Quick Start

🎛️ Orchestration Example

📚 STAR Dataset Integration

📖 Documentation

🙏 Acknowledgments

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

idiap/sdialog

Folders and files

Latest commit

History

Repository files navigation

🚀 Motivation

✨ Features

⚡ Installation

🏁 Quick Start

🎛️ Orchestration Example

📚 STAR Dataset Integration

📖 Documentation

🙏 Acknowledgments

📝 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages