Outformer is a powerful library that enables language models to generate structured outputs. It ensures always valid JSON outputs by generating only values while maintaining the structural integrity of your schema.
- ๐ Structured Output Generation: Generate valid JSON outputs from language models
- ๐ฏ Schema Validation: Ensure outputs conform to your JSON schema
- ๐ ๏ธ Flexible Integration: Works with any Hugging Face transformer model
- ๐ Easy to Use: Simple API with minimal configuration
- ๐จ Value Highlighting: Visualize generated values in your JSON structure
We recommend Python 3.10+, PyTorch 2.7.0+, transformers v4.51.3+.
pip install outformer
git clone https://github.com/milistu/outformer.git
cd outformer
pip install -e .
Here's a simple example to get you started:
Click to expand code example
from outformer import Jsonformer, highlight_values
from transformers import AutoModelForCausalLM, AutoTokenizer
# Initialize model and tokenizer
model_name = "Qwen/Qwen3-1.7B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Create Jsonformer instance
jsonformer = Jsonformer(model, tokenizer, max_tokens_string=30)
# Define your JSON schema
json_schema = {
"type": "object",
"properties": {
"brand": {
"type": "string",
"description": "Brand of the product",
},
"model": {
"type": "string",
"description": "Model of the product",
},
"product_type": {
"type": "string",
"description": "Type of the product",
},
"gender": {
"type": "string",
"enum": ["Female", "Male", "Unisex"],
},
"color": {
"type": "string",
"description": "Color of the product if specified, otherwise return 'Unknown'",
},
"material": {
"type": "string",
"description": "Material of the product if specified, otherwise return 'Unknown'",
},
"features": {
"type": "array",
"minItems": 3,
"items": {
"type": "string",
"description": "Features of the product that may be relevant for the customer. Extract as much as possible.",
},
},
},
}
# Your input prompt
prompt = """
Extract key information from the product description:
adidas Men's Powerlift.3 Cross-Trainer Shoes
A powerful shoe with lockdown fit. Made with an extra-wide design that allows the foot to spread, these men's lifting/weight-training shoes pair a snug-fitting upper with a wide midfoot strap for extra support. A high-density die-cut wedge midsole keeps you close to the ground.
100% Synthetic leather
Imported
Rubber sole
Removable Insole
"""
# Generate structured output
generated_data = jsonformer.generate(schema=json_schema, prompt=prompt)
# Highlight generated values
highlight_values(generated_data)
The code above will generate a structured JSON output and display it with highlighted values. Here's what you'll get:
{
"brand": "Adidas",
"model": "Powerlift.3 Cross-Trainer Shoes",
"product_type": "Cross-Trainer Shoes",
"gender": "Male",
"color": "Unknown",
"material": "Synthetic leather",
"features": [
"Lockdown fit",
"Extra-wide design",
"High-density die-cut wedge midsole",
],
}
When using highlight_values()
, the output will be displayed in your terminal with the generated values highlighted in color, making it easy to distinguish between the structure and the generated content.
The Jsonformer
class accepts several configuration parameters:
debug
(bool): Enable debug mode for detailed generation processmax_array_length
(int): Maximum number of elements in an arraymax_tokens_number
(int): Maximum number of tokens for number generationmax_tokens_string
(int): Maximum number of tokens for string generationtemperature
(float): Sampling temperature for generationgeneration_marker
(str): Marker for tracking generation positionmax_attempts
(int): Maximum attempts for value generation
- Basic types: string, number, boolean
- Arrays with min/max items
- Objects with nested properties
- Enums for constrained string values
- Descriptions for better generation context
We welcome contributions! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
The idea for this repository was inspired by jsonformer.
Maintainer: Milutin Studen
If you encounter any issues or have questions, please open an issue on our GitHub repository.