Adding optional structured generation to CodeAgent #1346

akseljoonas · 2025-05-18T15:46:41Z

Adds a boolean use_structured_generation to CodeAgent, which turns the "Thought:.. Code:..." response pattern into JSON-structured generation. It has been shown to increase model performance on benchmarks in internal tests.

If use_structured_generation is set to True, the LLM output will look like this:

{
  "thought": " ... ", 
  "code": " ... "
}

To make this work, I added a modified prompt that implements the few-shot examples in the new JSON format. Note: The LLM is prompted not to generate strictly valid JSON, since this helps some models' performance by removing formatting overhead.

…s and removed pydantic

albertvillanova

Thanks for the contribution! I have a few general comments and questions that could help clarify and improve the PR.

First, it would be really helpful to outline which models currently support structured generation and which don't, as well as the exact parameter names they use for this feature. This context is important to determine the appropriate level in the codebase where the new parameter should be implemented, especially if the enhancement aims to be general across models. Additionally, rather than introducing a custom model parameter name like response_format, it might be better to align with existing conventions to maintain consistency and reduce confusion.

And finally, it would be great to see specific tests covering the new structured generation feature. Tests not only verify correctness but also serve as executable documentation, which benefits future maintainers and users.

albertvillanova · 2025-05-19T08:08:24Z

src/smolagents/agents.py

-        grammar (`dict[str, str]`, *optional*): Grammar used to parse the LLM output.
+        response_format (`dict[str, str]`, *optional*): Response format used to parse the LLM output.


Why do you replace grammar? I would say we want to support both:

response_format: for structured data

grammar: for more flexible output format that JSON Schema can't describe (like programming languages, structured natural language, etc.)

And, even if we really wanted to replace grammar, I would strongly recommend to start a proper deprecation cycle to avoid breaking existing users' code

response_format seems more aligned with the argument used by most, but we'd need to document it to justify the change: so we need a doc showing the parameters used by most API providers, as we discussed @akseljoonas and suggested again by @albertvillanova above. This will also be necessary to track support for JSON/regex schemas across providers

Agree as well with the proper deprecation cycle! For this you can use a warning as we do here:

smolagents/src/smolagents/agents.py

Line 515 in bf44ee1

def logs(self):

src/smolagents/agents.py

aymeric-roucher · 2025-05-19T08:40:29Z

Nice PR, thanks a lot!
Agree with @albertvillanova's comment above, in particular:

Why change from grammar to response_format? This can be warranted, we discussed in person response_format could be more frequently used. But then we need to document this with a list across API providers.
Rather than adding this new _json_step method, re-use previous methods, as suggested above by albert.

Finally, about separating grammar from response_format: for me since these parameters would do the same thing, it's probably better to rename/not rename but keep them one parameter.

akseljoonas · 2025-05-19T10:23:30Z

Sorry, I forgot to add the document showing the Models that support structured generation. You can find it here. The Main reason for renaming grammar was that passing it to any of the providers didn't work. In the API calls it had to be response_format anyway.

In general, it seems like inference/model providers are taking OpenAI's lead and are moving away from supporting structured regex and grammar generation in favour of JSON only. Right now, VLLM and HF are the only ones supporting grammar+regex afaik.
For MultiStepAgent, maybe it makes sense to be able to turn on the use_structured_output and use the default JSON schema for the added benefit, but still be able to provide your own custom schema if more granular control is needed.
- In that case, we could still support regex and grammars by providing response_format = { "type" : "regex", ...} instead of { "type": "json_schema", ...}. However, more mappings are needed for VLLM and HFInference then.
- Maybe rename use_structured_output to use_default_structured_output in that case.

What do you think? @albertvillanova @aymeric-roucher

aymeric-roucher · 2025-05-19T14:42:39Z

So, let's try to make this API as simple as possiblle:

IMO, if we have an argument grammar or response_format, it should be a argument made to let people specify their grammar/response format.
Here since the vast majority of providers (cf Aksel's doc in screenshot below) support, in terms of grammar/response_format/structured generation (for me these terms are synonymous), only JSON (as opposed to regex or other types), we might as well only support this.

In our specific case, the usage of grammar is simplified by a few things:

Only used in CodeAgent, since ToolCallingAgent already uses tool-calling APIs from models, which already have structured generation.
it's will always be the same: one think block and one Action block (code block in the case of CodeAgent). So no real need for customization : thus no real need IMO to accept a complex object like response_format, this can be handled under the hood.

As a result, I think it's simpler to remove the grammar argument from all agents (with a deprecation warning, anyway this arg isn't doing anything in most models), and replace it only for CodeAgent by a simple flag like use_grammar or use_structured_outputs, which when passed simply enforces the the models follows our default response_format.

aymeric-roucher · 2025-05-21T09:17:12Z

src/smolagents/models.py

        tools_to_call_from: list[Tool] | None = None,
        **kwargs,
    ) -> Generator[ChatMessageStreamDelta]:
        generation_kwargs = self._prepare_completion_args(
            messages=messages,
            stop_sequences=stop_sequences,
-            grammar=grammar,
+            response_format=None,  # Transformers doesn't support structured generation, use VLLMModel for that


We should avoid this behaviour, since this create a silent deviation from user expectations, leading to hard-to-debug issues : instead, it's not a problem to just raise an error inside the model if the user tried to enforce structured generation with a TransformersModel

aymeric-roucher · 2025-05-21T09:42:14Z

src/smolagents/utils.py

@@ -172,6 +172,15 @@ def parse_json_blob(json_blob: str) -> tuple[dict[str, str], str]:
        )


+def extract_code_from_text(text: str) -> str | None:


@akseljoonas I unified code blob parsing functions in one!

albertvillanova

Just a general comment/question before diving into the PR details.

From what I see, this PR introduces an internal setting that primarily affects the agent's handling of "code" and "thought" blocks, rather than enabling broader structured output capabilities.

For instance, it is not intended to support a use case where a user requests something like

“Give me the 10 most cited references about AI agents”

and requests the final answer to conform to a user-defined JSON schema by passing use_structured_output.

Therefore, if this is indeed more of an internal formatting control, I wonder if the parameter name use_structured_output might be a bit misleading: users might reasonably expect it to influence the structure of the actual task output, not just internal representation.

albertvillanova

Thanks, I can fix the remaining tests and propose another way for the deprecation if you want.

albertvillanova · 2025-05-21T12:12:51Z

src/smolagents/agents.py

-        self.grammar = grammar
+        if grammar is not None:
+            warnings.warn(
+                "The `grammar` argument is deprecated and will have no effect on agent behavior.",


Better specifying the removal version.

Suggested change

"The `grammar` argument is deprecated and will have no effect on agent behavior.",

"Argument 'grammar' is deprecated and will be removed in version 1.20. ",

albertvillanova · 2025-05-21T12:15:26Z

src/smolagents/agents.py

+        if grammar is not None:
+            warnings.warn(
+                "The `grammar` argument is deprecated and will have no effect on agent behavior.",
+                DeprecationWarning,


Also we should propose using use_structured_output instead, but not sure in this case because use_structured_output is a CodeAgent-only param and is not exactly equivalent.

albertvillanova · 2025-05-21T12:15:57Z

src/smolagents/agents.py

+        if grammar is not None:
+            warnings.warn(
+                "The `grammar` argument is deprecated and will have no effect on agent behavior.",
+                DeprecationWarning,


Better use a FutureWarning:

Suggested change

DeprecationWarning,

FutureWarning,

albertvillanova · 2025-05-21T12:16:19Z

src/smolagents/agents.py

@@ -922,7 +935,6 @@ def to_dict(self) -> dict[str, Any]:
            "prompt_templates": self.prompt_templates,
            "max_steps": self.max_steps,
            "verbosity_level": int(self.logger.level),
-            "grammar": self.grammar,


Just a quick note regarding the deprecation because I think it is not well implemented. During the deprecation cycle, the grammar parameter should maintain its previous behavior to ensure backward compatibility, just with the addition of a deprecation warning to inform users of the upcoming change.

This approach helps avoid unexpected breakages for users relying on the current behavior while giving them time to adapt.

I can take care of this if you want.

albertvillanova · 2025-05-21T12:23:29Z

src/smolagents/utils.py

@@ -172,6 +172,15 @@ def parse_json_blob(json_blob: str) -> tuple[dict[str, str], str]:
        )


+def extract_code_from_text(text: str) -> str | None:
+    """Extract code from the LLM's output."""
+    pattern = r"```(?:py|python)?\s*\n(.*?)\n```"


Haven't checked this in detail, but according to the system prompt, there should not be triple bacticks, should there?

Yes! The code extraction from JSON is broken after the last commits. Communication issue, I'll fix it tomorrow morning

Hey @aymeric-roucher @albertvillanova! Can you check the code extraction from JSON for any potential security issues and logic? :)

HuggingFaceDocBuilderDev · 2025-05-21T13:43:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

aymeric-roucher · 2025-05-22T12:53:31Z

src/smolagents/agents.py

-                code_action = json.loads(output_text)["code"]
-                code_action = extract_code_from_text(code_action) or code_action
+                try:
+                    code_action, _ = parse_json_blob(output_text)


@akseljoonas I don't like this if/else : the two functions seem to be doing the same thing.
Moreover this try/catch logic does not need to be in agents.py: so let's just integrate all the logic in parse_json_blob

And more generally since the issue you're trying to solve is specifically due to the InferenceClient provider together not really respecting structured outputs for one specific model, I'm not convinced it needs handling in smolagents anyway. Let's just not enable it for now and open a further PR to handle it, where you can document the specific issue, which providers/modls it affects, and why solve it with complex regex.

aymeric-roucher · 2025-05-22T12:54:42Z

src/smolagents/utils.py

+        `str`: The parsed code.
+    """
+
+    code_match = re.search(r'"code"\s*:\s*"(.*?)"(?=\s*}\s*$|\s*,\s*")', text, re.DOTALL)


The lookahead (.*?) could lead to backtracking, so this is a security risk that should be handled with test cases. Let's put this in the next PR, cf other comment.

src/smolagents/utils.py

aymeric-roucher · 2025-05-22T13:28:48Z

Thanks @akseljoonas ! Merging this, and let's handle the edge cases for some providers in a subsequent PR (I'll open it)

@albertvillanova i change the argument to use_structured_outputs_internally to disambiguate from returning structured outputs.

akseljoonas and others added 8 commits May 12, 2025 13:03

implementing json step for CodeAgent

ae5c670

added formats for different models

dfca51e

fix regex error

166dc58

improving code regex and removing thought parising since its not needed

254806b

inference client got an update to structured output, updated the call…

72ab477

…s and removed pydantic

mlx no support, regex update

849be9b

completed support for vllm and removed support from transformers

f3cd8cb

reformatted code parsing

2398f6e

akseljoonas requested review from albertvillanova and aymeric-roucher May 18, 2025 15:46

akseljoonas and others added 4 commits May 18, 2025 18:00

fix tests

f1964f2

fix style

53b7d82

Fix style

82ec811

Fix test

5aafaa6

albertvillanova requested changes May 19, 2025

View reviewed changes

removing _json_step

346dd52

aymeric-roucher added 2 commits May 21, 2025 11:09

Rename args and methods to comply with library conventions

3edef1d

Merge branch 'main' into structured-generation

3b6cf4f

aymeric-roucher reviewed May 21, 2025

View reviewed changes

aymeric-roucher added 2 commits May 21, 2025 11:17

Revert silent parameter modification in TransformersModel

f793f63

Remove _fix_code_formatting

4569778

aymeric-roucher reviewed May 21, 2025

View reviewed changes

Remove modification of example

7655f48

aymeric-roucher force-pushed the structured-generation branch from f3d9b4c to 7655f48 Compare May 21, 2025 09:42

aymeric-roucher requested a review from albertvillanova May 21, 2025 09:43

aymeric-roucher approved these changes May 21, 2025

View reviewed changes

Raise errors for incompatible models

446c138

aymeric-roucher force-pushed the structured-generation branch from 9d59f6b to 446c138 Compare May 21, 2025 09:49

albertvillanova reviewed May 21, 2025

View reviewed changes

albertvillanova approved these changes May 21, 2025

View reviewed changes

albertvillanova added 4 commits May 21, 2025 14:29

Fix tests

2a5a7a8

Refactor VLLMModel guided_options_request

57167f1

Fix deprecation of grammar

72549bc

Force docs build

98edc0a

albertvillanova force-pushed the structured-generation branch from 3f54ff8 to 98edc0a Compare May 21, 2025 13:41

albertvillanova added 2 commits May 21, 2025 15:54

Remove grammar from tests

7e2d1e0

Test deprecation warning and ValueError

ef94545

aymeric-roucher reviewed May 22, 2025

View reviewed changes

src/smolagents/utils.py Outdated Show resolved Hide resolved

aymeric-roucher force-pushed the structured-generation branch from bedeff2 to ef94545 Compare May 22, 2025 13:00

aymeric-roucher added 2 commits May 22, 2025 15:27

Formatting

e18537a

Change parameter name

73d7f05

aymeric-roucher added 3 commits May 22, 2025 15:40

Fix tests

359bab3

Fix

32912c3

Fix wrong arg

eceaf82

aymeric-roucher merged commit d60c7be into main May 22, 2025
5 checks passed

albertvillanova mentioned this pull request May 26, 2025

[BUG] Unexpected keyword argument error in CodeAgent constructor within the gradio_ux.py example #1372

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding optional structured generation to CodeAgent #1346

Adding optional structured generation to CodeAgent #1346

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		grammar (`dict[str, str]`, optional): Grammar used to parse the LLM output.
		response_format (`dict[str, str]`, optional): Response format used to parse the LLM output.

		@@ -172,6 +172,15 @@ def parse_json_blob(json_blob: str) -> tuple[dict[str, str], str]:
		)


		def extract_code_from_text(text: str) -> str \| None:

	"The `grammar` argument is deprecated and will have no effect on agent behavior.",
	"Argument 'grammar' is deprecated and will be removed in version 1.20. ",

Adding optional structured generation to CodeAgent #1346

Adding optional structured generation to CodeAgent #1346

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!