8000 feat: custom evaluator agent for accuracy evals by manuhortet · Pull Request #3389 · agno-agi/agno · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: custom evaluator agent for accuracy evals #3389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

manuhortet
Copy link
Contributor

Summary

Update to AccuracyEval to support providing a custom evaluator agent.

Type of change

  • Bug fix
  • [ x ] New feature
  • Breaking change
  • Improvement
  • Model update
  • Other:

Checklist

  • Code complies with style guidelines
  • Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
  • Self-review completed
  • Documentation updated (comments, docstrings)
  • Examples and guides: Relevant cookbook examples have been included or updated (if applicable)
  • Tested in clean environment
  • Tests added/updated (if applicable)

@manuhortet manuhortet requested a review from a team as a code owner May 28, 2025 09:24
# This is the agent we will use to perform the evaluation
evaluator_agent = Agent(
model=OpenAIChat(id="o4-mini"),
tools=[CalculatorTools(enable_all=True)],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it need calculator tools?

expected_output="$1,739,130.43",
)

result: Optional[AccuracyResult] = evaluation.run(print_results=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you now access that custom response?

"""Check if the evaluator agent is using a custom response model"""
if not self.evaluator_agent:
return False
return self.evaluator_agent.response_model is not AccuracyAgentResponse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this use isinstance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or actually like type comparison? I'm not sure, just asking

@@ -419,7 +455,11 @@ def run_with_output(
)

if result is not None:
self.result.results.append(result)
if self._using_custom_response():
print(f"Evaluator Agent response: {result}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0