-
Notifications
You must be signed in to change notification settings - Fork 48
feat: tool calling custom interfaces tasks extension #636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
9326585
to
a70250c
Compare
8fa85fd
to
f84f5b6
Compare
using const variables for defining
add missing licenses
89293c0
to
d996e02
Compare
adjusted prompts
added default valdiators for some tasks
@boczekbartek I didnt change logic in these last commits but moved a lot of code, so please let me know if this is somewhat clear and made with sense now, because i'm a little bit dizzy with all these changes haha This commits also have some changes to basic tasks, as i just removed or moved some redundant code, which i missing in last PR |
Purpose
Make new custom interfaces tasks for tool calling benchmark, test and adjust task prompts and system prompt
For now custom interfaces tasks revolve around checking the interface of given topic/service and publish/call it once. Also majority of them aren't predefined and ready to import.
Proposed Changes
Changed prompts in existing tasks
Added 3 hard tasks that require calling multiple services/topics
Total of 12 different tasks
Predefined total of 18 tasks
Added tests for predefined tasks
Refactored timeout to depend on number of required calls
Issues
Testing
rai_bench/tool_calling_agent/tasks/basic.py
if they make sense to yourai_bench/tool_calling_agent/predefined/tasks_tasks.py
if they make sense to you