8000 GitHub - giganticode/throwbench: ThrowBench Code LLM benchmark
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

giganticode/throwbench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

ThrowBench

This repository contains the data for ThrowBench. Data is given in JSONL format. Records have the following fields:

  • bug_id: The RunBugRun bug id
  • code: Program code
  • exception_type: The name of the exception thrown, or no_exception if none thrown. This is the target label.
  • exception_message: The full message of the exception thrown`
  • language: The langugue, either c_sharp, java, python or ruby
  • input: The program input that triggers the exception
  • inputs: Other triggering inputs (input was randomly selected from this list)
  • locs: Program length (in lines of code)

Model Outputs

Model outputs can be found in the results directory. Files have the following fields:

  • model: Model name
  • predicted_exception: Answer given by model
  • actual_exception: Ground-truth answer
  • bug_id: See above
  • output: Full model output
  • language: See above

Scripts

The evaluation script can be found in run.py

About

ThrowBench Code LLM benchmark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0