Description
Hi,
I'm using MultiPL-E evaluating models on different languages (latest version and it passed make test
). But I found the scores on Java and TypeScript are not quite aligned with the trend on others. I checked some outputs and noticed that a lot of errors are SyntaxError
, and the code is logically correct.
(1) Is it possible that something is wrong with the env? For example, import org.javatuples.*;
is very suspicious, since the code often runs well after removing this import.
(2) Besides status, can we also log the complete error outputs from the compiler/executor, so we can have better ideas about what errors happened.
Thank you!
Rui
Example of Java (HumanEval_28_concatenate):
{
"program": "import java.util.*;\nimport java.lang.reflect.*;\nimport org.javatuples.*;\nimport java.security.*;\nimport java.math.*;\nimport java.io.*;\nimport java.util.stream.*;\nclass Problem {\n // Concatenate array list of strings into a single string\n // >>> concatenate((new ArrayList<String>(Arrays.asList())))\n // (\"\")\n // >>> concatenate((new ArrayList<String>(Arrays.asList((String)\"a\", (String)\"b\", (String)\"c\"))))\n // (\"abc\")\n public static String concatenate(ArrayList<String> strings) {\n StringBuilder sb = new StringBuilder();\n for (String s : strings) {\n sb.append(s);\n }\n return sb.toString();\n }\n public static void main(String[] args) {\n assert(concatenate((new ArrayList<String>(Arrays.asList()))).equals((\"\")));\n assert(concatenate((new ArrayList<String>(Arrays.asList((String)\"x\", (String)\"y\", (String)\"z\")))).equals((\"xyz\")));\n assert(concatenate((new ArrayList<String>(Arrays.asList((String)\"x\", (String)\"y\", (String)\"z\", (String)\"w\", (String)\"k\")))).equals((\"xyzwk\")));\n }\n\n}\n",
"timestamp": 1690088869,
"stdout": "",
"stderr": "",
"exit_code": -1,
"status": "SyntaxError"
},
Example of TS (HumanEval_0_has_close_elements):
{
"program": "//Check if in given array of numbers, are any two numbers closer to each other than\n// given threshold.\n// >>> has_close_elements([1.0, 2.0, 3.0], 0.5)\n// false\n// >>> has_close_elements([1.0, 2.8, 3.0, 4.0, 5.0, 2.0], 0.3)\n// true\nfunction has_close_elements(numbers: number[], threshold: number): boolean {\n for (let i = 0; i < numbers.length - 1; i++) {\n for (let j = i + 1; j < numbers.length; j++) {\n if (Math.abs(numbers[i] - numbers[j]) < threshold) {\n return true;\n }\n }\n }\n return false;\n}\n\ndeclare var require: any;\nconst assert = require('node:assert');\n\n\nfunction test() {\n let candidate = has_close_elements;\n assert.deepEqual(candidate([1.0, 2.0, 3.9, 4.0, 5.0, 2.2], 0.3),true);\n assert.deepEqual(candidate([1.0, 2.0, 3.9, 4.0, 5.0, 2.2], 0.05),false);\n assert.deepEqual(candidate([1.0, 2.0, 5.9, 4.0, 5.0], 0.95),true);\n assert.deepEqual(candidate([1.0, 2.0, 5.9, 4.0, 5.0], 0.8),false);\n assert.deepEqual(candidate([1.0, 2.0, 3.0, 4.0, 5.0, 2.0], 0.1),true);\n assert.deepEqual(candidate([1.1, 2.2, 3.1, 4.1, 5.1], 1.0),true);\n assert.deepEqual(candidate([1.1, 2.2, 3.1, 4.1, 5.1], 0.5),false);\n}\n\ntest();",
"timestamp": 1690088212,
"stdout": "",
"stderr": "",
"exit_code": -1,
"status": "SyntaxError"
},