| Model | AGIEval | GPT4All | TruthfulQA | Bigbench |
|---|---|---|---|---|
| recoilme-gemma-2-9B-v0.3 | 40.71 | Error: File does not exist | 58.62 | Error: File does not exist |
| Task | Version | Metric | Value | Stderr | |
|---|---|---|---|---|---|
| agieval_aqua_rat | 0 | acc | 22.44 | ± | 2.62 |
| acc_norm | 24.41 | ± | 2.70 | ||
| agieval_logiqa_en | 0 | acc | 36.56 | ± | 1.89 |
| acc_norm | 37.33 | ± | 1.90 | ||
| agieval_lsat_ar | 0 | acc | 22.17 | ± | 2.75 |
| acc_norm | 21.74 | ± | 2.73 | ||
| agieval_lsat_lr | 0 | acc | 44.90 | ± | 2.20 |
| acc_norm | 41.96 | ± | 2.19 | ||
| agieval_lsat_rc | 0 | acc | 65.43 | ± | 2.91 |
| acc_norm | 61.34 | ± | 2.97 | ||
| agieval_sat_en | 0 | acc | 77.67 | ± | 2.91 |
| acc_norm | 76.70 | ± | 2.95 | ||
| agieval_sat_en_without_passage | 0 | acc | 30.10 | ± | 3.20 |
| acc_norm | 27.67 | ± | 3.12 | ||
| agieval_sat_math | 0 | acc | 35.91 | ± | 3.24 |
| acc_norm | 34.55 | ± | 3.21 |
Average: 40.71%
Average: Error: File does not exist%
| Task | Version | Metric | Value | Stderr | |
|---|---|---|---|---|---|
| truthfulqa_mc | 1 | mc1 | 40.02 | ± | 1.72 |
| mc2 | 58.62 | ± | 1.51 |
Average: 58.62%
Average: Error: File does not exist%
Average score: Not available due to errors
Elapsed time: 03:33:24