Model name | Ja avg | JComQA | JEMHQA | NIILC | JSQuAD | XL-Sum | MGSM | En-Ja | Ja-En | JMMLU | JHumanEval | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Falcon3-1B-Base | 0.129 | 0.216 | 0.251 | 0.062 | 0.281 | 0.085 | 0.008 | 0.012 | 0.020 | 0.264 | 0.088 | |
Falcon3-3B-Base | 0.209 | 0.281 | 0.333 | 0.113 | 0.517 | 0.120 | 0.096 | 0.031 | 0.051 | 0.319 | 0.229 | |
Gemma 2 2B | 0.348 | 0.721 | 0.472 | 0.316 | 0.810 | 0.083 | 0.124 | 0.203 | 0.190 | 0.388 | 0.177 | |
Gemma 2 Baku 2B | 0.372 | 0.760 | 0.475 | 0.443 | 0.843 | 0.121 | 0.124 | 0.255 | 0.187 | 0.376 | 0.137 | |
Llama 3.2 1B | 0.201 | 0.208 | 0.404 | 0.188 | 0.525 | 0.081 | 0.024 | 0.079 | 0.092 | 0.260 | 0.150 | |
Llama 3.2 3B | 0.337 | 0.605 | 0.443 | 0.324 | 0.816 | 0.129 | 0.136 | 0.161 | 0.167 | 0.352 | 0.235 | |
llm-jp-3-1.8b | 0.251 | 0.209 | 0.463 | 0.449 | 0.703 | 0.100 | 0.012 | 0.198 | 0.134 | 0.242 | 0.001 | |
llm-jp-3-3.7b | 0.281 | 0.203 | 0.431 | 0.541 | 0.804 | 0.142 | 0.060 | 0.223 | 0.159 | 0.249 | 0.000 | |
PLaMo 2 1B | 0.250 | 0.203 | 0.463 | 0.434 | 0.626 | 0.055 | 0.052 | 0.236 | 0.119 | 0.256 | 0.057 | |
Qwen2.5-0.5B | 0.234 | 0.369 | 0.389 | 0.139 | 0.635 | 0.101 | 0.076 | 0.058 | 0.064 | 0.304 | 0.203 | |
Qwen2.5-1.5B | 0.372 | 0.800 | 0.383 | 0.241 | 0.849 | 0.143 | 0.292 | 0.132 | 0.134 | 0.438 | 0.308 | |
Qwen2.5-3B | 0.442 | 0.847 | 0.475 | 0.306 | 0.878 | 0.176 | 0.460 | 0.180 | 0.167 | 0.529 | 0.404 | |
TinySwallow-1.5B | 0.402 | 0.840 | 0.437 | 0.474 | 0.839 | 0.173 | 0.256 | 0.201 | 0.125 | 0.446 | 0.231 |
Model name | En avg | OpenBookQA | TriviaQA | HellaSwag | SQuAD2 | XWINO | MMLU | GSM8K | MATH | BBH | HumanEval | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Falcon3-1B-Base | 0.376 | 0.316 | 0.296 | 0.458 | 0.501 | 0.816 | 0.449 | 0.337 | 0.140 | 0.323 | 0.125 | |
Falcon3-3B-Base | 0.495 | 0.312 | 0.346 | 0.492 | 0.503 | 0.847 | 0.567 | 0.634 | 0.344 | 0.553 | 0.348 | |
Gemma 2 2B | 0.439 | 0.342 | 0.552 | 0.552 | 0.501 | 0.890 | 0.530 | 0.249 | 0.176 | 0.415 | 0.188 | |
Gemma 2 Baku 2B | 0.400 | 0.314 | 0.475 | 0.533 | 0.501 | 0.881 | 0.493 | 0.168 | 0.110 | 0.376 | 0.150 | |
Llama 3.2 1B | 0.339 | 0.300 | 0.388 | 0.477 | 0.501 | 0.849 | 0.313 | 0.049 | 0.020 | 0.303 | 0.193 | |
Llama 3.2 3B | 0.450 | 0.326 | 0.586 | 0.558 | 0.502 | 0.888 | 0.558 | 0.262 | 0.070 | 0.466 | 0.285 | |
llm-jp-3-1.8b | 0.293 | 0.244 | 0.301 | 0.462 | 0.501 | 0.851 | 0.248 | 0.017 | 0.018 | 0.276 | 0.008 | |
llm-jp-3-3.7b | 0.324 | 0.280 | 0.421 | 0.506 | 0.502 | 0.876 | 0.253 | 0.055 | 0.016 | 0.309 | 0.019 | |
PLaMo 2 1B | 0.274 | 0.280 | 0.129 | 0.425 | 0.501 | 0.807 | 0.294 | 0.072 | 0.034 | 0.122 | 0.080 | |
Qwen2.5-0.5B | 0.365 | 0.266 | 0.190 | 0.399 | 0.501 | 0.768 | 0.479 | 0.341 | 0.148 | 0.277 | 0.277 | |
Qwen2.5-1.5B | 0.490 | 0.342 | 0.397 | 0.499 | 0.506 | 0.851 | 0.610 | 0.611 | 0.314 | 0.413 | 0.356 | |
Qwen2.5-3B | 0.534 | 0.360 | 0.504 | 0.553 | 0.541 | 0.872 | 0.657 | 0.580 | 0.440 | 0.442 | 0.387 | |
TinySwallow-1.5B | 0.413 | 0.308 | 0.332 | 0.468 | 0.501 | 0.850 | 0.546 | 0.379 | 0.162 | 0.328 | 0.254 |
Model name | JMT avg | Code | Ext | Human | Math | Reason | Role | STEM | Write | |
---|---|---|---|---|---|---|---|---|---|---|
Falcon3-1B-Base | ||||||||||
Falcon3-3B-Base | ||||||||||
Gemma 2 2B | ||||||||||
Gemma 2 Baku 2B | ||||||||||
Llama 3.2 1B | ||||||||||
Llama 3.2 3B | ||||||||||
llm-jp-3-1.8b | ||||||||||
llm-jp-3-3.7b | ||||||||||
PLaMo 2 1B | ||||||||||
Qwen2.5-0.5B | ||||||||||
Qwen2.5-1.5B | ||||||||||
Qwen2.5-3B | ||||||||||
TinySwallow-1.5B |