Model name Ja avg JComQA JEMHQA NIILC JSQuAD XL-Sum MGSM En-Ja Ja-En JMMLU JHumanEval
Aya Expanse 8B 0.445 0.922 0.467 0.385 0.867 0.211 0.608 0.261 0.206 0.521 0.001
Falcon3-7B-Instruct 0.363 0.684 0.436 0.152 0.816 0.177 0.320 0.094 0.126 0.415 0.416
Falcon3-10B-Instruct 0.367 0.690 0.221 0.122 0.853 0.192 0.392 0.108 0.135 0.442 0.515
Gemma 2 9B IT 0.534 0.931 0.532 0.526 0.876 0.149 0.636 0.273 0.239 0.623 0.559
Llama 3 8B Instruct 0.430 0.880 0.417 0.385 0.891 0.126 0.424 0.214 0.202 0.468 0.296
Llama-3-ELYZA-JP-8B 0.471 0.897 0.498 0.496 0.906 0.169 0.436 0.250 0.185 0.487 0.388
Llama 3 heron brain 8B v0.3 0.488 0.923 0.493 0.569 0.906 0.218 0.456 0.277 0.217 0.499 0.318
Llama 3 Swallow 8B Instruct 0.480 0.911 0.496 0.517 0.905 0.128 0.492 0.253 0.227 0.481 0.394
Llama 3 Youko 8B Instruct 0.468 0.920 0.481 0.517 0.899 0.209 0.472 0.256 0.191 0.469 0.262
Llama 3.1 8B Instruct 0.470 0.880 0.447 0.407 0.886 0.148 0.516 0.218 0.200 0.509 0.488
Llama 3.1 Swallow 8B Instruct v0.1 0.505 0.924 0.587 0.574 0.917 0.138 0.508 0.282 0.228 0.530 0.366
Llama 3.1 Swallow 8B Instruct v0.2 0.514 0.929 0.560 0.599 0.915 0.137 0.528 0.288 0.227 0.550 0.408
Llama 3.1 Swallow 8B Instruct v0.3 0.510 0.924 0.528 0.583 0.896 0.191 0.532 0.281 0.229 0.544 0.394
llm-jp-3-13b-instruct 0.436 0.894 0.339 0.638 0.901 0.151 0.324 0.252 0.203 0.468 0.188
Mistral-NeMo-Instruct-2407 (12B) 0.500 0.927 0.497 0.484 0.905 0.176 0.552 0.240 0.205 0.548 0.469
Mistral-NeMo-Minitron 8B Instruct 0.478 0.892 0.498 0.380 0.578 0.000 0.556 0.199 0.193 0.510 0.496
Mistral-7B-Instruct-v0.3 0.378 0.754 0.447 0.268 0.870 0.205 0.224 0.163 0.177 0.403 0.267
Phi-4 0.580 0.945 0.608 0.507 0.923 0.219 0.796 0.283 0.231 0.689 0.598
Qwen2-7B-Instruct 0.478 0.888 0.390 0.379 0.897 0.127 0.576 0.206 0.190 0.571 0.555
Qwen2.5-7B-Instruct 0.498 0.915 0.429 0.391 0.891 0.168 0.632 0.210 0.192 0.623 0.532
Qwen2.5-14B-Instruct 0.553 0.953 0.588 0.519 0.902 0.140 0.680 0.193 0.160 0.708 0.691
Swallow-MS-7b-instruct-v0.1 0.394 0.758 0.490 0.446 0.864 0.158 0.172 0.227 0.187 0.419 0.215
Swallow-7b-instruct-v0.1 0.353 0.599 0.491 0.531 0.837 0.153 0.128 0.228 0.179 0.352 0.027
Tanuki-8B-dpo-v1.0 0.311 0.278 0.284 0.370 0.670 0.102 0.428 0.238 0.183 0.306 0.251
Model name En avg OpenBookQA TriviaQA HellaSwag SQuAD2 XWINO MMLU GSM8K MATH BBH HumanEval
Aya Expanse 8B 0.539 0.384 0.591 0.605 0.664 0.892 0.628 0.756 0.284 0.590 0.000
Falcon3-7B-Instruct 0.618 0.394 0.517 0.611 0.525 0.855 0.705 0.773 0.542 0.711 0.551
Falcon3-10B-Instruct 0.633 0.424 0.503 0.640 0.549 0.875 0.730 0.793 0.462 0.729 0.627
Gemma 2 9B IT 0.649 0.432 0.658 0.605 0.659 0.904 0.723 0.779 0.394 0.719 0.613
Llama 3 8B Instruct 0.605 0.388 0.670 0.583 0.611 0.892 0.657 0.745 0.306 0.646 0.554
Llama-3-ELYZA-JP-8B 0.495 0.318 0.551 0.523 0.600 0.882 0.587 0.558 0.164 0.321 0.449
Llama 3 heron brain 8B v0.3 0.551 0.362 0.656 0.569 0.581 0.901 0.622 0.578 0.222 0.641 0.381
Llama 3 Swallow 8B Instruct 0.560 0.370 0.655 0.585 0.567 0.899 0.633 0.592 0.244 0.639 0.419
Llama 3 Youko 8B Instruct 0.507 0.406 0.613 0.599 0.559 0.897 0.597 0.562 0.152 0.402 0.287
Llama 3.1 8B Instruct 0.627 0.366 0.699 0.592 0.600 0.904 0.680 0.743 0.376 0.690 0.624
Llama 3.1 Swallow 8B Instruct v0.1 0.563 0.388 0.649 0.615 0.598 0.891 0.624 0.605 0.236 0.642 0.379
Llama 3.1 Swallow 8B Instruct v0.2 0.574 0.380 0.625 0.603 0.607 0.887 0.634 0.620 0.264 0.649 0.474
Llama 3.1 Swallow 8B Instruct v0.3 0.566 0.396 0.629 0.593 0.570 0.884 0.629 0.622 0.266 0.626 0.445
llm-jp-3-13b-instruct 0.432 0.342 0.534 0.594 0.516 0.892 0.506 0.243 0.046 0.438 0.205
Mistral-NeMo-Instruct-2407 (12B) 0.608 0.406 0.726 0.645 0.606 0.911 0.683 0.721 0.274 0.537 0.571
Mistral-NeMo-Minitron 8B Instruct 0.634 0.452 0.719 0.639 0.624 0.909 0.701 0.754 0.274 0.663 0.601
Mistral-7B-Instruct-v0.3 0.541 0.408 0.677 0.652 0.576 0.905 0.621 0.500 0.160 0.563 0.346
Phi-4 0.677 0.378 0.682 0.647 0.646 0.903 0.802 0.899 0.556 0.654 0.601
Qwen2-7B-Instruct 0.582 0.396 0.547 0.615 0.593 0.886 0.707 0.626 0.504 0.304 0.643
Qwen2.5-7B-Instruct 0.604 0.428 0.519 0.624 0.569 0.877 0.742 0.739 0.688 0.217 0.636
Qwen2.5-14B-Instruct 0.614 0.438 0.592 0.656 0.680 0.890 0.800 0.761 0.666 0.029 0.632
Swallow-MS-7b-instruct-v0.1 0.436 0.360 0.500 0.587 0.510 0.886 0.526 0.215 0.082 0.441 0.256
Swallow-7b-instruct-v0.1 0.376 0.330 0.481 0.550 0.501 0.880 0.407 0.124 0.034 0.359 0.094
Tanuki-8B-dpo-v1.0 0.406 0.334 0.283 0.469 0.501 0.816 0.377 0.487 0.178 0.333 0.288
Model name JMT avg Code Ext Human Math Reason Role STEM Write
Aya Expanse 8B 0.637 0.494 0.718 0.855 0.398 0.433 0.737 0.677 0.787
Falcon3-7B-Instruct 0.377 0.549 0.506 0.340 0.406 0.257 0.299 0.340 0.317
Falcon3-10B-Instruct 0.413 0.509 0.545 0.382 0.480 0.356 0.335 0.373 0.324
Gemma 2 9B IT 0.736 0.652 0.765 0.857 0.614 0.673 0.811 0.713 0.800
Llama 3 8B Instruct 0.529 0.467 0.706 0.692 0.310 0.433 0.542 0.532 0.546
Llama-3-ELYZA-JP-8B 0.587 0.389 0.706 0.647 0.426 0.613 0.684 0.533 0.697
Llama 3 heron brain 8B v0.3 0.497 0.362 0.566 0.602 0.315 0.426 0.586 0.567 0.550
Llama 3 Swallow 8B Instruct 0.427 0.411 0.575 0.476 0.309 0.305 0.499 0.438 0.406
Llama 3 Youko 8B Instruct 0.616 0.464 0.757 0.769 0.414 0.487 0.695 0.583 0.753
Llama 3.1 8B Instruct 0.519 0.420 0.830 0.550 0.514 0.349 0.502 0.479 0.504
Llama 3.1 Swallow 8B Instruct v0.1 0.581 0.427 0.738 0.675 0.527 0.453 0.615 0.593 0.624
Llama 3.1 Swallow 8B Instruct v0.2 0.612 0.534 0.748 0.705 0.565 0.475 0.646 0.579 0.646
Llama 3.1 Swallow 8B Instruct v0.3 0.705 0.562 0.756 0.869 0.610 0.512 0.783 0.748 0.803
llm-jp-3-13b-instruct 0.588 0.373 0.556 0.816 0.371 0.526 0.730 0.614 0.715
Mistral-NeMo-Instruct-2407 (12B) 0.616 0.515 0.698 0.702 0.512 0.481 0.669 0.660 0.691
Mistral-NeMo-Minitron 8B Instruct 0.567 0.547 0.684 0.649 0.545 0.454 0.564 0.549 0.541
Mistral-7B-Instruct-v0.3 0.428 0.488 0.540 0.435 0.354 0.392 0.409 0.405 0.401
Phi-4 0.769 0.692 0.929 0.795 0.914 0.544 0.754 0.688 0.840
Qwen2-7B-Instruct 0.646 0.512 0.771 0.719 0.687 0.514 0.683 0.563 0.717
Qwen2.5-7B-Instruct 0.665 0.599 0.741 0.719 0.637 0.541 0.744 0.624 0.713
Qwen2.5-14B-Instruct 0.762 0.673 0.829 0.798 0.828 0.571 0.815 0.743 0.841
Swallow-MS-7b-instruct-v0.1 0.400 0.358 0.421 0.501 0.222 0.349 0.458 0.444 0.449
Swallow-7b-instruct-v0.1 0.419 0.324 0.401 0.519 0.275 0.344 0.535 0.494 0.462
Tanuki-8B-dpo-v1.0 0.529 0.461 0.597 0.562 0.495 0.377 0.589 0.509 0.643