モデル名 | Ja avg | JComQA | JEMHQA | NIILC | JSQuAD | XL-Sum | MGSM | En-Ja | Ja-En | JMMLU | JHumanEval | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Gemma 2 27B | 0.546 | 0.936 | 0.553 | 0.573 | 0.916 | 0.194 | 0.596 | 0.295 | 0.251 | 0.659 | 0.490 | |
Llama 3 70B | 0.569 | 0.946 | 0.606 | 0.589 | 0.922 | 0.228 | 0.664 | 0.286 | 0.252 | 0.705 | 0.491 | |
Llama 3 Swallow 70B | 0.594 | 0.968 | 0.675 | 0.684 | 0.923 | 0.239 | 0.708 | 0.307 | 0.255 | 0.706 | 0.477 | |
Llama 3 Youko 70B | 0.571 | 0.946 | 0.602 | 0.610 | 0.923 | 0.242 | 0.684 | 0.292 | 0.250 | 0.704 | 0.463 | |
Llama 3.1 70B | 0.566 | 0.946 | 0.616 | 0.603 | 0.925 | 0.228 | 0.672 | 0.287 | 0.257 | 0.669 | 0.462 | |
Llama 3.1 Swallow 70B v0.1 | 0.593 | 0.955 | 0.645 | 0.678 | 0.923 | 0.272 | 0.684 | 0.320 | 0.259 | 0.709 | 0.487 | |
Llama 3.3 Swallow 70B v0.4 | 0.629 | 0.967 | 0.671 | 0.732 | 0.924 | 0.283 | 0.776 | 0.327 | 0.260 | 0.742 | 0.604 | |
Qwen2-72B | 0.593 | 0.960 | 0.620 | 0.561 | 0.926 | 0.238 | 0.768 | 0.275 | 0.241 | 0.782 | 0.561 | |
Qwen2.5-72B | 0.623 | 0.972 | 0.611 | 0.619 | 0.930 | 0.279 | 0.828 | 0.287 | 0.252 | 0.804 | 0.648 | |
Sarashina2-70B | 0.530 | 0.929 | 0.717 | 0.668 | 0.929 | 0.190 | 0.488 | 0.313 | 0.243 | 0.592 | 0.235 | |
Swallow 70B | 0.519 | 0.920 | 0.626 | 0.689 | 0.920 | 0.225 | 0.480 | 0.304 | 0.231 | 0.579 | 0.220 | |
Swallow-MX 8x7B v0.1 | 0.506 | 0.922 | 0.533 | 0.577 | 0.917 | 0.263 | 0.444 | 0.272 | 0.209 | 0.565 | 0.358 | |
Yi-1.5 34B | 0.468 | 0.869 | 0.461 | 0.332 | 0.899 | 0.238 | 0.520 | 0.219 | 0.208 | 0.591 | 0.346 |
モデル名 | En avg | OpenBookQA | TriviaQA | HellaSwag | SQuAD2 | XWINO | MMLU | GSM8K | MATH | BBH | HumanEval | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Gemma 2 27B | 0.655 | 0.412 | 0.780 | 0.675 | 0.549 | 0.921 | 0.754 | 0.757 | 0.438 | 0.760 | 0.508 | |
Llama 3 70B | 0.689 | 0.440 | 0.826 | 0.690 | 0.618 | 0.920 | 0.787 | 0.801 | 0.446 | 0.829 | 0.527 | |
Llama 3 Swallow 70B | 0.672 | 0.430 | 0.823 | 0.682 | 0.628 | 0.923 | 0.774 | 0.817 | 0.414 | 0.734 | 0.499 | |
Llama 3 Youko 70B | 0.671 | 0.436 | 0.829 | 0.690 | 0.610 | 0.922 | 0.785 | 0.797 | 0.408 | 0.826 | 0.412 | |
Llama 3.1 70B | 0.671 | 0.450 | 0.829 | 0.690 | 0.605 | 0.920 | 0.786 | 0.798 | 0.434 | 0.655 | 0.546 | |
Llama 3.1 Swallow 70B v0.1 | 0.679 | 0.428 | 0.826 | 0.690 | 0.612 | 0.927 | 0.772 | 0.809 | 0.380 | 0.806 | 0.540 | |
Llama 3.3 Swallow 70B v0.4 | 0.711 | 0.424 | 0.817 | 0.683 | 0.641 | 0.920 | 0.802 | 0.863 | 0.496 | 0.754 | 0.709 | |
Qwen2-72B | 0.702 | 0.418 | 0.790 | 0.677 | 0.673 | 0.915 | 0.842 | 0.893 | 0.560 | 0.643 | 0.608 | |
Qwen2.5-72B | 0.709 | 0.416 | 0.760 | 0.685 | 0.693 | 0.901 | 0.861 | 0.870 | 0.626 | 0.727 | 0.554 | |
Sarashina2-70B | 0.491 | 0.388 | 0.537 | 0.628 | 0.675 | 0.917 | 0.630 | 0.011 | 0.206 | 0.639 | 0.281 | |
Swallow 70B | 0.543 | 0.416 | 0.761 | 0.643 | 0.522 | 0.920 | 0.659 | 0.503 | 0.108 | 0.655 | 0.240 | |
Swallow-MX 8x7B v0.1 | 0.589 | 0.348 | 0.773 | 0.651 | 0.538 | 0.919 | 0.692 | 0.574 | 0.298 | 0.686 | 0.410 | |
Yi-1.5 34B | 0.650 | 0.402 | 0.708 | 0.662 | 0.754 | 0.910 | 0.774 | 0.743 | 0.394 | 0.763 | 0.385 |
モデル名 | JMT avg | Code | Ext | Human | Math | Reason | Role | STEM | Write | |
---|---|---|---|---|---|---|---|---|---|---|
Gemma 2 27B | ||||||||||
Llama 3 70B | ||||||||||
Llama 3 Swallow 70B | ||||||||||
Llama 3 Youko 70B | ||||||||||
Llama 3.1 70B | ||||||||||
Llama 3.1 Swallow 70B v0.1 | ||||||||||
Llama 3.3 Swallow 70B v0.4 | ||||||||||
Qwen2-72B | ||||||||||
Qwen2.5-72B | ||||||||||
Sarashina2-70B | ||||||||||
Swallow 70B | ||||||||||
Swallow-MX 8x7B v0.1 | ||||||||||
Yi-1.5 34B |