Skip to content

MuhammadSaqlainAslam/tmmlu-leaderboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

177 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TMMLU+ Leaderboard πŸ‡ΉπŸ‡Ό

GitHub Pages License: MIT

πŸ“Œ About TMMLU+

TMMLU+ (Traditional Chinese Massive Multitask Language Understanding) is a state-of-the-art benchmark designed to evaluate Large Language Models (LLMs) specifically within the linguistic and cultural context of Taiwan.

The benchmark covers 66 subjects including STEM, Social Sciences, Humanities, and professional certifications, providing a rigorous standard for Traditional Chinese NLP evaluation.

πŸ“Š Live Interactive Leaderboard

Our interactive dashboard allows you to explore model performance in detail:

  • Search & Filter: Find specific models instantly.
  • Visual Analytics: Compare performance via Discipline Radar Maps and Category Bar Charts.
  • Nested Drill-down: Expand models to see Major Disciplines and individual subject scores.
  • General Benchmarks: Includes evaluations for DRCD, TW-RAG, GSM8K, and more.

πŸ‘‰ Access the Interactive Leaderboard Here

πŸ“‚ Repository Structure

β”œβ”€β”€ .github/ISSUE_TEMPLATE/  # Model submission form configuration
β”œβ”€β”€ docs/
β”‚   └── index.html           # Website Frontend (Plotly, PapaParse, Bootstrap)
β”œβ”€β”€ results/
β”‚   └── benchmark.csv        # Central Data Source
└── README.md                # Project Documentation

πŸš€ How to Submit Results

We welcome contributions from the research community! To add your model:

  1. Prepare Data: Ensure results match the format in results/benchmark.csv.
  2. Submit an Issue: Click the "Submit Your Model Results" button on the live website.
  3. Pull Request: Fork this repo, add your model's column to the CSV, and submit a PR.

πŸ“„ Citation

If you utilize this benchmark or leaderboard in your research, please cite:

@misc{aslam2025tmmluplus,
  author = {Aslam, Muhammad Saqlain},
  title = {TMMLU+ Leaderboard: Traditional Chinese Massive Multitask Language Understanding Benchmark},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{[https://github.com/MuhammadSaqlainAslam/tmmlu-leaderboard](https://github.com/MuhammadSaqlainAslam/tmmlu-leaderboard)}}
}

Maintained by: Muhammad Saqlain Aslam

Dedicated to the Traditional Chinese NLP Community.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors