MIracleyin

Yin Zhang MIracleyin

Scientific NLP, Science of Science, Recommendation Systems, OS, Rust

Achievements

Stars

eval tools for LLM

6 repositories

Code for the paper "Evaluating Large Language Models Trai 5C32 ned on Code"

Python 2,761 388 Updated Jan 17, 2025

CMMLU: Measuring massive multitask language understanding in Chinese

Python 763 62 Updated Dec 6, 2024

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Python 1,733 81 Updated Oct 26, 2023

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark

Python 29 42 Updated May 22, 2025

A framework for few-shot evaluation of language models.

Python 9,039 2,417 Updated May 25, 2025

Do Multilingual Language Models Think Better in English?

Jupyter Notebook 41 5 Updated Aug 3, 2023