NinaTian98369

Yufei NinaTian98369

CS PhD student @ UCLA

45 followers · 19 following

Achievements

Highlights

Stars

psunlpgroup / ReaLMistake

This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".

Python 29 3 Updated Aug 18, 2024

lmarena / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Python 837 101 Updated May 1, 2025

GAIR-NLP / auto-j

Generative Judge for Evaluating Alignment

Python 238 14 Updated Jan 18, 2024

GAIR-NLP / MetaCritique

Evaluate the Quality of Critique

Python 35 Updated Jun 1, 2024

facebookresearch / Shepherd

This is the repo for the paper Shepherd -- A Critic for Language Model Generation

Jupyter Notebook 218 9 Updated Aug 10, 2023

baoguangsheng / fast-detect-gpt

Code base for ICLR 2024 "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature".

Python 303 50 Updated Apr 3, 2025

NLP2CT / LLM-generated-Text-Detection

Forked from junchaoIU/LLM-generated-Text-Detection

A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current issues and future directions.

220 13 Updated Dec 30, 2024

ppapalampidi / TRIPOD

TuRnIng POint Dataset

Python 46 3 Updated Oct 17, 2019

jaehunjung1 / Maieutic-Prompting

Python 50 4 Updated Oct 24, 2023

davidjurgens / potato

potato: portable text annotation tool

Jupyter Notebook 333 56 Updated Apr 29, 2025

maitrix-org / llm-reasoners

A library for advanced large language model reasoning

Python 2,132 189 Updated Apr 9, 2025

zjunlp / Prompt4ReasoningPapers

[ACL 2023] Reasoning with Language Model Prompting: A Survey

961 69 Updated May 21, 2025

CHATS-lab / KokoMind

KokoMind: Can LLMs Understand Social Interactions?

JavaScript 104 8 Updated Oct 3, 2023

SimengSun / pearl

Python 52 4 Updated Jun 29, 2023

Ber666 / RAP

Reasoning with Language Model is Planning with World Model

PDDL 166 18 Updated Aug 25, 2023

THUDM / ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 41,066 5,221 Updated Jun 27, 2024

tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,909 2,230 Updated Jul 29, 2024

xxxiaol / magic-if

Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023)

Python 29 2 Updated Jun 4, 2023

meta-llama / llama

Inference code for Llama models

Python 58,298 9,780 Updated Jan 26, 2025

matt-seb-ho / WikiWhy

WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000+ "why" question-answer-rationale triplets.

Python 47 1 Updated Dec 7, 2023