Shwai-He

shwaihe Shwai-He

27 followers · 24 following

Achievements

Lists (1)

Sort

✨ Inspiration

2 repositories

Stars

QwenLM / ParScale

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 354 14 Updated May 17, 2025

ChenZiHong-Gavin / MoE-Visualizer

MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.

Python 11 1 Updated Apr 8, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,044 235 Updated May 28, 2025

EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,527 285 Updated May 29, 2025

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 5,420 579 Updated May 29, 2025

horseee / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Python 1,689 135 Updated Apr 23, 2025

meta-math / MetaMath

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Python 432 39 Updated Feb 1, 2024

HITsz-TMG / UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Python 720 44 Updated May 13, 2025

mbzuai-oryx / Awesome-LLM-Post-training

Awesome Reasoning LLM Tutorial/Survey/Guide

Python 1,662 120 Updated May 26, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,719 779 Updated May 28, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,575 837 Updated Apr 29, 2025

Zanette-Labs / efficient-reasoning

Python 64 6 Updated Apr 13, 2025

Zhen-Tan-dmml / LLM4Annotation

557 23 Updated May 28, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 48,507 7,668 Updated May 29, 2025

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python 9,785 716 Updated May 28, 2025

unslothai / unsloth

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 39,595 3,129 Updated May 29, 2025

MinishLab / model2vec

Fast State-of-the-Art Static Embeddings

Python 1,689 88 Updated May 29, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,610 2,271 Updated May 28, 2025

MIT-LCP / mimic-cxr

Code, documentation, and discussion around the MIMIC-CXR database

Jupyter Notebook 279 58 Updated Jul 13, 2020

deepseek-ai / DeepSeek-V3

Python 97,265 15,802 Updated Apr 9, 2025

microsoft / CryptoNets

CryptoNets is a demonstration of the use of Neural-Networks over data encrypted with Homomorphic Encryption. Homomorphic Encryptions allow performing operations such as addition and multiplication …

C# 292 75 Updated Jul 16, 2024

mosaicml / llm-foundry

LLM training code for Databricks foundation models

Python 4,246 561 Updated May 29, 2025

marcoszh / BatchCrypt

Python 87 18 Updated May 27, 2020

bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Python 946 243 Updated Oct 31, 2024

s1ghhh / lm-evaluation-harness

Forked from EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 2 Updated Apr 21, 2024

1 Updated Nov 5, 2024

locuslab / wanda

A simple and effective LLM pruning approach.

Python 751 104 Updated Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shwaihe Shwai-He

Achievements

Achievements

Block or report Shwai-He

Lists (1)

✨ Inspiration

Stars

QwenLM / ParScale

ChenZiHong-Gavin / MoE-Visualizer

QwenLM / Qwen2.5-Omni

EvolvingLMMs-Lab / lmms-eval

open-compass / opencompass

horseee / Awesome-Efficient-LLM

meta-math / MetaMath

HITsz-TMG / UMOE-Scaling-Unified-Multimodal-LLMs

mbzuai-oryx / Awesome-LLM-Post-training

deepseek-ai / DeepEP

deepseek-ai / FlashMLA

Zanette-Labs / efficient-reasoning

Zhen-Tan-dmml / LLM4Annotation

vllm-project / vllm

FlagOpen / FlagEmbedding

unslothai / unsloth

MinishLab / model2vec

huggingface / open-r1

MIT-LCP / mimic-cxr

deepseek-ai / DeepSeek-V3

microsoft / CryptoNets

mosaicml / llm-foundry

marcoszh / BatchCrypt

bigcode-project / bigcode-evaluation-harness

s1ghhh / lm-evaluation-harness

google-deepmind / alphafold3

star-history / star-history

EnnengYang / Awesome-Model-Merging-Methods-Theories-Applications

alphadl / Awesome-LLM-Compression

locuslab / wanda