- Shanghai, China
-
21:21
(UTC +08:00) - weigao266.github.io
-
-
LASP Public
Forked from OpenNLPLab/LASPLinear Attention Sequence Parallelism (LASP)
Python UpdatedMar 13, 2025 -
-
Linear-MoE Public
Forked from OpenSparseLLMs/Linear-MoE -
Megatron-LM Public
Forked from NVIDIA/Megatron-LMOngoing research training transformer models at scale
Python Other UpdatedDec 12, 2024 -
LLaMA-MoE-v2 Public
Forked from OpenSparseLLMs/LLaMA-MoE-v2LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Python Apache License 2.0 UpdatedNov 26, 2024 -
-
-
DeepSpeed-LASP Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python Apache License 2.0 UpdatedMay 23, 2024 -
fairscale-CO2 Public
The Fairscale framework with CO2 integrated.
Python MIT License UpdatedApr 29, 2024 -
fairseq-CO2 Public
Forked from facebookresearch/fairseqExample of using CO2 within Fairseq.
Python MIT License UpdatedApr 28, 2024 -
ring-attention-pytorch Public
Forked from lucidrains/ring-attention-pytorchImplementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Python MIT License UpdatedApr 17, 2024 -
-
TNL-MoE Public
Forked from pjlab-sys4nlp/llama-moeTNL-MoE: Building Mixture-of-Experts from TransNormerLLM (TNL) with Continual Pre-training
Python Apache License 2.0 UpdatedFeb 29, 2024 -
-
DataDriven-POPF Public
Data driven probabilistic optimal power flow with Probabilistic Methods
-
-
Hard Public
The exercise code for <learn python the hard way>
Python Apache License 2.0 UpdatedMar 29, 2020