8000 loulianzhang (lzlou) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View loulianzhang's full-sized avatar

Block or report loulianzhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2024] How do Large Language Models Handle Multilingualism?

Python 34 3 Updated Nov 8, 2024

Trying to prototype a multimodal llm which can take text and audio as input and then output text.

Jupyter Notebook 9 2 Updated Jul 31, 2024

Build your own visual reasoning model

Jupyter Notebook 365 20 Updated May 16, 2025

Open neural machine translation models and web services

Python 691 76 Updated Dec 12, 2024

s1: Simple test-time scaling

Python 6,372 744 Updated Apr 4, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 237 10 Updated Apr 15, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 7,606 644 Updated May 18, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,251 303 Updated May 13, 2025
Python 2 Updated Oct 10, 2024

Fully open reproduction of DeepSeek-R1

Python 24,452 2,250 Updated May 18, 2025

Multilingual Generative Pretrained Model

Jupyter Notebook 206 22 Updated May 13, 2024

BLEURT implementation in PyTorch

Python 32 5 Updated Jan 19, 2023

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,004 247 Updated May 9, 2025

Quantized Attention achieves speedup of 2-3x and 3-5x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda 1,499 106 Updated May 2, 2025

Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)

Jupyter Notebook 139 23 Updated Sep 21, 2024
Python 93 16 Updated Dec 12, 2024

STACL simultaneously translation model with PaddlePaddle

JavaScript 9 1 Updated Aug 1, 2022

LLama3中文个人版本

39 2 Updated Apr 26, 2024

Open Multilingual Chatbot for Everyone

1,256 73 Updated May 1, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 21,297 1,403 Updated May 16, 2025

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 38,881 3,045 Updated May 18, 2025

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,148 340 Updated May 7, 2025

Best practice for training LLaMA models in Megatron-LM

Python 650 57 Updated Jan 2, 2024

State-of-the-art LLM-based translation models.

Ruby 524 40 Updated Apr 9, 2025

MAD: The first work to explore Multi-Agent Debate with Large Language Models :D

Python 374 40 Updated Jan 14, 2025

GEMBA — GPT Estimation Metric Based Assessment

Python 118 23 Updated Jul 30, 2024

The official Python library for the OpenAI API

Python 26,713 3,897 Updated May 16, 2025

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Python 700 116 Updated Oct 23, 2023

PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)

Python 104 19 Updated Feb 27, 2022
Next
0