Lists (1)
Sort Name ascending (A-Z)
Stars
A course of learning LLM inference serving on Apple Silicon for systems engineers.
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
ICCV 2023: CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No
Repo for the Deep Reinforcement Learning Nanodegree program
A high-performance distributed training framework for Reinforcement Learning
Hands-on Deep Reinforcement Learning, published by Packt
PyTorch implementation of Soft Actor-Critic (SAC), Twin Delayed DDPG (TD3), Actor-Critic (AC/A2C), Proximal Policy Optimization (PPO), QT-Opt, PointNet..
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Example models using DeepSpeed
Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
The simplest, fastest repository for training/finetuning small-sized VLMs.
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources
Implementing DeepSeek R1's GRPO algorithm from scratch
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
Train transformer language models with reinforcement learning.