-
University of Toronto
- Toronto, Canada
- https://universea.github.io/
- @pengsongzhang96
Stars
Autonomous Generalist Scientist / AI Scientist / Agent Scientist / Robot Scientist
🤗 smolagents: a barebones library for agents that think in code.
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
A comprehensive list of excellent research papers, models, datasets, and other resources on Vision-Language-Action (VLA) models in robotics.
An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
An open protocol enabling communication and interoperability between opaque agentic applications.
A MCP for searching and downloading academic papers from multiple sources like arXiv, PubMed, bioRxiv, etc.
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref, SciHub, and SciDB.
StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型,无需针对图片微调,即能生成高质量的个性风格化图片!
This repository serves as a comprehensive knowledge hub, curating cutting-edge research papers and developments across 25+ specialized domains
CoTracker is a model for tracking any point (pixel) on a video.
A plugin that will automatically download PDFs of zotero items from sci-hub
Official implementation of "Exploring Temporally-Aware Features for Point Tracking" (CVPR 2025)
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
A lightweight, powerful framework for multi-agent workflows
Train your AI self, amplify you, bridge the world
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
peng-zhihui / AimRT
Forked from AimRT/AimRTA high-performance runtime framework for modern robotics.
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
A simple screen parsing tool towards pure vision based GUI agent
[CVPR 2024] Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
Magic to turn Cursor/Windsurf as 90% of Devin
Collect every awesome work about r1!