-
City Detect
- https://www.linkedin.com/in/jonathanrichardson7/
Lists (4)
Sort Name ascending (A-Z)
Stars
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
DSPy: The framework for programming—not prompting—language models
OpenMMLab Detection Toolbox and Benchmark
Pre-trained Deep Learning models and demos (high quality and extremely fast)
An MIT License of YOLOv9, YOLOv7, YOLO-RD
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Creation of annotated datasets from scratch using Generative AI and Foundation Computer Vision models
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
RetinaFace: Deep Face Detection Library for Python
Code to Blur Human Faces and Vehicle License Plates in Video and Images using a SoTA Object Detection model YOLOv8
An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets --- https://arxiv.org/abs/2004.07999
Famous Vision Language Models and Their Architectures
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Orchestrate zero-shot computer vision models
An Awesome List of Open-Source Data Engineering Projects
Implementing best practices for PySpark ETL jobs and applications.
This is a repo with links to everything you'd ever want to learn about data engineering
The best place to learn data engineering. Built and maintained by the data engineering community.
Data Engineering Practice Problems
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
✨✨Latest Advances on Multimodal Large Language Models
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.