Lists (2)
Sort Name ascending (A-Z)
Stars
OBS plugin for local speech recognition and captioning using AI
EyeTrax – webcam-based eye tracking made simple
The Gaze Correction Camera project is an advanced real-time gaze correction system designed to enhance video communication by improving eye contact. Leveraging state-of-the-art computer vision and …
naklecha / fashionAI
Forked from Nutlope/roomGPTTake a picture of a person, then modify clothing or explore fashion using our AI.
Virtual Clothing Assistant a custom unique implementation of ViTON, allows user to try different clothings virtually
AI Lip Syncing application, deployed on Streamlit
This project is a digital human that can talk and listen to you. It uses OpenAI's GPT to generate responses, OpenAI's Whisper to transcript the audio, Eleven Labs to generate voice and Rhubarb Lip …
DocTranslator is a powerful document AI translation tool that supports multiple file formats, OpenAI APIs, batch operations, multi-threading, and Docker deployment for efficient translation tasks! …
I developed the state-of-the-art YOLOv5x6-TTA for image manipulation detection
Classifies a given image as authentic or tampered by doing two levels of analysis. Implemented using PyTorch.
Benchmarking library for image manipulation detection.
Paper: CVPR2018, Learning Rich Features for Image Manipulation Detection
IFAKE is an application for detecting image and video forgery, designed to help users verify the authenticity of digital media. This repository also contains the AI model and dataset that we develo…
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
A JS magnify lens effect for images. This project allows you to add a magnifying glass effect to any image, complete with adjustable magnification via the scroll wheel.
Real-time browser-based Voice Activity Detection (VAD) using JavaScript and the Web Audio API. A modular and easily expandable web application template for integrating voice-triggered functionality…
Lightweight GPT-4 Vision processing over the Webcam
The AI-Powered Virtual Calculator lets users draw math equations using hand gestures. The system processes the input with OpenCV and sends it to Google's Gemini API, which generates detailed soluti…
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Convert images of LaTex math equations into LaTex code.
Math OCR model that outputs LaTeX and markdown
AI-powered math chatbot with Gemini, Whisper, file upload, and voice input
Augmented reality project to try out rings in mobile environment
React.js + Three.js 3d room planner & product configurator(bundled version)
3D model viewer with high quality rendering and glTF2.0/GLB export
a Omegle like site, lets 2 random user connect to each other though live video and chat
MedLSAM: Localize and Segment Anything Model for 3D Medical Images