8000 GitHub - rogino/VirtualBar: A virtual TouchBar through a camera and mirror
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

rogino/VirtualBar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VirtualBar

A Touch Bar, but worse, using computer vision and ML. Done as part of my COSC428 Computer Vision project.

Inspired by Anish Athalye's MacBook touchscreen

Using a mirror, the webcam sees the keyboard and uses this to detect two and three finger gestures on the not-a-Touch Bar area above the keyboard to control system volume and brightness.

three finger gesture gif

Brightness control is achieved using nriley/brightness

The gesture area is detected by taking the row mean to get a 1 px wide image, using the Sobel filter to find the derivative, and using this to detect horizontal edges. Some heuristics are used to determine the pair of edges that correspond to the gesture area above the keyboard.

Hands are detected using Apple's Vision framework. It doesn't work great, so the full hand needs to be in the image. MediaPipe hands is better, but I could barely get it compiling and had no idea where to start with porting it to macOS and integrating it with this system.

Works on my MacBook Air 2020. Will probably work on 13' MacBook Pros, 15' MacBook Pros with some minor tweaking, and possibly on the 2021 MacBook Pros with some more tweaking.

Requires macOS Big Sur or later. Tested on an M1 Mac, so it might be very slow Intel ones.

If you want to run this and don't have a MBA, go to MetalView.swift, set useLiveCamera to false and set videoPath to the absolute path to the example video file, footage/virtualbar_example_footage.mov, found in the repo.

Code

  • MetalView is responsible for setting up the camera. It outputs 720p video at 30 fps. If the useLiveCamera variable is set to false, input from a video file is used instead. The start and stop times, as well as the playback rate (slowDown) can be set in the VideoFeed constructor
  • Renderer is resposible for the main render loop as well as processing. When a frame arrives, it is converted to a metal texture. Then:
    • Straighten straightens and corrects lens distortion, and the texture is given to ImageMean
    • ImageMean determines the active area and sets the static activeArea array to a list of candidate areas, the first being the top. Values are [yMin, yMax] where 0 is the bottom and 1 is the top
      • ActiveAreaDetector is responsible for detecting possible active areas within a single frame
      • ActiveAreaSelector keeps a record of active areas from previous frames
    • FingerDetector runs the hand pose detector on the original frame (not the straightened one) and sets the volume/brightness as required.
      • GestureRecognizer is responsible for determining gesture start/stop, and the type of gesture
      • When a gesture is detected, it tells ActiveAreaSelector to lock the current top candidate

About

A virtual TouchBar through a camera and mirror

Resources

Stars

Watchers

Forks

Packages

No packages published
0