Studio AI

Interactive Studio Artificial Intelligence

About

Studio AI is a client/server application for an interactive studio Artificial Intelligence (AI), represented by a dynamically rendered avatar. The avatar receives its inputs via a microphone device connected to a Speech-to-Text engine, performs its reasoning process with a Text-to-Text (Chat) engine, and sends its outputs through a Text-to-Speech engine for driving an AI avatar whose audio and video streams are injected back into the studio production process. The result is an AI avatar the people on the studio stage can interact with in nearly real-time. This is intended for including an AI participant in a discussion or Q&A round.

The Speech-to-Text engine is based on the Deepgram cloud service, the Text-to-Text engine is based on the OpenAI ChatGPT cloud service, and the Text-to-Speech engine is based on the HeyGen Interactive Avatar cloud service. Currently Studio AI works at least for English and German languages.

NOTICE: As a consequence, to be able to use Studio AI you need API keys of those three cloud services.

Screenshots

The following are four screenshots for impressions on Studio AI. The first three screenshots show the settings dialogs of the CONTROL mode. The forth screenshot show the control dialog of the CONTROL mode (with german language examples). The firth screenshot shows the client in RENDER mode within OBS Studio.

Architecture

Studio AI is written in TypeScript, consists of a central Node.js-based server component and a HTML5 Single-Page Application (SPA) as the client component. The client component, in turn, runs in two distinct modes: an interactive control mode and an autonomous avatar rendering mode. The clients are communicating with each other through their bi-directional WebSocket connections to the server.

The core of the application can be found in the following software components:

Usage (Production)

Under Windows/macOS/Linux install Node.js for the server run-time, Google Chrome for the client run-time (control mode) and either OBS Studio or vMix for the client run-time (renderer mode).
Create and use local working copy:
git clone https://github.com/rse/studio-ai && cd studio-ai
Provide API keys of required cloud services:
echo "STUDIOAI_DEEPGRAM_API_TOKEN=\"<token1 8BBD >\"" >.env
echo "STUDIOAI_OPENAI_API_TOKEN=\"<token2>\"" >>.env
echo "STUDIOAI_HEYGEN_API_TOKEN=\"<token3>\"" >>.env
Install all dependencies:
npm install --production
Run the production build-process once:
npm start build
Run the bare server component:
npm start server
Open the client component (control mode) in Google Chrome:
https://127.0.0.1:12345/
Use the client component (renderer mode) in OBS Studio or vMix browser sources:
https://127.0.0.1:12345/#/render

Usage (Development)

Under Windows/macOS/Linux install Node.js for the server run-time and Google Chrome for the client run-time (both control mode and renderer mode), plus Visual Studio Code with its TypeScript, ESLint and VueJS extensions.
Create and use local working copy:
git clone https://github.com/rse/studio-ai && cd studio-ai
Provide API keys of required cloud services:
echo "STUDIOAI_DEEPGRAM_API_TOKEN=\"<token1>\"" >.env
echo "STUDIOAI_OPENAI_API_TOKEN=\"<token2>\"" >>.env
echo "STUDIOAI_HEYGEN_API_TOKEN=\"<token3>\"" >>.env
Install all dependencies:
npm install
Run the development build-process once:
npm start build-dev
Run the development build-process and server component continuously:
npm start dev
Open the client component (control mode) in Google Chrome:
https://127.0.0.1:12345/
Open the client component (renderer mode) in Google Chrome:
https://127.0.0.1:12345/#/render

History

The Studio AI application was inspired by a prototype application from msg systems ag, which employees of its public sector division and AI cross-division initially crafted for controlling an AI avatar on the panel discussion at the conference Nordl@nder Digital in September 2024. This prototype application was based on an earlier version of the HeyGen Interactive Avatar Demo for their HeyGen Streaming API.

In October 2024 Dr. Ralf S. Engelschall, CTO of msg group, initially integrated this prototype application into his msg Filmstudio. Unfortunately, the implementation did not allow a seamless studio integration. As a result, he just took the ideas of the prototype application and then developed Studio AI from scratch in order allow a more robust integration into a studio production process.

Name		Name	Last commit message	Last commit date
Latest commit History 208 Commits
doc		doc
etc		etc
package.d		package.d
res		res
src		src
var		var
.gitignore		.gitignore
.npmignore		.npmignore
.npmrc		.npmrc
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
TODO.md		TODO.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Studio AI

About

Screenshots

Architecture

Usage (Production)

Usage (Development)

History

See Also

Copyright & License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

rse/studio-ai

Folders and files

Latest commit

History

Repository files navigation

Studio AI

About

Screenshots

Architecture

Usage (Production)

Usage (Development)

History

See Also

Copyright & License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages