8000 GitHub - aggiacca/talon-config: A user configset for talon voice
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

aggiacca/talon-config

 
 

Repository files navigation

Talon Config

What is this?

Talon Voice is a computer voice control engine that i.a. aims to bring "full desktop computer proficiency to people who have limited or no use of their hands", developed by lunixbochs.

To make use of it, you need to provide a configuration for the engine. There are several ready-to-use configurations available, like knausj_talon for the beta version or 2shea's start pack for the release version.

This repository is another configuration set, based upon a heavily modified knausj_talon, taking bits and pieces from the talonvoice examples, as well as some snippets from the Slack channel.

Why another configuration set?

I have a non-native accent, a somewhat noisy environment, and the wrong mic for the job. And I wanted to understand how Talon configurations work.

As a result, I use a different spelling alphabet, always employ prefixed command words, and structure the directories around isolated features instead of large categories. The changed alphabet and commands make working with this configuration set slower, but the added redundancy makes for a bit more reliable recognition.

Should I use this?

This configuration is almost unusably incomplete. Only use it at all if you face the same problems as I do. Otherwise the community supported configurations will very likely be much better for your needs.

About the folder structure

As much as possible, each folder implements only a small number of strongly related features. While currently (2020-04-29) bugged, Talon will load the folders in lexicographic order (for now it only does so for the leafs, i.e. the files in the most deeply nested folders). However, currently this load order is not required at all. The numbers are mainly here to provide sorting for us humans, illustrating a path from very low-level, common, to more and more high-level scripts.

Other Talon tips

Linux

  • ImGUI issues: for some linux distributions, ImGUI will crash Talon with a segmentation fault after a few times of using help and/or command history. For this, I set software=True in the imgui annotations. Were this not Mesa related but a driver issue, using software rendering by setting the environment variable LIBGL_ALWAYS_SOFTWARE=1 were also an option.

Microphones, Linux, PulseAudio, Focusrite Scarlett

Talon requests a mono channel from your microphone, as is proper. However, when using mono on the stereo Focusrite Scarlett Sole USB interface, either PulseAudio or the interface creates a hard limit at -6dB, not through reduced gain, but plain numerical limiting. This causes unnecessary audio artifacts, which if they occurr will not exactly improve the recognition rate. Reducing the mic gain on the interface on the other hand can cause the VAD (voice activity detection) to not pick up on your commands.

The solution is to use PulseAudio for providing a virtual mono audio source, which in the background uses the interface in stereo mode, and copies the left channel to its one and only mono channel. Thankfully, this is a single line.

For those who also use a Focusrite Scarlett, see the utils/setup-microphone.sh script for this.

Talon Getting Started and Documention

Talon Modes

Talon supports so called modes. Such modes are like a context for contexts: Some contexts (in .py or .talon files) may be available only for a few or a single mode. As a result, each mode can have a unique set of commands available for the user.

In a .py file, use ctx.matches = 'mode: all' to set the mode, in a talon file add a line mode: all to the header.

  • command: activates voice commands
  • hotkey: activates hotkey commands
  • noise: activates noise commands
  • dictation: when using voice primarily for dictating text instead of issuing command
  • sleep: while sleeping
  • dragon: activates dragon naturally speaking dicatation mode
mode.enable(name)
mode.disable(name)
mode.toggle(name)
mode.save()
mode.restore()
speech.disable()
speech.enable()
mod.mode('name', desc='this is what the mode is for')

List of Talon config-sets

License

Everything is available under the Unlicense, which means as public domain as possible. See the file LICENSE for details.

If you contribute from other sources, please check that their license is compatible. (MIT requires attribution in source.)

About

A user configset for talon voice

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Shell 1.0%
0