How It Works

Until now, there has been no proven, legitimate way for programmers (or anyone doing fancy stuff) to replace their keyboard with voice control. Many people use voice dictation software to dictate text, but out of the box all voice dictation programs fall drastically short when it comes time to actually write and edit code or other non-prose text.

VoiceCode Is a multifaceted methodology. Similar to how the combination of keyboard + mouse forms a complete computer input system, VoiceCode is a multi-part input system.

VoiceCode Has 6 Primary Components

Dragon Dictate

* sold separately

Dragon is hands-down the industry leader when it comes to voice-recognition. No need to reinvent the wheel here.  Once calibrated, and with correct microphone placement Dragon approaches 100% accuracy at converting your speech to text.

Comprehensive Command Set

* included

The Command Set is the heart of the system. It represents all the phrases that the computer will respond to. A command can be as simple as “Swick” (switches to the most recently used application), to as complicated as “Chib Copterm Tabaish” (selects whichever line of text the mouse is hovering over, copies it, switches to the Terminal, opens a new tab, and pastes the copied text).

There are 3 main types of commands:

Static Commands – translate one-to-one with an action the computer should perform. These are things like clicking the mouse or pressing the ”enter” key.

Variable Commands – are similar to static commands with the exception that they take a variable as an argument. For example, The command “Doon sixty-five” presses the down-arrow 65 times. The static part is: “Doon”, and the variable is “sixty-five”, but any number would work in this case. Another example would be: “Duke Hello World”. The static part is “Duke”, and the variable is “Hello World”. This command would double-click on the word the mouse is hovering over, and immediately replace it with the text “hello world”.

Grammar Commands – are similar to variable commands except the variable part is parsed and interpreted through an external linguistic grammar model. Every grammar command begins with a static portion called the ”namespace” that determines the grammar model that should interpret the dynamic portion. The grammars for various commands are easily customized and extended by opening the grammar files, which are simple JSON or YAML files loaded by the software. For example, the command “doom way sibble hello” falls into the namespace: “doom”, which means “down”. The grammar checks the next word to see how far down you want to go. “Way” means all the way to the bottom. It parses “sibble” which means ‘select the entire line the cursor is on’. Next it sees the word “hello” and knows that it is to be inserted as text. So in combination, the command replaces the last line of whatever file you are working in with the text “hello”.

Here’s a quick, non-comprehensive list of some of the capabilities that these three types of commands enable:

  • Arrow-key commands
  • Text-selection commands
  • Keyboard-shortcut commands (ie command-option-R and friends)
  • Editing and formatting commands
  • Text case commands to format variables: “Sentence case”, “Title Case”, “ALL CAPS”, “all lower”, “snake_case”, “camelCase”, “UpperCamel”, “SCREAMING_SNAKE”, “spinal-case”, “Title-Spinal”, “lowerslam”, “UPPERSLAM”, etc.
  • Launching and switching applications
  • Opening drop-down menus
  • Switching windows or tabs
  • Initiating text-snippets or code-snippets
  • Really anything you can think of . . . and in any combination.

VoiceCode Language Processor (written in JavaScript)

* included

At the core of variable commands and grammar commands, lies the VoiceCode Language Processor (VCLP). This is the code that works behind the scenes to make it simple and easy to modify or extend existing command grammars, or add your own.

Command Execution Framework

* included

Once a command is processed and understood, we need a way to actually get the computer to perform the specified action. This is done through AppleScript and shell scripts (PC and Linux support is under development).

Custom Phonetic Alphabet for individual letters, numbers, symbols, and punctuation.

* included

When it comes time to write something – code, or anything else, You’re destined to run into words that are not in the vocabulary of the voice-to-text processor. If it is a word you are going to use consistently, you can add it to Dragon’s vocabulary, but if it’s just something quick, it would be nice to be able to spell it easily, without having to switch to some special “spelling mode” of dictation. The Phonetic Alphabet enables this concept perfectly, just like the “NATO phonetic alphabet” was used to disambiguate spoken letters and numbers in the military (ie Alpha, Bravo, Charlie, Niner etc”). However, with our phonetic alphabet speed is of the essence, so each character is only a single syllable, allowing you to spell something like: “src” with a simple 3 syllable phrase. Or what if you wanted to dictate: “imageio”? You would simply say: “image” followed immediately by a 2 syllable phrase meaning “io”, and it would come out correctly. Each single-syllable phoneme has been painstakingly chosen to avoid ambiguities, especially when used in combination. No combination of them will sound like any other English word. And the alphabet includes letters, numbers, symbols, punctuation, and symbol-combos like ‘=>’, ‘{ }’, or ‘||=’

SmartNav (by NaturalPoint)

* sold separately

While not required in order to use VoiceCode, SmartNav is nothing short of amazing. It makes the hands-free experience twice as awesome. SmartNav Is a mouse-replacement that uses infra-red light to track movement of a tiny reflector placed on a hat or headband. Once set up, the mouse moves EXACTLY where you look on the screen. It feels like magic, and it is much quicker AND more accurate than a normal mouse or trackpad. I even use it to sketch in Photoshop with better results than a mouse. The real power comes when you combine this mouse-freedom with voice control. For example, if you are looking at a word on the screen, you just say “dub cop” and the system performs a mouse double-click, followed by a command-C keystroke – resulting in the word you are looking at being copied to the clipboard.

"The mission is to eliminate computer injuries while increasing productivity and pushing the boundaries of computer-interaction and automation."