STT Configuration

STT, stands for Speech to Text, and is software that transforms spoken words & sentences into text, which is how Naomi can understand you

Engine name Privacy Respect Type Self Hosting Requests (free account) Quality Platform
Wit.ai 👎 Online NO Unlimited Really good Any
Google Cloud STT 👎 Online NO ? Really good Any
AT&T Speech API 👎 Online NO ? ? Any
Pocketsphinx 👍 Offline YES Unlimited ? Linux 🐧
Mozilla DeepSpeech 👍 Offline YES ? ? Linux 🐧
Julius 👍 Offline YES Unlimited ? Linux 🐧
Kaldi 👍 Online YES Unlimited ? Linux 🐧

You will need to pick one of the above and then follow the instructions below that is denoted for the STT engine you select

Note: For accuracy, really good understanding and easy to use, online solutions are better! But for privacy reasons and to use Naomi without internet access, we recommend the use of offline solutions

Wit.ai

You will need a token that you receive for free by registering an account on the Wit.ai website

Follow steps here

Google Cloud STT

You will need a token that you receive for free by registering an account on the Google Cloud website

Follow steps here

Note: Do not forget to enable billing! Even if you use the free account, you still have to enable it in order for the engine to work!

AT&T STT

You will need a token that you receive for free by registering an account on the AT&T developer program website

Follow steps here

Pocketsphinx

Install with the following instructions

Follow steps here

DeepSpeech

Install with the following instructions

Follow steps here

Julius

Note: You will need to train your own acoustic model, which is a very complex task that we do not provide support for!

Install with the following instructions

Follow steps here

Kaldi Server

Install with the following instructions

Follow steps here