STT Configuration
STT, stands for Speech to Text, and is software that transforms spoken words & sentences into text, which is how Naomi can understand you
Engine name | Privacy Respect | Type | Self Hosting | Requests (free account) | Quality | Platform |
---|---|---|---|---|---|---|
Wit.ai | 👎 | Online | NO | Unlimited | Really good | Any |
Google Cloud STT | 👎 | Online | NO | ? | Really good | Any |
AT&T Speech API | 👎 | Online | NO | ? | ? | Any |
Pocketsphinx | 👍 | Offline | YES | Unlimited | ? | Linux 🐧 |
Mozilla DeepSpeech | 👍 | Offline | YES | ? | ? | Linux 🐧 |
Julius | 👍 | Offline | YES | Unlimited | ? | Linux 🐧 |
Kaldi | 👍 | Online | YES | Unlimited | ? | Linux 🐧 |
You will need to pick one of the above and then follow the instructions below that is denoted for the STT engine you select
Note: For accuracy, really good understanding and easy to use, online solutions are better! But for privacy reasons and to use Naomi without internet access, we recommend the use of offline solutions
Wit.ai
You will need a token that you receive for free by registering an account on the Wit.ai website
Google Cloud STT
You will need a token that you receive for free by registering an account on the Google Cloud website
Note: Do not forget to enable billing! Even if you use the free account, you still have to enable it in order for the engine to work!
AT&T STT
You will need a token that you receive for free by registering an account on the AT&T developer program website
Pocketsphinx
Install with the following instructions
DeepSpeech
Install with the following instructions
Julius
Note: You will need to train your own acoustic model, which is a very complex task that we do not provide support for!
Install with the following instructions
Kaldi Server
Install with the following instructions