Speech to Text Apps

Speech to Text

  • ​DeepSpeech : simpler although inferior
  • ​Kaldi : STT supports hybrid NN-HMM and lattice-free MMI models. Kaldi is used by many people both in research and in production.
  • ​Lingvo is the open source version of Google speech recognition toolkit, with support mostly for end-to-end models.
  • ​ESPNet is good and well known for end-to-end models as well.
  • ​RASR + RETURNN are very good as well, both for end-to-end models and hybrid NN-HMM, but they are for non-commercial applications only (or you need a commercial licence) (disclaimer: I work at the university chair which develops these frameworks).
  • ​Wav2Letter, the tool by Facebook.
  • ​snakers4/silero-models at mlnews Silero Speech to Text
  • ​coqui Coqui STT and TTS
  • Dataset
    • English: Tedlium, Librispeech, etc.

Speech to Text Indonesian Support