语音转文字 asr stt speech to text
online
字说APP的api
搜狗输入法apk的api
微软stt
https://github.com/cuberwr/bilibiliSTT
多家免费stt
https://github.com/1c7/Translate-Subtitle-File
offline
https://github.com/metavoiceio/metavoice-src
pyannote segment audio according to different speakers, detect voice activity
speechbrain very advanced speech related ai library, with almost everything related to speech
vosk
paddlespeech
paper of Google USM (universal speech model) supporting 1000 languages
whisper.cpp perform fast voice to text operation using cpu rather than gpu
whisperx improve time accuracy with forced alignment
whisper by openai, with multilingual and translation avaliable, can detect under background music and noise, with slience,