Project structure of: james4ever0/whisper
whisper
Open-source speech recognition library and tools
CHANGELOG.md
Latest openai/whisper project updates, fixes, and features.
data
model-card.md
Whisper AI: Speech Recognition, Translation, Challenges
notebooks
Notebooks: Contains scripts for ASR, WER calculation, and multilingual tasks.
pyproject.toml
Python code formatting settings
README.md
Whisper: Multitask Transformer for diverse language speech recognition.
requirements.txt
Python packages and Triton version requirements.
setup.py
Set up Python package with requirements.
tests
Test audio, normalization, timing, and transcription functionality.
whisper
Whisper: Library for audio-to-text, efficient language detection and transcription.
__init__.py
Whisper: Library for model download, verification, and availability.
__main__.py
Transcribes audio using Whisper CLI.
audio.py
Audio preprocessing for Tensor or Torch tensors.
decoding.py
Whisper model, efficient audio language detection.
model.py
Audio-to-text model: encoder-decoder, convolutional layers, attention
normalizers
Text normalizers for multiple languages.
timing.py
Median filtering, dynamic time warping, text-to-speech alignment.
tokenizer.py
Efficient text tokenization with Tiktoken library support.
transcribe.py
Speech transcription with Whisper model, customizable.
triton_ops.py
Triton parallel Dynamic Time Warp, CUDA Median Filter
utils.py
Utility, compression ratio, timestamp formatting. WriteTSV, JSON.
version.py
Sets version of "whisper" to 20231117