Project structure of: james4ever0/whisper
whisper Open-source speech recognition library and tools
CHANGELOG.md Latest openai/whisper project updates, fixes, and features.
data
model-card.md Whisper AI: Speech Recognition, Translation, Challenges
notebooks Notebooks: Contains scripts for ASR, WER calculation, and multilingual tasks.
pyproject.toml Python code formatting settings
README.md Whisper: Multitask Transformer for diverse language speech recognition.
requirements.txt Python packages and Triton version requirements.
setup.py Set up Python package with requirements.
tests Test audio, normalization, timing, and transcription functionality.
whisper Whisper: Library for audio-to-text, efficient language detection and transcription.
__init__.py Whisper: Library for model download, verification, and availability.
__main__.py Transcribes audio using Whisper CLI.
audio.py Audio preprocessing for Tensor or Torch tensors.
decoding.py Whisper model, efficient audio language detection.
model.py Audio-to-text model: encoder-decoder, convolutional layers, attention
normalizers Text normalizers for multiple languages.
timing.py Median filtering, dynamic time warping, text-to-speech alignment.
tokenizer.py Efficient text tokenization with Tiktoken library support.
transcribe.py Speech transcription with Whisper model, customizable.
triton_ops.py Triton parallel Dynamic Time Warp, CUDA Median Filter
utils.py Utility, compression ratio, timestamp formatting. WriteTSV, JSON.
version.py Sets version of "whisper" to 20231117