The MOST Accurate Speech-to-Text in 2025 💥 Nvidia Parakeet Python Tutorial 💥

「ツール」は右上に移動しました。

利用したサーバー: wtserver1

82いいね 2,228 views回再生

The MOST Accurate Speech-to-Text in 2025 💥 Nvidia Parakeet Python Tutorial 💥

parakeet-tdt-0.6b-v2 is a 600-million-parameter automatic speech recognition (ASR) model designed for high-quality English transcription, featuring support for punctuation, capitalization, and accurate timestamp prediction. Try Demo here: https://huggingface.co/spaces/nvidia/...

This XL variant of the FastConformer [1] architecture integrates the TDT [2] decoder and is trained with full attention, enabling efficient transcription of audio segments up to 24 minutes in a single pass. The model achieves an RTFx of 3380 on the HF-Open-ASR leaderboard with a batch size of 128. Note: RTFx Performance may vary depending on dataset audio duration and batch size.

Key Features

Accurate word-level timestamp predictions
Automatic punctuation and capitalization
Robust performance on spoken numbers, and song lyrics transcription

Colab used in the video:

https://colab.research.google.com/dri...

❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - https://ko-fi.com/1littlecoder

🧭 Follow me on 🧭
Twitter - / 1littlecoder

The MOST Accurate Speech-to-Text in 2025 💥 Nvidia Parakeet Python Tutorial 💥

コメント