Get a weekly rundown of the latest AI models and research... subscribe! https://aimodels.substack.com/

Wav2vec2 Large Xlsr 53 Th

airesearch

๐Ÿ—ฃ๏ธ

The wav2vec2-large-xlsr-53-th model is a finetuned version of the pretrained wav2vec2-large-xlsr-53 model specifically trained for Automatic Speech Recognition (ASR) in Thai language. It is trained on the Thai Common Voice Corpus 7.0 dataset and uses tokenizers such as syllable_tokenize, word_tokenize (PyThaiNLP), and deepcut. The model is benchmarked using Word Error Rate (WER) and Character Error Rate (CER), with and without spell correction. The finetuning process and evaluation codes are provided in the repository. Please note that the APIs are not finetuned with the Common Voice 7.0 data.

Use cases

The wav2vec2-large-xlsr-53-th model has several potential use cases for Automatic Speech Recognition (ASR) in Thai language. It can be used for transcribing audio files, such as converting spoken Thai language into written text for documentation purposes. This can be valuable in various industries, including media and entertainment, education, and market research. The model can also be integrated into voice-controlled applications, enabling users to interact with devices and software through spoken commands in Thai language. Additionally, it can be used for speech data analysis and research, allowing for deeper understanding and insights from spoken Thai language. Overall, the model provides a powerful tool for processing and analyzing audio data in Thai language. Possible products or practical uses of this model could include Thai language transcription applications, virtual assistants with Thai language support, speech analytics tools for Thai language, and Thai language learning platforms with speech recognition capabilities.

automatic-speech-recognition

Pricing

Cost per run
$-
USD
Avg run time
-
Seconds
Hardware
-
Prediction

Creator Models

ModelCostRuns
Wangchanberta Base Att Spm Uncased$?20,824
Bert Base Multilingual Cased Finetune Qa$?39
Wangchanberta Base Wiki Newmm$?105
Wangchanberta Base Wiki Spm$?97
Bert Base Multilingual Cased Finetuned$?17

Similar Models

Try it!

You can use this area to play around with demo applications that incorporate the Wav2vec2 Large Xlsr 53 Th model. These demos are maintained and hosted externally by third-party creators. If you see an error, message me on Twitter.

Overview

Summary of this model and related resources.

PropertyValue
Creatorairesearch
Model NameWav2vec2 Large Xlsr 53 Th
Description

Finetuning wav2vec2-large-xlsr-53 on Thai Common Voice 7.0 Read more on ou...

Read more ยป
Tagsautomatic-speech-recognition
Model LinkView on HuggingFace
API SpecView on HuggingFace
Github LinkNo Github link provided
Paper LinkNo paper link provided

Popularity

How popular is this model, by number of runs? How popular is the creator, by the sum of all their runs?

PropertyValue
Runs66,429
Model Rank
Creator Rank

Cost

How much does it cost to run this model? How long, on average, does it take to complete a run?

PropertyValue
Cost per Run$-
Prediction Hardware-
Average Completion Time-