Nvidia
Rank:Average Model Cost: $0.0000
Number of Runs: 574,189
Models by this creator
speakerverification_en_titanet_large
speakerverification_en_titanet_large
The speakerverification_en_titanet_large model is a large version of the TitaNet model, designed for extracting speaker embeddings from speech. It can be used for tasks such as speaker verification and diarization. The model is trained using the NVIDIA NeMo toolkit and has around 23 million parameters. It accepts 16000 KHz mono-channel audio files as input and provides speaker embeddings as output. The model has been trained on a composite dataset comprising thousands of hours of English speech, including Voxceleb, Fisher, Switchboard, Librispeech, and SRE datasets. Performance is measured in terms of Equal Error Rate (EER%) for speaker verification and Diarization Error Rate (DER%) for diarization tasks. The model is not currently supported by NVIDIA Riva for deployment.
$-/run
342.9K
Huggingface
mit-b0
mit-b0
The mit-b0 model is an image classification model that has the ability to classify images into different categories. It utilizes deep learning algorithms to learn patterns and features in the images and make predictions based on those patterns. The model has been trained on a large dataset of images and has achieved high accuracy in its predictions. It is designed to be used in applications such as object detection, scene recognition, and image tagging.
$-/run
111.3K
Huggingface
segformer-b5-finetuned-ade-640-640
segformer-b5-finetuned-ade-640-640
The segformer-b5-finetuned-ade-640-640 model is an image segmentation model. It is based on the SegFormer architecture and has been fine-tuned on the ADE20K dataset. The model takes an input image and predicts a segmentation mask, where each pixel is classified into different classes. This model is specifically trained to perform image segmentation on images with a resolution of 640x640.
$-/run
41.4K
Huggingface
segformer-b0-finetuned-ade-512-512
segformer-b0-finetuned-ade-512-512
The segformer-b0-finetuned-ade-512-512 model is an image segmentation model that has been fine-tuned on the ADE20K dataset. It is based on the SegFormer architecture, which combines the efficiency of transformers with the effectiveness of convolutional neural networks for image segmentation tasks. This specific model has been trained to perform image segmentation on images with a resolution of 512x512 pixels. It is capable of accurately identifying and segmenting different objects in an image, making it useful for tasks such as object detection and scene understanding.
$-/run
40.5K
Huggingface
mit-b5
mit-b5
The model is an image classifier, designed to analyze and categorize images based on their content. It uses deep learning techniques to train on a large dataset of labeled images, learning the patterns and features that distinguish different categories. Once trained, the model can take in new images and predict the most likely category they belong to. It can be used in various applications such as object recognition, facial recognition, and scene understanding.
$-/run
8.6K
Huggingface
segformer-b5-finetuned-cityscapes-1024-1024
segformer-b5-finetuned-cityscapes-1024-1024
The segformer-b5-finetuned-cityscapes-1024-1024 model is an image segmentation model that has been trained on the Cityscapes dataset and fine-tuned using the SegFormer architecture with a backbone network based on EfficientNet-B5. Its input resolution is 1024x1024 pixels. The model can take an input image and generate a pixel-wise segmentation map, where each pixel is classified into different classes such as road, building, car, etc. This model is useful for tasks such as autonomous driving, urban planning, and mapping.
$-/run
7.6K
Huggingface
mit-b4
mit-b4
The model is an image classification model developed by MIT B4 and it is designed to classify images into different categories. It uses machine learning techniques to learn patterns and features from images and then predicts the class label of a given image. The model can be useful in various applications such as object recognition, scene understanding, and visual search.
$-/run
6.8K
Huggingface
segformer-b3-finetuned-ade-512-512
segformer-b3-finetuned-ade-512-512
segformer-b3-finetuned-ade-512-512 is an image segmentation model that is based on the SegFormer architecture. It has been fine-tuned on the ADE20K dataset, which is a large-scale scene parsing benchmark. The model takes an input image and outputs a pixel-level prediction, assigning a label to each pixel in the image. It can be used for tasks such as object segmentation, semantic segmentation, and scene parsing. The model has been trained with a resolution of 512x512 pixels.
$-/run
6.1K
Huggingface