Nvidia

Rank:

Average Model Cost: $0.0000

Number of Runs: 574,189

Models by this creator

speakerverification_en_titanet_large

speakerverification_en_titanet_large

nvidia

The speakerverification_en_titanet_large model is a large version of the TitaNet model, designed for extracting speaker embeddings from speech. It can be used for tasks such as speaker verification and diarization. The model is trained using the NVIDIA NeMo toolkit and has around 23 million parameters. It accepts 16000 KHz mono-channel audio files as input and provides speaker embeddings as output. The model has been trained on a composite dataset comprising thousands of hours of English speech, including Voxceleb, Fisher, Switchboard, Librispeech, and SRE datasets. Performance is measured in terms of Equal Error Rate (EER%) for speaker verification and Diarization Error Rate (DER%) for diarization tasks. The model is not currently supported by NVIDIA Riva for deployment.

Read more

$-/run

342.9K

Huggingface

mit-b0

mit-b0

The mit-b0 model is an image classification model that has the ability to classify images into different categories. It utilizes deep learning algorithms to learn patterns and features in the images and make predictions based on those patterns. The model has been trained on a large dataset of images and has achieved high accuracy in its predictions. It is designed to be used in applications such as object detection, scene recognition, and image tagging.

Read more

$-/run

111.3K

Huggingface

segformer-b0-finetuned-ade-512-512

segformer-b0-finetuned-ade-512-512

The segformer-b0-finetuned-ade-512-512 model is an image segmentation model that has been fine-tuned on the ADE20K dataset. It is based on the SegFormer architecture, which combines the efficiency of transformers with the effectiveness of convolutional neural networks for image segmentation tasks. This specific model has been trained to perform image segmentation on images with a resolution of 512x512 pixels. It is capable of accurately identifying and segmenting different objects in an image, making it useful for tasks such as object detection and scene understanding.

Read more

$-/run

40.5K

Huggingface

mit-b5

mit-b5

The model is an image classifier, designed to analyze and categorize images based on their content. It uses deep learning techniques to train on a large dataset of labeled images, learning the patterns and features that distinguish different categories. Once trained, the model can take in new images and predict the most likely category they belong to. It can be used in various applications such as object recognition, facial recognition, and scene understanding.

Read more

$-/run

8.6K

Huggingface

segformer-b5-finetuned-cityscapes-1024-1024

segformer-b5-finetuned-cityscapes-1024-1024

The segformer-b5-finetuned-cityscapes-1024-1024 model is an image segmentation model that has been trained on the Cityscapes dataset and fine-tuned using the SegFormer architecture with a backbone network based on EfficientNet-B5. Its input resolution is 1024x1024 pixels. The model can take an input image and generate a pixel-wise segmentation map, where each pixel is classified into different classes such as road, building, car, etc. This model is useful for tasks such as autonomous driving, urban planning, and mapping.

Read more

$-/run

7.6K

Huggingface

segformer-b3-finetuned-ade-512-512

segformer-b3-finetuned-ade-512-512

segformer-b3-finetuned-ade-512-512 is an image segmentation model that is based on the SegFormer architecture. It has been fine-tuned on the ADE20K dataset, which is a large-scale scene parsing benchmark. The model takes an input image and outputs a pixel-level prediction, assigning a label to each pixel in the image. It can be used for tasks such as object segmentation, semantic segmentation, and scene parsing. The model has been trained with a resolution of 512x512 pixels.

Read more

$-/run

6.1K

Huggingface

Similar creators