Models

Search through the list of amazing models below!

AI model preview image

stable-diffusion

stability-ai

The stable-diffusion model is a text-to-image diffusion model that can generate highly realistic images based on textual input. It utilizes latent variables and diffusion processes to achieve this.

Read more

$0.090/run

97.0M

Replicate

wav2vec2-large-xlsr-53-english

wav2vec2-large-xlsr-53-english

The model wav2vec2-large-xlsr-53-english is an automatic speech recognition (ASR) model designed to convert spoken language into written text. It is trained using the wav2vec2 architecture and the Cross-lingual Speaker Representations (XLSR) method. The model is specifically trained for the English language and is capable of accurately transcribing speech for various applications such as transcription services, voice assistants, and voice command recognition systems.

Read more

$-/run

68.5M

Huggingface

bert-base-uncased

bert-base-uncased

BERT-base-uncased is a pretrained language model that has been trained on a large corpus of English text. It uses a masked language modeling objective to learn an inner representation of the English language. This pretrained model can be used for tasks such as sequence classification, token classification, and question answering. It has been trained on the BookCorpus dataset and English Wikipedia using a vocabulary size of 30,000. The model has been trained on 4 cloud TPUs with a batch size of 256 and uses the Adam optimizer. When fine-tuned on downstream tasks, BERT-base-uncased achieves good performance on tasks like sentiment analysis and text classification. However, it is important to note that the model may have biased predictions, and this bias can also affect all fine-tuned versions of the model.

Read more

$-/run

49.6M

Huggingface

AI model preview image

blip

The Bootstrapping Language-Image Pre-training (BLIP) model is a technique that integrates text and image data to improve language understanding and generation tasks. It uses a pre-training stage where training data is combined from both images and text to learn joint representations. These joint representations are then used to fine-tune the model on specific downstream tasks, such as image captioning or text generation. BLIP has been shown to surpass state-of-the-art results on various tasks and can be applied to a wide range of applications that involve both text and image data.

Read more

$0.001/run

44.9M

Replicate

AI model preview image

gfpgan

GFPGAN is a practical face restoration algorithm designed for improving the quality of old photos or AI-generated faces. It uses a generator network to enhance facial details by upscaling low-resolution images and refining facial features. The model's objective is to generate high-quality, realistic face images that restore and improve the appearance of the input images.

Read more

$0.003/run

44.1M

Replicate

AI model preview image

clip-features

The clip-features model is a model that utilizes the clip-vit-large-patch14 architecture to extract features from text and images. It takes in an image and text as input and returns the corresponding CLIP features. These features can then be used for various tasks such as image classification, object detection, and image generation. The model is designed to provide a compact representation of the input that captures both visual and textual information, allowing for cross-modal understanding and analysis.

Read more

$0.001/run

41.3M

Replicate

AI model preview image

controlnet-scribble

The controlnet-scribble model is a text-to-image model that can generate detailed images from scribbled drawings. It uses a neural network architecture called ControlGAN, which incorporates a control module to guide the generation process based on textual descriptions. This model can be useful for tasks such as image synthesis, where detailed images need to be generated based on simple input sketches or descriptions.

Read more

$0.044/run

29.5M

Replicate

xlm-roberta-large

xlm-roberta-large

XLM-RoBERTa is a multilingual version of the RoBERTa model, which is pre-trained on a large corpus of text from 100 languages. It is trained using a masked language modeling (MLM) objective, where 15% of the words in a sentence are randomly masked and the model has to predict the masked words. This allows the model to learn a bidirectional representation of the sentence. The model can be used to extract features for downstream tasks such as classification or question answering. It can also be fine-tuned for specific tasks.

Read more

$-/run

23.7M

Huggingface

MedNER-CR-JA

MedNER-CR-JA

MedNER-CR-JA is a model for named entity recognition (NER) of Japanese medical documents. It is designed to identify and classify specific entities such as diseases, symptoms, treatments, and anatomical terms in the text. The model takes in a Japanese medical document as input and outputs the recognized entity mentions along with their corresponding entity labels. It can be used by running the provided predict.py script with the necessary files in the same folder. The model has been evaluated in the NTCIR-16 Real-MedNLP Task and achieved competitive results.

Read more

$-/run

20.1M

Huggingface

AI model preview image

codeformer

Codeformer is a robust face restoration algorithm that can restore old photos and generate AI-generated faces. It uses advanced AI techniques to analyze and process images, making them look as good as new. The model is designed to be highly effective in restoring and enhancing the quality of old and degraded photographs, as well as generating highly realistic AI-generated faces. This algorithm could be incredibly useful for tasks like photo restoration, face synthesis, and other image-to-image applications.

Read more

$0.005/run

18.0M

Replicate

gpt2

gpt2

GPT-2 is a transformers model pretrained on a large corpus of English data using a self-supervised learning approach. It was trained to predict the next word in a sentence. The model uses a mask mechanism to ensure predictions only rely on past tokens. GPT-2 can be used for text generation or fine-tuned for downstream tasks. The training data consists of unfiltered internet content and may introduce bias in its predictions. The model was trained on a dataset called WebText, which includes web pages from Reddit links. The texts are tokenized using a version of Byte Pair Encoding (BPE) with a vocabulary size of 50,257. The model achieves impressive results without fine-tuning. However, the training duration and exact details were not disclosed.

Read more

$-/run

17.8M

Huggingface

xlm-roberta-base

xlm-roberta-base

XLM-RoBERTa is a multilingual version of the RoBERTa model that is pre-trained on a large amount of CommonCrawl data containing 100 languages. It uses the Masked language modeling (MLM) objective to randomly mask words in a sentence and predict the masked words. The model learns a bidirectional representation of the sentence, which can be used for downstream tasks such as classification and token labeling. It is primarily intended to be fine-tuned on specific tasks and can be used with a pipeline for masked language modeling.

Read more

$-/run

16.3M

Huggingface

Page 1 of 20,503