Ydshieh

Rank:

Average Model Cost: $0.0000

Number of Runs: 9,987

Models by this creator

vit-gpt2-coco-en

vit-gpt2-coco-en

ydshieh

vit-gpt2-coco-en is a model that combines the Vision Transformer (ViT) architecture with the GPT-2 architecture to perform image-to-text tasks. It can generate textual descriptions of images when provided with visual inputs. The model is trained on the COCO (Common Objects in Context) dataset.

Read more

$-/run

5.3K

Huggingface

Similar creators