Intfloat
Rank:Average Model Cost: $0.0000
Number of Runs: 1,045,343
Models by this creator
e5-small-v2
e5-small-v2
The e5-small-v2 model is a text embedding model that uses contrastive pre-training to generate text embeddings. It has 12 layers and 384-dimensional embeddings. It is trained on the MS-MARCO passage ranking dataset. The model is limited to working with English texts and can only handle texts with a maximum of 512 tokens. It can be used with the sentence_transformers library for various natural language processing tasks.
$-/run
815.4K
Huggingface
e5-large-v2
e5-large-v2
The e5-large-v2 model is a text embedding model that is trained through weakly-supervised contrastive pre-training. It has 24 layers and the embedding size is 1024. The model can be used to encode queries and passages from the MS-MARCO passage ranking dataset. It is designed for English texts and can handle texts with a maximum of 512 tokens. The model's training details and benchmark evaluation can be found in the associated research paper.
$-/run
64.2K
Huggingface
e5-small
e5-small
The E5-small model is a text embedding model that utilizes weakly-supervised contrastive pre-training. It has 12 layers and an embedding size of 384. It can be used to encode queries and passages from the MS-MARCO passage ranking dataset. The model has been trained and evaluated on benchmark datasets. However, it only supports English texts and truncates long texts to a maximum of 512 tokens.
$-/run
48.3K
Huggingface
e5-base-v2
e5-base-v2
The e5-base-v2 model is a text embedding model that is trained using weakly-supervised contrastive pre-training. It consists of 12 layers and has an embedding size of 768. The model can be used to encode queries and passages from datasets like MS-MARCO for tasks such as passage ranking. It is designed for English texts and has a limitation of truncating long texts to a maximum of 512 tokens. The model's training details and benchmark evaluation results can be found in the associated research paper.
$-/run
45.5K
Huggingface
e5-base
e5-base
The e5-base model is a feature extraction model that is designed to extract useful features from text. It can be used for a variety of natural language processing tasks such as text classification, named entity recognition, and sentiment analysis. The model is trained on a large amount of text data and can capture semantic information from text to create meaningful representations of input text. It is a versatile tool that can be used as a starting point for building more complex NLP models.
$-/run
30.9K
Huggingface
multilingual-e5-base
multilingual-e5-base
The Multilingual-E5-base model is a text embedding model that is trained using weakly-supervised contrastive pre-training. It has 12 layers and an embedding size of 768. This model is initialized from xlm-roberta-base and trained on a mixture of multilingual datasets. It supports 100 languages from xlm-roberta, but performance may degrade for low-resource languages. The model is trained in two stages: contrastive pre-training with weak supervision and supervised fine-tuning. It can be used to encode queries and passages from the MS-MARCO passage ranking dataset. However, long texts will be truncated to at most 512 tokens. The model has been evaluated on the BEIR and MTEB benchmark.
$-/run
20.4K
Huggingface
e5-large
e5-large
The E5-large model is a text embedding model that is trained using weakly-supervised contrastive pre-training. It consists of 24 layers and has an embedding size of 1024. It can be used to encode queries and passages from the MS-MARCO passage ranking dataset. The model has been evaluated on the BEIR and MTEB benchmarks, and its evaluation results can be reproduced using the unilm/e5 model. However, it only works for English texts and long texts will be truncated to a maximum of 512 tokens.
$-/run
16.3K
Huggingface
e5-large-unsupervised
e5-large-unsupervised
This model is similar to e5-large but without supervised fine-tuning. Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022 This model has 24 layers and the embedding size is 1024. Below is an example to encode queries and passages from the MS-MARCO passage ranking dataset. Please refer to our paper at https://arxiv.org/pdf/2212.03533.pdf. Check out unilm/e5 to reproduce evaluation results on the BEIR and MTEB benchmark. If you find our paper or models helpful, please consider cite as follows: This model only works for English texts. Long texts will be truncated to at most 512 tokens.
$-/run
1.7K
Huggingface
multilingual-e5-large
$-/run
1.4K
Huggingface
e5-base-unsupervised
e5-base-unsupervised
This model is similar to e5-base but without supervised fine-tuning. Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022 This model has 12 layers and the embedding size is 768. Below is an example to encode queries and passages from the MS-MARCO passage ranking dataset. Please refer to our paper at https://arxiv.org/pdf/2212.03533.pdf. Check out unilm/e5 to reproduce evaluation results on the BEIR and MTEB benchmark. If you find our paper or models helpful, please consider cite as follows: This model only works for English texts. Long texts will be truncated to at most 512 tokens.
$-/run
1.3K
Huggingface