Castorini
Rank:Average Model Cost: $0.0000
Number of Runs: 59,318
Models by this creator
monot5-base-msmarco-10k
monot5-base-msmarco-10k
The monot5-base-msmarco-10k model is a T5-base reranker that has been fine-tuned on the MS MARCO passage dataset for 10,000 steps (or 1 epoch). It is designed to improve the performance of zero-shot tasks and performs better on datasets different from MS MARCO. This model can be used for document ranking and has demonstrated strong performance in various applications. For more information on how to use this model, the provided links offer examples and guidelines.
$-/run
26.5K
Huggingface
unicoil-noexp-msmarco-passage
$-/run
7.9K
Huggingface
ance-dpr-question-multi
ance-dpr-question-multi
This model is converted from the original ANCE repo and fitted into Pyserini: For more details on how to use it, check our experiments in Pyserini
$-/run
6.3K
Huggingface
tct_colbert-v2-hnp-msmarco
tct_colbert-v2-hnp-msmarco
This model is to reproduce a variant of TCT-ColBERT-V2 dense retrieval models described in the following paper: You can find our reproduction report in Pyserini here.
$-/run
3.7K
Huggingface
monot5-3b-msmarco-10k
monot5-3b-msmarco-10k
This model is a T5-3B reranker fine-tuned on the MS MARCO passage dataset for 10k steps (or 1 epoch). For more details on how to use it, check pygaggle.ai Paper describing the model: Document Ranking with a Pretrained Sequence-to-Sequence Model This model is also the state of the art on the BEIR Benchmark. Paper: No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval Repository
$-/run
3.3K
Huggingface
tct_colbert-msmarco
tct_colbert-msmarco
This model is to reproduce the TCT-ColBERT dense retrieval described in the following paper: For more details on how to use it, check our experiments in Pyserini
$-/run
3.2K
Huggingface
tct_colbert-v2-msmarco
tct_colbert-v2-msmarco
This model is to reproduce a variant of TCT-ColBERT-V2 dense retrieval models described in the following paper: You can find our reproduction report in Pyserini here.
$-/run
2.8K
Huggingface
ance-msmarco-passage
ance-msmarco-passage
Model Card for ance-msmarco-passage Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations. Model Details Model Description Pyserini is primarily designed to provide effective, reproducible, and easy-to-use first-stage retrieval in a multi-stage ranking architecture Developed by: Castorini Shared by [Optional]: Hugging Face Model type: Information retrieval Language(s) (NLP): en License: More information needed Related Models: More information needed Parent Model: RoBERTa Resources for more information: GitHub Repo Associated Paper Uses Direct Use More information needed Downstream Use [Optional] More information needed Out-of-Scope Use More information needed Bias, Risks, and Limitations Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Training Details Training Data More information needed Training Procedure Preprocessing More information needed Speeds, Sizes, Times More information needed Evaluation Testing Data, Factors & Metrics Testing Data The model creators note in the associated Paper that: Factors More information needed Metrics More information needed Results More information needed Model Examination More information needed Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). Hardware Type: More information needed Hours used: More information needed Cloud Provider: More information needed Compute Region: More information needed Carbon Emitted: More information needed Technical Specifications [optional] Model Architecture and Objective More information needed Compute Infrastructure More information needed Hardware More information needed Software For bag-of-words sparse retrieval, we have built in Anserini (written in Java) custom parsers and ingestion pipelines for common document formats used in IR research, Citation BibTeX: Glossary [optional] More information needed More Information [optional] More information needed Model Card Authors [optional] Castorini in collaboration with Ezi Ozoani and the Hugging Face team. Model Card Contact More information needed How to Get Started with the Model Use the code below to get started with the model.
$-/run
2.0K
Huggingface
unicoil-msmarco-passage
$-/run
2.0K
Huggingface
monot5-small-msmarco-10k
$-/run
1.7K
Huggingface