51la5

Rank:

Average Model Cost: $0.0000

Number of Runs: 3,528

Models by this creator

roberta-large-NER

roberta-large-NER

51la5

xlm-roberta-large-finetuned-conll03-english Table of Contents Model Details Uses Bias, Risks, and Limitations Training Evaluation Environmental Impact Technical Specifications Citation Model Card Authors How To Get Started With the Model Model Details Model Description The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is XLM-RoBERTa-large fine-tuned with the conll2003 dataset in English. Developed by: See associated paper Model type: Multi-lingual language model Language(s) (NLP) or Countries (images): XLM-RoBERTa is a multilingual model trained on 100 different languages; see GitHub Repo for full list; model is fine-tuned on a dataset in English License: More information needed Related Models: RoBERTa, XLM Parent Model: XLM-RoBERTa-large Resources for more information: -GitHub Repo -Associated Paper Uses Direct Use The model is a language model. The model can be used for token classification, a natural language understanding task in which a label is assigned to some tokens in a text. Downstream Use Potential downstream use cases include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. To learn more about token classification and other potential downstream use cases, see the Hugging Face token classification docs. Out-of-Scope Use The model should not be used to intentionally create hostile or alienating environments for people. Bias, Risks, and Limitations CONTENT WARNING: Readers should be made aware that language generated by this model may be disturbing or offensive to some and may propagate historical and current stereotypes. Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). In the context of tasks relevant to this model, Mishra et al. (2020) explore social biases in NER systems for English and find that there is systematic bias in existing NER systems in that they fail to identify named entities from different demographic groups (though this paper did not look at BERT). For example, using a sample sentence from Mishra et al. (2020): Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. Training See the following resources for training data and training procedure details: XLM-RoBERTa-large model card CoNLL-2003 data card Associated paper Evaluation See the associated paper for evaluation details. Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). Hardware Type: 500 32GB Nvidia V100 GPUs (from the associated paper) Hours used: More information needed Cloud Provider: More information needed Compute Region: More information needed Carbon Emitted: More information needed Technical Specifications See the associated paper for further details. Citation BibTeX: APA: Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., ... & Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116. Model Card Authors This model card was written by the team at Hugging Face. How to Get Started with the Model Use the code below to get started with the model. You can use this model directly within a pipeline for NER.

Read more

$-/run

3.3K

Huggingface

distilbert-base-sentiment

distilbert-base-sentiment

DistilBERT base uncased finetuned SST-2 Table of Contents Model Details How to Get Started With the Model Uses Risks, Limitations and Biases Training Model Details Model Description: This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2. This model reaches an accuracy of 91.3 on the dev set (for comparison, Bert bert-base-uncased version reaches an accuracy of 92.7). Developed by: Hugging Face Model Type: Text Classification Language(s): English License: Apache-2.0 Parent Model: For more details about DistilBERT, we encourage users to check out this model card. Resources for more information: Model Documentation How to Get Started With the Model Example of single-label classification: ​​ Uses This model can be used for topic classification. You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model. Risks, Limitations and Biases Based on a few experimentations, we observed that this model could produce biased predictions that target underrepresented populations. For instance, for sentences like This film was filmed in COUNTRY, this binary classification model will give radically different probabilities for the positive label depending on the country (0.89 if the country is France, but 0.08 if the country is Afghanistan) when nothing in the input indicates such a strong semantic shift. In this colab, Aurélien Géron made an interesting map plotting these probabilities for each country. We strongly advise users to thoroughly probe these aspects on their use-cases in order to evaluate the risks of this model. We recommend looking at the following bias evaluation datasets as a place to start: WinoBias, WinoGender, Stereoset. Training The authors use the following Stanford Sentiment Treebank(sst2) corpora for the model. learning_rate = 1e-5 batch_size = 32 warmup = 600 max_seq_length = 128 num_train_epochs = 3.0

Read more

$-/run

52

Huggingface

bert-large-NER

bert-large-NER

bert-base-NER Model description bert-large-NER is a fine-tuned BERT model that is ready to use for Named Entity Recognition and achieves state-of-the-art performance for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC). Specifically, this model is a bert-large-cased model that was fine-tuned on the English version of the standard CoNLL-2003 Named Entity Recognition dataset. If you'd like to use a smaller BERT model fine-tuned on the same dataset, a bert-base-NER version is also available. Intended uses & limitations You can use this model with Transformers pipeline for NER. This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains. Furthermore, the model occassionally tags subword tokens as entities and post-processing of results may be necessary to handle those cases. Training data This model was fine-tuned on English version of the standard CoNLL-2003 Named Entity Recognition dataset. The training dataset distinguishes between the beginning and continuation of an entity so that if there are back-to-back entities of the same type, the model can output where the second entity begins. As in the dataset, each token will be classified as one of the following classes: CoNLL-2003 English Dataset Statistics This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper. Training procedure This model was trained on a single NVIDIA V100 GPU with recommended hyperparameters from the original BERT paper which trained & evaluated the model on CoNLL-2003 NER task. Eval results The test metrics are a little lower than the official Google BERT results which encoded document context & experimented with CRF. More on replicating the original results here. BibTeX entry and citation info

Read more

$-/run

27

Huggingface

distilbert-base-NER

distilbert-base-NER

DistilBERT base uncased, fine-tuned for NER using the conll03 english dataset. Note that this model is not sensitive to capital letters — "english" is the same as "English". For the case sensitive version, please use elastic/distilbert-base-cased-finetuned-conll03-english. Versions Transformers version: 4.3.1 Datasets version: 1.3.0 Training After training, we update the labels to match the NER specific labels from the dataset conll2003

Read more

$-/run

16

Huggingface

Similar creators