[](#xlm-roberta-large-xnli)xlm-roberta-large-xnli
=================================================

[](#model-description)Model Description
---------------------------------------

This model takes [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) and fine-tunes it on a combination of NLI data in 15 languages. It is intended to be used for zero-shot text classification, such as with the Hugging Face [ZeroShotClassificationPipeline](https://huggingface.co/transformers/master/main_classes/pipelines.html#transformers.ZeroShotClassificationPipeline).

[](#intended-usage)Intended Usage
---------------------------------

This model is intended to be used for zero-shot text classification, especially in languages other than English. It is fine-tuned on XNLI, which is a multilingual NLI dataset. The model can therefore be used with any of the languages in the XNLI corpus:

*   English
*   French
*   Spanish
*   German
*   Greek
*   Bulgarian
*   Russian
*   Turkish
*   Arabic
*   Vietnamese
*   Thai
*   Chinese
*   Hindi
*   Swahili
*   Urdu

Since the base model was pre-trained trained on 100 different languages, the model has shown some effectiveness in languages beyond those listed above as well. See the full list of pre-trained languages in appendix A of the [XLM Roberata paper](https://arxiv.org/abs/1911.02116)

For English-only classification, it is recommended to use [bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) or [a distilled bart MNLI model](https://huggingface.co/models?filter=pipeline_tag%3Azero-shot-classification&search=valhalla).

#### [](#with-the-zero-shot-classification-pipeline)With the zero-shot classification pipeline

The model can be loaded with the `zero-shot-classification` pipeline like so:

    from transformers import pipeline
    classifier = pipeline("zero-shot-classification",
                          model="joeddav/xlm-roberta-large-xnli")
    

You can then classify in any of the above languages. You can even pass the labels in one language and the sequence to classify in another:

    # we will classify the Russian translation of, "Who are you voting for in 2020?"
    sequence_to_classify = "     2020 ?"
    # we can specify candidate labels in Russian or any other language above:
    candidate_labels = ["Europe", "public health", "politics"]
    classifier(sequence_to_classify, candidate_labels)
    # {'labels': ['politics', 'Europe', 'public health'],
    #  'scores': [0.9048484563827515, 0.05722189322113991, 0.03792969882488251],
    #  'sequence': '     2020 ?'}
    

The default hypothesis template is the English, `This text is {}`. If you are working strictly within one language, it may be worthwhile to translate this to the language you are working with:

    sequence_to_classify = "A quin vas a votar en 2020?"
    candidate_labels = ["Europa", "salud pblica", "poltica"]
    hypothesis_template = "Este ejemplo es {}."
    classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)
    # {'labels': ['poltica', 'Europa', 'salud pblica'],
    #  'scores': [0.9109585881233215, 0.05954807624220848, 0.029493311420083046],
    #  'sequence': 'A quin vas a votar en 2020?'}
    

#### [](#with-manual-pytorch)With manual PyTorch

    # pose sequence as a NLI premise and label as a hypothesis
    from transformers import AutoModelForSequenceClassification, AutoTokenizer
    nli_model = AutoModelForSequenceClassification.from_pretrained('joeddav/xlm-roberta-large-xnli')
    tokenizer = AutoTokenizer.from_pretrained('joeddav/xlm-roberta-large-xnli')
    
    premise = sequence
    hypothesis = f'This example is {label}.'
    
    # run through model pre-trained on MNLI
    x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
                         truncation_strategy='only_first')
    logits = nli_model(x.to(device))[0]
    
    # we throw away "neutral" (dim 1) and take the probability of
    # "entailment" (2) as the probability of the label being true 
    entail_contradiction_logits = logits[:,[0,2]]
    probs = entail_contradiction_logits.softmax(dim=1)
    prob_label_is_true = probs[:,1]
    

[](#training)Training
---------------------

This model was pre-trained on set of 100 languages, as described in [the original paper](https://arxiv.org/abs/1911.02116). It was then fine-tuned on the task of NLI on the concatenated MNLI train set and the XNLI validation and test sets. Finally, it was trained for one additional epoch on only XNLI data where the translations for the premise and hypothesis are shuffled such that the premise and hypothesis for each example come from the same original English example but the premise and hypothesis are of different languages.

## Model Overview

The `xlm-roberta-large-xnli` model is based on the XLM-RoBERTa large model and is fine-tuned on a combination of Natural Language Inference (NLI) data in 15 languages. This makes it well-suited for zero-shot text classification tasks, especially in languages other than English. Compared to similar models like [bart-large-mnli](https://aimodels.fyi/models/huggingFace/bart-large-mnli-facebook) and [bert-base-uncased](https://aimodels.fyi/models/huggingFace/bert-base-uncased-google-bert), the `xlm-roberta-large-xnli` model leverages multilingual pretraining to extend its capabilities across a broader range of languages.

## Model Inputs and Outputs

### Inputs
- **Text sequences**: The model can take in text sequences in any of the 15 languages it was fine-tuned on, including English, French, Spanish, German, and more.
- **Candidate labels**: When using the model for zero-shot classification, you provide a set of candidate labels that the input text should be classified into.

### Outputs
- **Label probabilities**: The model outputs a probability distribution over the provided candidate labels, indicating the likelihood of the input text belonging to each class.

## Capabilities

The `xlm-roberta-large-xnli` model is particularly adept at zero-shot text classification tasks, where it can classify text into predefined categories without any specific fine-tuning on that task. This makes it useful for a variety of applications, such as sentiment analysis, topic classification, and intent detection, across a diverse range of languages.

## What Can I Use It For?

You can use the `xlm-roberta-large-xnli` model for zero-shot text classification in any of the 15 supported languages. This could be helpful for building multilingual applications that need to categorize text, such as [customer service chatbots](https://aimodels.fyi/creators/huggingFace/joeddav) that can understand and respond to queries in multiple languages. The model could also be fine-tuned on domain-specific datasets to create custom classification models for specialized use cases.

## Things to Try

One interesting aspect of the `xlm-roberta-large-xnli` model is its ability to handle cross-lingual classification, where the input text and candidate labels can be in different languages. You could experiment with this by providing a Russian text sequence and English candidate labels, for example, and see how the model performs. Additionally, you could explore ways to further fine-tune the model on your specific use case to improve its accuracy and effectiveness.