[](#distilbert-base-multilingual-cased-sentiments-student)distilbert-base-multilingual-cased-sentiments-student
===============================================================================================================

This model is distilled from the zero-shot classification pipeline on the Multilingual Sentiment dataset using this [script](https://github.com/huggingface/transformers/tree/main/examples/research_projects/zero-shot-distillation).

In reality the multilingual-sentiment dataset is annotated of course, but we'll pretend and ignore the annotations for the sake of example.

    Teacher model: MoritzLaurer/mDeBERTa-v3-base-mnli-xnli
    Teacher hypothesis template: "The sentiment of this text is {}."
    Student model: distilbert-base-multilingual-cased
    

[](#inference-example)Inference example
---------------------------------------

    from transformers import pipeline
    
    distilled_student_sentiment_classifier = pipeline(
        model="lxyuan/distilbert-base-multilingual-cased-sentiments-student", 
        return_all_scores=True
    )
    
    # english
    distilled_student_sentiment_classifier ("I love this movie and i would watch it again and again!")
    >> [[{'label': 'positive', 'score': 0.9731044769287109},
      {'label': 'neutral', 'score': 0.016910076141357422},
      {'label': 'negative', 'score': 0.009985478594899178}]]
    
    # malay
    distilled_student_sentiment_classifier("Saya suka filem ini dan saya akan menontonnya lagi dan lagi!")
    [[{'label': 'positive', 'score': 0.9760093688964844},
      {'label': 'neutral', 'score': 0.01804516464471817},
      {'label': 'negative', 'score': 0.005945465061813593}]]
    
    # japanese
    distilled_student_sentiment_classifier("")
    >> [[{'label': 'positive', 'score': 0.9342429041862488},
      {'label': 'neutral', 'score': 0.040193185210227966},
      {'label': 'negative', 'score': 0.025563929229974747}]]
    
    

[](#training-procedure)Training procedure
-----------------------------------------

Notebook link: [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/Distilling_Zero_Shot_multilingual_distilbert_sentiments_student.ipynb)

### [](#training-hyperparameters)Training hyperparameters

Result can be reproduce using the following commands:

    python transformers/examples/research_projects/zero-shot-distillation/distill_classifier.py \
    --data_file ./multilingual-sentiments/train_unlabeled.txt \
    --class_names_file ./multilingual-sentiments/class_names.txt \
    --hypothesis_template "The sentiment of this text is {}." \
    --teacher_name_or_path MoritzLaurer/mDeBERTa-v3-base-mnli-xnli \
    --teacher_batch_size 32 \
    --student_name_or_path distilbert-base-multilingual-cased \
    --output_dir ./distilbert-base-multilingual-cased-sentiments-student \
    --per_device_train_batch_size 16 \
    --fp16
    

If you are training this model on Colab, make the following code changes to avoid Out-of-memory error message:

    ###### modify L78 to disable fast tokenizer 
    default=False,
    
    ###### update dataset map part at L313
    dataset = dataset.map(tokenizer, input_columns="text", fn_kwargs={"padding": "max_length", "truncation": True, "max_length": 512})
    
    ###### add following lines to L213
    del model
    print(f"Manually deleted Teacher model, free some memory for student model.")
    
    ###### add following lines to L337
    trainer.push_to_hub()
    tokenizer.push_to_hub("distilbert-base-multilingual-cased-sentiments-student")
      
    

### [](#training-log)Training log

    
    Training completed. Do not forget to share your model on huggingface.co/models =)
    
    {'train_runtime': 2009.8864, 'train_samples_per_second': 73.0, 'train_steps_per_second': 4.563, 'train_loss': 0.6473459283913797, 'epoch': 1.0}
    100%|| 9171/9171 [33:29<00:00,  4.56it/s]
    [INFO|trainer.py:762] 2023-05-06 10:56:18,555 >> The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
    [INFO|trainer.py:3129] 2023-05-06 10:56:18,557 >> ***** Running Evaluation *****
    [INFO|trainer.py:3131] 2023-05-06 10:56:18,557 >>   Num examples = 146721
    [INFO|trainer.py:3134] 2023-05-06 10:56:18,557 >>   Batch size = 128
    100%|| 1147/1147 [08:59<00:00,  2.13it/s]
    05/06/2023 11:05:18 - INFO - __main__ - Agreement of student and teacher predictions: 88.29%
    [INFO|trainer.py:2868] 2023-05-06 11:05:18,251 >> Saving model checkpoint to ./distilbert-base-multilingual-cased-sentiments-student
    [INFO|configuration_utils.py:457] 2023-05-06 11:05:18,251 >> Configuration saved in ./distilbert-base-multilingual-cased-sentiments-student/config.json
    [INFO|modeling_utils.py:1847] 2023-05-06 11:05:18,905 >> Model weights saved in ./distilbert-base-multilingual-cased-sentiments-student/pytorch_model.bin
    [INFO|tokenization_utils_base.py:2171] 2023-05-06 11:05:18,905 >> tokenizer config file saved in ./distilbert-base-multilingual-cased-sentiments-student/tokenizer_config.json
    [INFO|tokenization_utils_base.py:2178] 2023-05-06 11:05:18,905 >> Special tokens file saved in ./distilbert-base-multilingual-cased-sentiments-student/special_tokens_map.json
    
    

### [](#framework-versions)Framework versions

*   Transformers 4.28.1
*   Pytorch 2.0.0+cu118
*   Datasets 2.11.0
*   Tokenizers 0.13.3

## Model Overview

`distilbert-base-multilingual-cased-sentiments-student` is a distilled version of a zero-shot classification pipeline on the Multilingual Sentiment dataset. It was created by [lxyuan](https://aimodels.fyi/creators/huggingFace/lxyuan) using a process of knowledge distillation, where a larger "teacher" model (in this case, [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://aimodels.fyi/models/huggingFace/mDeBERTa-v3-base-mnli-xnli-moritzlaurer)) is used to train a smaller "student" model (in this case, `distilbert-base-multilingual-cased`). This allows the student model to achieve high performance while being more efficient and lightweight.

The model is capable of performing zero-shot sentiment classification on multilingual text, determining whether a given piece of text has a positive, negative, or neutral sentiment. It can handle text in English, Malay, Japanese, and potentially other languages. This makes it useful for applications that require sentiment analysis across multiple languages, without the need for language-specific training data.

## Model Inputs and Outputs

### Inputs
- **Text**: A piece of text, in any of the supported languages (English, Malay, Japanese, etc.), to be classified for sentiment.

### Outputs
- **Sentiment scores**: A list of three dictionaries, each containing the following keys:
  - `label`: The sentiment label ('positive', 'neutral', or 'negative')
  - `score`: The probability of the corresponding sentiment label

## Capabilities
The `distilbert-base-multilingual-cased-sentiments-student` model can perform zero-shot sentiment classification on multilingual text. For example:

```python
from transformers import pipeline

distilled_student_sentiment_classifier = pipeline(
    model="lxyuan/distilbert-base-multilingual-cased-sentiments-student", 
    return_all_scores=True
)

# English
distilled_student_sentiment_classifier("I love this movie and i would watch it again and again!")
# Output: [[{'label': 'positive', 'score': 0.9731044769287109},
#           {'label': 'neutral', 'score': 0.016910076141357422},
#           {'label': 'negative', 'score': 0.009985478594899178}]]

# Malay
distilled_student_sentiment_classifier("Saya suka filem ini dan saya akan menontonnya lagi dan lagi!")
# Output: [[{'label': 'positive', 'score': 0.9760093688964844},
#           {'label': 'neutral', 'score': 0.01804516464471817},
#           {'label': 'negative', 'score': 0.005945465061813593}]]

# Japanese
distilled_student_sentiment_classifier("")
# Output: [[{'label': 'positive', 'score': 0.9342429041862488},
#           {'label': 'neutral', 'score': 0.040193185210227966},
#           {'label': 'negative', 'score': 0.025563929229974747}]]
```

## What Can I Use It For?
The `distilbert-base-multilingual-cased-sentiments-student` model can be used in a variety of applications that require multilingual sentiment analysis, such as:

- **Social media monitoring**: Analyzing customer sentiment across multiple languages on social media platforms.
- **Product reviews**: Aggregating and analyzing product reviews from customers in different countries and languages.
- **Market research**: Gauging public opinion on various topics or events in a global context.
- **Customer service**: Automatically detecting the sentiment of customer inquiries or feedback in different languages.

By using this distilled and efficient model, you can build sentiment analysis pipelines that are fast, scalable, and capable of handling text in multiple languages.

## Things to Try
One interesting aspect of this model is that it was trained using a process of knowledge distillation, where a larger "teacher" model was used to train a smaller "student" model. This allows the student model to achieve high performance while being more efficient and lightweight.

You could try experimenting with the model's performance and compare it to the original teacher model, [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://aimodels.fyi/models/huggingFace/mDeBERTa-v3-base-mnli-xnli-moritzlaurer), to see how much the distillation process has impacted the model's accuracy and speed.

Additionally, you could explore using this model as a starting point for further fine-tuning on domain-specific sentiment analysis tasks, potentially leading to even better performance for your particular use case.