Herbert Large Cased

allegro

herbert-large-cased

HerBERT HerBERT is a BERT-based Language Model trained on Polish corpora using Masked Language Modelling (MLM) and Sentence Structural Objective (SSO) with dynamic masking of whole words. For more details, please refer to: HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish. Model training and experiments were conducted with transformers in version 2.9. Corpus HerBERT was trained on six different corpora available for Polish language: Tokenizer The training dataset was tokenized into subwords using a character level byte-pair encoding (CharBPETokenizer) with a vocabulary size of 50k tokens. The tokenizer itself was trained with a tokenizers library. We kindly encourage you to use the Fast version of the tokenizer, namely HerbertTokenizerFast. Usage Example code: License CC BY 4.0 Citation If you use this model, please cite the following paper: Authors The model was trained by Machine Learning Research Team at Allegro and Linguistic Engineering Group at Institute of Computer Science, Polish Academy of Sciences. You can contact us at: klejbenchmark@allegro.pl
feature-extraction

Pricing

Cost per run
$-
USD
Avg run time
-
Seconds
Hardware
-
Prediction

Creator Models

ModelCostRuns
Plt5 Base$?2,200
Plt5 Large$?624
Herbert Klej Cased Tokenizer V1$?362
Plt5 Small$?1,215
Herbert Klej Cased V1$?770

Similar Models

Try it!

You can use this area to play around with demo applications that incorporate the Herbert Large Cased model. These demos are maintained and hosted externally by third-party creators. If you see an error, message me on Twitter.

Currently, there are no demos available for this model.

Overview

Summary of this model and related resources.

PropertyValue
Creatorallegro
Model NameHerbert Large Cased
Description

HerBERT HerBERT is a BERT-based Language Model trained on P...

Read more ยป
Tagsfeature-extraction
Model LinkView on HuggingFace
API SpecView on HuggingFace
Github LinkNo Github link provided
Paper LinkNo paper link provided

Popularity

How popular is this model, by number of runs? How popular is the creator, by the sum of all their runs?

PropertyValue
Runs1,631
Model Rank
Creator Rank

Cost

How much does it cost to run this model? How long, on average, does it take to complete a run?

PropertyValue
Cost per Run$-
Prediction Hardware-
Average Completion Time-