Bongsoo
Rank:Average Model Cost: $0.0000
Number of Runs: 3,658
Models by this creator
moco-sentencebertV2.0
$-/run
3.0K
Huggingface
klue-cross-encoder-v1
$-/run
343
Huggingface
kpf-cross-encoder-v1
kpf-cross-encoder-v1
kpf-cross-encoder-v1 jinmang2/kpfbert ๋ชจ๋ธ์ ํ๋ จ์์ผ cross-encoder๋ก ํ์ธํ๋ํ ๋ชจ๋ธ This model was trained using SentenceTransformers Cross-Encoder class. Training sts(10)-sts(10)ํ๋ จ ์ํด STS : seed=111,epoch=10, lr=1e-4, eps=1e-6, warm_step=10%, max_seq_len=128, train_batch=128(small ๋ชจ๋ธ=32) (albert 13m/7G) ํ๋ จ์ฝ๋ ํ๊ฐ์ฝ๋,ํ ์คํธ์ฝ๋ ๋ชจ๋ธ korsts klue-sts glue(stsb) stsb_multi_mt(en) albert-small-kor-cross-encoder-v1 0.8455 0.8526 0.8513 0.7976 klue-cross-encoder-v1 0.8262 0.8833 0.8512 0.7889 kpf-cross-encoder-v1 0.8799 0.9133 0.8626 0.8027 Usage and Performance Pre-trained models can be used like this: The model will predict scores for the pairs ('Sentence 1', 'Sentence 2') and ('Sentence 3', 'Sentence 4'). You can use this model also without sentence_transformers and by just using Transformers AutoModel class
$-/run
125
Huggingface
moco-sentencedistilbertV2.1
$-/run
36
Huggingface
kpf-sbert-v1.1
kpf-sbert-v1.1
kpf-sbert-v1.1 This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. jinmang2/kpfbert ๋ชจ๋ธ์ sentencebert๋ก ํ์ธ๋๋ํ ๋ชจ๋ธ (kpf-sbert-v1 ์์ NLI-STS ํ๋ จ์ 1๋ฒ ๋ ์ํด) Evaluation Results ์ฑ๋ฅ ์ธก์ ์ ์ํ ๋ง๋ญ์น๋, ์๋ ํ๊ตญ์ด (kor), ์์ด(en) ํ๊ฐ ๋ง๋ญ์น๋ฅผ ์ด์ฉํจ ํ๊ตญ์ด : korsts(1,379์๋ฌธ์ฅ) ์ klue-sts(519์๋ฌธ์ฅ) ์์ด : stsb_multi_mt(1,376์๋ฌธ์ฅ) ์ glue:stsb (1,500์๋ฌธ์ฅ) ์ฑ๋ฅ ์งํ๋ cosin.spearman ํ๊ฐ ์ธก์ ์ฝ๋๋ ์ฌ๊ธฐ ์ฐธ์กฐ ๋ชจ๋ธ korsts klue-sts glue(stsb) stsb_multi_mt(en) distiluse-base-multilingual-cased-v2 0.7475 0.7855 0.8193 0.8075 paraphrase-multilingual-mpnet-base-v2 0.8201 0.7993 0.8907 0.8682 bongsoo/albert-small-kor-sbert-v1 0.8305 0.8588 0.8419 0.7965 bongsoo/klue-sbert-v1.0 0.8529 0.8952 0.8813 0.8469 bongsoo/kpf-sbert-v1.0 0.8590 0.8924 0.8840 0.8531 bongsoo/kpf-sbert-v1.1 0.8750 0.8900 0.8863 0.8554 For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net Training jinmang2/kpfbert ๋ชจ๋ธ์ sts(10)-distil(10)-nli(3)-sts(10)-nli(3)-sts(10) ํ๋ จ ์ํด The model was trained with the parameters: ๊ณตํต do_lower_case=1, correct_bios=0, polling_mode=mean 1.STS ๋ง๋ญ์น : korsts(5,749) + kluestsV1.1(11,668) + stsb_multi_mt(5,749) + mteb/sickr-sts(9,927) + glue stsb(5,749) (์ด:38,842) Param : lr: 1e-4, eps: 1e-6, warm_step=10%, epochs: 10, train_batch: 128, eval_batch: 64, max_token_len: 72 ํ๋ จ์ฝ๋ ์ฌ๊ธฐ ์ฐธ์กฐ 2.distilation ๊ต์ฌ ๋ชจ๋ธ : paraphrase-multilingual-mpnet-base-v2(max_token_len:128) ๋ง๋ญ์น : news_talk_en_ko_train.tsv (์์ด-ํ๊ตญ์ด ๋ํ-๋ด์ค ๋ณ๋ ฌ ๋ง๋ญ์น : 1.38M) Param : lr: 5e-5, eps: 1e-8, epochs: 10, train_batch: 128, eval/test_batch: 64, max_token_len: 128(๊ต์ฌ๋ชจ๋ธ์ด 128์ด๋ฏ๋ก ๋ง์ถฐ์ค) ํ๋ จ์ฝ๋ ์ฌ๊ธฐ ์ฐธ์กฐ 3.NLI ๋ง๋ญ์น : ํ๋ จ(967,852) : kornli(550,152), kluenli(24,998), glue-mnli(392,702) / ํ๊ฐ(3,519) : korsts(1,500), kluests(519), gluests(1,500) () HyperParameter : lr: 3e-5, eps: 1e-8, warm_step=10%, epochs: 3, train/eval_batch: 64, max_token_len: 128 ํ๋ จ์ฝ๋ ์ฌ๊ธฐ ์ฐธ์กฐ Citing & Authors bongsoo
$-/run
25
Huggingface
albert-small-kor-sbert-v1.1
albert-small-kor-sbert-v1.1
albert-small-kor-sbert-v1.1 This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. albert-small-kor-v1 ๋ชจ๋ธ์ sentencebert๋ก ๋ง๋ ๋ชจ๋ธ. Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. Evaluation Results ์ฑ๋ฅ ์ธก์ ์ ์ํ ๋ง๋ญ์น๋, ์๋ ํ๊ตญ์ด (kor), ์์ด(en) ํ๊ฐ ๋ง๋ญ์น๋ฅผ ์ด์ฉํจ ํ๊ตญ์ด : korsts(1,379์๋ฌธ์ฅ) ์ klue-sts(519์๋ฌธ์ฅ) ์์ด : stsb_multi_mt(1,376์๋ฌธ์ฅ) ์ glue:stsb (1,500์๋ฌธ์ฅ) ์ฑ๋ฅ ์งํ๋ cosin.spearman ํ๊ฐ ์ธก์ ์ฝ๋๋ ์ฌ๊ธฐ ์ฐธ์กฐ ๋ชจ๋ธ korsts klue-sts glue(stsb) stsb_multi_mt(en) distiluse-base-multilingual-cased-v2 0.7475 0.7855 0.8193 0.8075 paraphrase-multilingual-mpnet-base-v2 0.8201 0.7993 0.8907 0.8682 bongsoo/albert-small-kor-sbert-v1 0.8305 0.8588 0.8419 0.7965 bongsoo/klue-sbert-v1.0 0.8529 0.8952 0.8813 0.8469 bongsoo/kpf-sbert-v1.1 0.8750 0.8900 0.8863 0.8554 bongsoo/albert-small-kor-sbert-v1.1 0.8526 0.8833 0.8484 0.8286 For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net Training albert-small-kor-v1 ๋ชจ๋ธ์ sts(10)-distil(10) ํ๋ จ๋ง ์ํด (nli-sts์ถ๊ฐ ํ๋ จ์ํค๋ฉด ์ ํฉ๋๊ฐ ๋จ์ด์ง) ๊ต์ฌ๋ชจ๋ธ์ kpf-sbert-v1.1 ์ด์ฉํจ. The model was trained with the parameters: ๊ณตํต do_lower_case=1, correct_bios=0, polling_mode=cls 1.STS ๋ง๋ญ์น : korsts(5,749) + kluestsV1.1(11,668) + stsb_multi_mt(5,749) + mteb/sickr-sts(9,927) + glue stsb(5,749) (์ด:38,842) Param : lr: 1e-4, eps: 1e-6, warm_step=10%, epochs: 10, train_batch: 32, eval_batch: 64, max_token_len: 72 ํ๋ จ์ฝ๋ ์ฌ๊ธฐ ์ฐธ์กฐ 2.distilation ๊ต์ฌ ๋ชจ๋ธ : kpf-sbert-v1.1(max_token_len:128) ๋ง๋ญ์น : news_talk_ko_en_train.tsv (ํ๊ตญ์ด-์์ด ๋ํ-๋ด์ค ๋ณ๋ ฌ ๋ง๋ญ์น : 1.38M) Param : lr: 5e-5, epochs: 10, train_batch: 32, eval/test_batch: 64, max_token_len: 128(๊ต์ฌ๋ชจ๋ธ์ด 128์ด๋ฏ๋ก ๋ง์ถฐ์ค) ํ๋ จ์ฝ๋ ์ฌ๊ธฐ ์ฐธ์กฐ Full Model Architecture Citing & Authors bongsoo
$-/run
24
Huggingface
albert-small-kor-v1
$-/run
22
Huggingface
albert-small-kor-sbert-v1
$-/run
18
Huggingface
moco-sentencedistilbertV2.0
$-/run
18
Huggingface
mdistilbertV3.1
mdistilbertV3.1
mdistilbertV3.1 distilbert-base-multilingual-cased ๋ชจ๋ธ์ moco-corpus-kowiki2022 ๋ง๋ญ์น(kowiki202206 + MOCOMSYS ์ถ์ถ 3.2M ๋ฌธ์ฅ)๋ก vocab ์ถ๊ฐํ์ฌ ํ์ต ์ํจ ๋ชจ๋ธ vocab: 159,552๊ฐ (๊ธฐ์กด bert ๋ชจ๋ธ vocab(119,548๊ฐ)์ 40,004๊ฐ (ํ๊ธ๋จ์ด30,000๊ฐ+์๋ฌธ10,000๊ฐ+์๋ 4๊ฐ)vocab ์ถ๊ฐ mdistilbertV2.1 ๋ณด๋ค ์ฝ 7,000๊ฐ ๋จ์ด๊ฐ ๋ ๋ง๊ณ , ํ๊ธ๋จ์ด๋ mecab๋ฅผ ์ด์ฉํ์ฌ ์ถ์ถํจ. epoch์ 12๋ฒ ํ์ตํจ(mdistilbertV2.1์ 8๋ฒ) Usage (HuggingFace Transformers) 1. MASK ์์ ๊ฒฐ๊ณผ 2. ์๋ฒ ๋ฉ ์์ ํ๊ท ํด๋ง(mean_pooling) ๋ฐฉ์ ์ฌ์ฉ. (cls ํด๋ง, max ํด๋ง) ๊ฒฐ๊ณผ Training MLM(Masked Langeuage Model) ํ๋ จ ์ ๋ ฅ ๋ชจ๋ธ : distilbert-base-multilingual-cased ๋ง๋ญ์น : ํ๋ จ : bongsoo/moco-corpus-kowiki2022(7.6M) , ํ๊ฐ: bongsoo/moco_eval HyperParameter : LearningRate : 5e-5, epochs: 12 , batchsize: 32, max_token_len : 128 vocab : 159,552๊ฐ (๊ธฐ์กด bert ๋ชจ๋ธ vocab(119,548๊ฐ)์ 40,004๊ฐ (ํ๊ธ๋จ์ด30,000๊ฐ+์๋ฌธ10,000๊ฐ+์๋ 4๊ฐ)vocab ์ถ๊ฐ ์ถ๋ ฅ ๋ชจ๋ธ : mdistilbertV3.1 (size: 634MB) ํ๋ จ์๊ฐ : 90h/1GPU (24GB/16.5 use) ํ๋ จloss: 2.1154, ํ๊ฐloss: 2.5275 ํ๋ จ์ฝ๋ ์ฌ๊ธฐ ์ฐธ์กฐ perplexity ํ๊ฐ ์ฝ๋๋ ์ฌ๊ธฐ ์ฐธ์กฐ Model Config Citing & Authors bongsoo
$-/run
17
Huggingface