Bongsoo

Rank:

Average Model Cost: $0.0000

Number of Runs: 3,658

Models by this creator

moco-sentencebertV2.0

moco-sentencebertV2.0

bongsoo

Platform did not provide a description for this model.

Read more

$-/run

3.0K

Huggingface

kpf-cross-encoder-v1

kpf-cross-encoder-v1

kpf-cross-encoder-v1 jinmang2/kpfbert ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œ์ผœ cross-encoder๋กœ ํŒŒ์ธํŠœ๋‹ํ•œ ๋ชจ๋ธ This model was trained using SentenceTransformers Cross-Encoder class. Training sts(10)-sts(10)ํ›ˆ๋ จ ์‹œํ‚ด STS : seed=111,epoch=10, lr=1e-4, eps=1e-6, warm_step=10%, max_seq_len=128, train_batch=128(small ๋ชจ๋ธ=32) (albert 13m/7G) ํ›ˆ๋ จ์ฝ”๋“œ ํ‰๊ฐ€์ฝ”๋“œ,ํ…Œ์ŠคํŠธ์ฝ”๋“œ ๋ชจ๋ธ korsts klue-sts glue(stsb) stsb_multi_mt(en) albert-small-kor-cross-encoder-v1 0.8455 0.8526 0.8513 0.7976 klue-cross-encoder-v1 0.8262 0.8833 0.8512 0.7889 kpf-cross-encoder-v1 0.8799 0.9133 0.8626 0.8027 Usage and Performance Pre-trained models can be used like this: The model will predict scores for the pairs ('Sentence 1', 'Sentence 2') and ('Sentence 3', 'Sentence 4'). You can use this model also without sentence_transformers and by just using Transformers AutoModel class

Read more

$-/run

125

Huggingface

kpf-sbert-v1.1

kpf-sbert-v1.1

kpf-sbert-v1.1 This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. jinmang2/kpfbert ๋ชจ๋ธ์„ sentencebert๋กœ ํŒŒ์ธ๋“€๋‹ํ•œ ๋ชจ๋ธ (kpf-sbert-v1 ์—์„œ NLI-STS ํ›ˆ๋ จ์„ 1๋ฒˆ ๋” ์‹œํ‚ด) Evaluation Results ์„ฑ๋Šฅ ์ธก์ •์„ ์œ„ํ•œ ๋ง๋ญ‰์น˜๋Š”, ์•„๋ž˜ ํ•œ๊ตญ์–ด (kor), ์˜์–ด(en) ํ‰๊ฐ€ ๋ง๋ญ‰์น˜๋ฅผ ์ด์šฉํ•จ ํ•œ๊ตญ์–ด : korsts(1,379์Œ๋ฌธ์žฅ) ์™€ klue-sts(519์Œ๋ฌธ์žฅ) ์˜์–ด : stsb_multi_mt(1,376์Œ๋ฌธ์žฅ) ์™€ glue:stsb (1,500์Œ๋ฌธ์žฅ) ์„ฑ๋Šฅ ์ง€ํ‘œ๋Š” cosin.spearman ํ‰๊ฐ€ ์ธก์ • ์ฝ”๋“œ๋Š” ์—ฌ๊ธฐ ์ฐธ์กฐ ๋ชจ๋ธ korsts klue-sts glue(stsb) stsb_multi_mt(en) distiluse-base-multilingual-cased-v2 0.7475 0.7855 0.8193 0.8075 paraphrase-multilingual-mpnet-base-v2 0.8201 0.7993 0.8907 0.8682 bongsoo/albert-small-kor-sbert-v1 0.8305 0.8588 0.8419 0.7965 bongsoo/klue-sbert-v1.0 0.8529 0.8952 0.8813 0.8469 bongsoo/kpf-sbert-v1.0 0.8590 0.8924 0.8840 0.8531 bongsoo/kpf-sbert-v1.1 0.8750 0.8900 0.8863 0.8554 For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net Training jinmang2/kpfbert ๋ชจ๋ธ์„ sts(10)-distil(10)-nli(3)-sts(10)-nli(3)-sts(10) ํ›ˆ๋ จ ์‹œํ‚ด The model was trained with the parameters: ๊ณตํ†ต do_lower_case=1, correct_bios=0, polling_mode=mean 1.STS ๋ง๋ญ‰์น˜ : korsts(5,749) + kluestsV1.1(11,668) + stsb_multi_mt(5,749) + mteb/sickr-sts(9,927) + glue stsb(5,749) (์ด:38,842) Param : lr: 1e-4, eps: 1e-6, warm_step=10%, epochs: 10, train_batch: 128, eval_batch: 64, max_token_len: 72 ํ›ˆ๋ จ์ฝ”๋“œ ์—ฌ๊ธฐ ์ฐธ์กฐ 2.distilation ๊ต์‚ฌ ๋ชจ๋ธ : paraphrase-multilingual-mpnet-base-v2(max_token_len:128) ๋ง๋ญ‰์น˜ : news_talk_en_ko_train.tsv (์˜์–ด-ํ•œ๊ตญ์–ด ๋Œ€ํ™”-๋‰ด์Šค ๋ณ‘๋ ฌ ๋ง๋ญ‰์น˜ : 1.38M) Param : lr: 5e-5, eps: 1e-8, epochs: 10, train_batch: 128, eval/test_batch: 64, max_token_len: 128(๊ต์‚ฌ๋ชจ๋ธ์ด 128์ด๋ฏ€๋กœ ๋งŸ์ถฐ์คŒ) ํ›ˆ๋ จ์ฝ”๋“œ ์—ฌ๊ธฐ ์ฐธ์กฐ 3.NLI ๋ง๋ญ‰์น˜ : ํ›ˆ๋ จ(967,852) : kornli(550,152), kluenli(24,998), glue-mnli(392,702) / ํ‰๊ฐ€(3,519) : korsts(1,500), kluests(519), gluests(1,500) () HyperParameter : lr: 3e-5, eps: 1e-8, warm_step=10%, epochs: 3, train/eval_batch: 64, max_token_len: 128 ํ›ˆ๋ จ์ฝ”๋“œ ์—ฌ๊ธฐ ์ฐธ์กฐ Citing & Authors bongsoo

Read more

$-/run

25

Huggingface

albert-small-kor-sbert-v1.1

albert-small-kor-sbert-v1.1

albert-small-kor-sbert-v1.1 This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. albert-small-kor-v1 ๋ชจ๋ธ์„ sentencebert๋กœ ๋งŒ๋“  ๋ชจ๋ธ. Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. Evaluation Results ์„ฑ๋Šฅ ์ธก์ •์„ ์œ„ํ•œ ๋ง๋ญ‰์น˜๋Š”, ์•„๋ž˜ ํ•œ๊ตญ์–ด (kor), ์˜์–ด(en) ํ‰๊ฐ€ ๋ง๋ญ‰์น˜๋ฅผ ์ด์šฉํ•จ ํ•œ๊ตญ์–ด : korsts(1,379์Œ๋ฌธ์žฅ) ์™€ klue-sts(519์Œ๋ฌธ์žฅ) ์˜์–ด : stsb_multi_mt(1,376์Œ๋ฌธ์žฅ) ์™€ glue:stsb (1,500์Œ๋ฌธ์žฅ) ์„ฑ๋Šฅ ์ง€ํ‘œ๋Š” cosin.spearman ํ‰๊ฐ€ ์ธก์ • ์ฝ”๋“œ๋Š” ์—ฌ๊ธฐ ์ฐธ์กฐ ๋ชจ๋ธ korsts klue-sts glue(stsb) stsb_multi_mt(en) distiluse-base-multilingual-cased-v2 0.7475 0.7855 0.8193 0.8075 paraphrase-multilingual-mpnet-base-v2 0.8201 0.7993 0.8907 0.8682 bongsoo/albert-small-kor-sbert-v1 0.8305 0.8588 0.8419 0.7965 bongsoo/klue-sbert-v1.0 0.8529 0.8952 0.8813 0.8469 bongsoo/kpf-sbert-v1.1 0.8750 0.8900 0.8863 0.8554 bongsoo/albert-small-kor-sbert-v1.1 0.8526 0.8833 0.8484 0.8286 For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net Training albert-small-kor-v1 ๋ชจ๋ธ์„ sts(10)-distil(10) ํ›ˆ๋ จ๋งŒ ์‹œํ‚ด (nli-sts์ถ”๊ฐ€ ํ›ˆ๋ จ์‹œํ‚ค๋ฉด ์ •ํ•ฉ๋„๊ฐ€ ๋–จ์–ด์ง) ๊ต์‚ฌ๋ชจ๋ธ์€ kpf-sbert-v1.1 ์ด์šฉํ•จ. The model was trained with the parameters: ๊ณตํ†ต do_lower_case=1, correct_bios=0, polling_mode=cls 1.STS ๋ง๋ญ‰์น˜ : korsts(5,749) + kluestsV1.1(11,668) + stsb_multi_mt(5,749) + mteb/sickr-sts(9,927) + glue stsb(5,749) (์ด:38,842) Param : lr: 1e-4, eps: 1e-6, warm_step=10%, epochs: 10, train_batch: 32, eval_batch: 64, max_token_len: 72 ํ›ˆ๋ จ์ฝ”๋“œ ์—ฌ๊ธฐ ์ฐธ์กฐ 2.distilation ๊ต์‚ฌ ๋ชจ๋ธ : kpf-sbert-v1.1(max_token_len:128) ๋ง๋ญ‰์น˜ : news_talk_ko_en_train.tsv (ํ•œ๊ตญ์–ด-์˜์–ด ๋Œ€ํ™”-๋‰ด์Šค ๋ณ‘๋ ฌ ๋ง๋ญ‰์น˜ : 1.38M) Param : lr: 5e-5, epochs: 10, train_batch: 32, eval/test_batch: 64, max_token_len: 128(๊ต์‚ฌ๋ชจ๋ธ์ด 128์ด๋ฏ€๋กœ ๋งŸ์ถฐ์คŒ) ํ›ˆ๋ จ์ฝ”๋“œ ์—ฌ๊ธฐ ์ฐธ์กฐ Full Model Architecture Citing & Authors bongsoo

Read more

$-/run

24

Huggingface

mdistilbertV3.1

mdistilbertV3.1

mdistilbertV3.1 distilbert-base-multilingual-cased ๋ชจ๋ธ์— moco-corpus-kowiki2022 ๋ง๋ญ‰์น˜(kowiki202206 + MOCOMSYS ์ถ”์ถœ 3.2M ๋ฌธ์žฅ)๋กœ vocab ์ถ”๊ฐ€ํ•˜์—ฌ ํ•™์Šต ์‹œํ‚จ ๋ชจ๋ธ vocab: 159,552๊ฐœ (๊ธฐ์กด bert ๋ชจ๋ธ vocab(119,548๊ฐœ)์— 40,004๊ฐœ (ํ•œ๊ธ€๋‹จ์–ด30,000๊ฐœ+์˜๋ฌธ10,000๊ฐœ+์ˆ˜๋™ 4๊ฐœ)vocab ์ถ”๊ฐ€ mdistilbertV2.1 ๋ณด๋‹ค ์•ฝ 7,000๊ฐœ ๋‹จ์–ด๊ฐ€ ๋” ๋งŽ๊ณ , ํ•œ๊ธ€๋‹จ์–ด๋Š” mecab๋ฅผ ์ด์šฉํ•˜์—ฌ ์ถ”์ถœํ•จ. epoch์€ 12๋ฒˆ ํ•™์Šตํ•จ(mdistilbertV2.1์€ 8๋ฒˆ) Usage (HuggingFace Transformers) 1. MASK ์˜ˆ์‹œ ๊ฒฐ๊ณผ 2. ์ž„๋ฒ ๋”ฉ ์˜ˆ์‹œ ํ‰๊ท  ํด๋ง(mean_pooling) ๋ฐฉ์‹ ์‚ฌ์šฉ. (cls ํด๋ง, max ํด๋ง) ๊ฒฐ๊ณผ Training MLM(Masked Langeuage Model) ํ›ˆ๋ จ ์ž…๋ ฅ ๋ชจ๋ธ : distilbert-base-multilingual-cased ๋ง๋ญ‰์น˜ : ํ›ˆ๋ จ : bongsoo/moco-corpus-kowiki2022(7.6M) , ํ‰๊ฐ€: bongsoo/moco_eval HyperParameter : LearningRate : 5e-5, epochs: 12 , batchsize: 32, max_token_len : 128 vocab : 159,552๊ฐœ (๊ธฐ์กด bert ๋ชจ๋ธ vocab(119,548๊ฐœ)์— 40,004๊ฐœ (ํ•œ๊ธ€๋‹จ์–ด30,000๊ฐœ+์˜๋ฌธ10,000๊ฐœ+์ˆ˜๋™ 4๊ฐœ)vocab ์ถ”๊ฐ€ ์ถœ๋ ฅ ๋ชจ๋ธ : mdistilbertV3.1 (size: 634MB) ํ›ˆ๋ จ์‹œ๊ฐ„ : 90h/1GPU (24GB/16.5 use) ํ›ˆ๋ จloss: 2.1154, ํ‰๊ฐ€loss: 2.5275 ํ›ˆ๋ จ์ฝ”๋“œ ์—ฌ๊ธฐ ์ฐธ์กฐ perplexity ํ‰๊ฐ€ ์ฝ”๋“œ๋Š” ์—ฌ๊ธฐ ์ฐธ์กฐ Model Config Citing & Authors bongsoo

Read more

$-/run

17

Huggingface

Similar creators