https://github.com/BM-K/Sentence-Embedding-is-all-you-need
Korean-Sentence-Embedding
Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides environments where individuals can train models.
Quick tour
import torch<br /> from transformers import AutoModel, AutoTokenizer<br /> def cal_score(a, b):<br /> if len(a.shape) == 1: a = a.unsqueeze(0)<br /> if len(b.shape) == 1: b = b.unsqueeze(0)<br /> a_norm = a / a.norm(dim=1)[:, None]<br /> b_norm = b / b.norm(dim=1)[:, None]<br /> return torch.mm(a_norm, b_norm.transpose(0, 1)) * 100<br /> model = AutoModel.from_pretrained('BM-K/KoSimCSE-roberta-multitask')<br /> AutoTokenizer.from_pretrained('BM-K/KoSimCSE-roberta-multitask')<br /> sentences = ['치타가 들판을 가로 질러 먹이를 쫓는다.',<br /> '치타 한 마리가 먹이 뒤에서 달리고 있다.',<br /> '원숭이 한 마리가 드럼을 연주한다.']<br /> inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")<br /> embeddings, _ = model(**inputs, return_dict=False)<br /> score01 = cal_score(embeddings[0][0], embeddings[1][0])<br /> score02 = cal_score(embeddings[0][0], embeddings[2][0])<br />
Performance
- Semantic Textual Similarity test set results
Model | AVG | Cosine Pearson | Cosine Spearman | Euclidean Pearson | Euclidean Spearman | Manhattan Pearson | Manhattan Spearman | Dot Pearson | Dot Spearman |
---|---|---|---|---|---|---|---|---|---|
KoSBERT†SKT | 77.40 | 78.81 | 78.47 | 77.68 | 77.78 | 77.71 | 77.83 | 75.75 | 75.22 |
KoSBERT | 80.39 | 82.13 | 82.25 | 80.67 | 80.75 | 80.69 | 80.78 | 77.96 | 77.90 |
KoSRoBERTa | 81.64 | 81.20 | 82.20 | 81.79 | 82.34 | 81.59 | 82.20 | 80.62 | 81.25 |
KoSentenceBART | 77.14 | 79.71 | 78.74 | 78.42 | 78.02 | 78.40 | 78.00 | 74.24 | 72.15 |
KoSentenceT5 | 77.83 | 80.87 | 79.74 | 80.24 | 79.36 | 80.19 | 79.27 | 72.81 | 70.17 |
KoSimCSE-BERT†SKT | 81.32 | 82.12 | 82.56 | 81.84 | 81.63 | 81.99 | 81.74 | 79.55 | 79.19 |
KoSimCSE-BERT | 83.37 | 83.22 | 83.58 | 83.24 | 83.60 | 83.15 | 83.54 | 83.13 | 83.49 |
KoSimCSE-RoBERTa | 83.65 | 83.60 | 83.77 | 83.54 | 83.76 | 83.55 | 83.77 | 83.55 | 83.64 |
KoSimCSE-BERT-multitask | 85.71 | 85.29 | 86.02 | 85.63 | 86.01 | 85.57 | 85.97 | 85.26 | 85.93 |
KoSimCSE-RoBERTa-multitask | 85.77 | 85.08 | 86.12 | 85.84 | 86.12 | 85.83 | 86.12 | 85.03 | 85.99 |
收录说明:
1、本网页并非 BM-K/KoSimCSE-roberta-multitask 官网网址页面,此页面内容编录于互联网,只作展示之用;2、如果有与 BM-K/KoSimCSE-roberta-multitask 相关业务事宜,请访问其网站并获取联系方式;3、本站与 BM-K/KoSimCSE-roberta-multitask 无任何关系,对于 BM-K/KoSimCSE-roberta-multitask 网站中的信息,请用户谨慎辨识其真伪。4、本站收录 BM-K/KoSimCSE-roberta-multitask 时,此站内容访问正常,如遇跳转非法网站,有可能此网站被非法入侵或者已更换新网址,导致旧网址被非法使用,5、如果你是网站站长或者负责人,不想被收录请邮件删除:i-hu#Foxmail.com (#换@)
前往AI网址导航
2、本站所有文章、图片、资源等如果未标明原创,均为收集自互联网公开资源;分享的图片、资源、视频等,出镜模特均为成年女性正常写真内容,版权归原作者所有,仅作为个人学习、研究以及欣赏!如有涉及下载请24小时内删除;
3、如果您发现本站上有侵犯您的权益的作品,请与我们取得联系,我们会及时修改、删除并致以最深的歉意。邮箱: i-hu#(#换@)foxmail.com