layout | title | parent | has_children |
---|---|---|---|
default |
roberta-base |
Rankings |
true |
[comment]: # (This page contains a link to a table with the ranking and performance of all ranked roberta-base models. In addition, it contains a table with the baseline and the 10 best models. The original ranking was done by finetuning only the classification head of the model (linear probing) over the MNLI dataset. The best models by this ranking where ranked by the average accuracy after finetuning over the 36 datasets (except for the stsb dataset, where we used the Spearman correlation instead of accuracy).)
Ranking and performance of all 1277 ranked roberta-base models (full table). The top 386 models were fully tested.
Notes:
- The baseline results can be found here
- While the average improvement is small, many datasets show large gains
- ColD Fusion variations were removed to avoid cluttering the table
model_name | avg | mnli_lp | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
baseline | roberta-base | 76.22 | nan | 85.28 | 89.77 | 66.58 | 50.35 | 78.69 | 67.77 | 83.53 | 48.70 | 77.30 | 90.99 | 85.11 | 93.90 | 72.47 | 86.98 | 87.87 | 61.22 | 83.94 | 92.41 | 90.71 | 88.42 | 72.40 | 94.12 | 56.68 | 89.92 | 97.11 | 87.76 | 46.30 | 81.82 | 52.89 | 71.56 | 84.55 | 71.03 | 65.48 | 54.79 | 63.27 | 72.40 |
1 | ibm/ColD-Fusion | 78.47 | 86.09 | 85.82 | 89.80 | 66.26 | 51.94 | 81.38 | 87.50 | 83.32 | 72.00 | 78.63 | 91.14 | 88.10 | 93.86 | 73.53 | 87.30 | 87.01 | 63.72 | 85.58 | 92.40 | 91.11 | 91.84 | 85.20 | 95.41 | 56.38 | 91.30 | 97.00 | 90.40 | 46.31 | 83.04 | 54.44 | 77.93 | 85.93 | 70.43 | 68.65 | 47.89 | 60.58 | 71.87 |
2 | gustavecortal/roberta_emo | 78.47 | 84.87 | 85.82 | 90.23 | 66.08 | 52.16 | 81.62 | 89.29 | 83.41 | 71.00 | 77.50 | 90.70 | 86.10 | 93.78 | 73.01 | 86.82 | 88.24 | 64.07 | 88.46 | 92.88 | 90.95 | 91.37 | 83.39 | 95.76 | 57.47 | 91.51 | 97.20 | 91.20 | 45.99 | 82.48 | 52.49 | 75.64 | 86.63 | 70.87 | 68.50 | 46.48 | 63.46 | 72.27 |
3 | jakub014/ColD-Fusion-finetuned-convincingness-acl2016 | 78.39 | 84.05 | 86.13 | 89.17 | 66.68 | 52.22 | 81.44 | 85.71 | 82.84 | 67.00 | 77.77 | 91.10 | 85.60 | 93.56 | 71.84 | 87.53 | 87.50 | 64.03 | 91.35 | 93.21 | 91.28 | 91.93 | 86.28 | 95.41 | 58.28 | 91.34 | 97.20 | 88.80 | 46.49 | 83.46 | 54.95 | 73.98 | 85.93 | 70.88 | 66.93 | 49.30 | 62.50 | 72.50 |
4 | jakub014/ColD-Fusion-finetuned-convincingness-IBM | 78.36 | 85.08 | 85.98 | 89.37 | 67.26 | 51.31 | 81.56 | 89.29 | 82.84 | 75.00 | 77.00 | 91.26 | 87.60 | 94.18 | 72.43 | 87.23 | 89.22 | 64.17 | 88.46 | 92.22 | 90.90 | 91.46 | 85.20 | 95.53 | 57.65 | 91.52 | 97.40 | 87.40 | 46.24 | 82.83 | 54.01 | 75.51 | 85.23 | 69.85 | 66.77 | 47.89 | 57.69 | 71.60 |
5 | janeel/muppet-roberta-base-finetuned-squad | 78.04 | 83.24 | 84.89 | 89.67 | 67.16 | 53.59 | 82.39 | 82.14 | 81.88 | 62.00 | 77.77 | 91.34 | 85.60 | 94.12 | 72.95 | 86.55 | 89.46 | 64.25 | 87.50 | 92.70 | 91.00 | 90.71 | 83.75 | 95.99 | 58.14 | 91.29 | 97.00 | 90.60 | 46.46 | 82.20 | 54.38 | 80.10 | 84.88 | 71.85 | 70.22 | 39.44 | 63.46 | 71.93 |
6 | mwong/roberta-base-climate-evidence-related | 77.21 | 55.09 | 85.13 | 89.93 | 66.54 | 50.22 | 72.40 | 77.70 | 83.22 | 84.60 | 77.70 | 89.65 | 84.60 | 93.99 | 73.14 | 87.12 | 89.96 | 87.12 | 83.65 | 92.29 | 89.93 | 88.93 | 72.20 | 95.07 | 54.71 | 73.14 | 96.80 | 87.40 | 46.57 | 81.42 | 51.65 | 71.43 | 85.12 | 70.34 | 54.93 | 54.93 | 63.46 | 72.40 |
7 | k4black/roberta-base-e-snli-classification-nli-base | 77.06 | 80.54 | 85.42 | 89.30 | 66.54 | 51.66 | 79.88 | 78.57 | 83.32 | 59.00 | 77.30 | 90.70 | 86.20 | 93.96 | 73.21 | 86.80 | 85.78 | 62.09 | 81.73 | 92.35 | 91.04 | 88.09 | 80.14 | 94.38 | 56.47 | 90.97 | 97.80 | 87.60 | 46.40 | 81.42 | 52.83 | 68.88 | 83.60 | 69.36 | 69.75 | 56.34 | 63.46 | 71.77 |
8 | facebook/muppet-roberta-base | 77.00 | 84.75 | 90.00 | 89.77 | 86.50 | 52.59 | 82.17 | 80.36 | 81.21 | 65.00 | 85.17 | 52.59 | 46.10 | 91.74 | 73.01 | 93.04 | 88.97 | 64.15 | 94.14 | 84.48 | 91.25 | 58.10 | 39.44 | 67.06 | 94.84 | 91.58 | 85.58 | 96.80 | 82.76 | 51.11 | 76.02 | 84.77 | 71.57 | 87.07 | 66.61 | 91.10 | 63.46 | 71.90 |
9 | WillHeld/roberta-base-mnli | 76.93 | 86.22 | 83.48 | 90.07 | 84.50 | 50.75 | 80.18 | 82.14 | 80.63 | 72.00 | 77.43 | 50.75 | 45.65 | 92.98 | 70.34 | 91.76 | 88.48 | 62.81 | 81.73 | 82.67 | 91.20 | 86.21 | 57.75 | 65.84 | 94.15 | 89.82 | 96.00 | 85.00 | 78.61 | 52.15 | 70.15 | 83.60 | 70.23 | 86.98 | 66.77 | 89.86 | 65.38 | 71.27 |
10 | deepakvk/roberta-base-squad2-finetuned-squad | 76.89 | 61.13 | 85.41 | 89.37 | 66.62 | 52.22 | 79.11 | 69.64 | 82.74 | 55.00 | 77.60 | 90.65 | 88.80 | 93.43 | 71.84 | 86.49 | 88.24 | 63.51 | 85.58 | 92.84 | 90.69 | 87.52 | 77.26 | 93.12 | 56.61 | 90.09 | 97.80 | 89.00 | 45.60 | 81.14 | 53.50 | 71.56 | 83.84 | 70.01 | 69.59 | 56.34 | 63.46 | 72.00 |
Download full models ranking table: [csv](./results/roberta-base_table.csv)