Really needs a "both wrong" "both ok" option #1

bjj · 2024-02-05T04:04:05Z

After running through some test prompts there are many instances where there is nothing to separate the two answers (they're both wrong exactly the same amount or in the same way). There's probably something more statistically valid than picking at random in those cases.

Contextualist · 2024-02-05T06:29:10Z

Thanks for the feedback! The concern is indeed valid. I will need to think about how tie should be handled, though.

I did not implement the "both wrong" "both ok" out of two reasons:

Tie conflicts with the elimination-based tournament process, which allow you to pick the better responses among the responses from each model first, before comparing responses from different models.
Personally, two-way decision feels less mentally taxing as compared to three or four.

I need to think about how to handle tie in elimination matches, or to replace elimination with something else.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Really needs a "both wrong" "both ok" option #1

Really needs a "both wrong" "both ok" option #1

bjj commented Feb 5, 2024

Contextualist commented Feb 5, 2024

Really needs a "both wrong" "both ok" option #1

Really needs a "both wrong" "both ok" option #1

Comments

bjj commented Feb 5, 2024

Contextualist commented Feb 5, 2024