Better averaging of judges ratings #1281

epugh · 2025-03-15T12:23:27Z

Is your feature request related to a problem? Please describe.
Today we just straight up average. @david-fisher doesn't love this!

Describe the solution you'd like
if i have 3 or more, then I take the three highest ratings.

if all thee agree, i am happy.

if all three do NOT agree, then I take the min of all three.

jvia · 2025-04-07T23:46:57Z

Curious why this would be a better approach.

david-fisher · 2025-04-08T11:41:04Z

When combining the judgments from multiple judges, we can make a variety of assumptions. If we assume:

Judges are generally subject matter experts (SMEs)
Judges are rational
Judges have individual biases when interpreting relevance

Appealing to 1, we take the three highest out of all the judgments there are. Note, many additional judgments could all agree with these first three. Here we are being optimistic.
If the judges do not agree, we appeal to numbers 2 and 3. If a rational judge deems the candidate to be poorer than the other judge or judges, we will trust them, and use their value. Here we are both being pessimistic, concerning the individual judgment, and optimistic, concerning the direction of the bias. That is, we expect judges to overrate, in the general case.

As cases grow, and the relative quality of the judges becomes clearer, alternative combinations could be used.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better averaging of judges ratings #1281

Better averaging of judges ratings #1281

epugh commented Mar 15, 2025

jvia commented Apr 7, 2025

david-fisher commented Apr 8, 2025

Better averaging of judges ratings #1281

Better averaging of judges ratings #1281

Comments

epugh commented Mar 15, 2025

jvia commented Apr 7, 2025

david-fisher commented Apr 8, 2025