⚠️ Biases to watch out for when using LLMs as judges.
-
Position bias: the model tends to prefer the answer that is shown first. A common fix is to ask twice with the order swapped, A vs B and B vs A, and then take the majority result.
-
Verbosity bias: the model tends to prefer answers that are simply longer and more detailed, regardless of whether the content is actually better. You need to specify this in the rubric or explicitly penalize length.
-
Self-enhancement bias: the model prefers answers that resemble the ones it would have generated itself.
If the model generated a certain sentence, that means it probabilistically believed it was the best answer. So when it evaluates, it will naturally favor styles similar to its own. To avoid that, it is better to use different models for generation and evaluation.
