Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

Recent multimodal large language models have demonstrated strong reasoning ability, yet their reliability as automated evaluators remains limited by a critical weakness: when visual evidence conflicts with textual cues,

Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling appears in this edition because Recent multimodal large language models have demonstrated strong reasoning ability, yet their reliability as automated evaluators remains limited by a critical weakness: when visual evidence conflicts with textual cues, MLLM judges tend to reward plausible nar... Why it matters: Fresh arXiv paper with likely relevance to current AI/ML workflows. Primary citation: http://arxiv.org/abs/2606.02578v1. This item is included as recent arXiv submission signal rather than a settled claim where facts are still developing.

Read the original article ↗