AI chatbots like Gemini and ChatGPT are full of surprises. A shocking revelation has revealed a disturbing truth about these bots.
In this new research, researchers analyzed and compared approximately 43,000 simulated decisions made by AI systems and 1000 human-based decisions.
They found that like humans, AI models also judge humans based on their questions. So, these widely used AI systems not only process information, surprisingly they also systematically evaluate people in ways that mirror human trust, but with important caveats.
According to the findings published in the journal Proceedings of the Royal Society ABoth humans and AI systems value the same core pillars of trust, including honesty, competence, and benevolence. What makes them different from each other is the different way they are evaluated.
Humans rely on holistic and intuitive feeling by integrating multiple traits into a single decision. AI models, on the other hand, take a rigid and fragmented approach “by dividing people based on competence, integrity and kindness, almost like columns in a spreadsheet.”
“The people in our study are disorganized and holistic in the way they judge others. AI is cleaner, more organized, and can yield very different results,” study author Valeria Lerman said.
AI bias is becoming harder to notice
According to the researchers, it is no mistake to say that AI judgment is more rigid and less subtle unlike humans, making it harder to audit for hidden biases.
The “by-the-book” approach adopted by AI models has ultimately led to a disturbing pattern of increased bias. For example, in financial situations, AI will evaluate people based on demographic characteristics. As a result, older people will often have more favorable results.
According to Yaniv Dover, another author of the study, “Humans certainly have biases, but what surprised us is that AI’s biases may be more systematic, more predictable, and sometimes stronger.”
“The two systems may look similar on the surface but behave very differently when making decisions about people,” Dr. Lerman said.
“These deviations require careful attention when interpreting large language model confidence-related outputs,” the study warns.
