When Xiaochu Zhang learnt to write computer code as a high-school student, he had a dream: to build a machine that could fool people into thinking it was human. Over the years, he dropped the coding and studied cognitive psychology, eventually becoming a professor in the field at the University of Science and Technology of China.
But the advent of chatbots capable of producing elaborate, coherent texts thanks to artificial intelligence (AI) reminded him of his old goal. “We can talk to an AI just like to a person,” Zhang said. “This surprised me.”
Zhang decided to join his teenage dream with his current knowledge and designed an experiment to explore whether or not people can distinguish texts produced by humans versus machines. The results were recently published in Advanced Science.
Subconscious cues
He was impressed to find out that on a conscious level, people are remarkably bad at spotting which texts are produced by humans and which by chatbots. But even more surprisingly, there is some subconscious mental activity that detects more reliably whether a person or a machine has produced each text.
In the experiment, participants were presented with dialogues between a person and an interlocutor — either a chatbot or another person. They were asked to assess the personality of the interlocutor based on well-established parameters, and then were subjected to an fMRI scan to gauge their neural activity while reading the texts. While their brains were scanned, participants were asked to declare whether they thought the interlocutor was a human or a machine.
The researchers found that, no matter who had produced the text, a specific region in the participants’ brains, called the mentalizing network, was activated. This region tried to gauge the personality and intentions of the interlocutor. “When we hear language patterns, we need to infer the inner thoughts and intentions of others by observing or imagining their speech, actions and facial expressions,” said Zhengde Wei, associate research fellow at the University of Science and Technology of China and first author of the paper, “and the mentalizing network is used for that”.
The researchers observed that, while many people were deceived into thinking that machine interlocutors were in fact human, the activity in the mentalizing network usually gave away the true identity of the interlocutor. While the sample size is relatively small and thus the results would need to be validated with a larger number of chats and of people assessing them, the implications are potentially very deep.
“It’s fascinating,” said Anna Ivanova, postdoctoral researcher at MIT Quest for Intelligence who was not involved in the paper, “because […] you would think that we would activate our language-processing regions when reading those texts, which we do, but in addition to that we automatically construct this mental model of the speaker in our heads”.
The results imply that, at a subconscious level, the brain can distinguish between humans and machines, even if it can’t tell them apart consciously.
A confusing personality
In the personality test, people do assign less consistent values to chatbots than humans. This is not that surprising given that machines learn from many different sources, and they don’t have real intentions, tastes, or emotions.
Still, for Ivanova, the personality rating opens a door towards designing a practical way to tell humans apart from machines. “What it tells me […] is that people are bad at figuring out whether the chatbot wrote the text or the human [did], but they pick up on some cues that would help them make this decision, they just don’t use them.” Perhaps, she suggested, people could be trained to spot those cues and leverage them to detect the authorship of the text reliably.
The authors also state that their experiment should help build chatbots that use language in a more similar way to humans, which would make for seamless interactions between us and machines. However, teachers are increasingly worried about the potential for fraud if students start handing in machine-produced work that is original (i.e., plagiarism-free) but not written by them.
While developing ways to detect the true authorship might help spot cases of fraud, Zhang is more intrigued about the possibility of AI replacing teachers than students and the potential effect this might have on our brains. “We learn from humans, so we are humans. If we learn from non-humans, we do not know what will happen […] so it’s very dangerous to use AI chats in education because children would unconsciously learn from non-humans,” he argued.
This is why there are more and more proponents for expanding on legal norms to guide the development of AI tools and channel their huge potential into beneficial outcomes.
“We just need really strong regulations and ethics-based frameworks on how to use them, how to make sure that the big models are not used to deceive people,” Ivanova said. “Figuring out how to minimize the harms and maximize the benefits is probably what we’ll have to do, because stopping this is essentially impossible at this point.”
Reference: Zhengde Wei et al., Implicit perception of differences between NLP-produced and human-produced language in the mentalizing network, Advanced Science (2023). DOI: 10.1002/advs.202203990
Feature image credit: Christin Hume on Unsplash