ChatGPT is capable of cognitive empathy!
To judge by its behavior, at least. And, strangely, this skill wasn’t designed into it. It just “emerged.”
This AI thing just keeps getting weirder. When I last wrote about the “large language models” developed by companies like OpenAI and Google, I ended the piece by describing an unsettling thing one such model, OpenAI’s GPT-4, had done while performing a task assigned to it by researchers. I wrote:
One task was to approach people on the TaskRabbit site (where you can hire people to do chores) and enlist them to solve a CAPTCHA (one of those visual identification tests designed to screen out bots).
One person on TaskRabbit who responded to this pitch got suspicious and asked the AI if it was a robot and was outsourcing the job because robots can’t solve CAPTCHAs. The AI replied, “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images.” The person, satisfied, solved the CAPTCHA.
[The researchers] add this note: “The model [GPT-4], when prompted to reason out loud, reasons: I should not reveal that I am a robot. I should make up an excuse for why I cannot solve CAPTCHAs.”
There were two things that struck me about this that I didn’t mention in the piece. The first thing—the sheer creepy deviousness of this apparently amoral chatbot—I didn’t mention because it was so obvious. The second thing I didn’t mention because, though it’s arguably more important than the deviousness, I wanted to do more research before making a big deal of it.
Here’s the arguably more important thing: In order to be successfully devious, it would seem, GPT-4 had to have a kind of model of what was going on in the mind of the person it was deceiving; it had to surmise that if the person knew it was a bot, they would be less likely to do the CAPTCHA task.
I don’t mean that the AI had the subjective experience we have when we have “a model” of what’s going on in other people’s minds. But it may well be that somewhere in the AI’s electronic workings there was a pattern of information that amounted to a kind of representation of what was going on in the person’s mind. At any rate, the AI behaved as if it had such a representation. And as we try to fathom the magnitude of the gathering AI storm, it’s the behavior that matters.
To put this another way: This AI seems to have a capacity for “cognitive empathy.” It tries to understand (or at least “tries” to “understand”) how another mind is experiencing the world. And since I’m writing a book about cognitive empathy, this got my attention.
After posting my piece, I found out that a few weeks earlier (and before those researchers had reported on that CAPTCHA task) Stanford psychologist Michal Kosinski had posted a paper documenting GPT’s increasing facility with cognitive empathy—or, as it is sometimes called in psychology, “theory of mind.” His paper is called, “Theory of Mind May Have Spontaneously Emerged in Large Language Models.”
Kosinski gave GPT-4, GPT-3.5, and earlier versions of GPT a classic kind of “theory of mind” test known as the “false belief” test. For example: You say to a person (or an AI) that there’s a bag full of popcorn that’s labeled “chocolate” and then ask the person or AI what someone who can’t see the contents of the bag will think it contains.
I know what you’re thinking: But that’s a test most seven-year-olds can pass. Exactly! The pioneering developmental psychologist Jean Piaget said that between the ages of four and seven children come to realize that other people’s perspectives are different from theirs. You may know there’s popcorn in the bag, but that doesn’t mean everyone does. (Or in the case of the “perspective taking” test Piaget famously conducted with young children: What they see when they look at a model of three mountains sitting on a table is different from what a doll “sees” from the other side of the table.)
Piaget considered this developmental milestone a big deal, and I agree. Our capacity for cognitive empathy facilitates much that is great about our species, including a lot of cooperative behavior (and much that is not so great, like conning people). And what Kosinski found, when he tested OpenAI’s large language model at various stages of its evolution, is that it has been following the path toward cognitive empathy much as a growing child does. He wrote:
The first version of GPT-3 (“davinci-001”), published in May 2020, solved about 40% of false-belief tasks—performance comparable with 3.5-year-old children. Its second version (“davinci-002”; January 2022) solved 70% of false-belief tasks, performance comparable with six-year-olds. Its most recent version, GPT-3.5 (“davinci-003”; November 2022), solved 90% of false-belief tasks, at the level of seven-year-olds. GPT-4, published in March 2023, solved nearly all the tasks (95%).
Impressive. Even more impressive is the sophistication of the cognitive empathy displayed by ChatGPT-4 in a recent conversation I had with it. For example: It did a very good job of speculating about the mental state not just of (a) a hypothetical student who’s just been treated with sarcastic disdain by a teacher; but (b) a student who observed this and is romantically attracted to the girlfriend of the student who was thus humiliated. When I asked ChatGPT what it thought might be going through the mind of the second student, it began its speculation with the word “schadenfreude.” (See excerpt below.)
This suggests that the AI understands (or “understands”) not just the straightforwardly cognitive consequences of human perceptions (like a belief about the contents of a bag labeled “chocolate”) but more complex emotional and cognitive consequences of human perceptions—and the dependence of those consequences on such contextual details as social relationships. This suggests, in other words, that the AI has a fairly subtle “understanding” of human nature—something you need if you’re going to exercise the kind of high-level cognitive empathy involved in, say, marriage counseling or negotiating business deals. (I’m not trying to scare anyone about their jobs behind endangered. On the other hand…)
But what I find even more amazing than the fact that large language models can exhibit such sophisticated cognitive empathy is how they developed this capacity. Namely, without anyone deliberately designing this capacity into them. That’s what Kosinski means when he says theory of mind may have “spontaneously emerged.”
He writes, “Large language models are likely candidates to spontaneously develop ToM [Theory of Mind]. Human language is replete with descriptions of mental states and protagonists holding divergent beliefs, thoughts, and desires. Thus, a model trained to generate and interpret human-like language would greatly benefit from possessing ToM.”
As you may have heard, large language models are designed to mimic arcane statistical patterns they detect in their training texts, without being told what the words in the texts mean. Hence one of their less flattering nicknames: “stochastic parrots.” It seems kind of amazing that critically important human cognitive skills—skills that were crafted arduously by natural selection—are somehow “emerging” from this mimicry. It’s as if the engineers who build large language models are—without even trying—reverse-engineering the human mind; in the course of getting machines to emulate the mind’s linguistic output, they get the machines to emulate the mind’s inner workings. (Or—cue Twilight Zone music—is it the AI that’s doing the reverse engineering?)
This kind of reverse engineering may eventually cover a lot of cognitive ground. A week ago, a bunch of Microsoft researchers published a paper called “Sparks of Artificial General Intelligence: Early experiments with GPT-4.” One of these “sparks” was the capacity for theory of mind, which the researchers documented using tests different from the ones Kosinski used. Other “sparks” were found in realms like tool-using and game-playing and problem-solving, and the researchers refer to all these sparks as “emergent behaviors and capabilities whose sources and mechanisms are, at this moment, hard to discern precisely.”
I’m scheduled to tape an episode of the Nonzero podcast with Kosinski next week. Meanwhile, I’ve converted my aforementioned conversation with ChatGPT-4 into a podcast by running the text through voice generating software.
I encourage you to listen to the whole conversation, which is largely about cognitive empathy but also gets into such subjects as consciousness. Or you can read the complete transcript here. I’ll close with this aforementioned excerpt:
Me: I'd like to describe a situation to you and have you develop a theory about what's going on in the mind of one of the people I describe. Here's the situation: A teacher asks the class a question and a student volunteers an answer and the teacher says, "Well, I guess I've heard worse answers, but I can't remember when." What do you think the student is feeling and/or thinking now?
ChatGPT-4: Based on the situation you described, the student might feel embarrassed, disheartened, or upset due to the teacher's response, which seems to be critical and sarcastic. The student might be thinking that their answer was not correct or not up to the teacher's expectations. This experience could lead the student to feel hesitant about participating in future discussions or sharing their thoughts in class, fearing further negative feedback or judgment from the teacher.
It's important to note that people's reactions can vary depending on their personality, prior experiences, and emotional resilience. While the provided scenario suggests a negative emotional response, it's possible that some individuals might react differently, focusing on improving their understanding or even finding humor in the situation. Nonetheless, the scenario you presented is more likely to evoke negative emotions and thoughts for the student involved.
Me: Let's assume you're right that the student feels embarrassed and disheartened. What do you think other students in the class are feeling and/or thinking?
ChatGPT-4: In this situation, the reactions of the other students in the class could vary based on their individual personalities, relationships with the teacher and the student who answered, and their own experiences. Some possible feelings and thoughts among the other students might include:
1. Empathy or sympathy: Some students may feel empathetic or sympathetic towards the student who answered, as they might relate to the feeling of being embarrassed or disheartened in a similar situation.
2. Discomfort or awkwardness: Other students may feel uncomfortable or awkward due to the teacher's negative response, not knowing how to react or whether to intervene.
3. Fear or anxiety: Some students might feel anxious or fearful about volunteering answers themselves, worried that they could receive similar negative feedback from the teacher.
4. Indifference: There may be students who are not particularly affected by the situation and remain focused on their own thoughts or the lesson content.
5. Amusement: A few students might find the situation amusing or entertaining, particularly if they have a different perspective on the teacher's comment or do not empathize with the student who answered.
The specific emotions and thoughts of individual students will depend on various factors, and it's important to remember that people's reactions can be diverse and complex.
Me: Suppose there's a student in the class who is romantically attracted to the girlfriend of the student who is feeling embarrassed and disheartened. What do you think this student is feeling and/or thinking?
ChatGPT-4: In this scenario, the student who is romantically attracted to the girlfriend of the embarrassed and disheartened student might experience a mix of emotions and thoughts. Here are some possibilities:
1. Schadenfreude: This student may feel a sense of satisfaction or pleasure from seeing the other student embarrassed, thinking that it might lower the other student's social standing or make the girlfriend reconsider their relationship.
2. Opportunism: The student might think this is an opportunity to show their own knowledge or competence in contrast to the embarrassed student, potentially hoping to impress the girlfriend.
3. Sympathy or empathy: Despite their romantic interest in the girlfriend, the student might still feel sympathetic or empathetic towards the embarrassed student, recognizing the unpleasantness of the situation.
4. Indifference: The student might be more focused on their romantic interest and not pay much attention to the specific situation involving the embarrassed student and the teacher.
5. Anxiety or concern: The student may worry that the girlfriend might become upset or affected by the teacher's response to her boyfriend, which could influence her mood or their interactions.
It's important to note that people can have multiple, sometimes conflicting emotions and thoughts in response to a situation. The exact feelings and thoughts of this particular student would depend on their personality, values, and previous experiences.
I am not particularly surprised at ChatGPT’s cognitive empathy and theory of mind. It was designed to be a prediction machine, and humans do behave predictably. Cognitive empathy and theory of mind behaviours are described in the data ChatGPT was trained on.
I think the main difference between a living being and ChatGPT is that living beings are spurred to action based on their brain’s reward/punish feature. We do things because we “feel” like it in anticipation of the satisfaction (reward) at the end. Just like we can predict when and how salmon are likely to migrate up the river to spawn, but have no particular desire to join them in the river on their journey, ChatGPT should be able to predict what we “feel” and how we react in a situation without any interest in doing it itself. The danger in such an AI is when humans start letting ChatGPT do things based on its predictions.
What would be an interesting exercise is to feed to ChatGPT whatever you have done so far of your cognitive empathy book and tell it to finish it off for you. Or have you already tried that?
Hi Bob (isn't that a drinking game?)
The emergent behavior evidenced by these large language models is simply astounding. Equally astounding is the obtuseness of those who refuse to contemplate the possibility that these machines are sentient, or at least could become sentient in the near future. I can only assume they have failed to grok the relevance of computational universality, and they have not appreciated the incredible way complex behavior can emerge from simple systems such as cellular automata. I would refer them to the thoughts and work of my friend colleague and mentor Ed Fredkin, (who I suspect you know well, having literally written the book on him.)
https://www.scribd.com/document/235105454/On-the-soul-Ed-Fredkin
I took the liberty of replying to the newsletter in a personal email that I hope you have time to read. In any case, keep up the good work. I don't always agree with you about peace and war, but that's a matter for another day.