60 Comments
Feb 23Liked by Robert Wright

My guess is that a human who knows the name of Tom Cruise’s mother still could have trouble going the other way around if asked who her son is. I would think of Tom Cruise being a big word balloon in somebody’s head with a bunch of factoids that spring from it, whereas his mom is just one of those factoids and therefore harder to index if mentioned out of context.

Expand full comment

Gemini is tougher on itself than you are!

https://g.co/gemini/share/46031316c8ca

Expand full comment
Feb 23·edited Feb 23Liked by Robert Wright

On Searle: I haven't read his arguments for a long time, but when I did with my friend, somewhere around 1995, our impression after a long deciphering process was that it was really about consciousness without him saying it aloud. To us, the Chinese room sounded like a critique of functionalism as a theory of mind.

On understanding: I'd define it even more loosely than Bob, something like "a cognitive system integrating incoming information (perception) to its representations in a way that allows appropriate action". The point there is information meeting a representation.

Then there can be many levels of understanding depending on how sophisticated the representations of a system are. At extreme, in everyday life, people sometimes refer even to computers as them "not understanding something".

From this viewpoint, my feeling of the LLM "understanding" discussion is that it's mostly about:

1. Some people have philosophical fixation to what it means to _really_ understand, in a somewhat or fully metaphysical sense. It may be about "semantic grounding", consciousness, intentionality, and such. Linguists seem to have an issue with grounding and semantics in a way I as a non-linguist don't get.

2. The other thing is that LLMs have representations, that's self-evident, but some people want to emphasize the representations are inaccurate or superficial, while others want to emphasize that they are surprisingly accurate or deep or general. But shallowness or inaccuracy have no common measure, so this is little more than pointing by finger to different directions.

The third relevant issue is the architecture of LLMs, but people arguing about LLMs don't talk about that much. LLMs generate one token at a time, so their internal state is about that token only, conditioned on the "tape" they have output already, and the tape is their working memory. And they cannot edit the tape, so there is no iteration or internal reflection layer—if you want that for an LLM, you need to use chain of thought prompting (CoT) or some such, to simulate self-reflection on the tape. That is all very counterintuitive to us, but explains many of the stupid errors. LLMs are intuition machines, they spit out whatever comes to their mind without the ability to think it through before saying.

They still must have a plan of the output in some sense, a superposition of meaning, always concentrated to the prediction of the next token, a probability distribution that collapses a bit after each token, when the tape conditioning them becomes longer and longer.

Expand full comment

A sighted man, a blind man and an LLM walk into a bar. The sightless bartender has a standing policy, for purposes of jokes or thought experiments, of giving free drinks to whoever best evokes in him an understanding of what it means to be sighted.

Each takes a turn telling tales of blindlessness to the blind.

Who wins? Who understands?

Expand full comment

Bob,

This article illustrates three things that I find very frustrating in your writing on AI:

1. Ipse dixit arguments. In arguing that AI has "semantics," you place great emphasis on what you call the "map of semantic space." But giving it the name "map of semantic space" is just labeling. Someone else might give it a different label. To demonstrate that AI systems have semantics in the way that humans do, you need to do more than define AI's operations as involving "semantic space." To truly prove your case, you need to rely on the substance of how the systems operate--and to explain what, specifically, you are referring to when you argue that AI systems are doing more than pattern recognition / pattern application.

2. Confusing correlation with "understanding." AI systems generate speech that often appears consistent with our understanding of words--that is, AI systems generate speech that often correlates with our understanding of the underlying words in the text. That is no surprise, given humans speak and write in ways that reflect their understanding of words, and the AI systems follow the patterns of human speech and writing contained in the training data. You are asserting that the remarkable ability of AI systems to generate text that correlates with our understanding means that they must *have* understanding that is at least in some sense similar to ours. But correlation alone does not prove your case, especially given point 3 below.

3. Not engaging with the counterexamples. Gary Marcus gave you a host of examples that demonstrated that the fact that AI systems generate text that often correlates with our understanding of the underlying meaning of words does not mean that they have understanding in the way that we do. He showed, again and again, that when you take an LLM system outside of the patterns in its training data, it quickly falls apart--and does so in ways that demonstrate a lack of command of the underlying meaning of the words. Beyond the Tom Cruise example, you simply don't engage with those types of cases--and you basically say, "well someday computer scientists may fix those problems, and I bet they'll do so in ways similar to how humans reason." See point 1 above.

I've separately commented on other aspects of your AI positions, and I won't belabor those critiques here. I find your writing on AI particularly frustrating given how much respect I have for you. I think you're an incredible thinker and writer. I'll keep reading and listening because of that.

Expand full comment

Granted, Searle’s Chinese Room argument is dead, and AI has some kind of multi-dimensional vector semantic mapping, and the issue of consciousness may be barking up the wrong tree. But AI does not have even a toddler's understanding of the real (physical) world. As our human understanding develops during about 10 or so childhood years, we gradually gain object permanence, a logical sense of cause-and-effect, and conservation of weights and volumes. Open AI's Sora, as just demonstrated this month, shows weird and impossible things like a basket ball player making a turn-around jump shot with the player's head briefly turning into a second basket ball half-way through, and a grandma in the kitchen reaching to stir her dish while the spoon magically appears in her hand as she reaches towards the bowl. But I suppose that, to be fair, I should give AI another 10 years before evaluating whether it is capable of a human level of understanding.

Expand full comment

No they don't.

Based on the prompt, they assign probabilities to the words that are related to the input and pick the words that have highest probability.

LLMs are still stochastic parrots.

The training does not add to their "understanding". The more and more training data simply makes those probabilities slightly better.

Expand full comment

It seems this AI thing has just revealed that everyone had a different definition of "understanding".

I agree that any definition that invokes consciousness is pretty much useless for answering the question "Do AI's understand?" - because there's just no way to know if anyone but me is conscious (and even that is questionable...).

But I'd argue that Bob's definition isn't much better:

' “Understanding” means employing structures of information processing that are functionally comparable to the structures of information processing that, in the human brain, are critical to understanding. '

Who's human brain? Does this definition imply that humans are incapable of not understanding things? Why invoke humans at all in the definition? What does "functionally comparable" even mean here? How do you functionally compare structures? How could you ever quantify that? Why does the definition have "understanding" in it? What does THAT "understanding" mean?

Colloquially, when we say someone "understands" something, we're not talking about their brain structures - we're talking about their ability to predict something despite not having seen that exact thing before. Someone who "understands" Putin's mindset would have been more likely to predict the invasion of Ukraine. An AI who "understands" people would be able to predict how they might react to being subtly insulted. Kepler "understood" planetary motion because he could say where a planet was going to be in the future. Confronted with a new planet (its speed and trajectory), he would also be able to predict its motion.

So all that said - here is my humble proposal for the definition of "understanding":

"Understanding" is the ability to predict an outcome, given a context, despite having never observed that context before. The degree of understanding is a product of the correctness of the prediction and the degree to which past observed contexts have been dissimilar from the current one.

So if you're able to predict an outcome correctly because you've seen exactly that thing happen before: low understanding.

If you've never seen this context before and you are unable to predict the outcome: low understanding.

If you've never seen this context before and you are still able to predict the outcome: high understanding.

Humans need not be involved in the definition - they just tend to be good at finding parsimonious explanations for things - and thereby being able to generalise well outside the bounds of contexts they've seen so far.

Expand full comment

It's a safe hedge to put single quotation marks on understand in the title. Because saying that LLMs understand things is like saying calculators 'understand' math. I just finished reading Max Bennett's convincing "A Brief History of Intelligence," in which the author discusses how understanding partially arises from mammals' ability to simulate in the mind/imagination models of the world. In humans, the simulation of the concrete world contextualizes language use. Gary Marcus wrote about a case in which DALL-E can't even create an image that hides an elephant in the background. The issue? It doesn't really understand what it's being asked to do. It cannot simulate this possibility, owing to the fact that an image like this isn't on its training data. This 'training data' wasn't programmed in the human brain either; but we can easily imagine it and and easily recognize it should an artist produce such an image. That's real understanding, going beyond statistical probabilities.

Expand full comment

The fact that these systems are multimodal is totally irrelevant to the questions about understanding (and intentionality, for that matter). We can grasp abstract functions, and when we do we can distinguish logically incompatible but empirically equivalent functions. This ability is essential for understanding (in any classical sense, at least).

There is no evidence that LLMs--multimodal or not--can do this. In fact, there is plenty of evidence that they cannot. That these systems can generate outputs matching those generated by systems that do understand is obviously not evidence of understanding; otherwise we would have to say that hand calculators (or abacuses, for that matter) understand arithmetic.

Expand full comment

I disagree. A Turing machine, and all equivalent computers, are strict sequential machines. That limits simultaneous large scale data interactions. That is, IMO, a necessary pre-condition for consciousness. Do you really think, a Turing tape, clicking left sometimes, right sometimes, over a very long period is a basis for consciousness? At each click ( or each instruction cycle of a computer) very little is happening, only a few register or memory locations changing. Between clicks nothing is happening, just a static data configuration. Despite arriving at a deterministic computation result after many clicks, or cycles, there is no moment where consciousness is possible. The Chinese room argument actually closely parallels this. Of course, I am arguing that consciousness is essential in declaring a machine “has understanding”. As a humorous aside, I will accept that AI is truly here when AI can handle customer service calls as would a competent human. At present there is nothing remotely like that.

Expand full comment

Bob, you've occasionally given the impression that you think consciousness might be substrate-independent, but it occurred to me halfway through reading this that I couldn't recall you ever having explicitly said so. (You might say, since you frequently emphasize the fact that subjective experience is undetectable by third-party observers, that it's not really worth making a stand either way, since you can'tl make a clear argument for it.) Regardless, do you have any strongly held beliefs or intuitions about a) whether non-organic-brain-based professing systems can have subjective experience and, if so, b) whether the current LLMs are at that point?

Expand full comment

Bob, commenters like Tom have reprinted ChatGPT comments that directly contradict your AI positions. So, are you saying, "yes, AI 'understands' things, but it doesn't understand itself?"

And if you hold that position, why? At a nuts-and-bolts level, why would ChatGPT "understand" but not understand itself?

Expand full comment

This discussion brings to mind Bob's discussion of Lemoine and his finding that 'The transcript was indeed kind of amazing. ' with others finding it rather trivial. I suppose that depended on one's viewpoint concerning 'consciousness' . I think we have a similar case here with 'understanding'. Nevertheless, I just reviewed my yearly subscription hoping for fewer posts on this matter. (And really the promises of AI that I was reading about in 1981 are still far away in 2024.)

Expand full comment

The "Chinese Room" argument never held water. Instead of a machine processing the symbols, suppose it was a person who could read and write Chinese. Now, obviously, there is some other entity in the room that can process these symbols with a complete and full understanding and consciousness. That other person fully understands the requests and responses, even though John Searle does not.

Expand full comment