Beyond Grammar: On the Appearance and Reality of Prediction in the Brain

This blogpost is a continuation of a dialogue with Richard Gipps that started with his comments on Anil Seth's book 'Being You'. Here is his latest response.

I am grateful to Richard Gipps for his continued engagement with me on this issue. I questioned the value of extending this exchange further, particularly since I greatly admire Gipps and have no desire to prolong a dialogue just for the sake of it. However, I think I do have meaningful things to say in response to the points brought up by Gipps in his last post, and this offers an opportunity for further clarification.

#1. Orbits

What I have been trying to say is that when it comes to movements of objects, there are aspects, or relationships, or facts (if you will) about how things are that transcend any grammatical rule we may employ to talk about something. 

For instance, take this rule as expressed by Gipps: “What's properly said to orbit what (the sun orbits the earth, or the earth orbits the sun) depends purely on a decision as to what we set as our reference frame.”

Imagine a small screw (say from the debris of a satellite) floating in the space in Earth’s proximity. From almost any reference point, this screw is in orbit around the Earth, but from the reference point of the screw itself, the rule would say that the Earth can properly be said to move around it, and therefore be said to be in orbit around the screw. This grammatical rule takes relative motion to be the only thing relevant to orbital relationship. Implicit in the grammatical rule is also the idea that any frame of reference has as much validity as any other. Earth’s frame of reference is no more objective or valid than that of the screw. Even if we stick by this rule and this very counterintuitive assertion with regards to Earth being in orbit around a small screw in space, we can meaningfully say that this rule doesn’t take into account how objects behave in a gravitational field – in particular, the path objects take in a curved spacetime as specified by Einstein’s field equation. Einstein’s field equation says things about the curvature of spacetime due to Earth’s mass and curvature of spacetime due to the screw’s mass. These predictions say certain things about the path objects will take in a given spacetime; from Earth’s frame of reference, the screw takes a path based on how Earth has curved the spacetime; from the screw’s frame of reference, the Earth may appear to move around it, but Earth is not taking a path around the screw in a spacetime that is curved around the screw. Based on our grammatical rule, we may still insist that Earth is in orbit around the screw based on screw’s frame of reference, but any scientific explanation of the gravitational relationship will have to go beyond the grammatical rule to describe motions in a gravitational field.

The lesson: we have to think beyond the grammatical rules. Disagreements regarding what can be said to orbit what may be resolved by agreeing on arbitrary grammatical rules, but there are physical relationships governed by the laws of physics that exist outside of our grammatical rules, and different grammatical rules may be more or less aligned with them. The fact that motion is relative to a frame of reference doesn’t mean that paths taken by objects in spacetime continuum are also condemned to this sort of relativity.

[BTW: At least one astronomical dictionary defines orbit as “The path of a celestial body in a gravitational field. (my emphasis) The path is usually a closed one about the focus of the system to which it belongs, as with those of the planets about the Sun, or the components of a binary system about their common center of mass… To define the size, shape, and orientation of the orbit, seven quantities must be determined by observation. These are known as orbital elements…” (the semimajor axis, the eccentricity, the inclination, the longitude of the ascending node, the longitude of perihelion, the epoch, and the period.)]

To tie this to the science of perception, we can get into grammatical debates about terms like “prediction”… when can something be properly said to “predict” something?… and these debates are not trivial, because we do need to avoid muddled use, but my contention is that there are processes hypothesized to happen in the brain that some scientists describe using the term “prediction” or “inference”, and these processes need to be taken into account regardless of the grammatical rules we ultimately employ.


#2. Intermediary levels of cognitive scientific terms

Gipps: “to be 100% clear about this: I'm not trying to rule out a priori that enquiries and explanations framed in cognitive scientific terms are possible. My method is different: it's to urge that those who posit such a level a) aren't clear about what they mean, and b) rather look as if they've got in an unwitting muddle.”

That's a helpful clarification.


#3. Computers and Predictions

Gipps: “why is it that we say that the phone is doing something like predicting but that (my imagined) pancreas is not? Well, the only disanalogy I can see between them is that what the phone is involved with, even though of course it knows nothing of it (since it's not a knower), is semantic information or meaning. The marks on the phone's screen count as information because of how we relate to them, because of the place this artefact enjoys in our rich communicative, social, lives.”


This reminds me of something Bennett and Hacker said: “The computer calculates” means no more than “The computer goes through the electricomechanical processes necessary to produce the results of a calculation without any calculation.” (Neuroscience and Philosophy: Brain, Mind, and Language, page 151)

I believe Gipps would agree with this statement. And this l think points to a deeper disagreement. I will not attempt to resolve this disagreement here – because the disagreement is too deep and I’m not sure that it can be resolved here – but it is worth pointing out. Unlike Bennett and Hacker, I don't think that it is simply the case that results of a calculation are produced without any calculation; I think that a relationship between abstract mathematical entities is embodied in a physical system. The embodiment of such a mathematical relationship is independent of the place the computer has in human lives. (I suspect this takes us into a sort of Platonism with regards to abstract mathematical entities. Something David Chalmers’s said in his most recent book Reality+ is on my mind: “The basic idea of structural information as strings of bits is an abstract mathematical idea, but strings of bits gain causal powers once they’re embodied in physical systems, such as punched cards and computers.” Gipps will likely consider this very muddled! That’s okay. We just have to note that a disagreement exists on this point and move on.)

To go back to the Bennett and Hacker quote, aside from the reality or non-reality of calculation, it provides another possible point of disanalogy, although the difference is one of degree. “electricomechanical processes necessary to produce the results of a calculation without any calculation” … When we talk of the pancreas predicting in the hypothetical scenario being discussed, the nature of the biochemical processes taking place that produce the results of a prediction without any prediction differ considerably in complexity compared to the brain. We don’t, for instance, have to invoke “prediction error” or “internal models” or “updating priors.” Another difference in the case of brain vs pancreas/kidney is that when we hypothesize that the brain predicts something, we see this process as occupying an intermediate explanatory link between the relevant mental phenomenon (say perception) and neurological processes; even if we hypothesize that the pancreas predicts something, that predictive process is not tied to any greater explanatory role.


#4. Information

Gipps: “The marks on the phone's screen count as information because of how we relate to them”


Non-semantic information exists in many systems in nature, including biological systems. The “genetic code” is one example of such information: the sequence of base pairs determines to a great degree which proteins will be formed by the cell machinery. Computers also carry non-semantic information that can be discovered by any observer capable of such discovery just as the relationship between the DNA and proteins can be discovered by any observer capable of such discovery. If a computer program factorizes a number (breaks it down into the set of prime numbers) – say 6734 into 2 x 7 x 13 x 37 – a physical process has taken place in the circuit of the computer that embodies mathematical relationships, and any observer (including aliens or other computers) capable of recognizing the physical process and the mathematical relationship will be able to discern it.


#5. Physical Information and Prediction

Gipps: “What I can't yet see, however, is that this notion of physical information is going to get us anywhere when it comes to making sense of what it is for a brain to (in some or other similar-to-our-normal-use-of-the-terms sense) make inferences or predictions.”


I think this difficulty of specifying the process of inference or prediction in non-semantic terms first requires appreciating that non-semantic “models” and “representations” can exist in the brain. Consider the sensory and the motor homunculi, the neurological “maps” “models” or “representations” of different parts of the body in the brain which are employed in motor and sensory functioning. This is a non-semantic form of information. The physical relationship between the homunculus and the body exists independent of any meaning an observer attributes to it.

The second step now is appreciating that non-semantic information can be manipulated in a manner that resembles “prediction.” Let’s say I hurt my hand, and I feel pain in my hand. In order for me to feel pain in my hand, a corresponding neurocognitive process has to take place in which the brain uses the sensory homunculus to generate an experience of pain that is localized to a certain anatomical region. One can speak of the brain using a model to “predict” that the source of the pain is my hand; the fact that it is a prediction only becomes apparent when the prediction goes awry, e.g. when I experience phantom pain in my amputated hand. The brain predicts that the source of the signals in the pain nerves is my hand, but it makes a mistake, because there is no hand there. Prediction here is a metaphor, but it is a metaphor that nonetheless refers to or hypothesizes a process that can be empirically investigated. We may say that based on XYZ grammatical rules that it is not proper to say that the brain predicts that the pain is coming from the amputated hand. If so, OK, fine, it’s not “prediction” within the context of those grammatical rules, but something is happening and we must call it something, and if the scientists studying the phenomena think “prediction” works well enough, we may as well posit new grammatical rules and call it “prediction”.

Let me make this more palatable for Gipps: all he has to accept is that “brain predicts the source of the signals in the pain nerves” refers to neurocognitive processes that produce the result of a prediction without any prediction (akin to Bennett and Hacker on computers calculating). 

A similar sort of thing happens in other cases of perception. We are dealing with the hypothesis that there are models of incoming sensory input in the brain (“predictions”), which are compared to the actual sensory input, the discrepancy between the two is noted, and the discrepancy is then used to update the model. “This is the sense in which unconscious perceptual inference is inference: internal models are refined through prediction error minimization such that Bayesian inference is approximated.” (Hohwy, 2018)


Philosopher and cognitive scientist Andy Clark wrote in a widely cited 2013 article about predictive processing: “The best current evidence tends to be indirect, and it comes in two main forms. The first (which is highly indirect) consists in demonstrations of precisely the kinds of optimal sensing and motor control that the “Bayesian brain hypothesis” suggests. Good examples here include compelling bodies of work on cue integration showing that human subjects are able optimally to weight the various cues arriving through distinct sense modalities, doing so in ways that delicately and responsively reflect the current (context-dependent) levels of uncertainty associated with the information from different channels. This is beautifully demonstrated, in the case of combining cues from vision and touch, by Bayesian models such as that of Helbig and Ernst (2007). Similar results have been obtained for motion perception, neatly accounting for various illusions of motion perception by invoking statistically valid priors that favor slower and smoother motions – see Weiss et al. (2002) and Ernst (2010). Another example is the Bayesian treatment of color perception (see Brainard 2009), which again accounts for various known effects (here, color constancies and some color illusions) in terms of optimal cue combination.

The success of the Bayesian program in these arenas is impossible to doubt…

More promising in this regard [establishing the shape of mechanisms] are other forms of indirect evidence, such as the ability of computational simulations of predictive coding strategies to reproduce and explain a variety of observed effects. These include non-classical receptive field effects, repetition suppression effects, and the bi-phasic response profiles of certain neurons involved in low-level visual processing.” (my emphasis)


If various perceptual phenomena can by neatly explained by invoking “statistically valid priors” and if computational simulations of predictive coding strategies can explain these phenomena, then what is the reality of these priors and predictions? What are the neurocognitive processes necessary to produce such a result?


Gipps: “This, then, is the difficulty I see for the cognitive scientific project as it's typically spelled out.  On the one hand it's urged that the brain is making predictions, inferences, etc., not in a metaphorical sense but in something like the literal sense. To support this it's pointed out that artefacts like computers and phones do after all make something like predictions, process information, etc. However then when it's pointed out that these artefacts are only said to engage in meaning-related activity in a derivative concessionary sense, because of the place we confer on them within our normative practices, and that the brain enjoys no such role - its role being instead its causal contribution to our capacity to engage in such practices - then notions of information etc which don't have to do with ordinary meaning are instead invoked. But the difficulty now is that causal operations on meaningless physical information look simply nothing like predictions and inferences in anything like their ordinary forms.”


I have attempted to show that causal operations on meaningless physical information (e.g., information in the sensory homunculus) can look something like predictions in their ordinary form. Neuroscientists are routinely invoking such predictions, and if there is no there there, if there isn’t even a neurocognitive process which is being referred to that produces the appearance of prediction, why is, in the words of Andy Clark, “The success of the Bayesian program in these arenas is impossible to doubt…”?