Linear perspective is an extremely compelling visual illusion. And it is an illusion. You know the train tracks do not really converge. If they did, the train would derail.
Why do parallel lines look like they converge in the distance? I think it is a learned response to living in a world of pictures. That interpretation is so overlearned, it occurs without explicit awareness. But it should be possible to unpack that illusion.
Linear perspective was invented in the early 1400’s by an Italian architect (Brunelleschi) and simultaneously by others trying to draw and paint, as a way to represent three dimensional space on a two-dimensional sheet. It is a good invention and it works pretty well. We do see "depth" in a flat picture, even though that is not literally possible because there is no depth in a flat picture.
In the 600 years since then, the technique has become so universal that pictures drawn without it don’t look right.
The artist, Albrecht Durer, in a famous 1525 lithograph, is shown using strings to represent the rays of light coming from the corners of his object. His canvas swings out of the way while he sights down the strings to the object. Then he puts the canvas back in place and makes marks where the strings would hit it. Connect the dots, and you have a mathematically correct map of what the actual object looks like from that point of view.
But what is it a drawing of? The sides of a walkway do not really converge in the distance. If they did, you couldn’t walk it to the end. So a perspective drawing is surely not an accurate representation of reality. Strings or no strings, the perspective drawing is a fantasy, like a unicorn. It is something that does not exist in the world.
Why then does the unrealistic perspective drawing look so convincingly real, at least with respect to depth? Is it because when we look at the world we actually see it wrong? When you look down a long walkway, not a picture of one, are you seeing the world wrongly? We know from practical experience in the world that parallel lines do not converge in the distance, so why do we see convergence? We have to make a mental correction: “The sides look like they converge, but really they don’t.”
I have spent a lot of time looking at scenes (not pictures, but situations) where there seemed to be converging perspective lines. Much to my wife’s consternation, I will often stop to stare down a long hotel hallway like the one shown here, and ask myself, “Do those walls really look like they converge, or am I only imagining it?” I will walk up and down such hallways, trying to understand what I am seeing, and hoping I won’t get reported to hotel security.
And my conclusion is this: I do NOT see the walls converging. I can talk myself into it, but if I turn off my metacognition, what I see is a continuously unfolding horizon and continuous visual information moving around my head. At no time do I worry that the walls are closing in on me.
If I stand motionless and look down the hall as if I were a camera taking a picture, then I CAN see convergence. But that’s because I am pretending to be a camera. I am using metacognition, the ability we all have to introspect on our own mental experience. Metacognition is what allows you to answer the question, "What are you thinking about?" To answer, you must think about your thinking process. For a visual scene, the question, "What do you see?" encourages metacognition. You must think about your visual experience. Instead of just having a visual experience, you are now one step removed from it. You have stepped back from your natural experience and instead you are now using metacognition to examine your own mental imagery.
We learn visual metacognition when we learn to understand pictures. For the last six centuries virtually all pictures used the linear perspective technique of representation, which is derived from metacognition, not simple visual perception.
It is extremely difficult to set aside the ways of seeing that you have unconsciously assumed all your life and which your culture claims is the “correct” way of seeing. When you look at a photograph, you automatically apply metacognition – actually you must. To understand a picture, you must abandon your natural, egocentric point of view and take up the special imaginary point of view implied by the picture so you can imagine you are looking at the scene depicted. It is a sophisticated shift in personal frame of reference, but we do it with ease.
It looks like the perspective lines are “in” the photograph because that’s how we have been taught to interpret such artifacts. A photograph is a 2-dimensional representation of a three-dimensional scene, so we apply Brunelleschi’s rules to it. Whipping out a Sharpie and drawing the perspective lines on the photograph only proves my point that we are using an artist’s metacognitive way of seeing the photo. But we don't normally do that when we are using ordinary cognition, as opposed to metacognition, in the real world.
When I look down the grocery aisle as if I were a camera, I am strongly tempted to see converging lines. Look at them there in the picture! I can do that, but why would I?
Never, when I have been shopping for a box of pasta, have I experienced converging parallel lines. Never. I just go to the pasta section and get what I want, and proceed to the next aisle. Not a single time have I worried if my shopping cart would fit out the narrow opening in the far end of the aisle. Linear perspective just does not come up when you need a box of pasta.
However, if I stand at the head of the aisle and imagine I am a camera, or an artist, and mentally "step back" from my experience, then pop! There are the convergence lines. But they are purely an intellectual, metacognitive, culturally contrived way of seeing, an overlay on my natural experience. I have been taught to see convergence lines and so I do. But when I do, I am not looking at the world any more, but instead, looking at my own mental imagery of the world.
Now that I have learned how to unlearn that cultural habit, I no longer see perspective lines unless I want to. That demonstrates to me that apprehension of linear perspective is not a native property of the biological visual system.
Discussions of the perspective illusion always present pictures to illustrate the points being made. I have done that here, too. I admit that the convergence lines are there in the pictures. Of course they are. But that’s because they are pictures! Metacognition is required to understand pictures, just as it is to read a map, a floor plan, or a blueprint. Understanding pictures is a culturally acquired skill. There is perspectival convergence in the pictures because that’s how we have learned to interpret pictures.
But if you observe the real world, not pictures, you can, with practice, get back into your natural attitude (non-metacognitive) way of perceiving, and there will be no linear convergence. Try it. Walk some hotel hallways and some grocery aisles and some railroad tracks. What do you really see? You will find that the sides do not close in on you. You do not really see linear convergence unless you imagine you are "looking at a scene" instead of being in the world. We CAN take an attitude of detachment toward our perceptual experience, but that is a learned, introspective skill.
It seems to me it would not be too hard to test this hypothesis experimentally, with infants and non-human animals. You could test them for discrimination of natural scenes with and without perspective elements (what most people would call perspective elements, like railroad tracks and hallways). My hypothesis is that there would be no discrimination between scenes conventionally interpreted as containing convergence, and those without. It would be hard to do this without pictures, but not impossible. I notice recently that some hotels go to great lengths to break up the perspective effect in their hallways by using alcoves, varied lighting and wall colors, and non-linear carpet patterns. It should be possible to find comparable but contrasting hallways.
Then you could train the animals or infants to discriminate comparable perspective and non-perspective drawings, then test them on the natural scenes again. If the training were effective, the post-test should show discrimination of scenes with convergent and non-convergent elements. However, It might be difficult to accomplish the pictorial training, as the skill takes a long time to acquire.
Why does it matter whether we really see linear perspective in the world or just apply that cultural interpretation to what we see? I think it matters for two reasons.
1. It matters if we are seeing the world wrongly. We know the train tracks do not converge in reality but we see that they do. That’s wrong, a perceptual error. Well, if we see the world wrongly in that case, what else are we seeing wrong? What if it's all wrong? Is the world anything like what we think it is?
This is a fundamental question in the philosophy of perception. So-called “realists” believe that we see what is really out there. Sure we make errors, but over time, we generally understand what the world is really like.
On the other hand, “representationalists” say that our brain forms a neurological representation of the world and that is all we have to go on. We do not know anything for sure about the world in-itself. We know only what our brain represents for us, and that includes convergence lines of perspective.
I am a realist, and I have argued endlessly with representationalists about this. Representationalists tend to be interested in robotics and machines that can “represent” the world in computer memory. Representationalists use the illusion of linear perspective to argue, “It looks like the train tracks converge, but they don’t really. Therefore realism in perception is simply not true.”
My argument now is, “I deny that it looks like the train tracks converge. That is a learned attitude, an introspection, not natural perception.”
2. The second reason the perspective illusion matters is because it highlights a fundamental error people make about visual perception. The eye is not like a camera and does not work like a camera. Yes, the eye has a lens and a pupil (shutter opening) and the retina is analogous to a film. But the analogy is flawed and deeply misleading because an eye is a component of an active sensory system in an exploratory animal. A camera is an inert machine.
We do not look at our retinas, ever. The retinal image is nothing like the image on a film, and nobody ever sees it. A camera is passive, but vision is active, exploratory, selective, and cognitive. Recent developments in sensory substitution amply demonstrate how "mental" perception is (e.g., blind people learn to “see” from video signals translated into vibrations, sounds, pin pricks on their backs, or electrical signals to the brain).
Once you shake free of the erroneous camera analogy, you are free to see the world in your natural attitude, not through an arbitrary cultural lens.
Are the seats smaller in the back of this train?