Showing posts with label forcing. Show all posts
Showing posts with label forcing. Show all posts

Wednesday, August 8, 2018

How a Fields Medal led to a mathematical roller-coaster journey

By Keith Devlin

You can follow me on Twitter @profkeithdevlin


First, congratulations to Caucher Birkar, Alessio Figalli, Peter Scholze, and Akshay Venkatesh on being awarded the Fields Medal, an award that for regular “Devlin’s Angle” readers needs neither introduction nor description. (If it does, use Google.)

With Fields Medals awarded only (at most) once every four years to mathematicians who produce truly exceptional mathematics before they turn forty, few of us who enter the field come close to getting one. (Indeed, in some ways they are more akin to Olympic Gold Medals than the Nobel Prizes with which they are usually compared. Few club athletes will get one of those either.)  On the other hand, many of us earn our doctorates, or build our careers, by understanding a new approach or mastering a new technique that led to a Fields Medal. Just as Bill Gates copied the groundbreaking Macintosh interface to create Windows, so too it can pay off handsomely, and often quickly, for a young mathematician to “reverse-engineer” a Fields-Medal-winning new result and try to use it to solve a different – though often related – problem.

In fact, sometimes, the medal-winning breakthrough has such broad applicability that is initiates an entire new subfield of mathematics. That was exactly how I began my mathematical career almost a half a century ago. An interest in computing, initiated by a high-school summer internship writing software for British Petroleum (using the very first digital computer delivered to the city I grew up in), stayed with me throughout my undergraduate years, culminating with me interviewing for a job at IBM on graduation. But I was put off by the strong corporate culture and the lack of intellectual freedom I feared would come with joining Big Blue. Instead, I decided to go for a PhD in the general area of computing. Unfortunately, this was before Computer Science was a recognized discipline, and though it was possible to pursue graduate research related to computing, mathematically speaking there was not much of a “there” there back then.

The one mathematically-intriguing little “there” was a relatively new subject called Automata Theory that I had come across references to. Moreover, one of the pioneers of that field, John Shepherdson, was a professor of mathematics at the University of Bristol, just over a hundred miles from London, where I had just graduated. As chair of department, Shepherdson had built up a strong research team of experts in different branches of Mathematical Logic, the subfield of mathematics that provided the mathematical tools for Automata Theory. And so it was then that I applied to do a PhD at Bristol. That was in the fall of 1968.

Once I arrived in Bristol, everything changed. Among the mathematics graduate-student community at Bristol University, all the buzz – and there was a lot of it – was about an emerging new field called Axiomatic Set Theory. Actually, the field itself was not new. But as a result of a Fields Medal winning new result, it had recently blossomed into an exciting new area of research.

Not long after Georg Cantor’s introduction of abstract Set Theory in the late 19th century, Bertrand Russell came up with his famous paradox, concerning the set of all sets that are not members of themselves. To escape from the paradox – more accurately, to rescue the appealing, natural notion of using abstract sets as the basic building block out of which to construct all mathematical objects – Ernst Zermelo formulated a seemingly simple set of axioms to legislate the formation of sets. With an important addition from Abraham Fraenkel in 1925, that axiom system seemed to provide an adequate basis for the construction of all the objects of mathematics, while avoiding Russell’s Paradox. Zermelo-Fraenkel Set Theory, as it became known, rapidly came to be regarded as the "Grand Unified Theory" of mathematics, the basic system on which everything else is built. It was generally referred to as ZFC, the “C” denoting the Axiom of Choice, a basic principle Zermelo included but which was sufficiently controversial that its use was often acknowledged explicitly.

While the ZFC axiom system was indeed sufficient to ground all of mathematics, there were a small number of seemingly-simple questions about sets that no one could answer using just those axioms. The most notorious by far went back to Cantor himself: Cantor’s Continuum Problem asks how many real numbers there are? Of course, one answer is that there are an infinite number of such. But the ZFC axioms allow the construction of an entire system of infinite numbers of increasing size, together with an arithmetic, that can be used to provide a “count” of any set whatsoever. The smallest such infinite number, aleph-0, is the number of natural numbers. After aleph-0, the next infinite number is aleph-1. Then aleph-2, and so on. (It’s actually a lot more complicated than that, but let’s leave that to one side for now.)

Cantor showed that the real continuum, the set of all real numbers, has an infinite size strictly larger than aleph-0, so it must be at least aleph-1. But which aleph exactly was it? It’s tempting (on the grounds of pure laziness) to assume it’s aleph-1, an assumption known as the Continuum Hypothesis (CH). But there is no known evidence to support such an assumption.

In 1940, Kurt Goedel contructed a set-theoretic model of ZFC in which CH is true, thereby demonstrating that CH could never be proved false. But that does not imply that it is true. Maybe it was possible to construct another model in which CH was false. If so, then CH would be completely undecidable, based on the ZFC axioms. This would mean that the ZFC axioms are not sufficient to answer all reasonable questions about sets.

In 1963, Paul Cohen, a young mathematician at Stanford University, found such a model. Using an ingenious new method for constructing models of set theory that he called forcing, Cohen was able to create a model of ZFC in which CH is false. That result earned him the Fields Medal in 1966.

By 1968, when I went to the University of Bristol to commence my doctoral work, Cohen’s new method of forcing had been shown to have wide applicability, making it possible to prove that a number of long-standing, unanswered mathematical questions were in fact undecidable in the ZFC system. This opened up an exciting new pathway to getting a PhD. Learn how to use the method of forcing and then start applying it to unsolved mathematical problems, of which there was no shortage. Large numbers of beginning graduate students did just that, and by the time I joined a group of them, a few months after arriving at Bristol, the field was red hot. My interest in computation did not go away, but it would be over two decades before I would pick it up again. At 21 years of age, with a newly minted bachelors degree in mathematics under my belt, I had a mathematical research career to build, and axiomatic set theory was by far the most exciting field to do it in. I jumped onto the roller coaster and joined in the fun.

Working in my newly chosen field was just like working in any other branch of mathematics. Each day, you woke up and attempted to prove various mathematical statements using logically rigorous reasoning. To an observer looking over your shoulder, doing that involved scribbling formulas on paper and manipulating them in an attempt to construct a proof, just like any other branch of mathematics. The “rules of the game” were exactly the same as in any other branch of mathematics as well. The only difference was the nature of the answers you obtained – on the rare occasion when you did so. (Mathematics research is 95% failure. Actually, the failure rate may be higher than that; we have a far worse batting average than any professional baseball hitter.) In what those of us in this new field called “classical mathematics,” the goal was to prove statements about mathematical objects were true or false. In the new mathematics of undecidability proofs, the goal was to prove that statements about mathematical objects were undecidable (in the ZFC system). In both cases, the result was a rigorous mathematical statement (a theorem) justified by a rigorous mathematical argument (a proof).

From the perspective of mathematics as a whole, this meant that, thanks to Cohen, mathematicians had a new way to answer a mathematical question. Classically, there had been just two possibilities: true and false. If you can do neither, you had failed to find an answer.  Now, there was a third possibility: (provably) undecidable. What had previously been failure could now become success. Absence of a definite answer could be replaced by getting a definitive answer. Lack of knowledge could be replaced by knowledge.

The two decades following Cohen saw a whole range of unsolved mathematical problems proved undecidable, as a whole army of us jumped into the fray. Some results were easy. Success came quickly to those smart enough or lucky enough (or both) to find an unsolved problem that yielded relatively easily to the forcing technique. Others took much longer to resolve, and a few resisted all attempts (and have done so to this day). But by the start of the 1980s, the probability of success had dropped to that in other areas of mathematics. From then on, for most young mathematicians, getting a PhD by solving an undecidability problem meant finding some relatively minor variant of a result someone else had already obtained. The party was over.

Looking back, I realize that I was simply very lucky to be starting my mathematical career when a productive new subfield was just starting up. By going to the University of Bristol to do a PhD in Automata Theory, I found myself in the right place at the right time to jump ship and have the time of my life. When the field started to settle down and slowdown in the 1980s, I started to lose interest. Not in mathematics, just in that particular area as my main research focus. My main interest shifted elsewhere as my attention was caught by some new mathematics being developed at Stanford (as with forcing, Stanford turned out to be the place that generated the new ideas that caught my attention). That new mathematics was closely intertwined with my earlier high school interest in computing. But that’s another story.

Monday, June 3, 2013

Will Cantor’s Paradise Ever Be of Practical Use?

We really have no way of knowing what early humans thought when they gazed up at the sky. Since everyday practical experience is, by definition, limited to a very small region of space and time, it requires considerable cognitive sophistication to conceive of something – say the night sky – “going on for ever,” let alone to ponder whether that means it is “infinite,” or indeed what “infinite” actually means.

What we do know is that the ancient Greeks made what may have been the first substantial attempt to analyze the notion of infinity, with Zeno of Elea (ca 490-430 BCE) of particular note for his discussion of a number of (seeming) paradoxes that arise from the assumption that space and time are (or are not) infinitely divisible.

Archimedes’ (ca 287-112 BCE) calculations of areas and volumes made implicit use of infinity, and from today’s perspective can be recognized as the forerunner of integral calculus.

Skillful formal – though by modern standards not rigorous – use of the infinitely large and the infinitely small was made by Isaac Newton and Gottfried Leibniz in their development of modern infinitesimal calculus in the seventeenth century, though it was not until the nineteenth century when Bernard Bolzano, Augustin-Louis Cauchy, and Karl Weierstrass finessed the lurking problems of infinity by means of the famous (and for many a first-year mathematics major, infamous) epsilon-delta definitions of limits and continuity.

But none of these developments was about infinity as an entity; the focus rather was on the unending nature of certain processes, starting with counting. It was Georg Cantor (1845 – 1918) who really tackled infinity head on. His proof that the set of real numbers cannot be put into one-one correspondence with the natural numbers, and hence is of a larger order of infinitude, led to a series of papers, published in a remarkable ten-year period between 1874 and 1884, that formed the basis for modern abstract set theory, including the development of a fully formed arithmetical theory of infinite numbers (or “cardinals”).

Reactions to Cantor’s revolutionary new ideas ranged from outraged condemnation to fulsome praise. Henri PoincarĂ© called Cantor’s work a “grave disease” that threatened to infect mathematics, and Leopold Kronecker described Cantor as a “scientific charlatan” and a “corrupter of youth.” Ludwig Wittgenstein, writing long after Cantor's death, complained that mathematics had become “ridden through and through with the pernicious idioms of set theory,” a theory he dismissed as “utter nonsense,” “laughable,” and “wrong.”

At the other end of the spectrum, in 1904, in the UK the Royal Society awarded Cantor its highest award, the Sylvester Medal, and in Germany David Hilbert declared that “No one shall expel us from the Paradise that Cantor has created.”

Having devoted the early part of my professional career to work in (infinitary) set theory, starting with my Ph.D. in “large cardinal theory,” completed in 1971, and moving on to work on alternative universes of sets (a particularly hot topic after Paul Cohen’s introduction of the method of forcing in 1963), in the early 1980s my interests started to shift elsewhere, to questions about information, communication, and human reasoning.

I found myself temporarily back in the world of set theory and the arithmetic of infinite numbers recently, when I was approached by the organizers of the World Science Festival to moderate a panel discussion on the topic of infinity and a more in-depth follow-up the following day.

Both discussions raised the question as to whether study of infinity – in particular the hierarchy of larger infinities that Cantor bequeathed to us – would ever have any practical applications. As panelist Hugh Woodin remarked at one point in the discussion, it is a foolish mathematician who declares that a particular piece of mathematics will not find applications. For instance, G. H. Hardy’s famous statement (in his book A Mathematician’s Apology) that his work in number theory would never find practical application, proved to be spectacularly wrong less than a century later, when number theory became the foundation for internet security.

Hardy’s observation was based on his familiarity of the world he lived in, a world in which the World Wide Web was not even a dream. Today, we cannot know what the world of tomorrow will look like. On the other hand, whatever our children and grandchildren will take for granted, their world will surely be finite, which makes it unlikely that Cantor’s theory – and the almost a century of development in set theory since then – will have practical use.

Or does it? What about calculus? Infinitesimal (!) calculus not only has applications in the modern world, but much of the science, technology, medicine, and even financial structure the underpins our world depends on calculus for its very existence. Applications don’t get more real than that.

True, but the dependence on infinities you find in calculus is essentially asymptotic. What really drives calculus is the unending nature of certain processes on the natural numbers. Talk of “infinitely large” or “infinitely small” is little more than a manner of speaking. Indeed, the epsilon-delta definitions (which do not involve infinities or infinitesimals) are a way to formalize that manner of speaking, effectively eliminating any actual infinite or infinitesimal quantities.

In contrast, much of the work on infinity (more precisely, infinities) carried out in the second half of the twentieth century (when I was working in that area) focused on properties of sets that made their cardinalities super-infinities of different orders: inaccessible cardinals, Ramsey cardinals, measurable cardinals, compact cardinals, supercompact cardinals, Woodin cardinals, and so on. Infinities which dwarfed into invisibility the puny cardinality of the set of natural numbers. Indeed, each one in that sequence dwarfed all its predecessors into invisibility. How could that work find an application?

I’ll lay my cards on the table. I think the chances are that it won’t. But I don’t think it is impossible. Indeed, I began to suspect a possible application in the very domain I worked in after I left set theory.

[This may of course be nothing more than a reflection of having at my disposal a large hammer which made everything look a bit like a nail. But let’s press on.]

The post 9/11 world saw me involved in a series of Defense Department projects the first being improving intelligence analysis (and the others essentially variants of that).

In today’s information rich world, the major nations can be assumed to have access to all the information they need to predict (and hopefully thence prevent) the majority of terrorist attacks. The trouble is, the few data points which must be identified and connected together to determine the likelihood of a future attack are just a tiny few in an overwhelming ocean of data. Even in the era of cloud computing, identifying the key information is analogous to using the naked eye to find a handful of proverbial needles in a non-proverbial field of haystacks.

To all intents and purposes, the available data is infinite. The only hope is to impose some structure on the data that makes it possible for humans and computers to work together on it, narrowing down the focus to the regions more likely to be of significance. Though modern computing systems can sift through massive (finite) amounts of data in a relatively short time, they need to be programmed, and writing those programs (at least, some kinds of them) will require some structure on those large sets of data. A possible place to find the appropriate structure(s) is infinitary set theory. In other words, to develop the appropriate structures, assume the data is infinite. View the infinite as a theoretical simplification of the very large finite. (Economists sometimes make a similar simplifying assumption about economies.)

Do I think this is likely? Frankly, no. But then, neither could Hardy conceive of any practical application of his work in number theory. [Incidentally, like Hardy, I don’t think mathematics needs applications to justify itself. It’s just that the question of application is what this article is about!]

The discussion about large cardinals you will find in those panel discussions at the World Science Festival might seem impossibly abstract and far removed from the everyday world. Indeed, it is. But the questions being discussed all resulted from a process of rigorous, logical investigation that arose directly from late nineteenth century attempts to understand heat flow. History tells us that what begins in the real world, very often ends up being used in the real world.

Prediction is hard, particularly about the future.

Incidentally, how did I end up working on a project for the DoD? They asked me. I might not be the only person to speculate about a possible use of Cantor’s paradise. This is your taxpayer dollars at work.