Turns out that the development of computer learning algorithms has stumbled upon an obstacle. This obstacle is the set theory that cannot be solved due to fundamental reasons.
Amir Iegudajov and his colleagues from the University of Tel Aviv found out that machine learning algorithm has stumbled upon a fundamental mathematical paradox discovered by Georg Cantor and Kurt Goedel, the great mathematicians of the 19-20 centuries. The issue was addressed in the article published in Nature Machine Intelligence on January, 7.
Background: Famous paradoxes of the 20th century
The best way to illustrate the paradox discovered by the mathematician Bertrand Russell is with the problem on two catalogs. The problem situation is the following. All the books in the library must be added to one of the two catalogs. The first catalog is meant for the books that include a self-reference. The second catalog is meant for the books with no self-reference. Since the catalogs are books, they must be cataloged too. As for the first catalog, you can either add a self-reference to it or leave it as it is. Either way, the condition of the problem is met. As for the second catalog, it belongs in neither catalog and you can’t leave it uncatalogued. Either way, the condition is violated.
Pondering over Russell’s paradox, Kurt Goedel invented his famous “incompleteness theorem.” Let’s take a system of mathematical axioms and list all possible mathematical statements that follow from the axioms. (It’s something like a library catalog). There will always be a true mathematical statement that doesn’t belong on the list, just like the second catalog in the example above. Therefore, any system, even an infinite one, is always incomplete. There always will be a true statement that doesn’t follow from the axiom system. Mathematicians call such statements “undecidable.” Even if you call categorize such an undecidable statement as an axiom and add it to the list, the new system will still be incomplete, because there will be an indefensible and irrefutable statement for its tool.
One example of Goedel’s undecidable statement is the continuum hypothesis by Georg Cantor. Comparing infinite sets of numbers, the German mathematician discovered that they are different in “power.” For example, sets of natural, rational and real numbers are infinite. However, while you can compare natural and rational numbers (because these sets are of equal power), you can’t do the same with real numbers because they’re more “densely” packed.
Cantor formed the following question: Are there sets that are more powerful than a set of natural numbers but less powerful than a set of real numbers? Unfortunately, Cantor failed to find the answer to his question. However, in 1940, Goedel proved that this is actually an example of an undecidable statement in the set theory. One can say that there is no such thing as sets of intermediate power, and this statement will become a part of a consistent mathematical system. At the same time, one can say just the opposite, and that will also be a consistent system of statements, although different from the first one.
The English mathematician Alan Turing elaborated Goedel’s idea by applying it to computing algorithms. He proved that the list of “all possible algorithms that solve the problem” will lack an algorithm that would determine whether the problem can be solved by a random algorithm. Based on that discovery, the British mathematician Roger Penrose made a well-founded hypothesis saying that human reasoning cannot be imitated through algorithms. Therefore, the artificial intelligence, in the strict sense of the word, is impossible to create because some of the problems solved by the human brain might be Turing’s undecidable algorithms.
Machine learning paradox
In the 20th century, it seemed that Goedel’s undecidable statements were pretty abstract and had nothing to do with applied problems. However, a few years ago, a group of theoretical physicists headed by Toby Cubitt managed to prove that Goedel’s undecidability is also present in the quantum gap problem, where it’s impossible to theoretically determine whether a big plate of metal has superconductive properties.
The authors of the article were involved in studying machine learning, which is a field of an even more applicable nature. Using finite sets of data, an algorithm learns how to recognize, say, kitten images. The goal of learning is considered achieved if the algorithm learns to unmistakably find kitten images in infinite data sets.
Iegudajov and his colleagues studied the correlation between learning and data compression. They found out that compression of information is closely related to the Cantor’s continuum hypothesis (which, as mentioned earlier, is mathematically undecidable). There is an infinite number of ways for selecting a smaller data set out of an infinitely large one. However, the “power” of such an infinite data set was unknown. What Iegudajov and the others did is to show that the “power” was undecidable. That is, if we assume that Cantor’s theory is true, there always will be a small set of learning data using which the algorithm will learn how to make find-a-kitten predictions in an arbitrarily large sample. However, if we admit the existence of data sets of intermediate power, there would be no data sample that could guarantee success.
According to the authors, the detected paradox is extremely important for understanding data compression principles which sum up the machine learning. At the same time, its practical relevance causes doubt, because infinite data sets are a mathematical abstraction. Nonetheless, the studies, which outline the fundamental boundaries of algorithmic thinking, help estimate the prospects of AI systems and further explore the phenomenon of human intelligence.