The Ratiocinator: The Universal Universe, Part IV: Computational Complexity

Let me start this post on a personal note and apologize for the unannounced break -- I was on vacation in Norway, and, due to poor planning on my part, had an exam right afterwards, which kept me from attending to the blog as I planned.
It all turned out for the best in the end, though, since during my stay in Norway, I stumbled across this great essay by Scott Aaronson (discussed on his blog here), which alerted me to something I haven't paid a lot of attention to, but should have -- the field of computational complexity, and especially its implications for philosophy. Having now had some time to digest the contents, I think there's some important issues for me to think, and write, about.
Of course, I was aware of the field of computational complexity and its basic notions prior to reading that essay, but thought of it as in some way concerning 'merely' problems of practicality -- making the classic philosopher's error (not that I'm a philosopher, but this apparently doesn't keep me from committing their errors) of being content with showing something 'in principle' possible, ignoring the issues involved in making it possible 'in practice' as mere technicality.
Aaronson, now, has taken up the challenge of breaking this unfortunate habit, and does so with great clarity, showing that often enough, thinking about merely what is possible in principle, or, in more appropriate terms, what is computable, without thinking about the difficulty involved in actually undertaking that computation, misses most of the good stuff. One particularly interesting argument is his resolution of the so-called 'waterfall problem', which essentially poses that any physical process can be interpreted as implementing any computation, making thus the proposal that sentience can come about merely through computation somewhat suspect -- apparently forcing us to conclude that if there is a computation that gives rise to sentience, every (or nearly every) physical system can be viewed as implementing that computation, and hence, giving rise to sentience.

But first, let's have a little look at the fundamentals of computational complexity. Basically, computational complexity concerns itself with the question of how difficult it is for a computer to solve a given problem. To that end, one takes a measure of the size of some instance of the problem, and investigates how the resources the computer needs to solve said problem scale with this size.
For instance, if the problem is multiplying two numbers of n bits length, a 'naive' algorithm takes roughly n² steps; however, if the problem is conversely factoring a number of length n bits, i.e. finding those prime numbers that if multiplied yield said number, a similarly naive algorithm takes 2ⁿ steps to completion, which is hugely less efficient (in both cases, there exist more efficient algorithms, but the qualitative difference remains so far).
One should not underestimate the magnitude of the difference between the two examples: for just a 32-bit input, the multiplication problem can be solved in roughly 1000 steps, while factoring needs a whooping 4.3 billion steps -- depending on input size and the speed of the computational hardware you have access to, this distinction between polynomial and exponential dependence on problem size may well equate to the distinction of a few seconds or minutes of computation versus a time greater than the age of the universe (of course, to a modern computer, 4.3 billion steps isn't actually all that much).
To sort problems according to their difficulty, complexity theorists have devised a number of complexity classes, whose ordering principle is that two problems lie in the same complexity class provided their solution is (asymptotically) similarly resource-intensive. One of the most well-known such classes is called simply P, the class of problems to which a solution can be found in 'polynomial time' -- i.e. for which the number of steps it takes to find a solution scales as some power of the problem size; another is called NP, which, contrary to what is occasionally claimed, does not mean 'non-polynomial', but rather, 'nondeterministic polynomial', and roughly comprises those problems for which it takes polynomial time to check whether a proposed solution is correct. Thus, multiplication lies in P, factoring in NP (since in order to check whether a proposed factorization is correct, one has to multiply the factors). The question of whether or not P equals NP is one of the central open questions in complexity, with strong evidence -- but no conclusive proof -- pointing towards the case of both being unequal, i.e. to the existence of problems whose solutions can be checked, but not found, efficiently. The importance that is placed on this problem can be gauged by the fact that it is one of only seven 'Millennium Problems', the solution of which carries a prize of US$ 1,000,000. (If you'd like to know more about complexity and complexity classes, you could do worse than peruse the Complexity Zoo, created and curated by the very same Scott Aaronson.)

The Meaning of Computations
Now, what good does that do us?
There's one problem in particular that I have previously had only a somewhat muddy response to that really benefits from Aaronson's discussion, which is loosely the question: given a certain computation, what does this computation mean?
At first, this hardly seems like a deep question: computations that we are used to present their meaning to us in a straightforward way, such as in the form of a printout, or display on a screen. But remember the lessons of computational universality: any universal machine can implement any other, so there is a way of translating between the computation done by one machine and the computation done by another, such that each computation of one gets mapped to a computation by the other. The display on the screen translates our computer's computation into something a human brain can understand; however, other translations are possible.
We've met this ambiguity before, in the previous post in this series, in Deutsch's argument regarding the alleged impossibility of finding any true explanation for the nature of our world -- for any theory of everything one might come up with, many others can be found, making reference to different fundamental entities and interactions, yet being just as phenomenologically adequate. We'd be in the position of entities in a computer simulation trying to figure out the hardware and software their universe runs on -- but their universe might be written in LISP and run on a Mac just as well as it could be a Perl program executed on a Windows PC, with no way for them to tell.
Here, we're faced, in a sense, with the flipside of this issue: just as there is ambiguity in what computation realizes a given physical system or process, there is a similar ambiguity in the computation that a physical system carries out.
To my knowledge, the argument was first raised by Hilary Putnam, who formulated the following theorem:

Every physical system implements every finite state automaton (FSA).

Here, a finite state automaton is an abstract model of computation, similar to, but strictly weaker than, a Turing machine: as the name implies, it can be in one of finitely many states; thus, it in a sense works with finite resources, as opposed to the Turing machine's infinite memory tape. Strictly speaking, all physical computers are finite automata, since there's no such thing as infinite memory; they only acquire the characteristics of a Turing machine in the limit, if you imagine continuously adding more memory to them.
It's perhaps not obvious upfront that the above theorem is true. But consider the evolution of, say, a waterfall. At the start, the waterfall may be in one of finitely many states s_i, and after some time has elapsed, it will have evolved to a state t_i. The waterfall's evolution thus constitutes a function from initial to final states f(s_i) = t_i, and, if we accept the laws of physics to be reversible, that function will associate each initial with exactly one final state. But what we take these final states to mean is arbitrary; we could, for instance, set up an interpretation such that the initial states represent natural numbers in some ordering, and the final states represent natural numbers in another -- meaning that we can take the evolution of the waterfall to implement any function from natural numbers to natural numbers (or, more accurately, from a finite subset to another finite subset thereof). The same goes, for example, for strings of bits -- and computing functions from strings of bits to strings of bits is ultimately all our computers do, as well. So the waterfall's evolution can be viewed to implement any computation our computers can execute.
This, or so the argument goes, deals a deadly blow to computational theories of mind: for, if there exists a computation that gives rise to a mind, then every physical system -- provided it possesses sufficiently many states -- can be thought of as executing that computation; thus, either all physical systems must be conscious -- moreover, must be conscious in all the ways it is possible to be conscious: for, if there is a computation that produces my conscious experience, and there is a computation that produces your conscious experience, or anybody else's, then a system can be seen to implement all those possibilities --, or, there can be no computation that gives rise to a mind. In other words, computation is purely syntactic -- works just at the level of symbols -- but leaves arbitrary the semantics, the meanings of those symbols; yet those meanings are what's central to consciousness and subjective experience and all that.
This would seem to be a quite devastating conclusion if you're a computationalist -- like myself, as is obvious from past postings -- but Aaronson sees the crux of this argument in the complexity of the implementation of interpreting a physical system as implementing a given computation.
To see this, we first need to introduce the concept of a reduction. In complexity theory, a problem is said to reduce to another problem roughly if, given the solution to that second problem, the first problem's solution becomes simpler. Thus, if you had in hand a list of solutions to the second problem, or a device that readily provides these solutions (called an oracle), the complexity of solving the first problem becomes greatly reduced.
In the present case, the waterfall can be viewed as such an oracle -- it provides solutions to a certain kind of problem (the evolution of a waterfall), which then are used to solve a totally different kind of problem (for definiteness, let's say that problem is coming up with a good chess move). Now, the crucial point is that the reduction is useful just in the case that appealing to the oracle confers an advantage -- if the asymptotic complexity of the problem remains the same, then one might just as well do without the oracle. The algorithm that 'interprets' the oracle's output -- the waterfall's final state -- as a solution to the problem of finding a good chess move does the same work an algorithm that just computes that chess move would -- the waterfall doesn't really share any of the computational load. However, it's clear that for certain other problems -- say, the problem of simulating a waterfall -- the waterfall-oracle is extremely useful: the computational task, quite difficult without aid, becomes trivial.
So there are computations that a physical system implements in a more natural way than others, and for some -- one might conjecture the majority -- it will be no help at all; the work needed to interpret the system as implementing that computation is the same as just doing the computation itself.
An analogy may help to fully drive this point home. Consider a water-driven flour mill. At its heart, there is a simple apparatus that translates the energy imparted by, say, a waterfall driving a mill-wheel, into the grinding action of two stones versus one another, in order to grind the grain down to flour. Now, a miller, intimately familiar with this construction, comes -- say, through inheritance -- into possession of a plot of land. Naturally, he will want to build a mill on his new land. However, there is no waterfall available anywhere on it to drive his mill!
But, the miller is a clever sort, so, single-handedly advancing the state of technology, he invents an ingenious motor-driven pump system capable of pumping water up to a certain height, from which it is then allowed to fall, driving the mill wheel to grind the grain; at the bottom, the water is once more pumped up, closing the circuit (of course, this will require additional fuel for the motor; on this blog, we obey the laws of thermodynamics).
Of course, the ludicrousness of this scheme is apparent -- instead of using his motor to pump water, he could just have directly used it to drive the mill wheel, saving himself a lot of trouble. The waterfall drives the mill only apparently -- it could entirely be done without, just as it could be if there is a chess program computing chess moves as efficiently without a waterfall as the program that interprets the waterfall's evolution as chess-move computation does.
The conclusion is then that if the waterfall does not yield a reduction in the complexity of the problem of chess-move computation, then it can't be said to implement this computation. There thus are computations that it is natural to associate with a physical system -- the system's own evolution being the prime example --, while for the vast majority of other computations, a system can't be meaningfully said to implement them, as they are as efficiently (or inefficiently, as the case may be) possible without the system's aid at all.
Thus, most physical systems won't implement computations giving rise to consciousness after all -- saving us from having to apologize to every stone we stub our toes on, adding insult to injury.

Reasonable and Unreasonable Explanations
Having now a bit of a handle on what one might mean by the 'meaning' of a computation, we can extend the reasoning to further ameliorate the problem raised by Deutsch of finding a good explanation of the fundamental nature of the world in a computational universe.
Again, the argument goes something like: if we live in a universe that is in some sense equal to a computer executing a computation, then it is impossible for us to deduce the precise nature of this computation -- every computational theory we might come up with is in principle (but see above!) equally suited to doing so.
In the last article in this series, I have argued that this problem can be addressed by considering only theories that are 'well-suited' to human processing -- Newtonian gravity thus yielding a good explanation of planetary orbits, while electrodynamics doesn't, even though it is in principle possible to use it to compute them -- if through no other way than calculating the evolution of a computer, which essentially is nothing but an evolving electromagnetic field, that is tasked with computing orbital trajectories (alternatively, one might consider using the equations of hydrodynamics to calculate electromagnetic fields, by calculating the evolution of a waterfall implementing the necessary computation).
With the tools of computational complexity in hand, this reasoning can be made much more precise. Now, we can say that a theory of some physical phenomenon is only a good theory if it facilitates efficient computation of this phenomenon; in particular, a reduction of a phenomenon to some known theory is only practical if it actually lessens the computational load -- thus, hydrodynamics is not a good theory of electromagnetism, since there exists a theory (Maxwell's electrodynamics) able to implement the same calculations at less computational cost. Reducing electrodynamics to hydrodynamics is thus as pointless as reducing chess-move computations to waterfalls -- all the work is done by the reduction itself, so to speak.
Thus, contrary to Deutsch, we can find theories able to describe the world that are 'more natural' than others, at the very least. This does not address the question of 'what's really out there', perhaps: the universe might be 'really' something quite different from the mathematics we use to describe it. But I'm not sure there is a way the universe really is anymore than there is a path the photon really takes in a double-slit experiment. Perhaps the fact that there is a most efficient, i.e. least (or at least less) complex, way to describe it, is all we really get -- and need: just as the evolution of the waterfall being most efficiently seen as a computation of the waterfall itself justifies seeing the waterfall as a waterfall, rather than a chess-move computing system, or a conscious mind, maybe the computation running 'beneath' the observable reality of the universe being most efficiently seen as the evolution of the universe justifies seeing this computation as the universe. In this view, the computation would be primary, and complexity considerations would justify picking out one 'meaning' of this computation (or a small class of possibly related ones, perhaps).

Computing Universes
There's yet another problem that might be helped by these insights. In recent years, a few proposals have come up of what is sometimes called 'ensemble theories of everything', or 'ETOE's (some examples are due to Jürgen Schmidhuber, Russel K. Standish, and Max Tegmark). The basic idea of most ETOEs is that it is easier, in a well-defined way, to compute/generate all possible universes, than it is to just compute one -- the reason for this is that essentially it's easy to write a short program that produces all possible outputs, but producing any given output may be hard -- indeed, most possible outputs don't have a short program describing them (this ties in with a concept known as Kolmogorov complexity, about which much more in the next post).
So what you can do is to set up a computer that first generates, then executes all possible programs in a dove-tailing fashion (i.e. something like: perform the first step of the first program; then, perform the first step of the second program, and the second step of the first program; then, perform the first step of the third, the second step of the second, and the third step of the first program; and so on). Eventually, any possible program will be generated, and run -- thus, if, say, there is a program computing our universe, this program, too, will be generated and run.
For definiteness, let's fix a Turing machine taking as input binary strings, and outputting binary strings as well. It's fed with a master program, whose task it is to generate every possible binary sequence in some order, interpret each such sequence as a program, and execute it in the dovetailed manner outlined above. Sooner or later, one of its computations will correspond to our universe -- we'd have to wait a bit, sure, but it saves a whole lot of effort in creating universes.
But wait a moment -- how do we tell which of those computations corresponds to our universe? Sure, we could fix some interpretation -- some way to look at the output generated by all the programs --, and under this interpretation (say, a translation of the binary strings to output on a screen in some way), eventually, we'll find our universe within the outputs. But, this interpretation is completely arbitrary! In fact, we can find an interpretation such that any output conforms to our universe. Worse, in the absence of an interpretation, everything we have really are just binary strings -- Turing machines don't output universes, they output symbols! Thus, we seem to be dependent on somebody or something interpreting the outputs in some certain, arbitrary way in order for this scheme to give rise to our universe, or in fact to any universe at all. I've called this problem in the past the 'Cartesian computer', because of its resemblance to Dennett's Cartesian theater (see this previous post): some meaning-giving entity must be invoked that 'looks at' the computer's output, and recognizes it as our universe. This is a deeply unsatisfactory situation, in my opinion.
But, knowing what we know now, we can point to a possible resolution of this problem: for most (probably almost all) possible interpretations, the computational work needed to implement them will be intractable; contrariwise, natural interpretations exist that correspond to true reductions of the task of interpreting -- to a certain extent, then, this removes the arbitrariness of the interpretation, and helps fix one computation, or a small class of related computations, as 'computing our universe'.

Samstag, 3. September 2011

The Universal Universe, Part IV: Computational Complexity

Keine Kommentare:

Kommentar veröffentlichen