October 28, 2011 § Leave a comment

At first sight it may sound like a bad joke, indeed.

Turing not only provided many important theoretical insights on computing [1], including the Universal Turing Machine (UTM), he and his group in Bletchley Park also created a working prototype, which had been employing the theoretical results [2].

Turing Computation

In order to clarify what non-Turing computing could be, we first have to inspect a bit closer how Turing-computing is defined. On Wikipedia one can find the following explanation in standard language:

With this encoding of action tables as strings it becomes possible in principle for Turing machines to answer questions about the behavior of other Turing machines. Most of these questions, however, are undecidable, meaning that the function in question cannot be calculated mechanically. For instance, the problem of determining whether an arbitrary Turing machine will halt on a particular input, or on all inputs, known as the Halting problem, was shown to be, in general, undecidable in Turing’s original paper. Rice’s theorem shows that any non-trivial question about the output of a Turing machine is undecidable.

A universal Turing machine can calculate any recursive function, decide any recursive language, and accept any recursively enumerable language. According to the Church-Turing thesis, the problems solvable by a universal Turing machine are exactly those problems solvable by an algorithm or an effective method of computation, for any reasonable definition of those terms.

One could add firstly that any recursive algorithm can be linearized (and vice versa). Secondly, algorithms are defined as procedures that produce a defined result after a finite state of time.

Here is already the first problem in computational theory. What is a result? Is it a fixed value, or would we accept a probability density or even a class of those (like Dirac’s delta) also as a result? Even non-deterministic Turing machines yield unique results. The alternative of an indeterminable result sounds quite counter-intuitive and I suppose that it indeed can not be subsumed under the classical theory of computability. It would simply mean that the results of a UTM are only weakly predictable. We will return to that point a bit later.

Another issue is induced by problem size. While analytic undecidability causes the impossibility for the putative computational procedure to stop, sheer problem size may render the problem as if being undecidable. Solution spaces can be really large, beyond 102000 possible solutions. Compare this to the estimated 1080 atoms of visible matter in the whole universe. Such solution spaces are also an indirect consequence of Quine’s principle of underdetermination of an empirical situation, which results in the epistemological fact of indeterminacy of any kind of translation. We will discuss this elsewhere (not yet determined chapter…) in more detail.

From the perspective of an entity being searching through a such a large solution space it does not matter very much, whether the solution space is ill-defined or vast, from the perspective of the machine controller (“user”) both cases belong to the same class of problems: There is no analytic solution available. Let us now return the above cited question about the behavior of other entities. Even for the trivial case that the interactee is a Turing machine, the question about the behavior is undecidable. That means that any kind of interaction can not be computed using an UTM, particularly however those between epistemic beings. Besides the difficulties raised by this for the status of simulation, this means that we need an approach, which is not derived or included in the paradigm established by the Church-Turing thesis.

The UTM as the abstract predecessor of today’s digital computers is based on the operations of writing and deleting symbols. Before an UTM can start to work, the task to be computed needs to be encoded. Once, the task has been actually encoded, including the rules necessary to accomplish the computation, everything that happens is just an almost material moving of semantically empty graphemes. (We avoid here to call the 0 and 1 “symbols,” since “symbol” is a compound concept, hence it could introduce complications to our investigation here.) During the operations of the UTM, the total amount information is constantly decreasing. Else, an UTM is not only initially completely devoid of any meaning, it remains semantically empty during the whole period it works on the task. Any meaning concerning the UTM remains forever outside the UTM. This remains true even if the UTM would operate at the speed of light.

Note, that we are not discussing the architecture of an actual physical computing device. Everybody uses devices that are built according von Neumann architecture. There are very few (artificial) computers on this earth not following this paradigm. Yet, it is unclear why DNA-computers or even quantum computers should not fall in this category. These computers’ processing is different from an instance that computes based on traditional logics, physically realized as transistors. Yet, the von Neumann architecture does not make any proposal about the processor except that there need to be one. Such fancy computers still need persistent storage, a bus system, encoding and decoding devices.

As said, our concern is not about the architecture, or even more trivial, about different speed of calculation. Hence, he question of non-Turing computing is also not a matter of accuracy. For instance, it is sometimes claimed that a UTM can simulate an analog neural net with with arbitrary accuracy. (More on that later!) The issue at stake has much more to do with the role of encoding, the status of information and being an embodied entity than with the question of how to arrange physical components.

Our suggestion here is that any kind of computer could be probably used in a way that it changes into a non-Turing computer. In order to deal with this question we have to discuss first the contemporary concept of “computation.”


To get clear about the concept of “computation” does not include the attempt to find an answer to the question “What is computation?”, as for instance Jack Copeland did [3]. Such a question can not be included in any serious attempt of getting clear about it, precisely because it is not an ontological question. There are numerous attempts to define computation, then to invoke some intuitively “clear” or otherwise “indisputable” “facts”, only in order to claim an ontological status of the respective proposal. This of course is ridiculous, at least nowadays after the Linguistic Turn. Yet, the conflation of definitory means and ontic status is just (very) naive metaphysics, if not to say esoterism in scientifically looking clothes. The only thing we can do is to get clear about possible “reasonable” ways of usage of the concepts in question.

In philosophy of mind and cognitive science, and thus also for our investigation of machine-based epistemology, the interest in getting clear about computation is given by two issues. First,  there is the question, whether, and, if yes, to what extent, the brain can be assigned “a computational interpretation.” To address this question we have to clarify what “computing” could mean and whether the concept of “brain” could match any of the reasonable definitions for computing. Second, as a matter of fact we know before any such investigation that we, in order to create a machine able to follow epistemological topics, have at least to start with some kind of programming.The question here is simply how to start practically. This concerns methods, algorithms, or machine architectures. A hidden but important derivative of this question is about the possible schemes of differentiation of an initial artifact, which indeed is likely to be just a software running on a contemporary standard digital computer.

These questions that are related to the mind are not in the focus of this chapter. We will return to them elsewhere. First, and that’s our interest here, we have to clarify the usage of the concept of computation. Francesco Nir writes [4]:

According to proponents of computationalism, minds are computers, i.e., mechanisms that perform computations. In my view, the main reason for the controversy about whether computationalism is accurate in its current form, or how to assess its adequacy is the lack of a satisfactory theory of computation.

It is obvious that not only the concepts of computation, brain and mind are at stake and have to be clarified, but also the concept of theory. If we would follow a completely weird concept about “theory,” i.e. if our attempts would try to follow an impossible logical structure, we would have no chance to find appropriate solutions for those questions. We even would not be able to find appropriate answers about the role of our actions. This, of course, is true for any work; hence we will discuss the issue of “theory” in detail in another chapter. Similarly, it would be definitely to limited to conceive of a computer just as a digital computer running some algorithm (all of them are finite by definition).

The history of of computation as an institutionalized activity starts in medieval ages. Of course, people performed calculation long before. The ancient Egypts even used algorithms for problems that can’t be written in a closed form. In classics, there have been algorithms to calculate pi or square roots. Yet, only in medieval ages the concept of “computare” got a definite institutional, i.e. functional meaning. It referred to the calculation of the future Easter dates. The first scientific attempts to define computation start mainly with works published by Alan Turing and Alonzo Church, which then was later combined into the so-called Church-Turing-Thesis (CTT).

The CTT is a claim about effectively computable functions, nothing more, nothing less. Turing found that everything which is computable in finite time (and hence also on a finite strap) by his a-machine (later called Turing machine), is equivalent to the λ-calculus. As an effect, computability is equaled with the series of actions a Turing machine can perform.As stated above, even Universal Turing Machines (UTM) can’t solve the Halting-problem. There are even functions that can’t be decided by UTM.

It has been claimed that computation is just the sequential arrangement of input, transformation, and output. Yet, as Copeland and Nir correctly state, citing Searle therein, this would render even a wall into a computer. So we need something more exact. Copeland ends with the following characterization:

“It is always an empirical question whether or not there exists a labelling of some given naturally occurring system such that the system forms an honest model of some architecture-algorithm specification. And notwithstanding the truism that ‘syntax is not intrinsic to physics’ the discovery of this architecture-algorithm specification and labelling may be the key to understanding the system’s organisation and function.”

The strength of this attempt is the incorporation of the relation between algorithm and (machine) architecture into the theory. The weakness is given by the term “honest,” which is completely misplaced in the formal arguments Copeland builds up. If we remember that “algorithm” means “definite results in finite time and space” we quickly see that Copeland’s concept of computation is by far too narrow.

Recently, Wilfried Sieg tried to clarify the issues around computation and computability in a series of papers [5,6]. Similarly to Nir (see above), he starts his analysis writing:

“To investigate calculations is to analyze symbolic processes carried out by calculators; that is a lesson we owe to Turing. Taking the lesson seriously, I will formulate restrictive conditions and well motivated axioms for two types of calculators, namely, for human (computing) agents and mechanical (computing) devices. 1 My objective is to resolve central foundational problems in logic and cognitive science that require a deeper understanding of the nature of calculations. Without such an understanding, neither the scope of undecidability and incompleteness results in logic nor the significance of computational models in cognitive science can be explored in their proper generality.” [5]

Sieg follows (and improves) largely an argument originally developed by Robin Gandy. Sieg characterizes it (p.12):

“Gandy’s Central Thesis is naturally formulated as the claim that any mechanical device can be represented as a dynamical system satisfying the above principles.”

By which he meant four limiting principles that prevent that everything is regarded as a computer. He then proceeds:

I no longer take a Gandy machine to be a dynamical system 〈S, F〉 (satisfying Candy’s principles), but rather a structure M consisting of a structural class S of states together with two kinds of patterns and operations on (instantiations of) the latter;”

[decorations by W.Sieg]

What is a dynamical system for Sieg and Gandy? Just before (p.11), Sieg describes it as follows:

“Gandy’s characterization […] is given in terms of discrete dynamical systems 〈S, F〉, where S is the set of states and F governs the system’s evolution. More precisely, S is a structural class, i.e., a subclass of the hereditarily finite sets H F over an infinite set U of atoms that is closed under ∈- isomorphisms, and F is a structural operation from S to S, i.e., a transformation that is, roughly speaking, invariant under permutations of atoms. These dynamical systems have to satisfy four restrictive principles.”

[decorations by W.Sieg]

We may drop further discussion of these principles, since they just add further restrictions. From the last two quotes one can see two important constraints. First, the dynamical systems under considerations are of a discrete character. Second, any transformation leads from a well-defined (and unique) state to another such state.

The basic limitation is already provided in the very first sentence of Sieg’s paper: “To investigate calculations is to analyze symbolic processes carried out by calculators;” There are two basic objections, which lead us to deny the claim of Sieg that his approach provides the basis for a general account of computation

Firstly, from epistemology it is clear that there are no symbols out in the world. We even can’t transfer symbols in a direct manner between brains or minds in principle. We just say so in a very abbreviative manner. Even if our machine would work completely mechanically, Sieg’s approach would be insufficient to explain a “human computor.” His analysis is just and only be valid for machines belonging (as a subclass) to the group of Turing machines that run finite algorithms. Hence, his analysis is also suffering from the same restrictions. Turing machines can not make any proposal about other Turing machines. We may summarize this first point by saying that Sieg thus commits the same misunderstanding as the classical (strong) notion of artificial intelligence did. Meanwhile there is a large, extensive and somewhat bewildering debate about symbolism and sub-symbolism (in connectionism) that only stopped due to exhaustion of the participants and the practical failure of strong AI.

The second objection against Sieg’s approach comes from Wittgenstein’s philosophy. According to Wittgenstein, we can not have a private language [8]. In other words, our brains can not have a language of thinking, as such a homunculus arrangements would always be private by definition. Searle and Putnam agree on that in rare concordance. Hence it is also impossible that our brain is “doing calculations” as something that is different from the activities when we perform calculation with a pencil and paper, or sand, or a computer and electricity. This brings us to an abundant misunderstanding about what computer really do. Computers do not calculate. They do not calculate in the same respect as we our human brain does not calculate. Computers just perform moves, deletions and—according to their theory—sometimes also an insertion into a string of atomic graphemes. Computers do not calculate in the same way as the pencil is not calculating while we use it to write formulas or numbers. The same is true for the brain. What we call calculation is the assignment of meaning to a particular activity that is embedded in the Lebenswelt, the general fuzzy “network”, or “milieu” of rules and acts of rule-following. Meaning on the other hand is not a mental entity, Wilhelm Vossenkuhl emphasizes throughout his interpretation of Wittgenstein’s work.

The obvious fact that we as humans are capable of using language and symbols brings again the question to the foreground, which we addressed already elsewhere (in our editorial essay): How do words acquire meaning? (van Fraassen), or in terms of the machine-learning community: How to ground symbols? Whatsoever the answer will be (we will propose one in the chapter about conditions), we should not fallaciously take the symptom—using language and symbols—as the underlying process, “cause”, or structure. using language clearly does not indicate that our brain is employing language to “have thoughts.”

There are still other suggestions about a theory of computation. Yet, they either can be subsumed to the three approaches as discussed here, provided by Copeland, Nir, and Sieg, or they the fall short of the distinction between Turing computability, calculation and computation, or the are merely confused by the shortfalls of reductionist materialism. An example is the article by Goldin and Wegner where they basically equate computation with interaction [9].

As an intermediate result we can state that that there is no theory of computation so far that would would be appropriate to serve as a basis for the debate around epistemological and philosophical issues around our machines and around our mind. So, how to conceive of computation?

Computation: An extended Perspective

Any of the theories of computation refer to the concept of algorithm. Yet, even deterministic algorithms may run forever if the solution space is defined in a self-referential manner. There are also a lot of procedures that can be made to run on a computer, which follow “analytic rules” and never will stop running. (By “analytic rules” we understand an definite and completely determined and encoded rule that may be run on an UTM.)

Here we meet again the basic intention of Turing: His work in [1] has been about the calculability of functions. In other words, time is essentially excluded by his notion (and also in Sieg’s and Gandy’s extensions of Turing’s work). It does not matter, whether the whole of all symbol manipulations are accomplished in a femto-second or in a giga-second. Ontologically, there is just a single block: the function.

Here at this pint we can easily recognize the different ways of branching off the classical, i.e. Turing-theory based understanding of computation. Since Turing’s concept is well-defined, there are obviously more ways to conceive of something different. These, however, boil down to three principles.

  • (1) referring to (predefined) symbols;
  • (2) referring to functions;
  • (3) based on uniquely defined states.

Any kind of Non-Turing computation can be subsumed to either of these principles. These principles may also be combined. For instance, algorithms in the standard definition as given first by Donald Knuth refer to all three of them, while some computational procedures like the Game of Life, or some so-called “genetic algorithms” (which are not algorithms by definition) do not necessarily refer to (2) and (3). We may loosely distinguish weakly Non-Turing (WNT) structures from strongly Non-Turing (SNT) structures.

All of the three principles vanish, and thus the story about computation changes completely, if we allow for a signal-horizon inside the machine process.  Immediately, we would have myriads of read/write devices working all to the same tape. Note, that the situation does not actualize a parallel processing, where one would have lots of Turing machines, each of them working on its own tape. Such parallelism is equivalent to a single Turing machine, just working faster.Of course, exactly this is intended in standard parallel processing as it is implemented today.

Our shared-tape parallelism is strikingly different. Here, even as we still would have “analytic rules,” the effect of the signal horizon could be world-breaking. I guess exactly this was the basis for Turing’s interest in the question of the principles of morphogenesis [10]. Despite we only have determinate rules, we find the emergence of properties that can’t be predicted on the basis of those rules, neither quantitatively nor, even more important, qualitatively. There is not even the possibility of a language on the lower level to express what has been emerging from it. Such an embedding renders our analytic rules into “mechanisms.”

Due to the determinateness of the rules we still may talk about computational processes. Yet, there are no calculations of functions any more. The solution space gets extended by performing the computation. It is an empirical question to what extent we can use such mechanisms and systems built from such mechanisms to find “solutions.” Note, that such solutions are not intrinsically given by the task. Nevertheless, they may help us from the perspective of the usage to proceed.

A lot of debates about deterministic chaos, self-organization, and complexity is invoked by such a turn. At least the topic of complexity we will discuss in detail elsewhere. Notwithstanding we may call any process that is based on mechanisms and that extends the solution space by its own activity Proper Non-Turing Computation.

Non-Turing Computation

We have now to discuss the concept of Non-Turing Computation (NTC) more explicitly. We will yet not talk about Non-deterministic Turing Machines (NTM), and also not about exotic relativistic computers, i.e. Turing machines running in a black hole or its vicinity. Note also that as along as we would perform in an activity that finally is going to be interpreted as a solution for a function, we still are in the area defined by Turing’s theory, whether such an activity is based on so-called analog computers, DNA or quantum dots. A good example for such a misunderstanding is given in [11]. MacLennan [12] emphasizes that Turing’s theory is based on a particular model (or class of models) and its accompanying axiomatics. Based on a different model we achieve a different way of computation. Despite MacLennan provides a set of definitions of “computation” before the background of what we labels “natural computation,” his contribution remains too superficial for our purposes (He also does not distinguish between mechanistic and mechanismic).

First of all, we distinguish between “calculation” and “computation.” Calculating is completely within the domain of the axiomatic use of graphemes (again, we avoid using “symbol” here). An example is 71+52. How do we know that the result is 123? Simply by following the determinate rules that are all based on mathematical axioms. Such calculations do not add anything new, even if a particular one has been performed the first time ever. Their solution space is axiomatically confined. Thus, UTM and λ-calculus are the equivalent, as it holds also for mathematical calculation and calculations performed by UTM or by humans. Such, the calculation is equivalent to follow the defined deterministic rules. We achieve the results by combining a mathematical model and some “input” parameters. Note that this “superposition” destroys information. Remarkably, neither the UTM nor its physical realization as a package consisting from digital electronics and a particular kind of software can be conceived as a body not even metaphorically.

In contrast to that by introducing a signal horizon we get processes that provoke a basic duality. On the one hand they are based on rules, which can be written down explicitly; they even may be “analytic.” Nevertheless, if we run these rules under the condition of a signal horizon we get (strongly) emergent patterns and structures. The description of those patterns or structures can not be reduced to the descriptions of the rules (or the rules themselves) in principle. This is valid even for those cases, where the rules on the micro-level would indeed by algorithms, i.e. rules delivering definite results in finite time and space.

Still, we have a lot of elementary calculations, but the result is not given by the axioms according to which we perform these calculations. Notably, introducing a signal horizon is equivalent to introduce the abstract body. So how to call calculations that extend their own axiomatic basis?

We suggest that this kind of processes could be called Non-Turing Computation, despite the fact that Turing was definitely aware about the constraints of the UTM, and despite the fact that it was Turing who invented the reaction-diffusion-system as a Non-UTM-mechanism.

The label Non-Turing Computation just indicates that

  • – there is a strong difference between calculations under conditions of functional logics (λ-calculus) and calculations in an abstract and, of course, also in a concrete body, implied by the signal horizon and the related symmetry breaking; the first may be called (determinate) calculation, the latter (indeterminate) computation
  • – the calculations on the micro-level extend the axiomatic basis on the macro-level, leading to the fact that “local algorithmicity” does not not coincide any longer with its “global algorithmicity”;
  • – nevertheless all calculations on the micro-level may be given explicitly as (though “local”) algorithms.

Three notes are indicated here. Firstly, it does not matter for our argument, whether in a real body there are actually strict rules “implemented” as in a digital computer. The assumption that there are such rules plays the role of a worst-case assumption. If it is possible to get a non-deterministic result despite the determinacy of calculations on the micro-level, then we can proceed with our intention, that a machine-based epistemology is possible. At the same time this argument does not necessarily support either the perspective of functionalism (claiming statefulness of entities) or that of computationalism (grounding on “algorithmic framework”).

Secondly, despite the simplicity and even analyticity of local algorithms an UTM is not able to calculate a physical actualization of a system that performs non-Turing computations. The reason is that it is not defined in a way that it could. One of the consequences of embedding trivial calculations into a factual signal horizon is that the whole “system” has no defined state any more. Of course we can interpret the appearance of such a system and classify it. Yet, we can not claim anymore that the “system” has a state, which could be analytically defined or recognized as such. Such a “system” (like the reaction-diffusion systems) can not be described with a framework that allows only unique states, such as the UTM, nor can a UTM represent such a system. Here, many aspects come to the fore that are closely related to complexity. We will discuss them over there!

The third note finally concerns the label itself. Non-Turing computation could be any computation based on a customizable engine, where there is no symbolic encoding, or no identifiable states while the machine is running. Beside complex systems, there are other architectures, such like so-called analog computers. In some quite justifiable way, we could indeed conceive the simulation of a complex self-organizing system as an analog computer. Another possibility is given by evolvable hardware, like FPGA, even as the actual programming is still based on symbolic encoding. Finally, it has been suggested that any mapping of real-world data (e.g. sensory input) that are representable only by real numbers to a finite set of intensions is also non-Turing computation [13].

What is the result of an indeterminate computation, or, in order to use the redefined term, Non-Turing computation? We are not allowed to expect “unique” results anymore. Sometimes, there might be several results at the same time. A solution might be even outside of the initial solution space, causing a particular blindness of the instance performing non-Turing computations against the results of its activities. Dealing with such issues can not be regarded as an issue of a theory of calculability, or any formal theory of computation. Formal theories can not deal with self-induced extension of solution spaces.

The Role of Symbols

Before we are going to draw an conclusion, we have to discuss the role of symbols. Here we have, of course, to refer to semiotics. (…)

keywords: CS Peirce, symbolism, (pseudo-) sub-symbolism, data type in NTC as actualization of associativity (which could be quite different) network theory (there: randolation)


Our investigation of computation and Non-Turing-Computation brings a distinction of different ways of actualization of Non-Turing computation.Yet, there is one particular structure that is so different from Turing’s theory that it can not even compared to it. Naturally, this addresses the pen-ultimate precondition of Turing-machines: axiomatics. If we perform a computation in the sense of strong rule-following, which could be based even on predefined symbols, we nevertheless may end up with a machine that extends its own axiomatic basis. For us, this seems to be the core property of Non-Turing Computation.

Yet, such a machine has not been built so far. We provided just the necessary conditions for it. It is clear that mainly the software is missing for an actualization of such a machine. If in some near future such a machine would exist, however, this also would have consequences concerning the status of the human mind, though rather undramatic ones.

Our contribution to the debate of the relation of “computers” and “minds” spans over three aspects. Firstly, it should be clear that the traditional frame of “computationalism,” mainly based on the equivalence to the UTM, can recognized as an inappropriate hypothesis. For instance, questions like “Is the human brain a computer?” can be identified as inadequate, since it is not apriori clear what a computer should be (besides falling thereby into the anti-linguistic trap). David King asked even (more garbageful) “Is the human mind a Turing machine?” [14] King concludes that :

“So if we believe that we are more than Turing machines, a belief in a kind of Cartesian dualist gulf between the mental and the physical seems to be concomitant.”

He arrives at that (wrong) conclusion by some (deeply non-Wittgensteinian) reflections about the actual infinite and Cantor’s (non-sensical) ideas about it. It is simply an ill-posed question whether the human mind can solve problems a UTM can’t. Mode of the problems we as humans deal with all the day long can not be “solved” (within the same day), and many not even represented to a UTM, since this would require definite encoding into a string of graphemes. Indeed, we can deal with those problems without solving them “analytically.” King is not aware about the poison of analyticity imported through the direct comparison with the UTM.

This brings us to the second aspect, the state of mechanisms. The denial of the superiority or let it even be the equality of brains and UTMs does not mount to the acceptance of some top-down principle, as King suggests in the passage cited above. UTMs, as any other algorithmic machine, are finite state automata (FSA). FSA, and even probabilistic or non-deterministic FSA, are totalizing the mechanics such that they become equivalent to a function, as Turing himself clearly stated. Yet, the brain and mind could be recognized as something that indeed rests on very simple (material) mechanisms, while these mechanisms (say algorithms) are definitely not sufficient to explain anything about the brain or the mind. From that perspective we could even conclude that we only can build such a machine if we fully embrace the transcendental role of so-called “natural” languages, as it has been recognized by Wittgenstein and others.

The third and final aspect of our results finally concerns the effect of these mechanisms onto the theory. Since the elementary operations are still mechanical and maybe even finite and fully determined, it is fully justified to call such a process a calculation. Molecular operations are indeed highly determinate, yet only within the boundaries of quantum phenomena, and not to forget the thermal noise on the level of the condition of the possible. Organisms are investing a lot to improve the signal-noise-ratios up to a digital level. Yet, this calculation is not a standard computation for two reasons: First, these processes are not programmable. They are as they are, as a matter of fact and by means of the factual matter. Secondly, the whole process is not a well-defined calculation any more. There is even no state. At the borderlines between matter, its formation (within processes of interpretation themselves part of that borderline zone), and information something new is appearing (emerging?), that can’t be covered by the presuppositions of the lower levels.

As a model then—and we anyway always have to model in each single “event” (we will return to that elsewhere)—we could refer to axiomatics. It is a undeniable fact that we as persons can think more and in more generality than amoebas or neurons. Yet, even in case of reptiles, dogs, cats or dolphins, we could not say “more” anymore, it is more a “different” than a “more” that we have to apply to describe the relationships between our thinking and that of those. Still, dogs or chimpanzees did not develop the insight of the limitations of the λ-calculus.

As a conclusion, we could describe the “Non-Turing computation” with regard to the stability of its own axiomatic basis. Non-Turing computation extends its own axiomatic basis. From the perspective of the integrated entity, however, we can call it differentiation, or abstract growth. We already appreciated Turing’s contribution on that topic above. Just imagine to imagine images like those before actually having seen them…

There are some topics that directly emerge from these results, forming kind of a (friendly) conceptual neighborhood.

  • – What is the relation between abstract growth / differentiation and (probabilistic) networks?
  • – Part of the answer to this first issue is likely given by the phenomenon of a particular transition from the probabilistic to the propositional, which also play a role concerning the symbolic.
  • – We have to clarify the notion “extending an axiomatic basis”. This relates us further to evolution, and particularly to the evolution of symbolic spaces, which in turn is related to category theory and some basic notions about the concepts of comparison, relation, and abstraction.
  • – The relationship of Non-Turing Computation to the concepts of “model” and “theory.”
  • – Is there an ultimate boundary for that extension, some kind of conditional system that can’t be surpassed, and how could we speak about that?
  • [1] Alan M. Turing (1936), On Computable Numbers, With an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, Series 2, Volume 42 (1936), p.230-265.
  • [2] Andrew Hodges, Alan Turing.
  • [3] B. Jack Copeland (1996), What is Computation? Synthese 108: 335-359.
  • [4] Nir Fresco (2008), An Analysis of the Criteria for Evaluating Adequate Theories of Computation. Minds & Machines 18:379–401.
  • [5] Sieg, Wilfried, “Calculations by Man and Machine: Conceptual Analysis” (2000). Department of Philosophy. Paper 178.
  • [6] Sieg, Wilfried, “Church Without Dogma: Axioms for Computability” (2005). Department of Philosophy. Paper 119.
  • [7] Wilhelm Vossenkuhl, Ludwig Wittgenstein. 2003.
  • [8] Ludwig Wittgenstein, Philosophical Investigations §201; see also the Internet Encyclopedia of Philosophy
  • [9] Goldin and Wegner
  • [10] Alan M. Turing (1952), The Chemical Basis of Morphogenesis. Phil.Trans.Royal Soc. London. Series B, Biological Sciences, Vol.237, No. 641. (Aug. 14, 1952), pp. 37-72.
  • [11] Ed Blakey (2011), Computational Complexity in Non-Turing Models of Computation: The What, the Why and the How. Electronic Notes Theor Comp Sci 270: 17–28.
  • [12] Bruce J MacLennan (2009), Super-Turing or Non-Turing? Extending the Concept of Computation. Int. J. Unconvent. Comp., Vol 5 (3-4),p.369-387.
  • [13] Thomas M. Ott, Self-organised Clustering as a Basis for Cognition and Machine Intelligence. Thesis, ETH Zurich, 2007.
  • [14] David King (1996), Is the human mind a Turing machine? Synthese 108: 379-389.



Evolution in Associative Systems

October 26, 2011 § Leave a comment

When did you have the last thought

of which you think that it was a really novel one?

The fact is that this is happening probably just right now, according to neuroscientist Gerald Edelman. Edelman argues that this is a direct consequence of an ‘unlabeled world’. He denies that the instructionist concept (now better known as computationalism) can solve the puzzle that brain-bearing organisms can behave adaptively. He writes [1]:

To survive in its eco-niche, an organism must either inherit or create criteria that enable it to partition the world into perceptual categories according to its adaptive needs. Even after that partition occurs as a result of experience, the world remains to some extent an unlabeled place full of novelty.

Even the very basic standard process of categorizing perceived signals thus has to be inventive all the time.Edelman is convinced, like us, that a standard computer program as well as the whole class of rule-based systems can not be inventive. What is needed is a framework that is able to genuinely create and establish novelty on its own. The candidate principle for him (and us alike) is evolution. Edelman proposes a model of group selection on the level of neurons.

It [m: this theory] argues that the ability of organisms to categorize an unlabeled world and behave in an adaptive fashion arises not from instruction or information transfer, but from processes of selection upon variation.

He continues to explain this idea in the following figure.

The most important part for the process of categorization and building up internal “representations” is in the bottom row, to which he comments:

Reentry. Binding of functionally segregated maps occurs in time through parallel selection and the correlation of the various maps’ neuronal groups. This process provides a fundamental basis for perceptual categorization. Dots at the ends of some of the active reciprocal connections indicate parallel and more or less simultaneous strengthening of synapses facilitating certain reentrant paths. Synaptic strengthening (or weakening) can occur in both the intrinsic and extrinsic reentrant connections of each map.

Re-entrant processes are indeed very important in order to get things clear in a group of units of an associative arrangement. Usually, modeling in associative systems leads to smooth arrangements, where a large number of conflicting activation is present. The boundaries between concepts remain unclear. A first-step solution is provided by reentrant signalling within the associative structure; Edelman formulates:

Such reentrant activity is constructive: because of its reciprocal and recursive properties and its parallel structure, reentry leads to new neuronal responses and it can resolve conflicts arising between the synaptic activities of different mapped areas […]. It should be sharply differentiated from feedback. Feedback is concerned with error correction and defined inputs and outputs, whereas reentry has no necessary preferred direction and no predefined input or output function.

According to Edelman, this reentrant mapping triggers a selection process. He proposed this theory of selection on the neural level that works as a standard process in the working brain the first time back in 1978 [2], and in an extended version in 1987 [3]. Edelman proposals for the linkage between processes on the neural level, perceptual categorization and even higher “functions” of the brain are highly plausible.

Yet, throughout his work, he uses just a rather intuitive notion of evolution. Well, one can imagine fuzzily how the creative play between variation and selection is going on according to Edelman’s scheme. Yet, he does not indicate exactly why this process should be called “evolutionary.” After all, not every selective process is already an evolutionary process. His whole theory hence suffers from a proper adaptation of the theory of evolution into the domain of neuroscience.

Evolutionary theory (ET) is a well-defined theoretical framework that “explains” the variety of living organisms on earth. The modern, and extended, version of the ET, developed by Ernst Mayr, Dobzhansky, Lorenz, Maynard-Smith, Richard Dawkins, Manfred Eigen and Peter Schuster among many others, incorporates genetics, mathematical models of population dynamics and mathematical game theory; it also uses insights from complex systems theory.

Despite some mathematical work about the dynamics of selective processes in natural evolution, a generalizing formalization of evolutionary theory or generalized evolutionary processes itself is still missing (as far as I can tell), which I find quite astonishing. Of course, one can find a lot of mathematical looking papers, there is a whole journal (J.theor.Biol.) hosting such. Yet, we are convinced that a generalization of evolutionary processes based on formal representation should not be limited to just representing population dynamics by some formulas bringing in some notion of probability, but at the same time keeping the evolutionary vocabulary of unchanged, i.e. still talking about species, mutations, pheno-genotypes, gene frequencies and so on. These are biological terms. Even if those terms are represented using the formalism of probability theory, the result is still a representation of biological structures. It is an abundant misunderstanding in theoretical work about evolution that such endeavors represent a generalization, but in fact, no generalization has been achieved. As a consequence, a transfer into questions about cultural evolution is inappropriate.

As a consequence, we can adopt the idea of evolution only vaguely and imprecise into other domains outside of biology (culture, economy, general systems theory). The task of generalizing natural evolution can not be solved keeping the terms of biology. We do not have to describe biological mechanisms using the generalizing language of mathematics, instead we have to generalize the theory itself, that is, we have to abstract its most basic terms. This does not mean that we must dismiss all the mechanisms invented by natural evolution each and forever. But they can not be elements of a general theory of evolution.

This is what the rest of this paper is about. As a result, we will have a formulation of evolutionary theory at our disposal that is not only much more general than the biological version of it. It also can be transferred readily to any other scientific domain without using metaphors or introducing theoretical shortfalls. Examples for such domains are economy, social sciences, or urbanism.

Formalizing the Theory of Evolution

From a bird’s view one could say that evolution is the historical creation of information through complexity. The concept of evolution has been conceived primarily as concept from biology for more than 150 years. Almost since its first coherent formulation by Darwin (2003) there also have always been attempts to apply it to the progression of human culture. Generally, those attempts have been refuted, because a more or less direct transfer implicitly denies cultural achievements. Thus, we have to reconstruct a more general form of evolutionary theory in order to import it to a theory about the change of cities. Unfortunately, we can provide only a brief outline of the reformulation here.

In biology, evolutionary theory has been revised or extended several times. Never­theless, its core can be still compressed to the following two propositions (the plus sign does not mean arithmetic addition here), which express the basic elements of this theory (you may find more complicated looking versions here [4], but they all boil down to the following):

Evolution = Variation + Heredity + Selection      (1)

Fitness = number of offsprings in secondary filial generation F2      (2)

The concept of fitness is the core of the operationalization of evolutionary theory. As a measure it is only meaningful within a system of competing species. There are, of course, a lot of side conditions one have to be aware of and the mechanisms regarding the single terms of this equation are still under investigation. These equations reflect the characteristics of biological matter, i.e. genes and physiology making up a body, which is immersed in a population of bodies, similar (within a species) and different ones (competing species). A brain, or even groups of neurons do not have such a structure, thus we have to extract the abstract structure from the equation above in order to harvest the benefit.

Fortunately, this is quite easy. Our key element is a probabilized version of memory, where memory is a possibly structured random process that renders bits of information unreliable up to their complete deletion. This concept is very different from the concept of memes, which refers “directly” to the metaphor of the gene. Neuronal maps are yet not defined by sth discrete like genes. In neuronal maps—or any other non-biological entity that one wants to view “under the light of evolution”—the mechanisms of a transfer of information is completely different and totally incommensurable as compared to a biological entity like a “species.” Above all, there is no such thing like a “neuronal code” or even a “neuronal language,” as we know not only today. The same conclusion can be drawn from Wittgenstein’s philosophical work. The concept of the “meme” is ill-designed from ground-up.

We start by conceiving here­di­ty and selection as (abstract) memories. Heredity is obviously a highly accurate long-term memory, actualized as lasting, replicable structures, called DNA, albeit the DNA is not the only lasting memory in eukaryotic organisms. Nowadays it is well established that there is an epigenetic pathway (external to DNA/RNA) for passing information between generations. Obviously, epigenetic transmission of information is less stable than DNA-based transmission. After a few generations the trace is lost. Nevertheless, it is a strong memory, potentially very important to explain evolutionary phenomena.

Selection on the other hand is just the inverse of forgetting, or the difference established by it. Yet, if we try not to get trapped by the stickiness of words, we can clearly recognize that selection is just a particular memory function, especially as selection is an effect on the level of  populations.

Finally, variation can be con­ceptualized as a randomness operator acting on those two kinds of memories. We may call it impreciseness, influencing of the capability to evolve, speed of evolution, in any case we have some kind of transformation of some parts of the content of the memories, for whatsoever reason or mechanism.

Now we can reform­ulate all ingredients of the classical and basic evolutionary theory in terms of probabilistic memory.

Lemma 1: An organism is a device which “exists as” and which can maintain an assemblage M, called a probabilistic memory configuration. M consists of different kinds of memories mi, each of different duration and resolution. “Probabilistic” here means that the parameters, such as duration or resolution, are not determined by some fixed value, but by probabilistic densities. From the theoretical perspective, these densities may be interpreted as outcomes of random processes; from the perspective of scientific model identification, the “probabilistic turn” allows to change from question “what is it?” to the question about mechanisms and their orchestration.

Any embedding “evolutionary” process picks at least two different “memories” from that ensemble. We will return to this threshold condition about the number of memories in a moment. If the durations are sufficiently different, that organism as defined in (3) will be a “subject” of evolution, i.e. a “species” embedded in contingent historical constraints. For principle reasons, a species should not be defined by an identity relation, as it is done in evolutionary biology. In biology, a species is defined in a positive definite manner: if there is a DNA configuration that in principle is provable unique, then this DNA configuration is a species. Another individual has to have the same configuration (here is the identity relation), in order to establish / belong to that assumed species. This way of defining a species is deeply unsuitable, not only in biology (see chapter about logics, where we will deal with the issue of “identity”).

Instead, in the next lemma we define it as a limit process, which expresses the compatibility of two different memory configurations.

Lemma 2: In a population of size n, we define the limes of the probability for a unification of two different sets of probabilistic memory configurations M (e.g. biological organisms, structures of cultural ensembles) under conditions of potential interaction. If the unification of two memory configurations can not take place in a sustainable manner, or the probability for such unification tends towards zero, the two configurations can be conceived as “separated & competing probabilistic sustainability programs,” i.e. a quasi-species. A (quasi-)species [Q] can be defined by the limit value 0 of the following limes process:

If the memory configurations can not overlap, we may call them a species. Yet, even in biology it is well known that this separation is never complete, A salient example are ducks, of which many “species” could interbreed. Yet, they rarely do. The main reason are different and incompatible rituals before mating. In other words, the incompatibility is in the realm of information. Yet, sometimes it happens though, because some bytes in the ritual overlap.

Species, in the classical as well as in the probabilistic framework, do not only maintain memories in their inside. Any multi-particle system may establish some memory as a phenomenon on the level of the collection / population. The information stored therein may well-be significant or even essential for the persistence or stability of the whole (quasi-)species.

Lemma 3: Given a population of organisms, the combination (synchronic union) of at least two cross-linked memories M with different duration d and, optionally, of different temporal resolution r, modified by an operator for synchronous randomness, results in an evolution, if most of the individuals contain such a memory structure.

Evolution results in the creation of stable memory configurations, which are able to maintain themselves across the boundary of the individual. A plethora of mechanisms can be observed in the case of biological evolution. Yet, for our abstraction here, only the effect is interesting.

Just one question is remaining to be answered: Why should be there at least two different kinds of memories in the population of entities in order to allow for evolution? Of course, in an organism, even in the most simplest ones, we find many more levels and kinds of memories than just two. Yet, two memories are presumably the lower threshold. The reason is the mirrored in two conflicting requirements: First, it is necessary for the entity to maintain long-term stability, which includes reproducible structural information. This memory needs to be a private memory, i.e. accessible only to the hosting individual itself. Secondly, the generation of variability requires a second memory outside of and largely independent from the first one. This second memory may be actualized as a “private” memory or as a memory shared between individuals.

Already on the level of genomes we find two levels of memory. In bacteria, we find the plasmid. In eukaryotes (higher organisms), we find diploidy, with the interesting case of reweighing diploidy and haploidy in moss. Diploidy opens the possibility for a whole new game of (sexual) recombination without endangering the stability of the memory on the lower level. The further progression of memory types then is the population with its network effects, social systems for stabilizing amorphous network effects, and finally individual cognitive capabilities. Each of those additional level are themselves not simple, but already multi-layered instances of memory, with a sheer explosion of levels for cognition. Humans even learnt to externalize cognition into culture, beginning with language, simple drawings, writing, cities, books, and finally digital engines. These few and indeed coarse examples may already be sufficient to indicate the possibilities of the reformulated theory of evolution by means of probabilistic memory.

Lemma 4: Finally we can define fitness. First we set a term for the risk of getting extinct. A quasi-species Q disappears, if the probability to find it in a distribution approaches zero, and the closer that probability approaches to zero, the higher is that risk. If that quasi-species “is able” to deal with that risk, this mapping should be zero. The fitness of the quasi-species as an informational setup,then can be expressed again as a probability, namely that this capability to deal with the risk is >0. Taken all together we get:

This formulation of fitness has several advantages as compared to classical version that is in use in biology. First, any threat of circularity is abandoned. It has often be said that survival of the fittest simply means “survivor of the survivor.” Albeit this accuse does not respect the aspect of mutuality in evolutionary processes, it nevertheless poses a difficulty for traditional evolutionary theories. Secondly, the measurement of fitness itself becomes more clear. We don’t have to count the number of F2 offsprings, nor have we rely to any arbitrary measure of this kind at all. Fertility, and even differential fertility is rarely linked to the probability of getting extinct. Often we can measure just that a population is shrinking, for instance due to climate change: it may proof absolute impossible to find or even quantify something like a differential fertility, especially by resorting to fertility towards F2. Which “species” should be compared for that? Impossible knowledge about the future would be necessary for that, creating a circularity in reasoning. Thirdly, we can set up stochastic simulations about populations much more easily and else on a more abstract level, which is supportive for increasing the results of proposals about fitness. The traditional setup of fitness is only feasible for direct competition. Yet, even Darwin himself expressed doubts about such an assumption.

Synopsis and Similar Work

Our first achievement is clearly the replacement of biological terms by abstract concepts. On the achieved level of abstraction, all the mechanisms that can be observed in (or hypothesized about) natural evolution are not really interesting. Of course, “nature” has been quite inventive during the last 3.6 billion years, not only regarding the form of life, but also with regard to the mechanisms of evolution. There are plenty of them. But what is the analogon of diploidy in culture? Or even that of gene? Or where is the implementation of the genotype-phenotype separation? Claiming the existence of memes [5] as such an analogon does not help much, because the mechanism are vastly different between society and biological cells. It is one of the lesson that can be learnt in biology, to ask about the mechanism [e.g. 6]

In our perspective, it is nonsense to keep biological terms, rewrite them with a bit of math, and then to impose them to explanatory schemes about the evolution of culture. The result is simply nonsense. It is simply a categorical mistake to claim that cultural evolution is Darwininan, as Richerson and Boyd [7] meant.

The situation resulting from the theoretical gap that we tried to fill here is disorder, at least. In their excellent paper, Lipo and Madsen [8] diagnose rightly that there has been no proper transfer of evolutionary theory from biology to anthropology. Yet, they stick to biological terms, too.

As we already mentioned above that the problem of generalizing natural evolution can not be solved keeping the terms of biology. We do not have to describe biological mechanisms using the generalizing language of mathematics, instead we have to generalize the theory itself, that is, we have to find abstractions for its most basic terms.

Precisely this we did. Notably, we need not to refer to entities like genes anymore, and we also could remove the notions of selection, variation or heredity. These terms from the theory of natural selection are all special instances of abstract probabilistic memory processes. There is no categorical difference any more between Darwinian, Lamarckian or neutral evolution. If there is a discourse about evolutionary phenomena would mention those, or if such a discourse would invoke notions of phenotype, species, etc., we could be absolutely sure that the respective argument is not general enough for a general theory of evolution.

Darwinian, Lamarckian or neutral evolution differ just in the way the memory processes are instantiated. Of course, it has to be expected that natural selection (we already remarked that before) will “create and establish” a lot of different mechanisms that create better adaptivity of the evolutionary process itself [9]. So, if we are too close to our subject (the evolutionary phenomenon), we just arrive at the opinion that there are these different types of evolution.

We suppose that the formulas above are the first available and proper proposal of a theoretical structure that allows a transfer of evolutionary theory into anthropology (or into the realm of machine-based epistemology) and the question about cultural evolution that satisfy the following conditions:

  • – it is theoretically sound;
  • – it does not reduce culture to biology, even not implicitly;
  • – the core ideas of evolutionary theories remain intact;
  • – the theory can be used to derive proper models (check the chapter about theory of theory!)



The achieved advantages are completely due to the complete move to probability functions. (Here, and about probabilistic modeling  in general, I owe much to Albert Tarantola’s work) In our perspective, even species “are” just distributions, but not representations of a book full with genetic code, presumed to be unique and well-defined. These entities (formerly: “species”) need not even be sharply dis­tinctive! A similar proposal has been made by evolutionary biologist Manfred Eigen in 1977 [10], who coined the term “quasi-species.” For our context, these entities can be anything, the only requirement is that this “anything” could be conceived as a stacked collection of memories (biologically: “organism”) in the sense given above. Note, that this conceptuali­zation of evolution allows for Darwinian as well as for Lamarckian processes, for indivi­dual-based selection as well as for group selection, for horizontal (by vectors) as well as for vertical transfer of information (through inheritance). Thus we may conclude that it is truly a generalization of the biological notion of evolution—at first. We suggest that it is an appropriate generalization of any evolutionary process.

The proposed memory dynamics—which we could call generalized evolutionary theory—is closely linked to the con­cept of com­plexity and information, and probably not complete without them. Even without them it seems, however, that the should be well possible to build operable models of memory stacks, in other words, to design the capability for evolutionary change apriori. The free parameters for a prospective “memory design” that will exhibit properties of an evolution are the number of memory layers, their mutual dynamics and the inner properties / mechanisms of the memories.

We may conclude that we now have a concept of evolution at our disposal that we can apply to domains outside the natural history of biological species, without committing naive equivalence enforcements. We achieved this by a mapping of the biological version of evolutionary theory onto the framework of probabilistic memories As you can see, we successfully replaced the notion of “species” by a much more abstract concept. This abstraction is key for the correct, non-metaphorical transfer of the theory into other domains.

From a philosophical perspective the proposed approach bears an interesting point. In medieval philosophy (scholastics), the concept of “species” was not associated with the classification of organisms. Yet, the question was, first, how could it be possible to recognize things as belonging together and then second, how could it be possible to have a unique name for it. The expectation was that the universe (as a work of God) should not be unordered or unstable. Note that we find prototypical identity thinking here. A reminiscent to that can be traced even nowadays regarding the “representationalist fallacy,” and also in the more intense forms of realism. Our hope is that the probabilization of species and the structure of evolution helps to overcome those naive stances.

It is finally a complete adoption of the “informational turn” into evolutionary biology. As long as species and evolution as the most central concepts in biology are not defined in terms of probability densities, biology can’t deal with aspects of information in a “natural” manner, thus remaining in the materialist corner of science.

Evolution and Self-Organizing Maps

Interestingly, Edelman who writes about the brain on the level of neurons also uses the concept of “maps,” established by groups of neurons. The role of these maps in Edelman’s theory is the coupling from perceptual “input” to the decision towards the action then ultimately taken. Of course, one has to include self-directed restructuring, whether implicit or explicit, of the cognitive apparatus into account, to. Not all actions need to result in motor action. In Edelman’s model, reentrant processes occurring in these maps are responsible for perceptual categorization. We will demonstrate in another chapter how processes around a SOM can be arranged such that they start to categorize and to distill crisp, if not to say “idealistic” representations from their observational data.

Anyway, we still need to clarify where and how (abstract) evolutionary processes can take place in, by or with a Self-Organizing Map, or a group thereof.

In a collection, if not to say population of SOM we may refer to an individual SOM as individual entity only, if they do not share too much actual memory with the other instances. If we’d instantiate these individual entities as standard SOM, we would have the situation that our entities comprise only a single type of memory (see Lemma 1). Fortunately, there are some different methods we could apply in order to add further levels of memory to render single entities into “individuals.”

Remember that we will need at least two levels of memory in order to satisfy the minimal condition for evolutionary processes. Evolutionary processes inside the “machine” are in turn mandatory  for the capability to find new solutions that are not pre-programmed in any sense beyond a certain abstract materiality (see the chapter about associativity). One of these possibilities is the regulation of the connectivity in the SOM, that is, to allow for a symmetry-break with regard to the spread of information. In the standard SOM this spread s fully symmetric: it follows the pattern of a disc (radial symmetry) and it is the same decay function everywhere in the SOM. Another possibility to introduce a further level of memory is provided by the 2-layer approach as realized in the WebSom. Notably, the WebSom draws heavily on a probabilization of the input. Else, pullulation and some other principles of growth naturally lead to a multi-memory entity, in other words to an “individual.”

The next level of integration is the quasi-species (Lemma 2). The differentiation of the population of SOM into quasi-species need not be implemented as such, if we apply the more advanced growing schemes, i.e. plant-like or animal-like growth. Separate entities will emerge as a consequence of growth. Differentiation will of course not happen if we apply crystal-like growth scheme. The growth of coordinated swarms on the other do not provide sufficient stability for learning.

The connectivity between individuals will organize itself appropriately and quasi-species will appear quite naturally as a result of the process itself, if we allow for two principles. First, their should be a signal horizon within the population. Second, SOM individuals should communicate not on the data level, but on the “behavioral” level. Interacting on the “data level” would enforce homogeneity among the SOM entities in many cases.

In order to start the evolutionary process we need just to organize some kind of signal horizon, or, resulting in almost the same effects, some kind of scarcity (time, energy). Any kind of activity, whether on the level of the population or on the level of individual SOMs, then has to be stopped before its analytical end. Given the vast number of possible models given a properties vector of length n>100(0)+, learning and exploring the parameter space anyway will never be complete. The appropriate means to counteract that combinatorial explosion (see the chapter about modeling) is… evolution! On the individual level animals are able to optimize their search behavior very close to theoretical expectations. Davies [10] proposed the so-called “marginal value theorem” to organize (and to predict) the optimal amount of time to search for something under conditions of ignorance.

Finally, lemma 4 about fitness would wait for discussion. Yet, for now we are not interested in measuring the fitness of classes of SOMs in our SOM world, though this could change, of course.

(The thoughts about formalization of evolutionary theory have been partially published in another context in [12].)

  • [1] Gerald M. Edelman (1993), Neural Darwinism: Selection and Reentrant Signaling in Higher Brain Function. Neuron, 10(2): 115-125.
  • [2] Gerald M. Edelman, Group selection and phasic re-entrant signalling: a theory of higher brain function. In: G. M. Edelman and V. B. Mountcastle (eds.), The Mindful Brain. MIT Press, 1978.
  • [3] Gerald M. Edelman, Neural Darwinism: the theory of neuronal group selection. Basic Books, 1987.
  • [4] Richard Lewontin, The Genetic Basis of Evolutionary Change. Columbia University Press, New York 1974
  • [5] Susan Blackmore
  • [6] Paul Patrick Bateson (ed.), The Development and integration of behaviour. 1991
  • [7 ] Peter J. Richerson, Robert Boyd, Evolution: The Darwinian Theory of Social Change, in: Schelkle w., Krauth w.H., Kohli M., Ewarts G. (eds.), Paradigms of Social Change: Modernization, Development, Transformation, Evolution, Campus Verlag, Frankfurt 2000.
  • [8] Carl P. Lipo, Mark E. Madsen (1999), The Evolutionary Biology of Ourselves: Unit Requirements and Organizational Change in United States History. arXiv:adap-org/9901001v1
  • [9] Wolfgang Wagner, Evolution der Evolutionsfähigkeit. In: Dress, A., Hendrichs, H., G. Küppers: Selbstorganisation. Die Entstehung von Ordnung in Natur und Gesellschaft. München 1986.
  • [10] Manfred Eigen, …
  • [11] Davies, the marginal value theorem, in Krebs Davies. Ecology.
  • [12] Klaus Wassermann (2011), Sema Città: Deriving Elements for an applicable City Theory. in: Conf.Proc. 29th eCAADe, RESPECTING FRAGILE PLACES, University of Ljubljana (Slovenia), 21-24 September 2011, pp.134-142.


How to Grow Associative Maps?

October 25, 2011 § Leave a comment

It is probably only partially correct to claim that the borders of the world are constituted by the borders of language. Somehow it seems more appropriate to think that the borders of the world are given by the borders of the gardens, in which the associative maps are growing. Well, I admit, we then would have to discuss what’s been first, those gardens or language. Leaving this aside for a moment, then the most essential problem can be put forward in a simple manner:

How to run the garden of associativity?

Before we start I want to briefly recapitulate the chapter about growth. There we investigated what we called “abstract growth.” We related it to a general notion of differentiation, described and operationalized by the concept of “signal strength length.” Based on that, we explored the possibility of software systems that grow by itself, i.e. without external programmers. Here, we will now proceed the next step by explicating the candidate structure for such a structure. Besides the abstract aspect of growth and differentiation, we also have to keep in mind the medial, or if you like, the communicological aspects of the regulation of the actual growth, which needs to be instantiated into a technical representation without loosing too much of the architectural structure.

Unfortunately we can not follow the poetic implications of our introduction. Instead, we will try to identify all possible ways of how our candidate structure can grow. The issues raised by the concept of associativity will be described in a dedicated chapter.

The structure that we have to choose as the basic “segmental” unit needs to be an associative structure by itself. For that, any machine learning algorithm would do it. Yet, it is not reasonable, to take a closed algorithmic procedure for that job, since this would constrain the future functional role—which is necessarily unknown at implementation time—of the unit too much. Else, it should provide robustness and flexibility. For many reasons, the Self-Organizing Map (SOM) is the best choice for the basic unit. A particularly salient property of the SOM is the fact, that it is a network which can change its (abstract) symmetry parameters, something which Artificial Neural Networks can’t as easily achieve. This means that different types of symmetry breaks can be triggered in a SOM, and in turn, that the topology of the connectivity may change even locally. In this way, a single MAP may separate into two (or several). The attractive property of such separation processes in SOM is that any degree of connectivity between the parts can establish in a self-organized manner. In other words, in the SOM the units may develop any kind of division of “labor” (storage, transmission, control), and any unit can split off and develop into a fully-fledged further MAP.

A probabilistic manifold of networks can grow and differentiate in several different ways; any of  the growth patterns we described elsewhere (see chapter about growth) may apply:

  • – growth by accretion on the level of atoms, completely independent of context;
  • – ordered growth due to needs, i.e. controlled by inner mechanisms, mainly at the “tips” of an arrangement, or similar to that, metameric growth as in the case of worms
  • – differentiation by melting, folding and pullulation inside a particular map

Maps that have been split off from existing ones may loose all direct links to its “mother map” after a certain period of time. It would then receive messages through the common and anonymous messaging mechanism. In this way a population of SOMaps will be created. Yet, the entities of this population may develop second-order ties, or in order to use a graph-theoretic term, cliques, just dependent on the stream of data flowing in and being processed by the population.

The additional SOM, whether created through separation or “de novo” by duplication need not work all on the same level of integration. If the data from external sources are enriched by variables about the SOM itself by default, for any SOM in the population, a high-level division of labor will emerge spontaneously, if the whole system is put under time or energy constraints.

It is pretty clear, that this garden of associativity has to run in a fully autonomous manner. There is no place for a gardener in this architecture. Hence, growth must be regulated. This can be effectively achieved by two mechanisms: reinforcement based on usage, and even simple evolutionary selection processes on the basis of scarcity of time, absolute space, or the analogs to energy or supply of matter.

Despite such a system it might appear distantly similar to ant hive or swarm simulations, where the task of the individual entity is that of a single, yet complete SOM, we would like to deny such a relationship.

Of course, the idea of growing SOM have been around for some time. Examples are [1] or [2]. Yet, these papers or systems did conduct neither an analysis of growth processes beforehand, nor of the broader epistemological context, probably because they have been created by software engineers; hence these approaches remain rather limited, albeit they point to the right direction.

  • [1]
  • [2]


Machine-based Episteme/Epistemology

October 24, 2011 § Leave a comment

It is pretty clear that even if we just think of machines being able to understand this immediately triggers epistemological issues. If such a machine would be able to build previously non-existent representations of the outer world in a non-deterministic manner, a wide range of epistemological implications would be invoked for that machine-being, and these epistemological implications are largely the same as for humans or other cognitive well-developed organic life.

Epistemology investigates the conditions for “knowledge” and the philosophical consequences of knowing. “Knowledge” is notoriously difficult to define, and there are many misunderstandings around, including the “soft stone” of so-called “tacit knowledge”; yet for us it simply denotes a bundle consisting from

  • – a dynamic memory
  • – capacity for associative modeling, i.e. adaptively deriving rules about the world
  • – ability to act upon achieved models and memory
  • – self-oriented activity regarding the available knowledge
  • – already present information is used to extend the capabilities above

Note that we do not demand communication about knowledge. For several reasons and based on Wittgenstein’s theory on meaning we think that knowledge can not be transmitted or communicated. Recently, Linda Zagzebski [1] achieved the same result, starting from a different perspective. She writes, that “[…] knowledge is not an output of an agent, instead it is a feature of the agent.“. In agreement with at least some reasonably justified philosophical positions we thus propose that it is also reasonable to conceive of such a machine as mentioned before as being enabled to knowledge. Accordingly, it is indicated to assign the capability for knowing to the machine. That knowledge being comprised or constituted by the machine is not accessible for us as “creators” of the machine, for the very same reason of the difference of the Lebenswelten.

Yet, the knowledge acquired by the is also not “directly” accessible for the machine itself. In contrast to rationalist positions, knowledge can’t be separated from the whole of a cognitive entity. The only thing that is possible is to translate it into publicly available media like language, to negotiate a common usage of words and their associated link structures, and to debate about the mutually private experiences.

Resorting to the software running on the machine and checking the content of the machine will be not possible either. A software that will enable knowing can’t be decomposable in order to serve an explanation of that knowledge. The only things one will find is a distant analog to our neurons. As little as reductionism works for the human mind it will work for the machine.

Yet, such machine-knowledge is not comparable to human knowledge. The reason for that is not an issue of type, or extent. The reason is given by the fact that the Lebenswelt of the machine, that is the totality of all relations to the outer world and of all transformations of the perceiving and acting entity, the machine, would be completely different from ours. It will not make any sense to try to simulate any kind of human-like knowledge in that machine. It always will be drastically different.

The only possibility to speak about the knowing and the knowledge of the machine is through epistemological concepts. For us it doesn’t seem promising to engage in fields like “Cognitive Informatics,” since informatics (computer science) can not deal with cognition for rather fundamental reasons: Cognition is not Turing-computable.

The bridging bracket between the brains and minds of machine and human being is the theory of knowing. Consequently, we have to apply epistemology to deal with machines that possibly know. The conditions for that knowledge could turn out to be strange; else we should try to develop the theory of machine-based knowledge from the perspective of the machine. It is important to understand that attempts like the Turing-Test [2] are inappropriate, for several reasons so: (i) they follow the behavioristic paradigm, (ii) they do not offer the possibility to derive scales for comparison, (iii) no fruitful questions can be derived.

Additionally, there are some arguments pointing to the implicit instantiation of a theory as soon as something is going to be modeled. In other words, a machine which is able to know already has a—probably implicit—theory about it, and this also means about itself. That theory would originate in the machine (despite the fact that it can’t be a private theory). Hence, we open a branch and call it machine-based epistemology.

Some Historical Traces of ‘Contacts’

(between two strange disciplines)

Regarding the research about and the construction of “intelligent” machines, the relevance of thinking in epistemological terms has been recognized quite early. In 1963, A. Wallace published a paper entitled “Epistemological Foundations of Machine Intelligence”[3] that quite unfortunately is not available except the already remarkable abstract:

Abstract : A conceptual formulation of the Epistemological Foundations of Machine Intelligence is presented which is synthesized from the principles of physical and biological interaction theory on the one hand and the principles of mathematical group theory on the other. This synthesis, representing a fusion of classical ontology and epistemology, is generally called Scientific Epistemology to distinguish it from Classical General Systems theory. The resulting view of knowledge and intelligence is therefore hierarchical, evolutionary, ecological, and structural in character, and consequently exhibits substantial agreement with the latest developments in quantum physics, foundations of mathematics, general systems theory, bio-ecology, psychology, and bionics. The conceptual formulation is implemented by means of a nested sequence of structural Epistemological-Ontological Diagrams which approximate a strong global interaction description. The mathematico-physical structure is generalized from principles of duality and impotence, and the techniques of Lie Algebra and Lie Continuous Group theory.

As far as it is possible to get an impression about the actual but lost full paper, Wallace’s approach is formal and mathematical. Biological interaction theory at that time was a fork from mathematical information theory, at least in the U.S. where this paper originates. Another small weakness could be indicated by the notion of “hierarchical knowledge and intelligence,” pointing to some rest of positivism. Anyway, the proposed approach never was followed upon, unfortunately. Yet we will see in our considerations about modeling that the reference to structures like the Lie Group theory could not have worked out in a satisfying manner.

Another early instance of bringing epistemology into the research about “artificial intelligence” is McCarthy [4,5], who coined the term “Artificial Intelligence.” Yet, his perspective appears as by far too limited. First he starts with the reduction of epistemology to first-order logics:

“We have found first order logic to provide suitable languages for expressing facts about the world for epistemological research.” […]

Philosophers emphasize what is potentially knowable with maximal opportunities to observe and compute, whereas AI must take into account what is knowable with available observational and computational facilities.

Astonishingly, he does not mention any philosophical argument in the rest of the paper  except the last paragraph:

“More generally, we can imagine a metaphilosophy that has the same relation to philosophy that metamathematics has to mathematics. Metaphilosophy would study mathematical systems consisting of an “epistemologist” seeking knowledge in accordance with the epistemology to be tested and interacting with a “world”.” […] AI could benefit from building some very simple systems of this kind, and so might philosophy.”

McCarthy’s stance to philosophy is typical for the whole field. Besides the presumptuous suggestion of a “metaphilosophy” and subsuming it rather nonchalant to mathematics, he misses the point of epistemology, even as he refers to the machine as an “observer”: A theory of knowledge is about the conditions of the possibility for knowledge. McCarthy does not care about the implications of his moves to that possibility, or vice versa.

Important progress about the issue of the sate of machines was contributed not by the machine technologists themselves, but by philosophers, namely Putnam, Fodor, Searle and Dennett in the English speaking world, and also among French philosophers like Serres (in his “Hermes” series) and Guattari. The German systems theorists like von Foerster and Luhmann and their fellows never went beyond cybernetics, so we can omit them here.  In 1998, Wellner [6] provided a proposal for epistemology in the field of “Artificial Life” (what a terrible wording…). Yet, his attempt to contribute to the epistemological discussion turns out to be inspired by Luhmann’s perspective, and the “first step” he proposes is simply to stuff robots with sensory, i.e. finally it’s not really a valuable attempt to deal with epistemology in affairs of epistemic machines.

In 1978, Daniel Dennett [6] reframed the so-called “Frame Problem” of AI, of which already McCarthy and Hayes [3] got aware 10 years earlier. Dennet asks how

“a cognitive creature … with many beliefs about the world” can update those beliefs when it performs an act so that they remain “roughly faithful to the world”? (cited acc. to [8])

Recently, Dreyfus [9] and Wheeler [10], who yet disagrees about the reasoning with Dreyfus about it, called the Frame problem an illusionary pseudo-problem, created by the adherence to Cartesian assumptions. Wheeler described it as:

“The frame problem is the difficulty of explaining how non-magical systems think and act in ways that are adaptively sensitive to context-dependent relevance.”

Wheeler as well as Dreyfus recognize the basic problem(s) in the architecture of mainstream AI, and they identify Cartesianism as the underlying principle of these difficulties, i.e. the claim of analyticity, reducibility and identifiability. Yet, neither of the two so far proposes a stable solution. Heideggerian philosophy with its situationistic appeal does not help to clarify the epistemological affairs,  neither of machines nor of humans.

Our suggestion is the following: Firstly, a general solution should be found, how to conceive the (semi-)empirical relationship between beings that have some kind of empirical coating. Secondly, this general solution should serve as a basis to investigate the differences, if there are any, between machines and humans, regarding their epistemological affairs with the “external” world. This endeavor we label as  “machine-based epistemology.”

Machine-based Epistemology

If a machine, or better, a synthetic body that was established as a machine in the moment of its instantiation, would be able act freely, it would face the same epistemological problems as we humans, starting with basic sensory perception and not ending with linking a multi-modal integration of sensory input to adequate actions. Therefore machine-based epistemology (MBE) is the appropriate label for the research program that is dedicated to learning processes implemented on machines. We avoid invoking the concept of agents here, since this already brings in a lot of assumptions.

Note that MBE should not be mixed with so-called “Computer Epistemology”, which is concerned just about the design of so-called man-machine-interfaces [11]. We are not concerned about epistemological issues arising through the usage computers, of course.

It is clear that the term machine learning is missing the point, it is a pure technical term. Machine learning is about algorithms and programmable procedures, not about the reflection of the condition of that. Thus, it does not recognize the context into which learning machines are embedded, and in turn it misses also the consequences. In some way machine learning is not about learning about machines. It remains a pure engineering discipline.

As a consequence, one can find a lot of nonsense in the field of machine learning, especially concerning so-called ontologies and meta-data, but also about the topic of “learning” itself. There is the nonsensical term of “reinforcement learning”… which kind of learning could not be about (differential) reinforcement?

The other label Machine-based Epistemology is competing with is “Artificial Intelligence.” Check out the editorial text “Where is the Limit” for arguments against the label “AI.” The conclusion was that AI is too close to cybernetics and mathematical information theory, that it is infected by romanticism and it is difficult to operationalize, that it does not appropriately account for cultural effects onto the “learning subject.” Since AI is not connected natively to philosophy, there is no adequate treatment of language: AI never took the “Linguistic Turn.” Instead, the so-called philosophy of AI poses silly questions about “mental states.”

MBE is concerned about the theory of machines that possibly start to develop autonomous cognitive activity; you may call this “thinking.” You also may conceive it as a part of a “philosophy of mind.” Both notions, thinking and mind, may work in the pragmatics of everyday social situations, for a more strict investigation I think they are counter-productive: We should pay attention to language in order not to get vexed by it. If there is no “philosophy of unicorns,” then probably there also should not be a “philosophy of mind.” Both labels, thinking and mind, pretend to define a real and identifiable entity, albeit exactly this should be one of the targets for a clarification. Those labels can easily cause the misunderstanding of separable subjects. Instead, we could call it “philosophy of generalized mindfulness”, in order to avoid anthropomorphic chauvinism.

As a theory, MBE is not driven by engineering, as it is the case for AI; just the other way round, MBE itself is driving engineering. It somehow brings philosophical epistemology into the domain of engineering computer  systems that are able to learn. Such it is natively linked in a an already well-established manner to other fields in philosophy. Which, finally, helps to avoid to pose silly questions or to follow silly routes.

  • [1] Linda Zagzebski, contribution to: Jonathan Dancy, Ernest Sosa, Matthias Steup (eds.), “A Companion to Epistemology”, Vol. 4, pp.210; here p.212.
  • [2] Alan Turing (1950), Computing machinery and intelligence. Mind, 59(236): 433-460.
  • [3] Wallace, A. (1963), EPISTEMOLOGICAL FOUNDATIONS OF MACHINE INTELLIGENCE. Information for the defense Community (U.S.A.), Accession Number : AD0681147
  • [4] McCarthy, J. and Hayes, P.J. (1969) Some Philosophical Problems from the Standpoint of Artificial Intelligence. Machine Intelligence 4, pp.463-502 (eds Meltzer, B. and Michie, D.). Edinburgh University Press.
  • [5] McCarthy, J. 1977. Epistemological problems of artificial intelligence. In IJCAI, 1038-1044.
  • [6] Jörg Wellner 1998, Machine Epistemology for Artificial Life In: “Third German Workshop on Artificial Life”, edited by C. Wilke, S. Altmeyer, and T. Martinetz, pp. 225-238, Verlag Harri Deutsch.
  • [7]  Dennett, D. (1978), Brainstorms, MIT Press., p.128.
  • [8] Murray Shanahan (2004, rev.2009), The Frame Problem, Stanford Encyclopedia of Philosophy, available online.
  • [9] H.L. Dreyfus, (2008), “Why Heideggerian AI Failed and How Fixing It Would Require Making It More Heideggerian”, in The Mechanical Mind in History, eds. P.Husbands, O.Holland & M.Wheeler, MIT Press, pp. 331–371.
  • [10] Michael Wheeler (2008), Cognition in Context: Phenomenology, Situated Robotics and the Frame Problem. Int.J.Phil.Stud. 16(3), 323-349.
  • [11] Tibor Vamos, Computer Epistemology: A Treatise in the Feasibility of the Unfeasible or Old Ideas Brewed New. World Scientific Pub, 1991.



October 24, 2011 § Leave a comment

Representation always has been some kind of magic.

Something could have been there—including all its associated power—without being physically there. Magic, indeed, and involving much more than that.

Literally—if we take the early Latin roots as a measure—it means to present something again, to place sth. again or in an emphasized style before sth. else or somebody, usually by means of placeholder, the so-called representative. Not surprising then it is closely related to simulacrum which stands for “likeness, image, form, representation, portrait.”

Bringing the notion of the simulacrum onto the table is dangerous, since it refers not only to one of the oldest philosophical debates, but also to a central one: What do we see by looking onto the world? How can it be that we trust the images produced by our senses, imaginations, apprehensions? Consider only Platon’s famous answer that we will not even cite here due to its distracting characteristics and you can feel the philosophical vortices if not twisters caused by the philosophical image theory.

It is impossible to deal here with the issues raised by the concepts of representation and simulacrum in any more general sense, we have to focus on our main subject, the possibility and its conditions for machine-based epistemology.

The idea behind machine-based epistemology is to provide a framework for talking about the power of (abstract and concrete) machines to know and to know about the conditions of that (see the respective chapter for more details). Though by “machine” we do not understand a living being here, at least not apriori, it is something produced. Let us call the producer in a simplified manner a “programmer.” In stark contrast to that, the morphological principles of living organisms are the result of a really long and contingent history of unimaginable 3.6 billion years. Many properties, as well as their generalizations, are historical necessities, and all properties of all living beings constitute a miraculous co-evolutionary fabric of dynamic relations. In case of the machine, there are only little historic necessities, for the good and the bad. The programmer has to define necessities, the modality of senses, the chain of classifications, the kind of materiality etc.etc. Among all these decisions there is one class that is predominantly important:

How to represent external entities?

Quite naturally, as “engineers” of cognitive machines we can not really evade the old debate about what is in our brains and minds, and what’s going on there while we are thinking, or even just recognizing a triangle as a triangle. Our programmer could take a practical stance to this question and reformulate it as: How could she or he achieve that the program will recognize any triangle?

It needs to be able to distinguish it from any other figure, even the program never has been confronted with an “ideal” template or prototype. It also needs to identify quite incorrect triangles, e.g. from hand drawings, as triangles. It even should be able to identify virtual figures, which exist only in their negativity like the Kanizsa-triangle. For years, computer scientists proposed logical propositions and shape grammars as a solution—and failed completely. Today, machine learning in all its facets is popular, of course. This choice alone, however, is not yet the solution.

The new questions then have been (and still are): What to present to the learning procedure? How to organize the learning procedures?

Here we have to care about a threatening misunderstanding, actually of two misunderstandings, heading from opposite directions to the concept of “data.” Data are of course not “just there.” One needs a measurement device, which in turn is based on a theory, then on a particular way to derive models and devices from that theory. In other words, data are dependent on the culture. So far, we agree with Putnam about that. Nevertheless, given the body of a cognitive entity, that entity, whether human, animal or machine, finds itself “gestellt” into a particular actuality of measurement in any single situation. The theory about the data is apriori, yet within the particular situation the entity finds “raw data.” Both, theory and data impose severe constraints on what can be perceived by or even known to the cognitive entity. Given the data, the cognitive entity will try to construct diagnostic / predictive models, including schemes of interpretations, theories, etc.  The important question then is concerned about the relationship between apriori conditions regarding the cognitive entity and the possibly derived knowledge.

On the other hand, we can defend us against the second misunderstanding. Data may be conceived as (situational) “givens”, as the Latin root of the word suggests. Yet, this givenness is not absolute. Somewhat more appropriate, we may conceive data as intermediate results of transformations. This renders any given method into some kind of abstract measurement device. The label of “data” we usually just use for those bits whose conditions of generation we can not influence.

Consider for instance a text. For the computer a text is just a non-random series of graphemes. We as humans can identify a grammar in human languages. Many years, if not decades, people thought that computers will understand language as soon as grammar has been implemented. The research by Chomsky [1], Jackendoff [2] and Pinker [3], among others, is widely recognized today, resulting in the concepts of phrase structure grammar, x-bar syntax or head-driven syntax. Yet, large research projects with hundreds of researchers (e.g. “verbmobil”) did not only not reach the self-chosen goals, they failed completely on the path to implement understanding of language. Even today, for most languages there is no useful parser available, the best parser for German language achieves around 85-89% accuracy, which is disastrous for real applications.

Another approach is to bring in probabilistic theories. Particularly n-grams and Markov-models have been favored. While the first one is an incredibly stupid idea for the representation of a text, Markov-models are more successful. It can be shown, that they are closely related to Bayes belief networks and thus also to artificial neural networks, though the latter employ completely different mechanism as compared to Markov-models. Yet, from the very mechanism and the representation that is created as/by the Markov-model, it is more than obvious that there is no such thing as language understanding.

Quite obviously, language as text can not be represented as a grammar plus a dictionary of words. Doing so one would be struck by the “representational fallacy,” which not only has been criticized by Dreyfus recently [4], it is a matter of fact that representationalist in machine learning approaches failed completely. Representational cognitivism claims that we have distinct image-like engrams in our brain when we are experiencing what we call thinking. They should have read Wittgenstein first (e.g. About Certainty), before starting expensive research programs. That experience about one’s own basic mental affairs is as little directly accessible as any other thing we think or talk of. A major summary of many subjections against the representationalist stance in theories about the mind, as well as a substantial contribution is Rosenfield’s “The Invention of Memory” [6]. Rosenfield argues strongly against the concept of “memory as storage,” in the same venue as Edelman, to which we fully agree.

It does not help much either to resort to “simple” mathematical or statistical models, i.e. models effectively based on an analytical function, as apposed to models based on a complex system. Conceiving language as a mere “random process” of whatsoever kind simply does not work, let it be those silly n-grams, or sophisticated Hidden Markov Models. There are open source packages in the web you can use to try it yourself.

But what then “is” a text, how does a text unfold its effects? Which aspects should be presented to the learning procedure, the “pattern detection engine,” such that the regularities could be appropriately extracted and a re-presentation could be built? Taking semiotics into account, we may add links between words. Yet, this involves semantics. Peter Janich has been arguing convincingly that the separation of syntax and semantics should be conceived of as just another positivist/cyberneticist myth [5]. And on which “level” should links be regarded as significant signals? If there are such links, any text renders immediately into a high-dimensional non-trivial and above all dynamic network…

An interesting idea has been proposed by the research group around Teuvo Kohonen. They invented a procedure they call the WebSom [7]. You can find material in the web about it, else we will discuss it in great detail within our sections devoted to the SOM. There are two key elements of this approach:

  • (1) It is a procedure which inherently abstracts from the text.
  • (2) the text is not conceived—and (re-)presented—as “words”, i.e. distinct lexicographical primitives; instead words are mapped into the learning procedure as a weighted probabilistic function of their neighborhood.

Particularly seminal is the second of the key properties, the probabilization into overlapping neighborhoods. While we usually think that words a crisp entities arranged into a structured series, where the structure follows a grammar, or is identical with it,  this is not necessarily appropriate, even not for our own brain. The “atom” of human language is most likely not the word. Until today, most (if not all people engaged in computer linguistics) think that the word, or some very close abstraction of it, plus some accidentia, forms the basic entities, the indivisible of language.

We propose that this attitude is utterly infected by some sort of pre-socratic and romantic cosmology, geometry and cybernetics.We even can’t know which representation is the “best”, or even an appropriate one.  Even worse, the appropriateness of the presentation of raw data to the learning procedure via various pre-processors and preparation of raw data (series of words) is not independent from the learning procedure. We see that the problems with presentation and representation reach far into the field of modeling.

Despite we can’t know in principle how to perform measurements in the most appropriate manner, as a matter of fact we will perform some form of measurement. Yet, this initial “raw data” does not “represent” anything, even not the entity being subject of the measurement. Only a predictive model derived from those observations can represent an entity, and it does so only in a given context largely determined by some purpose.

Whatsoever such an initial and multiple presentation of an entity will look like, it is crucial, in my opinion, to use a proababilized preparation of the basic input data. Yet, components of such preparations not only comprise the raw input data, but also the experience of the whole engine, i.e. a kind of semantic influence, acquired by learning. Further (potential) components of a particular small section of a text, say a few words, are any kind of property of the embedding text, of any extent. Not only words as lexemes, but also words as learned entities, as structural elements, then also sentences and their structural (syntactical)) properties, semantic or speech-pragmatic markers, etc.etc. and of course also including a list of properties as Putnam proposed already in 1979 in “The meaning of “Meaning” [8].”

Taken together we can state that the input to the association engine are probabilistic distributions about arbitrarily chosen “basic” properties. As we will see in the chapter on modeling, these properties are not to be confused with objective facts to be found in the external world. There we also will see how we can operationalize these insights into implementation. In order to enable a machine to learn how to use words as items of a language, we should not present words in their propositional form to it. Any entity has to be measured as a entity from a random distribution and represented as a multi-dimensional probability distribution. In other words, we deny the possibility to transmit any particular  representation into the machine (or another mind as well). A particular manifold of representations has to built up by the cognitive entity itself in direct response to requirements of the environment, which is just to be conceived as the embedding for “situations.” In the modeling chapter we will provide arguments for the view that this linkage to requirements does not result in behavioristic associativism, the simple linkage between simulus and response according to the framework proposed by Watson and Pawlow. Target-oriented modeling in the multi-dimensional case necessarily leads to a manifold of representations. Not only the input is appropriately described by probability distributions, but also the output of learning.

And where is the representation of the learned subject? How does it look like? This question is almost sense-free, since it would require to separate input, output, processing, etc. it would deny the inherent manifoldness of modeling, in short, it is a deeply reductionist question. The learning entity is able to behave, react, anticipate, and to measure, hence just the whole entity is the representation.

The second important anatomical property of an entity able to acquire the capability to understand texts is the inherent abstraction. Above all, we should definitely not follow the flat world approach of the positivist ideology. Note, that the programmer should not only not build a dictionary into the machine; he also should not pre-determine the kind of abstraction the engine develops. This necessary involves internal differentiation, which is another word for growth.

  • [1] Noam Chomsky (to be completed…)
  • [2] Jackendoff
  • [3] Steven Pinker 1994?
  • [4] Hubert L Dreyfus, How Representational Cognitivism Failed and is being replaced by Body/World Coupling. p.39-74, in: Karl Leidlmair (ed.), After Cognitivism: A Reassessment of Cognitive Science and Philosophy, Springer, 2009.
  • [5] Peter Janich. 2005.
  • [6] Israel Rosenfield, The Invention of Memory: A New View of the Brain. New York, 1988.
  • [7] WebSom
  • [8] Hilary Putnam, The Meaning of “Meaning”. 1979.


Mental States

October 23, 2011 § Leave a comment

The issue we are dealing with here is the question whether we are justified to assign “mental states” to other people on the basis of our experience, that is, based on weakly valid predictions and the use of some language upon them.

Hilary Putnam, in an early writing (at least before 1975), used the notion of mental states, and today almost everybody does so. In the following passage he tries to justify the reasonability of the inference of mental states (italics by H.Putnam, colored emphasis by me); I think this passage is not compatible with his results any more in “Representation and Reality”, although most people particularly from computer sciences cite him as a representative of a (rather crude) machine-state functionalism:

“These facts show that our reasons for accepting it that others have mental states are not an ordinary induction, any more than our reasons for accepting it that material objects exist are an ordinary induction Yet, what can be said in the case of material objects can also be said here our acceptance of the proposition that others have mental states is both analogous and disanalogous to the acceptance of ordinary empirical theories on the basis of explanatory induction. It is disanalogous insofar as ‘other people have mental states’ is, in the first instance, not an empirical theory at all, but rather a consequence of a host of specific hypothesis, theories, laws, and garden variety empirical statements that we accept.   […]   It is analogous, however, in that part of the justification for the assertion that other people have mental states is that to give up the proposition would require giving up all of the theories, statements, etc., that we accept implying that proposition; […] But if I say that other people do not have minds, that is if I say that other people do not have mental states, that is if I say that other people are never angry, suspicious, lustful,sad, etc., I am giving up propositions that are implied by the explanations that I give on specific occasions of the behavior of other people. So I would have to give up all of these explanations.”

Suppose, we observe someone for a few minutes while he or she is getting increasingly stressed/relaxed, and suddenly the person starts to shout and to cry, or to smile. More professionally, if we use a coding system like the one proposed by Scherer and Ekman, the famous “Facial Action Coding System,”  recently popularized by the TV series “Lie to me,” are we allowed to assign them a “mental state”?

Of course, we intuitively and instinctively start trying to guess what’s going on with the person, in order to make some prediction or diagnosis (which essentially is the same thing), for instance because we feel inclined to help, to care, to console the person, to flee, or to chummy with her. Yet, is such a diagnosis, probably taking place in the course of mutual interpretation of almost non-verbal behavior, is such a diagnosis the same as assigning “mental states”?

We are deeply convinced, that the correct answer is ‘NO’.

The answer to this question is somewhat important for an appropriate handling of machines that start to be able to open their own epistemology, which is the correct phrase for the flawed notion of “intelligent” machines. Our answer rests on two different pillars. We invoke complexity theory, and a philosophical argument as well. Complexity theory forbids states for empirical reasons; the philosophical argument forbids its usage regarding the mind due to the fact that empirical observations never can be linked to statefulness, neither by language nor by mathematics. Statefulness is then identified as a concept from the area of (machine) design.

Yet, things are a bit tricky. Hence, we have to extend the analysis a bit. Else we have to refer to what we said (or will say) about theory and modeling.

Reductionism, Complexity, and the Mental

Since the concept of “mental state” involves the concept of state, our investigation has to follow two branches. Besides the concept of “state” we have the concept of the “mental,” which still is a very blurry one. The compound concept of “mental state” just does not seem to be blurry, because of the state-part. But what if the assignment of states to the personal inner life of the conscious vis-a-vis is not justified? We think indeed that we are not allowed to assign states to other persons, at least when it comes to philosophy or science  about the mind (if you would like to call psychology a ‘science’). In this case, the concept of the mental remains blurry, of course. One could suspect that the saying of “mental state” just arose to create the illusion of a well-defined topic when talking about the mind or mindfulness.

“State” denotes a context of empirical activity. It assumes that there have been preceding measurements yielding a range of different values, which we aposteriori classify and interpret. As a result of these empirical activities we distinguish several levels of rather similar values, give them a label and call them a “state.” This labeling remains always partially arbitrary by principle. Looking backward we can see that the concept of “state” invokes measurability, interpretation and, above all, identifiability. The language game of “state” excludes basic non-identifiability. Though we may speak about a “mixed state,” which still assumes identifiability in principle, there are well-known cases of empirical subjects that we can not assign any distinct value in principle. Prigogine [2] gave many examples, and even one analytic one, based on number theory. In short, we can take it for sure that complex systems may traverse regions in their parameter space where it is not possible to assign anything identifiable. In some sense, the object does not exist as a particular thing, it just exists as a trajectory, or more precise, a compound made from history and pure potential. A slightly more graspable example for those regions are the bifurcation “points” (which are not really points for real systems).

An experimental example being also well visible are represented by arrangements like so-called Reaction-Diffusion-Systems [3]. How to describe such a system? An atomic description is not possible, if we try to refer to any kind of rules. The reason is that the description of a point in their parameter system around the indeterminate area of bifurcation is the description of the whole system itself, including its trajectory through phase space. Now, who would deny that the brain and the mind springing off from it is something which exceeds by far those “simple” complex systems in their complexity, which are used as “model systems” in the laboratory, in Petri dishes, or even computer simulations?

So, we conclude that brains can not “have” states in the analytic sense. But what about meta-stability? After all, it seems that the trajectories of psychological or behavioral parameters are somehow predictable. The point is that the concept of meta-stability does not help very much. That concept directly refers to complexity, and thus it references to the whole “system,” including a large part of its history. As a realist, or scientist believing in empiricism, we would not gain anything. We may summarize that their is no possible reduction of the brain to a perspective that would justify the usage of the notion of “state.”

But what about the mind? Let the brain be chaotic, the mind need not, probably. Nobody knows. Yet, an optimistic reductionist could argue for its possibility. Is it then allowed to assign states to the mind, that is, to separate the brain from the mind with respect to stability and “statefulness”? Firstly, again the reductionist would loose all his points, since in this case the mind and its states would turn into something metaphysical, if not from “another reality.” Secondly, measurability would fall apart, since mind is nothing you could measure as an explanans. It is not possible to split off the mind of a person from that very person, at least not for anybody who would try to justify the assignment of states to minds, brains or “mental matter.” The reason is a logical one: Such an attempt would commit a petitio principii.

Obviously, we have to resort to the perspective of language games. Of course, everything is a language game, we knew that even before refuting the state as an appropriate concept to describe the brain. Yet, we have demonstrated that even an enlightened reductionist, in the best case a contemporary psychologist, or probably also William James, must acknowledge that it is not possible to speak scientifically (or philosophically) about states concerning mental issues. Before starting with the state as a Language Game I would first like to visit the concepts of automata in their relation to language.

Automata, Mechanism, and Language

Automata are positive definite, meaning that it consists from a finite set of well-defined states. At any point in time they are exactly defined, even if the particular automaton is a probabilistic one. Well, complexity theory tells us, that this is not possible for real objects. Yet, “we” (i.e. computer hardware engineers) learned to suppress deviations far enough in order to build machines which come close to what is called the “Universal Turing Machine,” i.e. nowadays physical computers. A logical machine, or a “logics machine”, if you like, then is an automaton. Therefore, standard computer programs are perfectly predictable. They can be stopped, hibernated, restarted etc., and weeks later you can proceed at the last point of your work, because the computer did not change any single of more than 8’000’000’000 dual-valued bits. All of the software running on computers is completely defined at any point in time. Hence, logical machines not only do exist outside of time, at least from their own perspective. It is perfectly reasonable to assign them “states,” and the sequence of these states are fully reversible in the sense that either a totality of the state can be stored and mapped onto the machine, or that it can be identically reproduced.

For a long period of time, people thought that such a thing would be an ideal machine. Since it was supposed to be ideal, it was also a matter of God, and in turn, since God could not do nonsense (as it was believed), the world had to be a machine. In essence, this was the reasoning in the startup-phase of the Renaissance, remember Descartes’s or Leibniz’s ideas about machines. Later, Laplace claimed perfect predictability for the universe, if he could measure everything, as he said. Not quite randomly Leibniz also thought about the possibility to create any thought by combination from a rather limited set of primitives, and in that vein he also proposed binary encoding. Elsewhere we will discuss, whether real computers as simulators of logic machines can just and only behave deterministically. (they do not…)

Note that we are not just talking about the rather trivial case of Finite State Automata. We explicitly include the so-called Universal-Turing-Machine (UTM) into our considerations, as well as Cellular Automata, for which some interesting rules are known, producing unpredictable though not random behavior. The common property of all these entities is the positive definiteness. It is important to understand that physical computers must not conceived as UTM. The UTM is logical machine, while the computer is a physical instance of it. At the same time it is more, but also less than a UTM. The UTM consists of operations virtually without a body and without matter, and thus also without the challenge of a time viz. signal horizon: things, which usually cause trouble when it comes to exactness. The particular quality of the unfolding self-organization in Reaction-Diffusion-System is—besides other design principles—dependent on effective signal horizons.

Complex systems are different, and so are living systems (see posts about complexity). Their travel through parameter space is not reversible. Even “simple” chemical processes are not reversible. So, neither the brain nor the mind could be described as reversible entities. Even if we could measure a complex system at a given point in time “perfectly,” i.e. far beyond quantum mechanic thresholds (if such a statement makes any sense at all), even then the complex system will return to increasing unpredictability, because such systems are able to generate information [4]. Besides stability, they are also deeply nested, where each level of integration can’t be reduced to the available descriptions of the next lower level. Standard computer programs are thus an inappropriate metaphor for the brain as well as for the mind. Again, there is the strategic problem for the reductionist trying to defend the usage of the concept of states to describe mental issues, as reversibility would apriori assume complete measurability, which first have to be demonstrated, before we could talk about “states” in the brain or “in” the mind.

So, we drop the possibility that the brain or the mind either is an automaton. A philosophically inspired biological reductionist then probably will resort to the concept of mechanism. Mechanisms are habits of matter. They are micrological and more local with respect to the more global explanandum. Mechanisms do not claim a deterministic causality for all the parts of a system, as the naive mechanists of earlier days did. Yet, referring to mechanisms imports the claim that there is a linkage between probabilistic micrological (often material) components and a reproducible overall behavior of the “system.” The micro-component can be modeled deterministically or probabilistically following very strong rules, the overall system then shows some behavior which can not described by the terms appropriate for the micro-level. Adopted to our case of mental states that would lead us to the assumption that there are mechanisms. We could not say that these mechanisms lead to states, because the reductionist first has to proof that mechanisms lead to stability. However, mechanisms do not provide any means to argue on the more integrated level. Thus we conclude that—funny enough—resorting to the concept of probabilistic mechanism includes the assumptions that it is not appropriate to talk about states. Again a bad card for the reductions heading for the states in the mind.

Instead, systems theory uses concepts like open systems, dynamic equilibrium (which actually is not an equilibrium), etc. The result of the story is that we can not separate a “something” in the mental processes that we could call a state. We have to speak about processes. But that is a completely different game, as Whitehead has demonstrated as the first one.

The assignment of a “mental state” itself is empty. The reason is that there is nothing we could compare it with. We only can compare behavior and language across subjects, since any other comparison of two minds always includes behavior and language. This difficulty is nicely demonstrated by the so-called Turing-test, as well as Searle’s example of the Chinese Chamber. Both examples describe situations where it is impossible to separate something in the “inner being” (of computers, people or chambers with Chinese dictionaries); it is impossible, because that “inner being” has no neighbor, as Wittgenstein would have said. As already said, there is nothing which we could compare with. Indeed, Wittgenstein said so about the “I” and refuted its reasonability, ultimately arriving at a position of “realistic solipsism.” Here we have to oppose the misunderstanding that an attitude like ours denies the existence of mental affairs of other people. It is totally o.k. to believe and to act according to this believe that other people have mental affairs in their own experience; but it is not o.k. to call that a state, because we can not know anything about the inner experience of private realities of other people, which would justify the assignment of the quality of a “state.” We also could refer to Wittgenstein’s example of pain: it is nonsense to deny that other people have pain, but it is also nonsense to try to speak about the pain of others in a way that claims private knowledge. It is even nonsense to speak about one’s own pain in a way that would claim private knowledge—not because it is private, but because it is not a kind of knowledge. Despite we are used to think that we “know” the pain, we do not. If we would, we could speak exactly about it, and for others it would not be unclear in any sense, much like: I know that 5>3, or things like that. But it is not possible to speak in this way about pain. There is a subtle translation or transformation process in between the physiological process of releasing prostaglandin at the cellular level and the final utterance of the sentence “I have a certain pain.” The sentence is public, and mandatory so. Before that sentence, the pain has no face and no location even for the person feeling the pain.

You might say, o.k. there is physics and biology and molecules and all the things we have no direct access to either. Yet, again, these systems behave deterministically, at least some of them we can force to behave regularly. Electrons, atoms and molecules do not have individuality beyond their materiality, they can not be distinguished, they have no memory, and they do not act in their own symbolic space. If they would, we would have the same problem as with the mental affairs of our conspecifics (and chimpanzees, whales, etc.).

Some philosophers, particularly  those calling themselves analytic, claim that not only feelings like happiness, anger etc. require states, but also that intentions would do so. This, however, would aggravate the attempt to justify the assignment of states to mental affairs, since intentions are the result of activities and processes in the brain and the mind. Yet, from that perspective one could try to claim that mental states are the result of calculations or deterministic processes. As for mathematical calculations, there could be many ways leading to the same result. (The identity theory between physical and mental affairs has been refuted first by Putnam 1967 [5].) On the level of the result we unfortunately can not tell anything about the way how to achieve it. This asymmetry is even true for simple mathematics.

Mental states are often conceived as “dispositions,” we just before talked about anger and happiness, notwithstanding more “theoretical” concepts. Regarding this usage of “state,” I suppose it is circular, or empty. We can not talk about the other’s psychic affairs except the linkage we derive by experience. This experience links certain types of histories or developments with certain outcomes. Yet, their is no fixation of any kind, and especially not in the sense of a finite state automaton. That means that we are mapping probability densities to each other. It may be natural to label those, but we can not claim that these labels denote “states.” Those labels are just that: labels. Perhaps negotiated into some convention, but still, just labels. Not to be aware of this means to forget about language, which really is a pity in case of “philosophers.” The concept of “state” is basically a concept that applies to the design of (logical) machines. For these reasons is thus not possible to use “state” as a concept where we attempt to compare (hence to explain)  different entities, one of which is not the result of  design. Thus, it is also not possible to use “states” as kind of “explaining principle” for any kind of further description.

One way to express the reason for the failure of  the supervenience claim is that it mixes matter with information. A physical state (if that would be meaningful at all) can not be equated with a mind state, in none of its possible ways. If the physical parameters of a brain changes, the mind affairs may or may not be affected in a measurable manner. If the physical state remains the same, the mental affairs may remain the same; yet, this does not matter: Since any sensory perception alters the physical makeup of the brain, a constant brain would be simply dead.

Would we accept the computationalist hypothesis about the brain/mind, we would have to call the “result” a state, or the “state” a result. Both alternatives feel weird at least with respect to a dynamic entity like the brain, though the even feel weird with respect to arithmetics. There is no such thing in the brain like a finite algorithm that stops when finished. There are no “results” in the brain, something, even hard-core reductionistic neurobiologists would admit. Yet, again, exactly this determinability had to be demonstrated in order to justify the usage of “state” by the reductionist, he can not refer to it as an assumption.

The misunderstanding is quite likely caused by the private experience of stability in thinking. We can calculate 73+54 with stable results. Yet, this does not tell anything about the relation between matter and mind. The same is true for language. Again, the hypothesis underling the claim of supervenience is denying the difference between matter and information.

Besides the fact that the reductionist is running again into the same serious tactical difficulties as before, this now is a very interesting point, since it is related to the relation of brain and mind on the one side and actions and language on the other. Where do the words we utter come from? How is it possible to express thoughts such that it is meaningful?

Of course, we do not run a database with a dictionary inside it in our head. We not only don’t do so, it would not be possible to produce and to understand language at all, even to the slightest extent. Secondly, we learn language, it is not innate. Even the capability to learn language is not innate, contrary to a popular guess. Just think about Kaspar Hauser who never mastered it better than a 6-year old child. We need an appropriately trained brain to become able to learn a language. Would the capability for language being innate, we would not have difficulties to learn any language. We all know that the opposite is true, many people having severe difficulties to learn even a single one.

Now, the questions of (1) how to become able to learn a language and (2) how to program a computer that it becomes able to understand language are closely related. The programmer can NOT put the words into the machine apriori as that would be self-delusory. Else, the meaning of something can not be determined apriori without referring to the whole Lebenswelt. That’s the result of Wittgenstein’s philosophy as well as it is Putnam’s final conclusion. Meaning is not a mental category, despite that it requires always several brains to create something we call “meaning” (emphasis on several). The words are somewhere in between, between the matter and the culture. In other words there must be some kind process  that includes modeling, binding, symbolization, habituation, both directed to its substrate, the brain matter, and its supply, the cultural life.

We will discuss this aspect elsewhere in more detail. Yet, for the reductionist trying to defend the usage of the concept of states for the description of mental affairs, this special dynamics between the outer world and the cognitively established reality, and which is embedding  our private use of language, is the final defeat for state-oriented reductionisms.

Nevertheless we humans often feel inclined to use that strange concept. The question is why do we do so, and what is the potential role of that linguistic behavior? If we take the habit of assigning a state to mental affairs of other people as a language game, a bunch of interesting questions come to the fore. These are by far too complex and to rich as to be discussed here. Language games are embedded into social situations, and after all, we always have to infer the intentions of our partners in discourse, we have to establish meaning throughout the discourse, etc. Assigning a mental state to another being probably just means “Hey, look, I am trying to understand you! Would you like to play the mutual interpretation game?” That’s ok, of course, for the pragmatics of a social situation, like any invitation to mutual inferentialism [6], and like any inferentialism it is even necessary—from the perspective of the pragmatics of a given social situation. Yet, this designation of understanding should not mistake the flag with the message. Demonstrating such an interest need not even be a valid hypothesis within the real-world situation. Ascribing states in this way, as an invitation for inferring my own utterances,  is even unavoidable, since any modeling requires categorization. We just have to resist to assign these activities any kind of objectivity that would refer to the inner mental affairs of our partner in discourse. In real life, doing so instead is inevitably and always a sign of deep disrespect of the other.

In philosophy, Deleuze and Guattari in their “Thousand Plateaus” (p.48) have been among the first who recognized the important abstract contribution of Darwin by means of his theory. He opened the possibility to replace types and species by population, degrees by differential relations. Darwin himself, however, has not been able to complete this move. It took another 100 years until Manfred Eigen coined the term quasi-species as an increased density in a probability distribution. Talking about mental states is noting than a fallback into Linnean times when science was the endeavor to organize lists according to uncritical use of concepts.

Some Consequences

The conclusion is that we can not use the concept of state for dealing with mental or cognitive affairs in any imaginable way, without stumbling into serious difficulties . We should definitely drop it from our vocabulary about the mind (and the brain as well). Assuming mental states in other people is rendering those other people into deterministic machines. Thus, doing so would even have serious ethical consequences. Unfortunately, many works by many philosophers are rendered into mere garbage by mistakenly referring to this bad concept of “mental states.”

Well, what are the consequences for our endeavor of machine-based epistemology?

The most salient one is that we can not use the digital computers to produce language understanding as along as we use these computers as deterministic machines. If we still want to try (and we do so), then we need mechanisms that introduce aspects that

  • – are (at least) non-deterministic;
  • – produce manifolds with respect to representations, both on the structural level and “content-wise”;
  • – start with probabilized concepts instead of compound symbolic “whole-sale” items (see also the chapter about representation);
  • – acknowledge the impossibility to analyze a kind of causality or—equival- ently—states inside the machine in order to “understand” the process of language at a microscopic level: claiming ‘mental states’ is a garbage state, whether it is assigned to people or to machines.

Fortunately enough, we found further important constraints for our implementa- tion of a machine that is able to understand language. Of course, we need further ingredients, but for now theses results are seminal. You may wonder about such mechanisms and the possibility to implement them on a computer. Be sure, they are there!

  • [1] Hilary Putnam, Mind, language, and reality. Cambridge University Press, 1979. p.346.
  • [2] Ilya Prigogine.
  • [3] Reaction-Diffusion-Systems: Gray-Scott-systems, Turing-systems
  • [4] Grassberger, 1988. Physica A.
  • [5] Hilary Putnam, 1967, ‘The Nature of Mental States’, in Mind, Language and reality, Cambridge University Press, 1975.
  • [6] Richard Brandom, Making it Explicit. 1994.


Connectivity, Levels, and Boxes

October 22, 2011 § Leave a comment

In programming, one is constantly creating boxes, or let us be more precise, compartments. Even as there is the holy grail of completely closed and re-usable software objects, programmers are fighting all the time against blurring the boundaries of structures they themselves have created. Almost since their invention, programming languages have thus been defined to support structured programming (ALGOL60, Pascal), which resulted in the object-oriented style paradigm around the mid-90ies. In an ideal programming world of of a 1990ies software engineer, those “objects” are completely independent, they are black-boxes regarding the way they exert their job, hence, some technicians say, they behave. More profane, they create an enclosed globality, they represent something like a transportable software agent.Of course, these agents do not “act,” they have to perform in an exactly defined behavior.

Similarly, databases got reasonably structured around 1985, and for more than 20 years almost any database system followed the relational approach. A similar story happened for networking, i.e. for the task of connecting the various parts of a computing system across the physical box of a computer. For decades, and still for the vast majority of installations, engineers followed the strict belief in the paraphrase of “divide & conquer,” which results invariably in a strict hierarchical system. The famous “Server/Client” architecture is legend.

In practice, perfect hierarchies rarely match. Perfect hierarchies increasingly fail the further away you are from the processor of the computer. An operating system may work in a hierarchical structure, a storage system might, a software system like SAP R/3, but not the people in social processes. Who would claim that thoughts appearing in natural brains follow a hierarchical structure? Thus we can not claim that thoughts are results of applied logic. Just remember the great failure of Prolog… Since a few years now, things fortunately change, at least, it feels like. There are link-based (Neo4J) or document=object-based (CouchDB) databases. Yet, if you think you of so-called storage clouds (e.g. Apple’s iCloud), you are taking the donkey for the… They only look like non-hierarchical systems, in fact, internally they are still strongly hierarchical. What yields the impression is the contact between the human user (you) as a body (which is much slower than the speed of light) and the processing speed/bandwidth of networks.

Back to our endeavor of intelligent systems here. Note, that we really mean “intelligent” on its own, not just intelligently engineered, or marketed. The system should not repeat programmed steps (like the ugly Deep Blue…) at the speed of light. We make the first reasonable assumption that an intelligent (computer) system is built from a kind of more or less separated modules. We would prefer to avoid to assign “functions” or “functional roles” to them for the time being. These modules have to be connected. As for the content, the programmer should not determine in advance which modules are connected in which way. Additionally, there is the practical question about the structural level on which these modules of the envisioned computing “ecosystem” should be connected?

Since around 1998 several standards for linking computer systems emerged: ORB, Web Services, SOAP, AJAX, Websockets, and lately REST. From those, the last one stands apart, because it is the only one which enforces a strict reference to the behavior of the linked systems, and only that. There are only a few cases, where the raw data coincide with “behavior,” namely physical sensors. Linking instances of software, however, is a different game. In order to achieve robustness against future changes, softwares should be linked only on the “behavioral” level. Probably here is the problem of Object request brokers (ORB), Web Services, and SOAP, which all did not spread as intended, albeit they have been pushed into the market. They are not only too complicated; I am convinced they encourage to do the wrong things when it comes to linking different computing systems. With regard to a more thorough and philosophically sound thinking, it is probably even worse that they pretend to cover semantics, but rather to the opposite they get drowned in extensive syntactical heavy-duty rules. Yet, no amount of syntax whatsoever will “create” semantics. ORB (and similarly RIM, or Akka) even require the same format on the binary level, as they exchange objects or signatures of objects, which are always proprietary for programming languages and their compilers. Of course, once you have decided to use just Java and nothing else, Akka is a quite elegant solution for the challenge of distributed computing!

What is wrong with those approaches with regard to intelligent systems? They are trying to standardize the interaction on the level of structural field definitions, instead on the level of behavior, i.e. results. Thus, they are always highly specialized already from the beginning. And specialists are always in danger to become extinct. You only need to change a single bit, and the whole machinery will stop working, and along with it, the interaction. In other, more theoretical terms, we can state that the linked systems have to have an isomorphic interface on the level of semantics. Can you see the architectural misconception? Astonishingly, this misunderstanding happened on top of a long history of abstraction concerning standards for network transport. While for more hardware related things here is the OSI model stack, these lessons seemingly went forgotten when it came to interactions, where semantics played a role.

What we obviously need is a much more strict decoupling of the transport layer from the transported stuff, and, at the same time,  a similarly strong decoupling between the source of a message and any potential corresponding receptor. The only thing that is shared between a SOURCE and a RECEPTOR is a kind of loose contract about the type of data exchanged. Secondly, they never get into contact directly. Any exchange is always mediated by some kind of relaying instance. For the time being, we call it a “message board.” Take it for now as a mixture of telephone relay station, kind media, a black board, a bunch of neurons and their fibers, or the stage of a theater. The message board establishes links between participants (sources or receptors) in a probabilistic manner, it is kind of a media for the participants, or better, a kind of active milieu. In this way, links are not necessarily transparent any more. Instead, the activity of the message board allows for the usage-driven emergence of new codes. From a communicological point of view, the message board may be conceived as a milieu, offering different topologies of relaying (1:1 -> n:m) as well as different modes to establish linkages. Participants may transfer rules for matching SOURCEs and RECEPTORs.

We have said, that participants exchanges the contract about the type of document; that’s not completely true, because in this case we again would have to negotiate the interaction, which in turn would require that we standardize on the level of fields, i.e. the bit-structure of variables. Exactly this causes the mess with SOAP. Our proposal is different, leaning towards the pattern provided by biological bodies, in particular by the way the nervous system “encodes” stimuli “for” particular brain regions, or by the way, the endocrinological system is organized. The fact, is that neither the humoral system nor the nervous system first define a code to negotiate and then negotiate the interaction and the exchange of the data. The same is true for immune system and its matching against infecting agents. In all these cases, the match is built into the matter, directly. In those natural systems, an effect is solicited either through repeated use, or through a large population of almost identical “interactions” between participants of such a matching game. Whether they match or not is not part of the game. They are simply there. Any response is a matter of a secondary processing (intracellular amplifying biochemical chains).

This means, that on both sides of the message board there have to be populations of entities, which match across the medium apriori regarding the transport. Any processing is then taking place inside the entities. Whether the transmitted package is processed or not is NOT subject of the transmission game anymore. This principle is not only diametral to approaches like WebServices, SOAP, ORB, or RIM. In those frameworks, once a message has been transmitted and matched, it is mandatory that it is also processed. That’s fine for systems, where one is happy with database look-ups, i.e. in purely syntactical systems. We propose that it is unsuitable for systems that shall show some capability for semantics. The separation between successful transmission and the decision about processing is a principle that we constantly employ in language-based communication and the core principle for messaging in biological systems. We neither negotiate words nor do we negotiate the design of the ears and the vocal apparatus. They match apriori to any communication.

In case of a cognitive system that has just booted, or one that is in the booting process,—like post-natal animals, if you like—the problematics is a different one. It is solved by two conditions: the ability to associate in certain receiving modules and the emerging regularity of inputs to the same sections of the population of those associative modules. The result is again a “body” where the match between sender and receiver needs not to be negotiated. In case of the immune system we know of “priming” the creation of receptors through “exposing” certain patterns, which are realized as molecular configurations. Yet, it is not reasonable to design the connectivity of modules in a cognitive system following the assumption that the whole system is always in a kind of peri-natal state. Instead, we assume a body made from matching “material” parts. Only if this condition is fulfilled, so our guess, a sustainable learning will be possible.

Practically, then, we create the message as an instance board that fulfills the following conditions:
— it is able to run on any physical transport protocol, such like UDP, TCP, FTP,
…..or HTTP (in a “restful” manner);
— on top of these transport protocols, a transmission procedure is implemented
… a transactional process, actualized as a stack of simple and type-free
…..XML-sets, which are known to the participants, too;
— optionally, the participants can deliver contextual matching rules to the
…..message board;
— the semantic content is completely wrapped, i.e. black-boxed from the
…..perspective of the message board, using text-based encoding
… base64.

If the receptor receiving the message can handle it, it will do so, if not, it won’t. If the population of receptors is comparatively small (as always in technical systems), the receptor also will return a trace (kind of “feedback”) to the message board, signalling the acceptance / denial of the package.

Such a framework is highly suited to connect members of a population of entities that are able to perform associative learning, where those entities are existing as separated “behaving objects,” just linked together by means of the messages they exchange through the message board(s). From a bird’s view, the message board may not conceived as a black board anymore, it is more like an active glue between neuronal instances. In neurobiology, such an entity is called glia. As it has been recently discovered for the biological glia, the message glue possesses own capabilities for processing, for amplifying, dispatching or repeating signals. We do not claim, of course, that our system of connecting artificial neuronal collections is like the glia or a simulation of it. Yet, we think that we turned away from the perspective, which tries to render the transmission medium strictly invisible, and which tries to negotiate all the time between non-matching “bodies.”

Where Am I?

You are currently viewing the archives for October, 2011 at The "Putnam Program".