Forms of Life

July 12, 2012 § Leave a comment

It’s Time to Change. At least a bit.

And at least, again.

As readers of this blog, you may already know that I easily exhibit my preferences to the philosophy of Ludwig Wittgenstein, as well as that of Gilles Deleuze. The former is known as a philosopher of language. The latter is not yet known as a philosopher of biology, especially of evolution. Both did not explain their subject. They worked with it. Of course, both of them did lots of other things as well. Anyway.

We started this blog as an investigation of some aspects of the future of machines. Hopefully, we came close(r) to what could be called philosophy. At least with regard to the two guys mentioned above I feel that I worked on their foundations. A gnome on the shoulders of giants, perhaps. Anyway.

Philosophy without reference to life and its forms remains irrelevant. “What is philosophy?” Deleuze and Guattari asked towards the end of the last century. What “is” it, indeed? A technique? A cure? A style? Touching the wall and stepping across the border? Sustainably practiced consciousness while talking to someone else? Maybe. We can feel clearly that the simplicity of this question is somewhat deceiving.

In the future, we will refer to the concepts we discussed (discovered? (re-)invented?) in the previous essays, using them to comment on things I come across. Contingently.

One of these areas is architecture, or to be more precise, urbanism. To me it seems, that there is only very little, and if, quite limited theory in this field. I mean, there are tons of models around, but almost no theory. Even in the Koolhaas’ writings, e.g. in Singapore Songlines. AMO/OMA does a lot of empiric research, into many directions, but where he refers to concepts like semiotics, he falls behind. Architects or so-called theoreticians in architecture often import certain patterns such as semiotics, grammar from linguistics, sociological stuff like feminism or the “inevitable” critique of capitalism. But these imports do not represent theory in architecture, as theory not only provides a frame for modeling, it provides a deep milieu with its own dimensionality (see this for more details), which would include the awareness about the style that shows up in ones own modeling. The pretended theories are merely templates for the interpretation of endless lists of phenomena. Some even try to turn architecture into a science. Or into some kind of machine. Or into some kind of psychoanalysis. All of this can’t provide theories, as little as historical accounts can do. We will hence deal (again) with the question about theory (in architecture).

Architecture is at the crossroads. Has been sitting on the crossing of roads now for quite some time. Probably since Versailles, or S,M,L,XL. Probably since Pruitt-Igoe and its blast. Or Venturi’s visit in Las Vegas. Who knows. Architecture always behaved as a crystallization site, a catalysatory seed of growth and differentiation for Forms of Life into which it was embedded and to which it has been contributing (Of course, that story is mutual one.). The visible part of all those sediments, strata, and layers that we call history of culture.

Yet, things started to change, I think. Architecture and its products do neither provide something (as functions) nor represent anything anymore. Hence, it is probably misplaced to ask about the any in architecture (see the “any conferences”). The stuff got active. Or will, or is currently becoming active. That stuff came to life. And this issue we can’t leave uncommented! Wittgenstein and Deleuze will contribute through my assimilations.

There is more than one aspect that these developments in the domains of architecture or urbanism share with our original topic of machine-based episteme or machines with mental capabilities. If you are a programmer, you probably know about the concept of “design patterns”. That concept has been introduced into architecture by Christopher Alexander, who originally has been trained as a mathematician. Remarkably, he also referred to behavioral sciences. Besides that there was of course also the notion of the “city as a machine”, or, some time ago, the “city as organism”. Both metaphors have probably been taken too serious at their time. Yet, Koolhaas, in the already mentioned Singapore Songline stated:

I have tried to decipher its reverse alchemy, understand its genealogy, do an architectural genome project, re-create its architectural songlines.” [p.1017, his emphasis]

My impression is that Koolhaas tried to find some structural analogue which would allow him to impose some reasonable order onto his empirical findings. Yet, he did not express it in this way. Maybe due to a missing theory. The problem with the genome is, well, it’s not really a “problem”, at least not for a biologist, that a genome needs an apparatus for translation, an egg, a mother. Which is the kind of relations between the female and the machine, here?

Honestly, Koolhaas also brings in the conceptual pattern of the songline. Did he refer to popular music of our times? Or that of Mozart and the particular relations between the libretto and the music? In any way, the songline is in utter need of the music. Unfortunately, Koolhaas does never ask about the music of the city, the music that the city is playing. Otherwise we would have met the composer Johannes Sistermanns, or he would have discovered the power of associativity (as an abstract concept).

Such, the first piece(s) will be a reconsideration of Koolhaas quite influential writings “Generic City“, “JunkSpace” and “Singapore Songlines”.
۞

Advertisements

Formalization and Creativity as Strongly Singular Terms

February 16, 2012 § Leave a comment

Formalization is based on the the use of symbols.

In the last chapter we characterized formalization as a way to give a complicated thing a symbolic form that lives within a system of other forms.

Here, we will first discuss a special property of the concepts of formalization and creativity, one that they share for instance with language. We call this property strong singularity. Then, we will sketch some consequences of this state.

What does “Strongly Singular” mean?

Before I am going to discuss (briefly) the adjacent concept of “singular terms” I would like to shed a note on the newly introduced term of “strong singularity”.

The ordinary Case

Let us take ordinary language, even as this may be a difficult thing to theorize about. At least, everybody is able to use it. We can do a lot of things with language, the common thing about these things is, however, that we use it in social situations, mostly in order to elicit two “effects”: First, we trigger some interpretation or even inference in our social companion, secondly, we indicate that we did just that. As a result, a common understanding emerges, formally taken, a homeomorphism, which in turn then may serve as the basis for the assignment of so-called “propositional content”. Only then we can “talk about” something, that is, only then we are able to assign a reference to something that is external to the exchanged speech.

As said, this is the usual working of language. For instance, by saying “Right now I am hearing my neighbor exercising piano.” I can refer to common experience, or at least to a construction you would call an imagination (it is anyway always a construction). This way I refer to an external subject and its relations, a fact. We can build sentences about it, about which we even could say whether they correspond to reality or not. But, of course, this already would be a further interpretation. There is no direct access to the “external world”.

In this way we can gain (fallaciously) the impression that we can refer to external objects by means of language. Yet, this is a fallacy, based on an illegitimate shortcut, as we have seen. Nevertheless, for most parts of our language(s) it is possible to refer to external or externalized objects by exchanging the mutual inferential / interpretational assignments as described above. I can say “music” and it is pretty clear what I mean by that, even if the status of the mere utterance of a single word is somewhat deficient: it is not determined whether I intended to refer to music in general, e.g. as the totality of all pieces or the cultural phenomenon, or to a particular piece, to a possibility of its instantiation or the factual instance right now. Notwithstanding this divergent variety, it is possible to trigger interpretations and to start a talk between people about music, while we neither have to play or to listen to music at that moment.

The same holds for structural terms that regulate interpretation predominantly by their “structural” value. It is not that important for us here, whether the externalization is directed to objects or to the speech itself. There is an external, even a physical justification for the starting to engage in the language game about such entities.

Something different…

Now, this externalization is not possible for some terms. The most obvious is “language”. We neither can talk about language without language, nor can we even think “language” or have the “feeling” of language without practicing it. We also can’t investigate language without using or practicing it. Any “measurement” about language inevitably uses language itself as the means to measure, and this includes any interpretation of speech in language as well. This self-referentiality further leads to interesting phenomena, such as “n-isms” like the dualism in quantum physics, where we also find a conflation of scales. If we would fail to take this self-referentiality into consideration we inevitably will create faults or pseudo-paradoxa.

The important issue about that is that there is no justification of language which could be expressed outside of language, hence there is no (foundational) justification for it at all. We find a quite unique setting, which corrodes any attempt for a “closed” , i.e. formal analysis of language.

The extension of the concept “language” is at the same time an instance of it.

It is absolutely not surprising that the attempt for a fully mechanic, i.e. apriori determined or algorithmic analysis of language must fail. Wittgenstein thus arrived at the conclusion that language is ultimately embedded as a practice in the life form [1] (we would prefer the term “performance” instead). He demanded, that justifications (of language games as rule-following) have to come to an end1; for him it was fallacious to think that a complete justification—or ultimate foundation—would be possible.

Just to emphasize it again: The particular uniqueness of terms like language is that they can not be justified outside of themselves. Analytically, they start with a structural singularity. Thus the term “strong singularity” that differs significantly from the concept of the so-called “singular term” as it is widely known. We will discuss it below.

The term “strong singularity” indicates the absence of any possibility for an external justification.

In §329 of the Philosophical Investigations, Wittgenstein notes:

When I think in language, there aren’t ”meanings” going through my mind in addition to the verbal expressions: the language is itself the vehicle of thought.

It is quite interesting to see that symbols do not own this particular property of strong singularity. Despite that they are a structural part of language they do not share this property. Hence we may conceive it as a remarkable instance of a Deleuzean double articulation [2] in midst thinking itself. There would be lot to say about it, but it also would not fit here.

Further Instances

Language now shares the property of strong singularity with formalization .  We can neither have the idea nor the feeling of formalization without formalization, and we even can not perform formalization without prior higher-order formalization. There is no justification of formalization which could be expressed outside of formalization, hence there is no (foundational) justification for it at all. The parallel is obvious: Would it then be necessary, for instance, to conclude that formalization is embedded in the life form much in the same way as it is the case for language? That mere performance precedes logics? Precisely this could be concluded from the whole of Wittgenstein’s philosophical theory, as Colin Johnston suggested [3].

Per­forma­tive activity precedes any possibility of applying logics in the social world; formulated the other way round, we can say that transcendental logics is getting instantiated into an applicable quasi-logics. Before this background, the idea of truth functions determining a “pure” or ideal truth value is rendered into an importunate misunder­standing. Yet, formali­zation and language are not only similar with regard to this self-referentiality, they are also strictly different. Nevertheless, so the hypothesis we try to strengthen here, formalization resembles language in that we can not have the slightest thought or even any mental operation without formalization. It is even the other way round, in that any mental operation invokes a formalizing step.

Formalization and language are not the only entities, which exhibit self-referentiality and which can not defined by any kind of outside stance. Theory, model and metaphor belong to the family, too, not to forget finally about thinking, hence creativity, at large. A peculiar representative of these terms is the “I”. Close relatives, though not as critical as the former ones, are concepts like causality or information. All these terms are not only self-referential, they are also cross-referential. Discussing any of them automatically involves the others. Many instances of deep confusion derive from the attempt to treat them separately, across many domains from neurosciences, socio­logy, computer sciences and mathematics up to philosophy. Since digital techno­logies are based seriously on formalization and have been developing yet further into a significant deep structure of our contemporary life form, any area where software technology is pervasively used is endangered by the same misunderstandings. One of these areas is architecture and city-planning, or more general, any discipline where language or the social in involved as a target of the investigation.

There is last point to note about self-referentiality. Self-referentiality may likely lead to a situation that we have described as “complexity”. From this perspective, self-referentiality is a basic condition for the potential of novelty. It is thus interesting to see that this potential is directly and natively implanted into some concepts.

Singular Terms

Now we will briefly discuss the concept of “singular term” as it is usually referred to. Yet, there is not a full agreement about this issue of singular terms, in my opinion mainly due to methodological issues. Many proponents of analytical philosophy simply “forget that there are speaking”, in the sense mentioned above.

The analytical perspective

Anyway, according to the received view, names are singular terms. It is said that the reference of singular terms are singular things or objects, even if they are immaterial, like the unicorn. Yet, the complete distinctive list of singular terms would look like this:

  • – proper names (“Barack Obama”);
  • – labeling designation (“U.S. President”);
  • – indexical expressions (“here”, “this dog”).

Such singular terms are distinguished from so-called general terms. Following Tugendhat [4], who refers in turn to Strawson [5], the significance of a general term F consists from the conditions to be fulfilled, such that F matches one or several objects. In other words, the significance of a singular term is given by a rule for identification, while the significance of a general term is given by a rule for classification. As a consequence, singular terms require knowledge about general terms.

Such statements are typical for analytical philosophy.

There are serious problems with it. However, even the labeling is misleading. It is definitely NOT the term that is singular. Singular is at most a particular contextual event, which we decided to address by a name. Labelings and indexical expressions are not necessarily “singular,” and quite frequently the same holds for names. Think about “John Smith” first as a name, then as a person…  This mistake is quite frequent in analytic philosophy. We can trace it even to the philosophy of mathematics [6], when it comes to certain claims of set theory about infinity.

The relevance for the possibility of machine-based episteme

There can be little doubt, as we already have been expressing it elsewhere, that human cognition can’t be separated from language. Even the use of most primitive tools, let alone be the production and distribution of them, requires the capability for at least a precursor of language, some first steps into languagability.

We know by experience that, in our mother tongue, we can understand sentences that we never heard before. Hence, understanding of language (quite likely as any understanding) is bottom-up, not top-down, at least in the beginning of the respective processes. Thus we have to ask about the sub-sentential components of a sentence.

Such components are singular terms. Imagine some perfectly working structure that comprises the capability for arbitrary classification as well as the capability for non-empirical analogical thinking, that is based on a dynamic symmetries. The machine wold not only be able to perform the transition from extensions to intensions, it would even be able to abstract the intension into a system of meta-algebraic symmetry relations. Such a system, or better, the programmer of it then would be faced with the problem of naming and labeling. Somehow the intensions have to be made addressable. A private index does not help, since such an index would be without any value for communication purposes.

The question is how to make the machine referring to the proper names? We will see elsewhere (forthcoming: “Waves, Words, and Images“), that this question will lead us to the necessity of multi-modality in processing linguistic input, e.g. language and images together into the same structure (which is just another reason why to rely on self-organizing maps and our methodology of modeling).

Refutation of the analytical view

The analytical position about singular term does not provide any help or insight into the the particular differential quality of terms as words that denote a concept.2   Analytical statements as cited above are inconsistent, if not self-contradictory. The reason is simple. Words as placeholders for concepts can not have a particular meaning attached to them by principle. The meaning, even that of subsentential components, is an issue of interpretation, and the meaning of a sentence is given not only by its own totality, it is also dependent on the embedding of the sentence itself into the story or the social context, where it is performed.

Since “analytic” singular terms require knowledge about general terms, and the general terms are only determined if the sentence is understood, it is impossible to identify or classify single terms, whether singular or general, before the propositional content of the sentence is clear to the participants. That propositional content of the sentence, however, is, as Robert Brandom in chapter 6 of his [7] convincingly argues, only accessible through their role in the inferential relations between the participants of the talk as well as the relations between sentences. Such we can easily see that the analytical concept of singular terms is empty, if not self-nullifying.

The required understanding of the sentence is missing in the analytical perspective, the object is dominant against the sentence, which is against any real-life experience. Hence, we’d also say that the primacy of interpretation is not fully respected. What we’d need instead is a kind of bootstrapping procedure that works within a social situation of exchanged speech.

Robert Brandom moves this bootstrapping into the social situation itself, which starts with a medial symmetry between language and socialness. There is, coarsely spoken, a rather fixed choreography to accomplish that. First, the participants have to be able to maintain what Brandom calls a de-ontic account. The sequence start with a claim, which includes the assignment of a particular role. This role must be accepted and returned, which is established by signalling that the inference / interpretation will be done. Both the role and the acceptance are dependent on the claim, on the de-ontic status of the participants and on the intended meaning. (now I have summarized about 500 pages of Brandom’s book…, but, as said, it is a very coarse summary!)

Brandom (chp.6) investigates the issue of singular terms. For him, the analytical perspective is not acceptable, since for him, as it the case for us, there is the primacy of interpretation.

Brandom refutes the claim of analytical philosophy that singular names designate single objects. Instead he strives to determine the necessity and the characteristics of singular terms by a scheme that distinguishes particular structural (“syntactical”) and semantic conditions. These conditions are further divergent between the two classes of possible subsentential structures, the singular terms (ST) and predicates (P). While syntactically, ST take the role of substitution-of/substitution-by and P take the structural role of providing a frame for such substitutions, in the semantic perspective ST are characterised exclusively by so called symmetric substitution-inferential commitments (SIC), where P also take asymmetric SIC. Those inferential commitments link the de-ontic, i.e. ultimately socialness of linguistic exchange, to the linguistic domain of the social exchange. We hence may also characterize the whole situation as it is described by Brandom as a cross-medial setting, where the socialness and linguistic domain provide each other mutually a medial embedding.

Interestingly, this simultaneous cross-mediality represents also a “region”, or a context, where materiality (of the participants) and immateriality (of information qua interpretation) overlaps. We find, so to speak, an event-like situation just before the symmetry-break that we ay identify as meaning. To some respect, Brandom’s scheme provides us the pragmatic details of of a Peircean sign situation.

The Peirce-Brandom Test

This has been a very coarse sketch of one aspect of Brandom’s approach. Yet, we have seen that language understanding can not be understood if we neglect the described cross-mediality. We therefore propose to replace the so-called Turing-test by a procedure that we propose to call the Peirce-Brandom Test. That test would proof the capability to take part in semiosis, and the choreography of the interaction scheme would guarantee that references and inferences are indeed performed. In contrast to the Turing-test, the Peirce-Brandom test can’t be “faked”, e.g. by a “Chinese Room.” (Searle [8]) Else, to find out whether the interaction partner is a “machine” or a human we should not ask them anything, since the question as a grammatical form of social interaction corroborates the complexity of the situation. We just should talk to it/her/him.The Searlean homunculus inside the Chinese room would not be able to look up anything anymore. He would have to be able to think in Chinese and as Chinese, q.e.d.

Strongly Singular Terms and the Issue of Virtuality

The result of Brandom’s analysis is that the label of singular terms is somewhat dispensable. These terms may be taken as if they point to a singular object, but there is no necessity for that, since their meaning is not attached to the reference to the object, but to their role in in performing the discourse.

Strongly singular terms are strikingly different from those (“weakly) singular terms. Since they are founding themselves while being practiced through their self-referential structure, it is not possible to find any “incoming” dependencies. They are seemingly isolated on their passive side, there are only outgoing dependencies towards other terms, i.e. other terms are dependent on them. Hence we could call them also “(purely) active terms”.

What we can experience here in a quite immediate manner is pure potentiality, or virtuality (in the Deleuzean sense). Language imports potentiality into material arrangements, which is something that programming languages or any other finite state automaton can’t accomplish. That’s the reason why we all the time heftily deny the reasonability to talk about states when it comes to the brain or the mind.

Now, at this point it is perfectly clear why language can be conceived as ongoing creativity. Without ongoing creativity, the continuous actualization of the virtual, there wouldn’t be anything that would take place, there would not “be” language. For this reason, the term creativity belongs to the small group of strongly singular terms.

Conclusion

In this series of essays about the relation between formalization and creativity we have achieved an important methodological milestone. We have found a consistent structural basis for the terms language, formalization and creativity. The common denominator for all of those is self-referentiality. On the one hand this becomes manifest in the phenomenon of strong singularity, on the other hand this implies an immanent virtuality for certain terms. These terms (language, formalization, model, theory) may well be taken as the “hot spots” not only of the creative power of language, but also of thinking at large.

The aspect of immanent virtuality implicates a highly significant methodological move concerning the starting point for any reasoning about strongly singular terms. Yet, this we will check out in the next chapter.

Part 1: The Formal and the Creative, Introduction

Part 3: A Pragmatic Start for a Beautiful Pair


Notes

1.  Wittgenstein repeatedly has been expressing this from different perspectives. In the Philosophical Investigations [1], PI §219, he states: “When I obey the rule, I do not choose. I obey the rule blindly.” In other words, there is usually no reason to give, although one always can think of some reasons. Yet, it is also true that (PI §10) “Rules cannot be made for every possible contingency, but then that isn’t their point anyway.” This leads us to §217: “If I have exhausted the justifications I have reached bedrock, and my spade is turned. Then I am inclined to say: ‘This is simply what I do’.” Rules are are never intended to remove all possible doubt, thus  PI  §485: “Justification by experience comes to an end. If it did not it would not be justification.” Later Quine proofed accordingly from a different perspective what today is known as the indeterminacy of empirical reason (“Word and Object”).

2. There are, of course, other interesting positions, e.g. that elaborated by Wilfrid Sellars [9], who distinguished different kinds of singular terms: abstract singular terms (“triangularity”), and distributive singular terms (“the red”), in addition to standard singular terms. Yet, the problem of which the analytical position is suffering also hits the position of Sellars.

  • References
  • [1] Ludwig Wittgenstein, Philosophical Investigations.
  • [2] Gilles Deleuze, Felix Guattari, Milles Plateaus.
  • [3] Colin Johnston (2009). Tractarian objects and logical categories. Synthese 167: 145-161.
  • [4] Ernst Tugendhat, Traditional and Analytical Philosophy. 1976
  • [5] Strawson 1974
  • [6] Rodych, Victor, “Wittgenstein’s Philosophy of Mathematics”, The Stanford Encyclopedia of Philosophy (Summer 2011 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu.
  • [7] Robert Brandom, Making it Explicit. 1994
  • [8] John Searle (1980). Minds, Brains and Programs. Behav Brain Sci 3 (3), 417–424.
  • [9] Wilfrid Sellars, Science and Metaphysics. Variations on Kantian Themes, Ridgview Publishing Company, Atascadero, California [1967] 1992.

۞

Theory (of Theory)

February 13, 2012 § Leave a comment

Thought is always abstract thought,

so thought is always opposed to work involving hands. Isn’t it? It is generally agreed that there are things like theory and practice, which are believed to belong to different realms. Well, we think that this perspective is inappropriate and misleading. Deeply linked to this first problem is a second one, the distinction between model and theory. Indeed, there are ongoing discussions in current philosophy of science about those concepts.

Frequently one can meet the claim that theories are about predictions. It is indeed the received view. In this essay we try to reject precisely this received view. As an alternative, we offer a Wittgensteinian perspective on the concept of theory, with some Deleuzean, dedicatedly post-Kantian influences. This perspective we could call a theory about theory. It will turn out that this perspective not only is radically different from the received view, it also provides some important otherwise unachievable benefits, or (in still rather imprecise wording) both concerning “practical” as well as philosophical aspects. But let us start first with some examples.

Even before let me state clearly that there is much more about theory than can be mentioned in a single essay. Actually, this essay is based on a draft for book on the theory of theory that comprises some 500 pages…

The motivation to think about theory derives from several hot spots. Firstly, it is directly and intrinsically implied by the main focus of the first “part” of this blog on the issue of (the possibility for a) machine-based episteme. We as humans only can know because we can willingly take part in a game that could be appropriately described as mutual and conscious theorizing-modeling induction. If machines ever should develop the capability for their own episteme, for their autonomous capability to know, they necessarily have to be able to build theories.

A second strain of motivation comes from the the field of complexity. There are countless publications stating that it is not possible to derive a consistent notion of complexity, ranging from Niklas Luhmann [1986] to Hermann Haken [2012] (see []), leading either to a rejection of the idea that it is a generally applicable concept, or to an empty generalization, or to a reduction. Obviously, people are stating that there is no possibility for a theory about complexity. On the other hand, complexity is more and more accepted as a serious explanatory scheme across disciplines, from material science to biology, sociology and urbanism. Complexity is also increasingly a topic in the field of machine-based episteme, e.g. through the concept of self-organizing maps (SOM). This divergence needs to be clarified, and to be dissolved, of course.

The third thread of motivation is given by another field where theory has  been regarded usually as something exotic: urbanism and architecture. Is talking about architecture, e.g. its history, without actually using this talking in the immediate context of organizing and rising a building already “theory”? Are we allowed to talk in this way at all, thereby splitting talking and doing? Another issue in these fields is the strange subject of planning. Plans are neither models nor theory, nor operation, and planning often fails, not only in architecture, but also in the IT-industry. In order to understand the status of plans, we have first to get clear about the abundant parlance that distinguishes “theory” and “practice”.

Quite obviously, a proper theory of theory in general, that is, not just a theory about a particular theory, is also highly relevant what is known as theory about theory change, or in terms used often in the field of Artificial Intelligence, belief revision. If we do not have a proper theory about theory at our disposal, we also will not talk reasonably about what it could mean to change a belief. Actually, the topic about beliefs is so relevant that we will discuss it in a dedicated essay. For the time being, we just want to point out the relevance of our considerations here. Later, we will include a further short remark about it.

For these reasons it is vital in our opinion (and for us) to understand the concept of theory better than it is possible on the basis of current mainstream thinking on the subject.

Examples

In line with that mainstream attitude it has been said for instance that Einstein’s theory predicted—or: Einstein predicted from his theory—the phenomenon of gravitational lenses for light. In Einstein’s universe, there is no absoluteness regarding the straightness of a line, because space itself has a curvature that is parametrized. Another example is the so-called Standard Model, or Standard Interpretation in particle physics. Physicists claim that this model is a theory and that it is the best available theory in making correct predictions about the behavior of matter. The core of this theory is given by the relation between two elements, the field and its respective mediating particle, a view, which is a descendant of Einstein’s famous equation about energy, mass and the speed of of light. Yet, the field theory leads to the problem of infinite regress, which they hope to solve in the LHC “experiments” currently performed at the CERN in Geneva. The ultimate particle that also should “explain” gravity is called the Higgs-Boson. The general structure of the Standard Model, however, is a limit process: The resting mass of the particles is thought to become larger and larger, such, the Higgs-Boson is the last possible particle, leaving gravitation and the graviton still unexplained. There is also a pretty arrangement of the basic types of elementary particles that is reminding the periodic table in chemistry. Anyway, by means of that Standard Model it is possible to build computers, or at least logical circuits, where a bit is represented by just some 20 electrons. Else, Einstein’s theory has a direct application in the GPS, where a highly accurate common time base shared between the satellites is essential.

Despite these successes there are still large deficits of the theory. Physicists say that they did not detect gravitational waves so far that are said to be predicted by their theory. Well, physics even does not even offer any insight about the genesis of electric charges and magnetism. These are treated as phenomena, leaving a strange gap between the theory and the macroscopic observations (Note that the Standard Model does NOT allow decoherence into a field, but rather only into particles). Else, physicists do not have even the slightest clue about some mysterious entities in the universe that they call “dark matter” and “dark energy”, except that it exerts positive or negative gravitational force. I personally tend to rate this as one of the largest (bad) jokes of science ever: Building and running the LHC (around 12 billion $ so far) on the one hand and at the same time taking the road back into mythic medieval language serious. We meet also and again meet dark ages in physics, not only dark matter and dark energy.

Traveling Dark Matter in a particular context, reflecting and inducing new theories: The case of Malevich and his holy blackness.1

Anyway, that’s not our main topic here. I cited these examples just to highlight the common usage of the concept of theory, according to which a theory is a more or less mindful collection of proposals that can be used to make predictions about worldly facts.

To be Different, or not to be Different…

But what is then the difference between theories and models? The concept of model is itself an astonishing phenomenon. Today, it is almost ubiquitous, We hardly can imagine anymore that only a few decades ago, back in the 19th century, the concept of model was used mainly by architects. Presumably, it was the progress made in physics in the beginning of the 20th century, together with the foundational crisis in mathematics that initiated the career of the concept of model (for an overview in German language see this collection of pages and references).

One of the usages of the concept of model refers to the “direct” derivation of predictions from empirical observations. We can take some observations about process D, e.g. an illness of the human body, where we know the outcome (cured or not) and then we could try to build an “empiric” model that links the observations to the outcome. Observations can include the treatment(s), of course. It is clear that predictions and diagnoses are almost synonyms.

Where is the theory here? Many claim that there is no theory in modeling in general, and particularly that there is no theory possible in the case of medicine and pharmacology. Statistical techniques are usually regarded as some kind of method. For there is no useful generalization is is believed that a “theory” would not be different from stating that the subject is alive. It is claimed that we are always directly faced with the full complexity of living organisms, thus we have to reduce or perspective. But stop, shouldn’t we take the notion of complexity here already as a theory, should we?

For Darwin’s theory of natural selection it is also not easy to draw a separating line between the concept of models and theories. Darwin indeed argued on a quite abstract level, which led to the situation that people think that his theory can not be readily tested. Some people feel thus inclined to refer to the great designer, or to the  Spaghetti monster alike. Others, notably often physicists, chemists or mathematicians, tried to turn Darwin’s theory into a system that actually could be tested. For the time being we leave this as an open issue, but we will return to it later.

Today it is generally acknowledged that measurement always implies a theory. From that we directly can conclude that the same should hold for modeling. Modeling implies a theory, as measurement implies a particular model. In the latter case the model is often actualized by the materiality or the material arrangement of the measurement device. Both, the material aspects together with the immaterial design aspects that mainly concern informational filtering, establish at least implicitly a particular normativity, a set of normative rules that we can call “model.” This aspect of normativity of models (and of theories alike) is quite important, we should keep this in mind.

In the former relation, the implication of theories by modeling, we may expect a similar dependency. Yet, as far as we do not clearly distinguish models and theory, theories would be simply some kind of more general models. If we do not discern them, we would not need both. Actually, precisely this is the current state of affairs, at least in the mainstreams across various disciplines.

Reframing. Into the Practice of Languagability.

It is one of the stances inherited from materialism to pose questions about a particular subject in an existential, or if you like, ontological, manner. Existential questions take the form “What is X?”, where the “is” already claims the possibility of an analytical treatment, implied by the sign for equality. In turn this equality, provoked by the existential parlance, claims that this equation is a lossless representation. We are convinced that this approach destroys any chance for sustainable insights already in the first move. This holds even for the concepts of “model” or “theory” themselves. Nevertheless, the questions “What is a model?” or “What is a theory?” can be frequently met (e.g. [1] p.278)

The deeper reason for the referred difficulties is that it implies the primacy of the identity relation. Yet, the only possible identity relation is a=a, the tautology, which of course is empirically empty. Despite we can write a=b, it is not an identity relation any more. Either it is a claim, or it is based on empiric arguments, that means, it is always a claim. In any case, one have to give further criteria upon which the identity a=b appears as justified. The selection of those criteria is far outside of the relation itself. It invokes the totality of the respective life form. The only conclusion we can draw from this is that the identity relation is transcendent. Despite its necessity it can not be part of the empirical world. All the same is hence true for logic.

Claiming the identity relation for empirical facts, i.e. for any kind of experience and hence also for any thought, is self-contradictive. It implies a normativity that remains deliberately hidden. We all know about the late and always disastrous consequences of materialism on the societal level, irrespective of choosing the marxist or the capitalist flavor.

There are probably only two ways of rejecting materialism and such also for avoiding its implications. Both of them reject the primacy of the identity relation, yet in slightly different ways. The first one is Deleuze’s transcendental difference, which he developed in his philosophy of the differential (e.g. in Difference & Repetition, or his book about the Fold and Leibniz). The second one is Wittgenstein’s proposal to take logic as a consequence of performance, or more precise, as an applicable quasi-logic, and to conceive of logic as a transcendental entity. Both ways are closely related, though developed independently from each other. Of course, there are common traits shared by Deleuze and Wittgenstein such as rejecting what has been known as “academic philosophy” at their time. All the philosophy had been positioned just as “footnotes to Platon”, Kant or Hegel.

In our reframing of the concept of theory we have been inspired by both, Deleuze and Wittgenstein, yet we follow the Wittgensteinian track more explicitly in the following.

Actually, the move is quite simple. We just have to drop the assumption that entities “exist” independently. Even if we erode that idealistic independence only slightly we are ultimately actually enforced to acknowledge that everything we can say, know or do is mediated by language, or more general by the conditions that imply the capability for language, in short by languagability.

In contrast to so-called “natural languages”—which actually is a revealing term— languagability is not a dualistic, bivalent off-or-on concept. It is applicable to any performing entity, including animals and machines. Hence, languagability is not only the core concept for the foundation of the investigation of the possibility of machine-based episteme. It is essential for any theory.

Following this track, we stop asking ontological questions. We even drop ontology as a whole. Questions like “What is a Theory?”, “What is Language?” etc. are almost free of any possible sense. Instead, it appears much more reasonable to accept the primacy of languagability and to ask about the language game in which a particular concept plays a certain role. The question that promises progress therefore is:

What can we say about the concept of theory as a language game?

To our knowledge, the “linguistic turn” has not been performed in philosophy of science so far, let it even be in disciplines like computer science or architecture. The consequence of which is a considerable mess in the respective disciplines.

Theory as a Language Game

One of the first implications of the turn towards the primacy of languagability is the vanishing of the dualism between theory and practice. Any practice requires rules, which in turn can only be referred to in the space of languagability. Of course, there is more than the rule in rule-following. Speech acts have been stratified first by Austin [2] into locutionary, illocutionary and perlocutionary parts. There might be even further ones, implying evolutionary issues or the play as story-telling. (Later we we call these aspects “delocutionary”) On the other hand, it is also true that one can not pretend to follow a rule, as Wittgenstein recognized [3].

It is interesting in this respect that the dualistic, opposing contrast between theory and practice has not been the classical view; not just by chance it appeared as late as in the early 17th century [4]. Originally, theory just meant “to look at, to speculate”, a pairing that is interesting in itself.

Ultimately, rules are embedded in the totality of a life form (“Lebensform” in the Wittgensteinian, non-phenomenological sense), including the complete “system” of norms in charge at a given moment. Yet, most rules are regulated themselves, by more abstract ones, that set the conditions for the less abstract ones. The result is not a perfect hierarchy of course, the collection of rules being active in a Lebensform is not an analytic endeavor. We already mentioned this layered system in another chapter (about “comparing”) and labeled it “orthoregulation” there. Rules are orthoregulated, without orthoregulation rules would not be rules.

This rooting of rules in the Forms of Life (Wittgenstein), the communal aspect (Putnam), the Field of Proposals (“Aussagefeld”, Foucault) or the Plane of Immanence provoked by attempting to think consistently (Deleuze), which are just different labels for closely related aspects, prevents the ultimate justification, the justifiable idea, and the presence of logical truth values or truth functions in actual life.

It is now important to recognize and to keep in mind that rules about rules are not referring to any empiric entity that could be found as material or informational fact! Rules about rules are referring to the regulated rules only. Of course, usually even the meta-rules are embedded into the larger context of valuation, the whole system should work somehow, that is, the whole system should allow to create predictive models. Here we find the link to risk (avoidance) and security.

Taking an empiricist or pragmatic stance also for the “meta”-rules that are part of the orthoregulative layer we could well say that the empiric basis of the ortho-rules are other, less abstract and less general rules.

Now we can apply the principle of orthoregulation to the subject of theory. Several implications are immediately and clearly visible, namely and most important that

  • – theories are not about the prediction of empirical “non-normative” phenomena, the subject of Popper’s falsificationism is the model, nor the theory;
  • – theories can not be formalized, because they are at least partially normative;
  • – facts can’t be “explained” as far as “explanations” are conceived to be non-normative entities;

It is clear that the standard account to the status of scientific theories is not compatible with that (which actually is a compliment). Mathias Frisch [5] briefly discusses some of the issues. Particularly, he dismisses the stance that

“the content of a theory is exhausted by its mathematical formalism and a mapping function defining the class of its models.” (p.7)

This approach is also shared by the influential Bas van Fraassen, especially his 1980 [6]. In contrast to this claim we definitely reject that there is any necessity consistency between models and the theory from which they have been derived, nor among the family of models that could be associated with a theory. Life forms (Lebensformen) can not and should not be  evaluated by means of “consistency”, unless you are a social designer, that for instance  has been inventing a variant of idealism practicing in and on Syracuse… The rejection of a formal relationship between theories and models includes the rejection of the set theoretic perspective onto models. Since theories are normative they can’t be formalizable and it is near to scandal to claim ([6], p.43) that

Any structure which satisfies the axioms of a theory…is called a model of that theory.

The problem here being mainly the claim that theories consist of or contain axioms. Norms never have been and never will be “axiomatic.”

There is a theory about belief revision that has been quite influential for the discipline or field that is called “Artificial Intelligence” (we dismiss this term/name, since it is either empty or misleading). This theory is known under the label AGM theory, where the acronym derives from the initials of the names of three proponents Alchourrón, Gärdenfors, and Makinson [7]. The history of its adoption by computer scientists is a story in itself [8]; what we can take here is that it is believed by the computer scientists that the AGM theory is relevant for the update of so-called knowledge bases.

Despite its popularity, the AGM theory is seriously flawed, as Neil Tennant has been pointing out [9] (we will criticize his results in another essay about beliefs (scheduled)). A nasty discussion mainly characterized by mutual accusations started (see [10] as an example), which is typical for deficient theories.

Within AGM, and similar to Fraassen’s account on the topic, a theory is a equal to a set of beliefs, which in turn is conceived as a logically closed set of sentences. There are several mistakes here. First, they are applying truth-function logic as a foundation. This is not possible, as we have seen elsewhere. Second, a belief is not a belief any more as soon as we conceive it as a preposition, i.e. a statement within logic, i.e. under logical closure. It would be a claim, not a belief. Yet, claims belong to a different kind of game. If one would to express the fact that we can’t know anything precisely, e.g. due to the primacy of interpretation, we simply could take the notion of risk, which is part of a general concept of model. A further defect in AGM theory and any similar approach that is trying to formalize the notion of theory completely is that they conflate propositional content with the form of the proposition. Robert Brandom demonstrates in an extremely thorough way, why this is a mistake, and why we are enforced to the view that propositional content “exists” only as a mutual assignment between entities that talk to each other (chapter 9.3.4 in [11]). The main underlying reason for this is the primacy of interpretation.

In turn we can conclude that the AGM theory as well as any attempt to formalize theory can be conceived as a viable theory only, if the primacy of interpretation is inadequate. Yet, this creates the problem how we are tied to the world. The only alternative would be to claim that this is going on somehow “directly”. Of course, such claims are either 100% nonsense, or 100% dictatorship.

Regarding the application of the faulty AGM theory to computer science we find another problem: Knowledge can’t be saved to a hard disk, as little as it is possible for information. Only a strongly reductionist perspective, which almost is a caricature of what could be called knowledge, allows to take that route.

We already argued elsewhere that a model neither can contain the conditions of its applicability nor of its actual application. The same applies of course to theories. As a direct consequence of that we have to investigate the role of conditions (we do this in another chapter).

Theories are precisely the “instrument” for organizing the conditions for building models. It is the property of being an instrument about conditions that renders them into an entity that is inevitably embedded into community. We could even bring in Heidegger’s concept of the “Gestell” (scaffold) here, which we coined in the context of his reflections about technology.

The subject of theories are models, not the proposals about the empirical world, as far as we exclude models from the empirical world. The subject of Popper’s falsificationism is the realm of models. In the chapter about modeling we determined models as tools for anticipation given the expectation of weak repeatability. These anticipations can fail, hence they can be tested and confirmed. Inversely, we also can say that every theoretical construct that can be tested is an anticipation, i.e. a model. Theoretical constructs that can not be tested are theories. Mathias Frisch ([5], p.42) writes, quote:

I want to suggest that in accepting a theory, our commitment is only that the theory allows us to construct successful models of the phenomena in its domain, where part of what it is for a model to be successful is that it represents the phenomenon at issue to whatever degree of accuracy is appropriate in the case at issue. That is, in accepting a theory we are committed to the claim that the theory is reliable, but we are not committed to its literal truth or even just of its empirical consequences.

We agree with him concerning the dismissal of truth or empiric content regarding the theories. Yet, the term “reliable” could still be misleading. One never would say that a norm is reliable. Norms themselves can’t be called reliable, only its following. You not only just obey to a norm, the norm is also something that has been fixed as the result of social process, as a habit of a social group. On a wider perspective, we probably could assign that property, since we tend to expect that a norm supports us in doing so. If norm would not support us, it would not “work,” and in the long run it will be replaced, often in a catastrophically sweeping event. That “working”of a norm is, however, almost unobservable by the individual, since it belongs to the Lebensform. We also should keep in mind that as far as we would refer to such a reliability, it is not directed towards the prediction, at least not directly, it refers just to the possibility to create predictive models.

From  safe grounds we now can reject all the attempts that try to formalize theories according to the line Carnap-Sneed-Stegmüller-Moulines [12, 13, 14, 15]. The “intended usage” of a theory (Sneed/Stegmüller) can not be formalized, since it is related to the world, not just to an isolated subject. Scientific languages (Carnap’s enterprise) are hence not possible.

Of course, it is possible to create models about the modeling, i.e. taking models as an empiric subject. Yet, such models are still not a theory, even as they look quite abstract. They are simply models,  which imply or require a theory. Here lies the main misunderstanding of the folks cited above.

The turn towards languagability includes the removal of the dualistic contrast between theory and practice. This dualism is replaced by a structural perspective according to which theory and practice are co-extensive. Still, there are activities, that we would not call a practice or an action, so to speak before any rule. Such activities are performances. Not to the least this is also the reason why performance art is… art.

Heinrich Lüber, the Swiss performance artist, standing on-top of a puppet shaped as himself. What is no visible here: He stood there for 8 hours, in the water on shore of the French Atlantic coastline.

Besides performance (art) there are no activities that would be free of rules, or equivalently, free of theory. Particularly modeling is of course a practice, quite in contrast to theory. Another important issue we can derive from our distinction is that any model implies a theory, even if the model just consists of a particular molecule, as it is the case in the perception mechanisms of individual biological cells.

Another question we have sharply to distinguish from that about the reach of theories is whether the models predict well. And of course, just as norms, also theories can be inappropriate.

Theories are simply there. Theories denote what can be said about the influence of the general conditions—as present in the embedding “Lebenswelt”—onto the activity of modeling.

Theories thus can be described by the following three properties:

  • (1) A theory is the (social) practice of determining the conditions for the actualization of virtuals, the result of which are models.
  • (2) A theory acts as a synthesizing milieu, which facilitate the orthoregulated  instantiation of models that are anticipatively related to the real world (where the “real world” satisfies the constraints of Wittgensteinian solipsism).
  • (3) A theory is a language generating language game.

Theories, Models, and in between

Most of the constructs called “theory” are nothing else than a hopeless mixture of models and theories, committing serious naturalistic fallacies in comparing empiric “facts” with normative conditions. We will give just a few examples for this.

It is generally acknowledged that some of Newton’s formulas constitute his theory of gravitation. Yet, it is not a theory, it is a model. It allows for direct and, in the mesocosmic scale, even for almost lawful predictions about falling objects or astronomical satellites. Newton’s theory, however, is given by his belief in a certain theological cosmology. Due to this theory, which entails absoluteness, Newton was unable to detect relativism.

Similarly the case of Kepler. For a long time (more than 20 years) Kepler’s theory entailed the belief in a pre-established cosmic harmony that could be described by Euclidean geometry, which itself was considered as being a direct link to divine regions at that time. The first model that Kepler constructed to fulfill this theory comprised the inscription of platonic solids into the planetary orbits. But those models failed. Based on better observational data he derived different models, yet still within the same theory. Only when we dropped the role of the geometrical approach in his theory he was able to find his laws about the celestial ellipses. In other words, he dropped most of his theological orthoregulations.

Einstein’s work about relativity finally is clearly a model as there is not only one formula. Einstein’s theory is not related to the space-time structure of the macroscopic universe. Instead, the condition for deriving the quantitative / qualitative predictions are related to certain beliefs in non-randomness of the universe. His conflict with quantum theory is well-known: “God does not play dice.

The contemporary Standard Model in particle physics is exactly that: a model. Its not a theory. The theory behind the standard model is logical flatness and materialism. It is a considerable misunderstanding of most physicists to accuse proponents of the String theory not to provide predictions. They can not, because they are thinking about a theory. Yet, string theorists themselves do not properly understand the epistemic role of their theory as well.

A particular case is given by Darwin’s theory. Darwin of course did not distinguish perfectly or explicit between models and theories, it was not possible for him at these days. Yet, throughout his writings and the organization of his work we can detect that he implicitly followed that distinction. From Darwin’s writings we know that he was deeply impressed by the non-random manifoldness in the domain of life. Precisely this represented the core of his theory. His formulation about competition, sexual selection or inheritance are just particular models. In our chapter about the abstract structure of evolution we formulated a model about evolutionary processes in a quite abstract way. Yet, it is still a model, within almost the same theory that Darwin once followed.2

There is a quite popular work about the historical dynamics of theory, Thomas Kuhn’s “The Structure of Scientific Revolutions“, which is not theory, but just a model. For large parts it is not even a model, but just a bad description, which he coined the paradigm of the “paradigm shift”. There is almost no reflection in it. Above all, it is certainly not a theory about theory, nor a theory about the evolution of theories. He had to fail, since he does not distinguish between theories and models to the least extent.

So, leaving these examples, how do relate models and theories practically? Is there a transition between them?

Model of Theory, Theory of Model, and Theory of Theory

I think we can we can derive from these examples a certain relativity regarding the life-cycle of models and theories. Theories can be transformed into models through removal of those parts that refer to the Lebenswelt, while models can be transformed into theories if the orthoregulative part of models gets focused (or extracted from theory-models)

Obviously, what we just did was to describe a mechanism. We proposed a model. In the same way it represents a model to use the concept of the language game for deriving a structure for the concept of theory. Plainly spoken, so far we created a model about theory.

As we have seen, this model also comprises proposals about the transition from model to theory. This transition may take two different routes, according to our model about theory. The first route is taken if a model gets extended by habits and further, mainly socially rooted, orthoregulation, until the original model appears just as a special case. The abstract view might be still only implicit, but it may be derived explicity if the whole family of models is concretely going to be constructed, that are possible within those orthoregulations. The second route draws upon a proceeding abstraction, introducing thereby the necessity of instantiation. It is this necessity that decouples the former model from its capability to predict something.

Both routes, either by adding orthoregulations explicitly or implicitly through abstraction, turn the former model de actio into a milieu-like environment: a theory.

As productive milieus, theories comprise all components that allow the construction and the application of models:

  • – families of models as ensembles of virtualized models;
  • – rules about observation and perception, including the processes of encoding and decoding;
  • – infrastructural elements like alphabets or indices;
  • – axiomatically introduced formalizations;
  • – procedures of negotiation the procedures of standardization and other orthoregulations up to arbitrary order

The model of model, on the other hand, we already provided here, where we described it as a 6-Tupel, representing different, incommensurable domains. No possible way can be thought of from one domain to one of the other. These six domains are, by their label:

  • (1) usage U
  • (2) observations O
  • (3) featuring assignates F on O
  • (4) similarity mapping M
  • (5) quasi-logic Q
  • (6) procedural aspects of the implementation

or, taken together:

This model of model is probably the most abstract and general model that is not yet a theory. It provides all the docking stations that are required to attach the realm of norms. Such, it would be only a small step to turn this model into a theory. That step towards a theory of model would include statements about two further dimensions: (1) the formal status and (2) the epistemic role of models. The first issue is largely covered by identifying them as a category (in the sense of category theory). The second part is related to the primacy of interpretation, that is, to a world view that is structured by (Peircean) sign processes and transcendental differences (in the Deleuzean sense).

The last twist concerns the theory of theory. There are good reasons to assume that for a theory of theory we need to invoke transcendental categories. Particularly, a theory of theory can’t contain any positive definite proposal, since in this case it would automatically turn into a model. A theory of theory can be formulated only as a self-referential, self-generating structure within transcendental conditions, where this structure can act as a borderless container for any theory about any kind of Lebensform. (This is the work of the chapter about the Choreosteme.)

Remarkably, we thus could not formulate that we could apply a theory to itself, as a theory is a positive definite thing, even if it would contain only proposals about conditions (yet, this is not possible either). Of course, this play between (i) ultimately transcendent conditions, (ii) mere performance that is embedded in a life form and finally (iii) the generation of positivity within this field constitutes a quite peculiar “three-body-problem” of mental life and (proto-)philosophy. We will return to that in the chapter about the choreosteme, where we also will discuss the issue of “images of thoughts” (Gilles Deleuze) or, in slightly different terms, the “idioms of thinking” (Bernhard Waldenfels).

Conclusion

Finally, there should be our cetero censeo, some closing remarks about the issue of machine-based episteme, or even machine-based epistemology.  Already in the beginning of this chapter we declared our motivation. But what can we derive and “take home” in terms of constructive principles?

Our general goal is to establish—or to get clear about—some minimal set of necessary conditions that would allow a “machinic substrate” in such a way that we could assign to it the property of “being able to understand” in a fully justified manner.

One of the main results in this respect here was that modeling is nothing that could be thought of as running independently, as algorithm, in such a way that we could regard this modelling as sufficient for ascribing the machine the capability to understand. More precisely, it is not even the machine that is modeling, it is the programmer, or the statistician, the data analyst etc., who switched the machine into the ON-state. For modeling, knowing and theorizing the machine should act autonomously.

On the other hand, performing modeling inevitably implies a theory. We just have to keep this theory somehow “within” the machine, or more precisely, within the sign processes that take place inside the machine. The ability to build theories necessarily implies self-referentiality of the informational processes. Our perspective here is that the macroscopic effects of  self-referentiality, such like the ability for building theories, or consciousness, can not be “programmed”, they have to be a consequence of the im-/material design aspects of the processes that make up this aspects…

Another insight is, also not a heavily surprising one, though, that the ability to build theories refers to social norms. Without social norms there is no theorizing. It is not the mathematics or the science that would be necessary it is just the presence and accessibility of social norms. We could call it briefly education. Here we are aligned to theories (i.e. mostly models) that point to the social origins of higher cognitive functions. It is quite obvious that some kind of language is necessary for that.

The road to machine-based episteme thus does not imply a visit in the realms of robotics. There we will meet only insects and …roboters. The road to episteme leads through languagability, and anything that is implied by that, such as metaphors or analogical thinking. These subjects will be the topic of next chapters. Yet, it also defines the programming project accompanying this blog: implementing the ability to understand textual information.

u .

Notes

1. The image in the middle of this tryptich shows the situation in the first installation on the exhibition in Petrograd in 1915, arranged by Malevich himself. He put the “Black Square” exactly at the same place where traditionally the christian cross was to be found in Russian living rooms at that time: up in the corner under the ceiling. This way, he invoked a whole range of reflections about the dynamics of symbols and habits.

2. Other components of our theory of evolutionary processes entail the principle of complexity, and the primacy of difference and the primacy of interpretation.

This article has been created on Oct 21st, 2011, and has been republished in a considerably revised form on Feb 13th, 2012.

References

  • [1] Stathis Psillos, Martin Curd (eds.) The Routledge Companion to Philosophy of Science.

    Taylor & Francis, London and New York 2008.

  • [2] Austin, Speech Act Theory;
  • [3] Wittgenstein, Philosophical Investigations;
  • [4] etymology of “theory”; “theorein”
  • [5] Mathias Frisch, Inconsistency, Asymmetry, and Non-Locality: A Philosophical Investigation of Classical Electrodynamics. Oxford 2005.
  • [6] Bas van Frassen, The Scientific Image,

    Oxford University Press, Oxford 1980.

  • [7] Alchourron, C., Gärdenfors, P. and Makinson, D. (1985). On the Logic of Theory Change: Partial Meet Contraction and Revision Functions. Journal of Symbolic Logic, 50, 510-30.
  • [8] Raúl Carnota and Ricardo Rodríguez (2011). AGM Theory and Artificial Intelligence.

    in: Belief Revision meets Philosophy of ScienceLogic, Epistemology, and the Unity of Science, 2011, Vol.21, 1-42.

  • [9] Neil Tennant (1997). Changing the Theory of Theory Change: Reply to My Critics.

    Brit. J. Phil. Sci. 48, 569-586.

  • [10] Hansson, S. 0. and Rott, H. [1995]: ‘How Not to Change the Theory of Theory Change: A Reply to Tennant’, British Journal for the Philosophy of Science, 46, pp. 361-80.
  • [11] Robert Brandom, Making it Explicit. 1994.
  • [12] Carnap
  • [13] Sneed
  • [14] Wolfgang Stegmüller
  • [15] Moulines

۞

Associativity

December 19, 2011 § Leave a comment

Initially, the meaning of ‘associativity’ seems to be pretty clear.

According to common sense, it denotes the capacity or the power to associate entities, to establish a relation or a link between them. Yet, there is a different meaning from mathematics that almost appears as kind of a mocking of the common sense. Due to these very divergent meanings we first have to clarify our usage before discussing the concept.

A Strange Case…

In mathematics, associativity is defined as a neutrality of the results of a compound operation with respect to the “bundling,” or association, of the individual parts of the operation. The formal statement is:

A binary operation ∘ (relating two arguments) on a set S is called associative if it satisfies the associative law:

x∘(y∘z) = (x∘y)∘z for all x, y, z S

This, however, is just the opposite of “associative,” as it demands the independence from any particular association. If there would be any capacity to establish an association between any two elements of S, then there should not be any difference.

Maybe, some mathematician in the 19th century hated the associative power of so many natural structures. Subsequently, modernism contributed its own part to establish the corruption of the obvious etymological roots.

In mathematics the notion of associativity—let us call it I-associativity in order to indicate the inverted meaning—is an important part of fundamental structures like “classic” (Abelian) groups or categories.

Groups are important since they describe the basic symmetries within the “group” of operations that together form an algebra. Groups cover anything what could be done with sets. Note that the central property of sets is their enumerability. (Hence, a notion of “infinite” sets is nonsense; it simply contradicts itself.) Yet, there are examples of quite successful, say: abundantly used, structures that are not based on I-associativity, the most famous of them being the Lie-group. Lie-groups allow to conceive of continuous symmetry, hence it is much more general than the Abelian group that essentially emerged from the generalization of geometry. Even in the case of Lie-groups or other “non-associative” structures, however, the term refers to the meaning such as to inverting it.

With respect to categories we can say that so far, and quite unfortunately, there is not yet something like a category theory that would not rely on I-associativity, a fact that is quite telling in itself. Of course, category theory is also quite successful, yet…

Well, anyway, we would like to indicate that we are not dealing with I-associativity here in this chapter. In contrast, we are interested in the phenomenon of associativity as it is indicated by the etymological roots of the word: The power to establish relations.

A Blind Spot…

In some way the particular horror creationes so abundant in mathematics is comprehensible. If a system would start to establish relations it also would establish novelty by means of that relation (sth. that simply did not exist before). So far, it was not possible for mathematics to deal symbolically with the phenomenon of novelty.

Nevertheless it is astonishing that a Google raid on the term “associativity” reveals only slightly more than 500 links (Dec. 2011), from which the vast majority consists simply from the spoofed entry in Wikipedia that considers the mathematical notion of I-associativity. Some other links are related to computer sciences, which basically refer to the same issue, just sailing under a different flag. Remarkably, only one (1) single link from an open source robotics project [1] mentions associativity as we will do here.

Not very surprising one can find an intense linkage between “associative” and “memory,” though not in the absolute number of sources (also around ~600), but in the number of citations. According to Google scholar, Kohonen and his Self-Organizing Map [2] is being cited 9000+ times, followed by Anderson’s account on human memory [3], accumulating 2700 citations.

Of course, there are many entries in the web referring to the word “associative,” which, however, is an adjective. Our impression is that the capability to associate has not made its way into a more formal consideration, or even to regard it as a capability that deserves a dedicated investigation. This deficit may well be considered as a continuation of a much older story of a closely related neglect, namely that of the relation, as Mertz pointed out [4, ch.6], since associativity is just the dynamic counterpart of the relation.

Formal and (Quasi-)Material Aspects

In a first attempt, we could conceive of associativity as the capability to impose new relations between some entities. For Hume (in his “Treatise”, see Deleuze’s book about him), association was close to what Kant later dubbed “faculty”: The power to do sth, and in this case to relate ideas. However, such wording is inappropriate as we have seen (or: will see) in the chapters about modeling and categories and models. Speaking about relations and entities implies set theory, yet, models and modeling can’t be covered by set theory, or only very exceptionally so. Since category theory seems to match the requirements and the structure of models much better, we also adapt its structure and its wording.

Associativity then may be taken as the capability to impose arrows between objects A, B, C such that at least A ⊆ B ⊆ C, but usually A ⋐ B ⋐ C, and furthermore A ≃ C, where “≃” means “taken to be identical despite non-identity”. In set theoretic terms we would have used the notion of the equivalence class. Such arrows may be identified with the generalized model, as we are arguing in the chapter about the category of models. The symbolized notion of the generalized abstract model looks like this (for details jump over to the page about modeling):

eq.1

where U=usage; O=potential observations; F=featuring assignates on O; M=similarity mapping; Q=quasi-logic; P=procedural aspects of implementation.

Those arrows representing the (instances of a generalized) model are functors that are mediating between categories. We also may say that the model imposes potentially a manifold of partially ordered sets (posets) onto the initial collection of objects.

Now we can start to address our target, the structural aspects of associativity, more directly. We are interested in the necessary and sufficient conditions for establishing an instance of an object that is able (or develops the capability) to associate objects in the aforementioned sense. In other words, we need an abstract model for it. Yet, here we are not interested in the basic, that is transcendental conditions for the capability to build up associative power.

Let us start more practically, but still immaterial. The best candidates we can think of are Self-Organizing Maps (SOM) and particularly parameterized Reaction-Diffusion Systems (RDS); both of them can be subsumed into the class of associative probabilistic networks, which we describe in another chapter in more technical detail. Of course, not all networks exhibit the emergent property of associativity. We may roughly distinguish between associative networks and logistic networks [5]. Both, SOM as well as RDS, are also able to create manifolds of partial orderings. Another example from this family is the Boltzmann engine, which, however, has some important theoretical and practical drawbacks, even in its generalized form.

Next, we depict the elementary processes of SOM and RDS, respectively. SOM and RDS can be seen as instances located at the distant endpoints of a particular scale, which expresses the topology of the network. The topology expresses the arrangement of quasi-material entities that serve as persistent structure, i.e. as a kind of memory. In the SOM, these entities are called nodes and they are positioned in a more or less fixed grid (albeit there is a variant of the SOM, the SOM gas, where the grid is more fluid). The nodes do not range around. In contrast to the SOM, the entities of an RDS are freely floating around. Yet, RDS are simulated much like the SOM, assuming cells in a grid and stuffing them with a certain memory.

Inspecting those elementary processes, we of course again find transformations. More important, however, is another structural property to both of them. Both networks are characterized by a dynamically changing field of (attractive) forces. Just the locality of those forces is different between SOM and RDS, leading to a greater degree of parallelity in RDS and to multiple areas of the same quality. In SOMs, each node is unique.

The forces in both types of networks are, however, exhibiting the property of locality, i.e. there is one or more center, where the force is strong, and a neighborhood that is established through a stochastic decay of the strength of this force. Usually, in SOM as well as in RDS, the decay is assumed to be radially symmetric, but this is not a necessary condition.

After all, are we now allowed to ask ‘Where does this associativity come from?’ The answer is clearly ‘no.’ Associativity is a holistic property of the arrangement as a total. It is the result of the copresence of some properties like

  • – stochastic neighborhoods that are hosting an anisotropic and monotone field of forces;
  • – a certain, small memory capacity of the nodes; note that the nodes are not “points”: in order to have a memory they need some corporeality. In turn this opens the way to think of a separation of of the function of that memory and a variable host that provides a container for that memory.
  • – strong flows, i.e. a large number of elementary operations acting on that memory, producing excitatory waves (long-range correlations) of finite velocity;

The result of the interaction of those properties can not be described on the level of the elements of the network itself, or any of its parts. What we will observe is a complex dynamics of patterns due to the superposition of antagonist forces, that are modeled either explicitly in the case of RDS, or more implicitly in the case of SOM. Thus both networks are also presenting the property of self-organization, though this aspect is much more dominantly expressed in RDS as compared to the SOM. The important issue is that the whole network, and even more important, the network and its local persistence (“memory”) “causes” the higher-level phenomenon.

We also could say that it is the quasi-material body that is responsible for the associativity of the arrangement.

The Power of a Capability

So, what is this associativity thing about? As we have said above, associativity imposes a potential manifold of partial orderings upon an arbitrary open set.

Take a mixed herd of Gnus and Zebras as the open set without any particular ordering, put some predators like hyenas or lions into this herd, and you will get multiple partially ordered sub-populations. In this case, the associativity emerges through particular rules of defense, attack and differential movement. The result of the process is a particular probabilistic order, clearly an immaterial aspect of the herd, despite the fact that we are dealing with fleshy animals.

The interesting thing in both the SOM and the RDS is that a quasi-body provides a capability that transforms an immaterial arrangement. The resulting immaterial arrangement is nothing else than information. In other words, something specific, namely a persistent contrast, has been established from some larger unspecific, i.e. noise. Taking the perspective of the results,  i.e. with respect to the resulting information, we always can see that the association creates new information. The body, i.e. the materially encoded filters and rules, has a greater weight in RDS, while in case of the SOM the stabilization aspect is more dominant. In any case, the associative quasi-body introduces breaks of symmetry, establishes them and stabilizes them. If this symmetry breaking is aligned to some influences, feedback or reinforcement acting from the surrounds onto the quasi-body, we may well call the whole process (a simple form of) “learning.”

Yet, this change in the informational setup of the whole “system” is mirrored by a material change in the underlying quasi-body. Associative quasi-bodies are therefore representatives for the transition from the material to the immaterial, or in more popular terms, for the body-mind-dualism. As we have seen, there is no conflict between those categories, as the quasi-body showing associativity provides a double-articulating substrate for differences. Else, we can see that these differences are transformed from a horizontal difference (such as 7-5=2) into vertical, categorical differences (such like the differential). If we would like to compare those vertical differences we need … category theory! …or a philosophy of the differential!

Applications

Early in the 20th century, the concept of association has been adopted by behaviorism. Simply recall the dog of Pavlov and the experiments of Skinner and Watson. The key term in behaviorism as a belated echo of 17th century hyper-mechanistics (support of a strictly mechanic world view) is conditioning, which appears in various forms. Yet, conditioning always remains a 2-valued relation, practically achieved as an imprinting, a collision between two inanimate entities, despite the wording of behaviorists who equate their conditioning with “learning by association.” What should learning be otherwise? Nevertheless, behaviorist theory commits the mistake to think that this “learning” should be a passive act. As you can see here, psychologists still strongly believe in this weird concept. They write: “Note that it does not depend on us doing anything.” Utter nonsense, nothing else.

In contrast to imprinting, imposing a functor onto an open set of indeterminate objects is not only an exhausting activity, it is also a multi-valued “relation,” or simply, a category. If we would analyze the process of imprinting, we would find that even “imprinting” can’t be covered by a 2-valued relation.

Nevertheless, other people took the media as the message. For instance, Steven Pinker criticized the view that association is sufficient to explain the capability of language. Doing so, he commits the same mistake as the behaviorists, just from the opposite direction. How else should we acquire language, if not by some kind of learning, even if it is a particular type of learning? The blind spot of Pinker seems to be randomization, i.e. he is not able leave the actual representation of a “signal” behind.

Another field of application for the concept of associativity is urban planning or urbanism, albeit associativity is rarely recognized as a conceptual or even as a design tool. [cf. 6]  It is obvious that urban environments can be conceived as a multitude of high-dimensional probabilistic networks [7].

Machines, Machines, Machines, ….Machines?

Associativity is a property of a persistent (quasi-material) arrangement to act onto a volatile stream (e.g. information, entropy) in such a way as to establish a particular immaterial arrangement (the pattern, or association), which in turn is reflected by material properties of the persistent layer. Equivalently we may say that the process leading to an association is encoded into the material arrangement itself. The establishment of the first pattern is the work of the (quasi-)body. Only for this reason it is possible to build associative formal structures like the SOM or the RDS.

Yet, the notion of “machine” would be misplaced. We observe strict determinism only on the level of the elementary micro-processes. Any of the vast number of individual micro-events are indeed uniquely parameterized, sharing only the same principle or structure. In such cases we can not speak of a single machine any more, since a mechanic machine has a singular and identifiable state at any point in time. The concept of “state” does neither hold for RDS nor for SOM. What we see here is much more like a vast population of similar machines, where any of those is not even stable across time. Instead, we need to adopt the concept of mechanism, as it is in use in chemistry, physiology, or biology at large. Since both principles, SOM and RDS, show the phenomenon of self-organization, we even can not say that they represent a probabilistic machine. The notion of the “machine” can’t be applied to SOM or RDS, despite the fact that we can write down the principles for the micro-level in simple and analytic formulas. Yet, we can’t assume any kind of a mechanics for the interaction of those micro-machines.

It is now exciting to see that a probabilistic, self-organizing process used to create a model by means of associating principles looses the property of being a machine, even as it is running on a completely deterministic machine, the simulation of a Universal Turing Machine.

Associativity is a principle that transcends the machine, and even the machinic (Guattari). Assortative arrangements establish persistent differences, hence we can say that they create proto-symbols. Without associativity there is no information. Of course, the inverse is also true: Wherever we find information or an assortment, we also must expect associativity.

۞

  • [1]  iCub
  • [2] Kohonen, Teuvo, Self-Organization and Associative Memory. Springer Series in Information Sciences, vol.8, Springer, New York 1988.
  • [3] Anderson J.R., Bower G.H., Human Associative Memory. Erlbaum, Hillsdale (NJ) 1980.
  • [4] Mertz, D. W., Moderate Realism and its Logic, New Haven: Yale 1996.
  • [5] Wassermann, K. (2010), Associativity and Other Wurban Things – The Web and the Urban as merging Cultural Qualities. 1st international workshop on the urban internet of things, in conjunction with: internet of things conference 2010 in Tokyo, Japan, Nov 29 – Dec 1, 2010. (pdf)
  • [6] Dean, P., Rethinking representation. the Berlage Institute report No.11, episode Publ. 2007.
  • [7] Wassermann, K. (2010). SOMcity: Networks, Probability, the City, and its Context. eCAADe 2010, Zürich. September 15-18, 2010. (pdf)

Non-Turing-Computing

October 28, 2011 § Leave a comment

At first sight it may sound like a bad joke, indeed.

Turing not only provided many important theoretical insights on computing [1], including the Universal Turing Machine (UTM), he and his group in Bletchley Park also created a working prototype, which had been employing the theoretical results [2].

Turing Computation

In order to clarify what non-Turing computing could be, we first have to inspect a bit closer how Turing-computing is defined. On Wikipedia one can find the following explanation in standard language:

With this encoding of action tables as strings it becomes possible in principle for Turing machines to answer questions about the behavior of other Turing machines. Most of these questions, however, are undecidable, meaning that the function in question cannot be calculated mechanically. For instance, the problem of determining whether an arbitrary Turing machine will halt on a particular input, or on all inputs, known as the Halting problem, was shown to be, in general, undecidable in Turing’s original paper. Rice’s theorem shows that any non-trivial question about the output of a Turing machine is undecidable.

A universal Turing machine can calculate any recursive function, decide any recursive language, and accept any recursively enumerable language. According to the Church-Turing thesis, the problems solvable by a universal Turing machine are exactly those problems solvable by an algorithm or an effective method of computation, for any reasonable definition of those terms.

One could add firstly that any recursive algorithm can be linearized (and vice versa). Secondly, algorithms are defined as procedures that produce a defined result after a finite state of time.

Here is already the first problem in computational theory. What is a result? Is it a fixed value, or would we accept a probability density or even a class of those (like Dirac’s delta) also as a result? Even non-deterministic Turing machines yield unique results. The alternative of an indeterminable result sounds quite counter-intuitive and I suppose that it indeed can not be subsumed under the classical theory of computability. It would simply mean that the results of a UTM are only weakly predictable. We will return to that point a bit later.

Another issue is induced by problem size. While analytic undecidability causes the impossibility for the putative computational procedure to stop, sheer problem size may render the problem as if being undecidable. Solution spaces can be really large, beyond 102000 possible solutions. Compare this to the estimated 1080 atoms of visible matter in the whole universe. Such solution spaces are also an indirect consequence of Quine’s principle of underdetermination of an empirical situation, which results in the epistemological fact of indeterminacy of any kind of translation. We will discuss this elsewhere (not yet determined chapter…) in more detail.

From the perspective of an entity being searching through a such a large solution space it does not matter very much, whether the solution space is ill-defined or vast, from the perspective of the machine controller (“user”) both cases belong to the same class of problems: There is no analytic solution available. Let us now return the above cited question about the behavior of other entities. Even for the trivial case that the interactee is a Turing machine, the question about the behavior is undecidable. That means that any kind of interaction can not be computed using an UTM, particularly however those between epistemic beings. Besides the difficulties raised by this for the status of simulation, this means that we need an approach, which is not derived or included in the paradigm established by the Church-Turing thesis.

The UTM as the abstract predecessor of today’s digital computers is based on the operations of writing and deleting symbols. Before an UTM can start to work, the task to be computed needs to be encoded. Once, the task has been actually encoded, including the rules necessary to accomplish the computation, everything that happens is just an almost material moving of semantically empty graphemes. (We avoid here to call the 0 and 1 “symbols,” since “symbol” is a compound concept, hence it could introduce complications to our investigation here.) During the operations of the UTM, the total amount information is constantly decreasing. Else, an UTM is not only initially completely devoid of any meaning, it remains semantically empty during the whole period it works on the task. Any meaning concerning the UTM remains forever outside the UTM. This remains true even if the UTM would operate at the speed of light.

Note, that we are not discussing the architecture of an actual physical computing device. Everybody uses devices that are built according von Neumann architecture. There are very few (artificial) computers on this earth not following this paradigm. Yet, it is unclear why DNA-computers or even quantum computers should not fall in this category. These computers’ processing is different from an instance that computes based on traditional logics, physically realized as transistors. Yet, the von Neumann architecture does not make any proposal about the processor except that there need to be one. Such fancy computers still need persistent storage, a bus system, encoding and decoding devices.

As said, our concern is not about the architecture, or even more trivial, about different speed of calculation. Hence, he question of non-Turing computing is also not a matter of accuracy. For instance, it is sometimes claimed that a UTM can simulate an analog neural net with with arbitrary accuracy. (More on that later!) The issue at stake has much more to do with the role of encoding, the status of information and being an embodied entity than with the question of how to arrange physical components.

Our suggestion here is that any kind of computer could be probably used in a way that it changes into a non-Turing computer. In order to deal with this question we have to discuss first the contemporary concept of “computation.”

Computation

To get clear about the concept of “computation” does not include the attempt to find an answer to the question “What is computation?”, as for instance Jack Copeland did [3]. Such a question can not be included in any serious attempt of getting clear about it, precisely because it is not an ontological question. There are numerous attempts to define computation, then to invoke some intuitively “clear” or otherwise “indisputable” “facts”, only in order to claim an ontological status of the respective proposal. This of course is ridiculous, at least nowadays after the Linguistic Turn. Yet, the conflation of definitory means and ontic status is just (very) naive metaphysics, if not to say esoterism in scientifically looking clothes. The only thing we can do is to get clear about possible “reasonable” ways of usage of the concepts in question.

In philosophy of mind and cognitive science, and thus also for our investigation of machine-based epistemology, the interest in getting clear about computation is given by two issues. First,  there is the question, whether, and, if yes, to what extent, the brain can be assigned “a computational interpretation.” To address this question we have to clarify what “computing” could mean and whether the concept of “brain” could match any of the reasonable definitions for computing. Second, as a matter of fact we know before any such investigation that we, in order to create a machine able to follow epistemological topics, have at least to start with some kind of programming.The question here is simply how to start practically. This concerns methods, algorithms, or machine architectures. A hidden but important derivative of this question is about the possible schemes of differentiation of an initial artifact, which indeed is likely to be just a software running on a contemporary standard digital computer.

These questions that are related to the mind are not in the focus of this chapter. We will return to them elsewhere. First, and that’s our interest here, we have to clarify the usage of the concept of computation. Francesco Nir writes [4]:

According to proponents of computationalism, minds are computers, i.e., mechanisms that perform computations. In my view, the main reason for the controversy about whether computationalism is accurate in its current form, or how to assess its adequacy is the lack of a satisfactory theory of computation.

It is obvious that not only the concepts of computation, brain and mind are at stake and have to be clarified, but also the concept of theory. If we would follow a completely weird concept about “theory,” i.e. if our attempts would try to follow an impossible logical structure, we would have no chance to find appropriate solutions for those questions. We even would not be able to find appropriate answers about the role of our actions. This, of course, is true for any work; hence we will discuss the issue of “theory” in detail in another chapter. Similarly, it would be definitely to limited to conceive of a computer just as a digital computer running some algorithm (all of them are finite by definition).

The history of of computation as an institutionalized activity starts in medieval ages. Of course, people performed calculation long before. The ancient Egypts even used algorithms for problems that can’t be written in a closed form. In classics, there have been algorithms to calculate pi or square roots. Yet, only in medieval ages the concept of “computare” got a definite institutional, i.e. functional meaning. It referred to the calculation of the future Easter dates. The first scientific attempts to define computation start mainly with works published by Alan Turing and Alonzo Church, which then was later combined into the so-called Church-Turing-Thesis (CTT).

The CTT is a claim about effectively computable functions, nothing more, nothing less. Turing found that everything which is computable in finite time (and hence also on a finite strap) by his a-machine (later called Turing machine), is equivalent to the λ-calculus. As an effect, computability is equaled with the series of actions a Turing machine can perform.As stated above, even Universal Turing Machines (UTM) can’t solve the Halting-problem. There are even functions that can’t be decided by UTM.

It has been claimed that computation is just the sequential arrangement of input, transformation, and output. Yet, as Copeland and Nir correctly state, citing Searle therein, this would render even a wall into a computer. So we need something more exact. Copeland ends with the following characterization:

“It is always an empirical question whether or not there exists a labelling of some given naturally occurring system such that the system forms an honest model of some architecture-algorithm specification. And notwithstanding the truism that ‘syntax is not intrinsic to physics’ the discovery of this architecture-algorithm specification and labelling may be the key to understanding the system’s organisation and function.”

The strength of this attempt is the incorporation of the relation between algorithm and (machine) architecture into the theory. The weakness is given by the term “honest,” which is completely misplaced in the formal arguments Copeland builds up. If we remember that “algorithm” means “definite results in finite time and space” we quickly see that Copeland’s concept of computation is by far too narrow.

Recently, Wilfried Sieg tried to clarify the issues around computation and computability in a series of papers [5,6]. Similarly to Nir (see above), he starts his analysis writing:

“To investigate calculations is to analyze symbolic processes carried out by calculators; that is a lesson we owe to Turing. Taking the lesson seriously, I will formulate restrictive conditions and well motivated axioms for two types of calculators, namely, for human (computing) agents and mechanical (computing) devices. 1 My objective is to resolve central foundational problems in logic and cognitive science that require a deeper understanding of the nature of calculations. Without such an understanding, neither the scope of undecidability and incompleteness results in logic nor the significance of computational models in cognitive science can be explored in their proper generality.” [5]

Sieg follows (and improves) largely an argument originally developed by Robin Gandy. Sieg characterizes it (p.12):

“Gandy’s Central Thesis is naturally formulated as the claim that any mechanical device can be represented as a dynamical system satisfying the above principles.”

By which he meant four limiting principles that prevent that everything is regarded as a computer. He then proceeds:

I no longer take a Gandy machine to be a dynamical system 〈S, F〉 (satisfying Candy’s principles), but rather a structure M consisting of a structural class S of states together with two kinds of patterns and operations on (instantiations of) the latter;”

[decorations by W.Sieg]

What is a dynamical system for Sieg and Gandy? Just before (p.11), Sieg describes it as follows:

“Gandy’s characterization […] is given in terms of discrete dynamical systems 〈S, F〉, where S is the set of states and F governs the system’s evolution. More precisely, S is a structural class, i.e., a subclass of the hereditarily finite sets H F over an infinite set U of atoms that is closed under ∈- isomorphisms, and F is a structural operation from S to S, i.e., a transformation that is, roughly speaking, invariant under permutations of atoms. These dynamical systems have to satisfy four restrictive principles.”

[decorations by W.Sieg]

We may drop further discussion of these principles, since they just add further restrictions. From the last two quotes one can see two important constraints. First, the dynamical systems under considerations are of a discrete character. Second, any transformation leads from a well-defined (and unique) state to another such state.

The basic limitation is already provided in the very first sentence of Sieg’s paper: “To investigate calculations is to analyze symbolic processes carried out by calculators;” There are two basic objections, which lead us to deny the claim of Sieg that his approach provides the basis for a general account of computation

Firstly, from epistemology it is clear that there are no symbols out in the world. We even can’t transfer symbols in a direct manner between brains or minds in principle. We just say so in a very abbreviative manner. Even if our machine would work completely mechanically, Sieg’s approach would be insufficient to explain a “human computor.” His analysis is just and only be valid for machines belonging (as a subclass) to the group of Turing machines that run finite algorithms. Hence, his analysis is also suffering from the same restrictions. Turing machines can not make any proposal about other Turing machines. We may summarize this first point by saying that Sieg thus commits the same misunderstanding as the classical (strong) notion of artificial intelligence did. Meanwhile there is a large, extensive and somewhat bewildering debate about symbolism and sub-symbolism (in connectionism) that only stopped due to exhaustion of the participants and the practical failure of strong AI.

The second objection against Sieg’s approach comes from Wittgenstein’s philosophy. According to Wittgenstein, we can not have a private language [8]. In other words, our brains can not have a language of thinking, as such a homunculus arrangements would always be private by definition. Searle and Putnam agree on that in rare concordance. Hence it is also impossible that our brain is “doing calculations” as something that is different from the activities when we perform calculation with a pencil and paper, or sand, or a computer and electricity. This brings us to an abundant misunderstanding about what computer really do. Computers do not calculate. They do not calculate in the same respect as we our human brain does not calculate. Computers just perform moves, deletions and—according to their theory—sometimes also an insertion into a string of atomic graphemes. Computers do not calculate in the same way as the pencil is not calculating while we use it to write formulas or numbers. The same is true for the brain. What we call calculation is the assignment of meaning to a particular activity that is embedded in the Lebenswelt, the general fuzzy “network”, or “milieu” of rules and acts of rule-following. Meaning on the other hand is not a mental entity, Wilhelm Vossenkuhl emphasizes throughout his interpretation of Wittgenstein’s work.

The obvious fact that we as humans are capable of using language and symbols brings again the question to the foreground, which we addressed already elsewhere (in our editorial essay): How do words acquire meaning? (van Fraassen), or in terms of the machine-learning community: How to ground symbols? Whatsoever the answer will be (we will propose one in the chapter about conditions), we should not fallaciously take the symptom—using language and symbols—as the underlying process, “cause”, or structure. using language clearly does not indicate that our brain is employing language to “have thoughts.”

There are still other suggestions about a theory of computation. Yet, they either can be subsumed to the three approaches as discussed here, provided by Copeland, Nir, and Sieg, or they the fall short of the distinction between Turing computability, calculation and computation, or the are merely confused by the shortfalls of reductionist materialism. An example is the article by Goldin and Wegner where they basically equate computation with interaction [9].

As an intermediate result we can state that that there is no theory of computation so far that would would be appropriate to serve as a basis for the debate around epistemological and philosophical issues around our machines and around our mind. So, how to conceive of computation?

Computation: An extended Perspective

Any of the theories of computation refer to the concept of algorithm. Yet, even deterministic algorithms may run forever if the solution space is defined in a self-referential manner. There are also a lot of procedures that can be made to run on a computer, which follow “analytic rules” and never will stop running. (By “analytic rules” we understand an definite and completely determined and encoded rule that may be run on an UTM.)

Here we meet again the basic intention of Turing: His work in [1] has been about the calculability of functions. In other words, time is essentially excluded by his notion (and also in Sieg’s and Gandy’s extensions of Turing’s work). It does not matter, whether the whole of all symbol manipulations are accomplished in a femto-second or in a giga-second. Ontologically, there is just a single block: the function.

Here at this pint we can easily recognize the different ways of branching off the classical, i.e. Turing-theory based understanding of computation. Since Turing’s concept is well-defined, there are obviously more ways to conceive of something different. These, however, boil down to three principles.

  • (1) referring to (predefined) symbols;
  • (2) referring to functions;
  • (3) based on uniquely defined states.

Any kind of Non-Turing computation can be subsumed to either of these principles. These principles may also be combined. For instance, algorithms in the standard definition as given first by Donald Knuth refer to all three of them, while some computational procedures like the Game of Life, or some so-called “genetic algorithms” (which are not algorithms by definition) do not necessarily refer to (2) and (3). We may loosely distinguish weakly Non-Turing (WNT) structures from strongly Non-Turing (SNT) structures.

All of the three principles vanish, and thus the story about computation changes completely, if we allow for a signal-horizon inside the machine process.  Immediately, we would have myriads of read/write devices working all to the same tape. Note, that the situation does not actualize a parallel processing, where one would have lots of Turing machines, each of them working on its own tape. Such parallelism is equivalent to a single Turing machine, just working faster.Of course, exactly this is intended in standard parallel processing as it is implemented today.

Our shared-tape parallelism is strikingly different. Here, even as we still would have “analytic rules,” the effect of the signal horizon could be world-breaking. I guess exactly this was the basis for Turing’s interest in the question of the principles of morphogenesis [10]. Despite we only have determinate rules, we find the emergence of properties that can’t be predicted on the basis of those rules, neither quantitatively nor, even more important, qualitatively. There is not even the possibility of a language on the lower level to express what has been emerging from it. Such an embedding renders our analytic rules into “mechanisms.”

Due to the determinateness of the rules we still may talk about computational processes. Yet, there are no calculations of functions any more. The solution space gets extended by performing the computation. It is an empirical question to what extent we can use such mechanisms and systems built from such mechanisms to find “solutions.” Note, that such solutions are not intrinsically given by the task. Nevertheless, they may help us from the perspective of the usage to proceed.

A lot of debates about deterministic chaos, self-organization, and complexity is invoked by such a turn. At least the topic of complexity we will discuss in detail elsewhere. Notwithstanding we may call any process that is based on mechanisms and that extends the solution space by its own activity Proper Non-Turing Computation.

Non-Turing Computation

We have now to discuss the concept of Non-Turing Computation (NTC) more explicitly. We will yet not talk about Non-deterministic Turing Machines (NTM), and also not about exotic relativistic computers, i.e. Turing machines running in a black hole or its vicinity. Note also that as along as we would perform in an activity that finally is going to be interpreted as a solution for a function, we still are in the area defined by Turing’s theory, whether such an activity is based on so-called analog computers, DNA or quantum dots. A good example for such a misunderstanding is given in [11]. MacLennan [12] emphasizes that Turing’s theory is based on a particular model (or class of models) and its accompanying axiomatics. Based on a different model we achieve a different way of computation. Despite MacLennan provides a set of definitions of “computation” before the background of what we labels “natural computation,” his contribution remains too superficial for our purposes (He also does not distinguish between mechanistic and mechanismic).

First of all, we distinguish between “calculation” and “computation.” Calculating is completely within the domain of the axiomatic use of graphemes (again, we avoid using “symbol” here). An example is 71+52. How do we know that the result is 123? Simply by following the determinate rules that are all based on mathematical axioms. Such calculations do not add anything new, even if a particular one has been performed the first time ever. Their solution space is axiomatically confined. Thus, UTM and λ-calculus are the equivalent, as it holds also for mathematical calculation and calculations performed by UTM or by humans. Such, the calculation is equivalent to follow the defined deterministic rules. We achieve the results by combining a mathematical model and some “input” parameters. Note that this “superposition” destroys information. Remarkably, neither the UTM nor its physical realization as a package consisting from digital electronics and a particular kind of software can be conceived as a body not even metaphorically.

In contrast to that by introducing a signal horizon we get processes that provoke a basic duality. On the one hand they are based on rules, which can be written down explicitly; they even may be “analytic.” Nevertheless, if we run these rules under the condition of a signal horizon we get (strongly) emergent patterns and structures. The description of those patterns or structures can not be reduced to the descriptions of the rules (or the rules themselves) in principle. This is valid even for those cases, where the rules on the micro-level would indeed by algorithms, i.e. rules delivering definite results in finite time and space.

Still, we have a lot of elementary calculations, but the result is not given by the axioms according to which we perform these calculations. Notably, introducing a signal horizon is equivalent to introduce the abstract body. So how to call calculations that extend their own axiomatic basis?

We suggest that this kind of processes could be called Non-Turing Computation, despite the fact that Turing was definitely aware about the constraints of the UTM, and despite the fact that it was Turing who invented the reaction-diffusion-system as a Non-UTM-mechanism.

The label Non-Turing Computation just indicates that

  • – there is a strong difference between calculations under conditions of functional logics (λ-calculus) and calculations in an abstract and, of course, also in a concrete body, implied by the signal horizon and the related symmetry breaking; the first may be called (determinate) calculation, the latter (indeterminate) computation
  • – the calculations on the micro-level extend the axiomatic basis on the macro-level, leading to the fact that “local algorithmicity” does not not coincide any longer with its “global algorithmicity”;
  • – nevertheless all calculations on the micro-level may be given explicitly as (though “local”) algorithms.

Three notes are indicated here. Firstly, it does not matter for our argument, whether in a real body there are actually strict rules “implemented” as in a digital computer. The assumption that there are such rules plays the role of a worst-case assumption. If it is possible to get a non-deterministic result despite the determinacy of calculations on the micro-level, then we can proceed with our intention, that a machine-based epistemology is possible. At the same time this argument does not necessarily support either the perspective of functionalism (claiming statefulness of entities) or that of computationalism (grounding on “algorithmic framework”).

Secondly, despite the simplicity and even analyticity of local algorithms an UTM is not able to calculate a physical actualization of a system that performs non-Turing computations. The reason is that it is not defined in a way that it could. One of the consequences of embedding trivial calculations into a factual signal horizon is that the whole “system” has no defined state any more. Of course we can interpret the appearance of such a system and classify it. Yet, we can not claim anymore that the “system” has a state, which could be analytically defined or recognized as such. Such a “system” (like the reaction-diffusion systems) can not be described with a framework that allows only unique states, such as the UTM, nor can a UTM represent such a system. Here, many aspects come to the fore that are closely related to complexity. We will discuss them over there!

The third note finally concerns the label itself. Non-Turing computation could be any computation based on a customizable engine, where there is no symbolic encoding, or no identifiable states while the machine is running. Beside complex systems, there are other architectures, such like so-called analog computers. In some quite justifiable way, we could indeed conceive the simulation of a complex self-organizing system as an analog computer. Another possibility is given by evolvable hardware, like FPGA, even as the actual programming is still based on symbolic encoding. Finally, it has been suggested that any mapping of real-world data (e.g. sensory input) that are representable only by real numbers to a finite set of intensions is also non-Turing computation [13].

What is the result of an indeterminate computation, or, in order to use the redefined term, Non-Turing computation? We are not allowed to expect “unique” results anymore. Sometimes, there might be several results at the same time. A solution might be even outside of the initial solution space, causing a particular blindness of the instance performing non-Turing computations against the results of its activities. Dealing with such issues can not be regarded as an issue of a theory of calculability, or any formal theory of computation. Formal theories can not deal with self-induced extension of solution spaces.

The Role of Symbols

Before we are going to draw an conclusion, we have to discuss the role of symbols. Here we have, of course, to refer to semiotics. (…)

keywords: CS Peirce, symbolism, (pseudo-) sub-symbolism, data type in NTC as actualization of associativity (which could be quite different) network theory (there: randolation)

Conclusion

Our investigation of computation and Non-Turing-Computation brings a distinction of different ways of actualization of Non-Turing computation.Yet, there is one particular structure that is so different from Turing’s theory that it can not even compared to it. Naturally, this addresses the pen-ultimate precondition of Turing-machines: axiomatics. If we perform a computation in the sense of strong rule-following, which could be based even on predefined symbols, we nevertheless may end up with a machine that extends its own axiomatic basis. For us, this seems to be the core property of Non-Turing Computation.

Yet, such a machine has not been built so far. We provided just the necessary conditions for it. It is clear that mainly the software is missing for an actualization of such a machine. If in some near future such a machine would exist, however, this also would have consequences concerning the status of the human mind, though rather undramatic ones.

Our contribution to the debate of the relation of “computers” and “minds” spans over three aspects. Firstly, it should be clear that the traditional frame of “computationalism,” mainly based on the equivalence to the UTM, can recognized as an inappropriate hypothesis. For instance, questions like “Is the human brain a computer?” can be identified as inadequate, since it is not apriori clear what a computer should be (besides falling thereby into the anti-linguistic trap). David King asked even (more garbageful) “Is the human mind a Turing machine?” [14] King concludes that :

“So if we believe that we are more than Turing machines, a belief in a kind of Cartesian dualist gulf between the mental and the physical seems to be concomitant.”

He arrives at that (wrong) conclusion by some (deeply non-Wittgensteinian) reflections about the actual infinite and Cantor’s (non-sensical) ideas about it. It is simply an ill-posed question whether the human mind can solve problems a UTM can’t. Mode of the problems we as humans deal with all the day long can not be “solved” (within the same day), and many not even represented to a UTM, since this would require definite encoding into a string of graphemes. Indeed, we can deal with those problems without solving them “analytically.” King is not aware about the poison of analyticity imported through the direct comparison with the UTM.

This brings us to the second aspect, the state of mechanisms. The denial of the superiority or let it even be the equality of brains and UTMs does not mount to the acceptance of some top-down principle, as King suggests in the passage cited above. UTMs, as any other algorithmic machine, are finite state automata (FSA). FSA, and even probabilistic or non-deterministic FSA, are totalizing the mechanics such that they become equivalent to a function, as Turing himself clearly stated. Yet, the brain and mind could be recognized as something that indeed rests on very simple (material) mechanisms, while these mechanisms (say algorithms) are definitely not sufficient to explain anything about the brain or the mind. From that perspective we could even conclude that we only can build such a machine if we fully embrace the transcendental role of so-called “natural” languages, as it has been recognized by Wittgenstein and others.

The third and final aspect of our results finally concerns the effect of these mechanisms onto the theory. Since the elementary operations are still mechanical and maybe even finite and fully determined, it is fully justified to call such a process a calculation. Molecular operations are indeed highly determinate, yet only within the boundaries of quantum phenomena, and not to forget the thermal noise on the level of the condition of the possible. Organisms are investing a lot to improve the signal-noise-ratios up to a digital level. Yet, this calculation is not a standard computation for two reasons: First, these processes are not programmable. They are as they are, as a matter of fact and by means of the factual matter. Secondly, the whole process is not a well-defined calculation any more. There is even no state. At the borderlines between matter, its formation (within processes of interpretation themselves part of that borderline zone), and information something new is appearing (emerging?), that can’t be covered by the presuppositions of the lower levels.

As a model then—and we anyway always have to model in each single “event” (we will return to that elsewhere)—we could refer to axiomatics. It is a undeniable fact that we as persons can think more and in more generality than amoebas or neurons. Yet, even in case of reptiles, dogs, cats or dolphins, we could not say “more” anymore, it is more a “different” than a “more” that we have to apply to describe the relationships between our thinking and that of those. Still, dogs or chimpanzees did not develop the insight of the limitations of the λ-calculus.

As a conclusion, we could describe the “Non-Turing computation” with regard to the stability of its own axiomatic basis. Non-Turing computation extends its own axiomatic basis. From the perspective of the integrated entity, however, we can call it differentiation, or abstract growth. We already appreciated Turing’s contribution on that topic above. Just imagine to imagine images like those before actually having seen them…

There are some topics that directly emerge from these results, forming kind of a (friendly) conceptual neighborhood.

  • – What is the relation between abstract growth / differentiation and (probabilistic) networks?
  • – Part of the answer to this first issue is likely given by the phenomenon of a particular transition from the probabilistic to the propositional, which also play a role concerning the symbolic.
  • – We have to clarify the notion “extending an axiomatic basis”. This relates us further to evolution, and particularly to the evolution of symbolic spaces, which in turn is related to category theory and some basic notions about the concepts of comparison, relation, and abstraction.
  • – The relationship of Non-Turing Computation to the concepts of “model” and “theory.”
  • – Is there an ultimate boundary for that extension, some kind of conditional system that can’t be surpassed, and how could we speak about that?
  • [1] Alan M. Turing (1936), On Computable Numbers, With an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, Series 2, Volume 42 (1936), p.230-265.
  • [2] Andrew Hodges, Alan Turing.
  • [3] B. Jack Copeland (1996), What is Computation? Synthese 108: 335-359.
  • [4] Nir Fresco (2008), An Analysis of the Criteria for Evaluating Adequate Theories of Computation. Minds & Machines 18:379–401.
  • [5] Sieg, Wilfried, “Calculations by Man and Machine: Conceptual Analysis” (2000). Department of Philosophy. Paper 178. http://repository.cmu.edu/philosophy/178
  • [6] Sieg, Wilfried, “Church Without Dogma: Axioms for Computability” (2005). Department of Philosophy. Paper 119. http://repository.cmu.edu/philosophy/119.
  • [7] Wilhelm Vossenkuhl, Ludwig Wittgenstein. 2003.
  • [8] Ludwig Wittgenstein, Philosophical Investigations §201; see also the Internet Encyclopedia of Philosophy
  • [9] Goldin and Wegner
  • [10] Alan M. Turing (1952), The Chemical Basis of Morphogenesis. Phil.Trans.Royal Soc. London. Series B, Biological Sciences, Vol.237, No. 641. (Aug. 14, 1952), pp. 37-72.
  • [11] Ed Blakey (2011), Computational Complexity in Non-Turing Models of Computation: The What, the Why and the How. Electronic Notes Theor Comp Sci 270: 17–28.
  • [12] Bruce J MacLennan (2009), Super-Turing or Non-Turing? Extending the Concept of Computation. Int. J. Unconvent. Comp., Vol 5 (3-4),p.369-387.
  • [13] Thomas M. Ott, Self-organised Clustering as a Basis for Cognition and Machine Intelligence. Thesis, ETH Zurich, 2007.
  • [14] David King (1996), Is the human mind a Turing machine? Synthese 108: 379-389.

۞

Machine-based Episteme/Epistemology

October 24, 2011 § Leave a comment

It is pretty clear that even if we just think of machines being able to understand this immediately triggers epistemological issues. If such a machine would be able to build previously non-existent representations of the outer world in a non-deterministic manner, a wide range of epistemological implications would be invoked for that machine-being, and these epistemological implications are largely the same as for humans or other cognitive well-developed organic life.

Epistemology investigates the conditions for “knowledge” and the philosophical consequences of knowing. “Knowledge” is notoriously difficult to define, and there are many misunderstandings around, including the “soft stone” of so-called “tacit knowledge”; yet for us it simply denotes a bundle consisting from

  • – a dynamic memory
  • – capacity for associative modeling, i.e. adaptively deriving rules about the world
  • – ability to act upon achieved models and memory
  • – self-oriented activity regarding the available knowledge
  • – already present information is used to extend the capabilities above

Note that we do not demand communication about knowledge. For several reasons and based on Wittgenstein’s theory on meaning we think that knowledge can not be transmitted or communicated. Recently, Linda Zagzebski [1] achieved the same result, starting from a different perspective. She writes, that “[…] knowledge is not an output of an agent, instead it is a feature of the agent.“. In agreement with at least some reasonably justified philosophical positions we thus propose that it is also reasonable to conceive of such a machine as mentioned before as being enabled to knowledge. Accordingly, it is indicated to assign the capability for knowing to the machine. That knowledge being comprised or constituted by the machine is not accessible for us as “creators” of the machine, for the very same reason of the difference of the Lebenswelten.

Yet, the knowledge acquired by the is also not “directly” accessible for the machine itself. In contrast to rationalist positions, knowledge can’t be separated from the whole of a cognitive entity. The only thing that is possible is to translate it into publicly available media like language, to negotiate a common usage of words and their associated link structures, and to debate about the mutually private experiences.

Resorting to the software running on the machine and checking the content of the machine will be not possible either. A software that will enable knowing can’t be decomposable in order to serve an explanation of that knowledge. The only things one will find is a distant analog to our neurons. As little as reductionism works for the human mind it will work for the machine.

Yet, such machine-knowledge is not comparable to human knowledge. The reason for that is not an issue of type, or extent. The reason is given by the fact that the Lebenswelt of the machine, that is the totality of all relations to the outer world and of all transformations of the perceiving and acting entity, the machine, would be completely different from ours. It will not make any sense to try to simulate any kind of human-like knowledge in that machine. It always will be drastically different.

The only possibility to speak about the knowing and the knowledge of the machine is through epistemological concepts. For us it doesn’t seem promising to engage in fields like “Cognitive Informatics,” since informatics (computer science) can not deal with cognition for rather fundamental reasons: Cognition is not Turing-computable.

The bridging bracket between the brains and minds of machine and human being is the theory of knowing. Consequently, we have to apply epistemology to deal with machines that possibly know. The conditions for that knowledge could turn out to be strange; else we should try to develop the theory of machine-based knowledge from the perspective of the machine. It is important to understand that attempts like the Turing-Test [2] are inappropriate, for several reasons so: (i) they follow the behavioristic paradigm, (ii) they do not offer the possibility to derive scales for comparison, (iii) no fruitful questions can be derived.

Additionally, there are some arguments pointing to the implicit instantiation of a theory as soon as something is going to be modeled. In other words, a machine which is able to know already has a—probably implicit—theory about it, and this also means about itself. That theory would originate in the machine (despite the fact that it can’t be a private theory). Hence, we open a branch and call it machine-based epistemology.

Some Historical Traces of ‘Contacts’

(between two strange disciplines)

Regarding the research about and the construction of “intelligent” machines, the relevance of thinking in epistemological terms has been recognized quite early. In 1963, A. Wallace published a paper entitled “Epistemological Foundations of Machine Intelligence”[3] that quite unfortunately is not available except the already remarkable abstract:

Abstract : A conceptual formulation of the Epistemological Foundations of Machine Intelligence is presented which is synthesized from the principles of physical and biological interaction theory on the one hand and the principles of mathematical group theory on the other. This synthesis, representing a fusion of classical ontology and epistemology, is generally called Scientific Epistemology to distinguish it from Classical General Systems theory. The resulting view of knowledge and intelligence is therefore hierarchical, evolutionary, ecological, and structural in character, and consequently exhibits substantial agreement with the latest developments in quantum physics, foundations of mathematics, general systems theory, bio-ecology, psychology, and bionics. The conceptual formulation is implemented by means of a nested sequence of structural Epistemological-Ontological Diagrams which approximate a strong global interaction description. The mathematico-physical structure is generalized from principles of duality and impotence, and the techniques of Lie Algebra and Lie Continuous Group theory.

As far as it is possible to get an impression about the actual but lost full paper, Wallace’s approach is formal and mathematical. Biological interaction theory at that time was a fork from mathematical information theory, at least in the U.S. where this paper originates. Another small weakness could be indicated by the notion of “hierarchical knowledge and intelligence,” pointing to some rest of positivism. Anyway, the proposed approach never was followed upon, unfortunately. Yet we will see in our considerations about modeling that the reference to structures like the Lie Group theory could not have worked out in a satisfying manner.

Another early instance of bringing epistemology into the research about “artificial intelligence” is McCarthy [4,5], who coined the term “Artificial Intelligence.” Yet, his perspective appears as by far too limited. First he starts with the reduction of epistemology to first-order logics:

“We have found first order logic to provide suitable languages for expressing facts about the world for epistemological research.” […]

Philosophers emphasize what is potentially knowable with maximal opportunities to observe and compute, whereas AI must take into account what is knowable with available observational and computational facilities.

Astonishingly, he does not mention any philosophical argument in the rest of the paper  except the last paragraph:

“More generally, we can imagine a metaphilosophy that has the same relation to philosophy that metamathematics has to mathematics. Metaphilosophy would study mathematical systems consisting of an “epistemologist” seeking knowledge in accordance with the epistemology to be tested and interacting with a “world”.” […] AI could benefit from building some very simple systems of this kind, and so might philosophy.”

McCarthy’s stance to philosophy is typical for the whole field. Besides the presumptuous suggestion of a “metaphilosophy” and subsuming it rather nonchalant to mathematics, he misses the point of epistemology, even as he refers to the machine as an “observer”: A theory of knowledge is about the conditions of the possibility for knowledge. McCarthy does not care about the implications of his moves to that possibility, or vice versa.

Important progress about the issue of the sate of machines was contributed not by the machine technologists themselves, but by philosophers, namely Putnam, Fodor, Searle and Dennett in the English speaking world, and also among French philosophers like Serres (in his “Hermes” series) and Guattari. The German systems theorists like von Foerster and Luhmann and their fellows never went beyond cybernetics, so we can omit them here.  In 1998, Wellner [6] provided a proposal for epistemology in the field of “Artificial Life” (what a terrible wording…). Yet, his attempt to contribute to the epistemological discussion turns out to be inspired by Luhmann’s perspective, and the “first step” he proposes is simply to stuff robots with sensory, i.e. finally it’s not really a valuable attempt to deal with epistemology in affairs of epistemic machines.

In 1978, Daniel Dennett [6] reframed the so-called “Frame Problem” of AI, of which already McCarthy and Hayes [3] got aware 10 years earlier. Dennet asks how

“a cognitive creature … with many beliefs about the world” can update those beliefs when it performs an act so that they remain “roughly faithful to the world”? (cited acc. to [8])

Recently, Dreyfus [9] and Wheeler [10], who yet disagrees about the reasoning with Dreyfus about it, called the Frame problem an illusionary pseudo-problem, created by the adherence to Cartesian assumptions. Wheeler described it as:

“The frame problem is the difficulty of explaining how non-magical systems think and act in ways that are adaptively sensitive to context-dependent relevance.”

Wheeler as well as Dreyfus recognize the basic problem(s) in the architecture of mainstream AI, and they identify Cartesianism as the underlying principle of these difficulties, i.e. the claim of analyticity, reducibility and identifiability. Yet, neither of the two so far proposes a stable solution. Heideggerian philosophy with its situationistic appeal does not help to clarify the epistemological affairs,  neither of machines nor of humans.

Our suggestion is the following: Firstly, a general solution should be found, how to conceive the (semi-)empirical relationship between beings that have some kind of empirical coating. Secondly, this general solution should serve as a basis to investigate the differences, if there are any, between machines and humans, regarding their epistemological affairs with the “external” world. This endeavor we label as  “machine-based epistemology.”

Machine-based Epistemology

If a machine, or better, a synthetic body that was established as a machine in the moment of its instantiation, would be able act freely, it would face the same epistemological problems as we humans, starting with basic sensory perception and not ending with linking a multi-modal integration of sensory input to adequate actions. Therefore machine-based epistemology (MBE) is the appropriate label for the research program that is dedicated to learning processes implemented on machines. We avoid invoking the concept of agents here, since this already brings in a lot of assumptions.

Note that MBE should not be mixed with so-called “Computer Epistemology”, which is concerned just about the design of so-called man-machine-interfaces [11]. We are not concerned about epistemological issues arising through the usage computers, of course.

It is clear that the term machine learning is missing the point, it is a pure technical term. Machine learning is about algorithms and programmable procedures, not about the reflection of the condition of that. Thus, it does not recognize the context into which learning machines are embedded, and in turn it misses also the consequences. In some way machine learning is not about learning about machines. It remains a pure engineering discipline.

As a consequence, one can find a lot of nonsense in the field of machine learning, especially concerning so-called ontologies and meta-data, but also about the topic of “learning” itself. There is the nonsensical term of “reinforcement learning”… which kind of learning could not be about (differential) reinforcement?

The other label Machine-based Epistemology is competing with is “Artificial Intelligence.” Check out the editorial text “Where is the Limit” for arguments against the label “AI.” The conclusion was that AI is too close to cybernetics and mathematical information theory, that it is infected by romanticism and it is difficult to operationalize, that it does not appropriately account for cultural effects onto the “learning subject.” Since AI is not connected natively to philosophy, there is no adequate treatment of language: AI never took the “Linguistic Turn.” Instead, the so-called philosophy of AI poses silly questions about “mental states.”

MBE is concerned about the theory of machines that possibly start to develop autonomous cognitive activity; you may call this “thinking.” You also may conceive it as a part of a “philosophy of mind.” Both notions, thinking and mind, may work in the pragmatics of everyday social situations, for a more strict investigation I think they are counter-productive: We should pay attention to language in order not to get vexed by it. If there is no “philosophy of unicorns,” then probably there also should not be a “philosophy of mind.” Both labels, thinking and mind, pretend to define a real and identifiable entity, albeit exactly this should be one of the targets for a clarification. Those labels can easily cause the misunderstanding of separable subjects. Instead, we could call it “philosophy of generalized mindfulness”, in order to avoid anthropomorphic chauvinism.

As a theory, MBE is not driven by engineering, as it is the case for AI; just the other way round, MBE itself is driving engineering. It somehow brings philosophical epistemology into the domain of engineering computer  systems that are able to learn. Such it is natively linked in a an already well-established manner to other fields in philosophy. Which, finally, helps to avoid to pose silly questions or to follow silly routes.

  • [1] Linda Zagzebski, contribution to: Jonathan Dancy, Ernest Sosa, Matthias Steup (eds.), “A Companion to Epistemology”, Vol. 4, pp.210; here p.212.
  • [2] Alan Turing (1950), Computing machinery and intelligence. Mind, 59(236): 433-460.
  • [3] Wallace, A. (1963), EPISTEMOLOGICAL FOUNDATIONS OF MACHINE INTELLIGENCE. Information for the defense Community (U.S.A.), Accession Number : AD0681147
  • [4] McCarthy, J. and Hayes, P.J. (1969) Some Philosophical Problems from the Standpoint of Artificial Intelligence. Machine Intelligence 4, pp.463-502 (eds Meltzer, B. and Michie, D.). Edinburgh University Press.
  • [5] McCarthy, J. 1977. Epistemological problems of artificial intelligence. In IJCAI, 1038-1044.
  • [6] Jörg Wellner 1998, Machine Epistemology for Artificial Life In: “Third German Workshop on Artificial Life”, edited by C. Wilke, S. Altmeyer, and T. Martinetz, pp. 225-238, Verlag Harri Deutsch.
  • [7]  Dennett, D. (1978), Brainstorms, MIT Press., p.128.
  • [8] Murray Shanahan (2004, rev.2009), The Frame Problem, Stanford Encyclopedia of Philosophy, available online.
  • [9] H.L. Dreyfus, (2008), “Why Heideggerian AI Failed and How Fixing It Would Require Making It More Heideggerian”, in The Mechanical Mind in History, eds. P.Husbands, O.Holland & M.Wheeler, MIT Press, pp. 331–371.
  • [10] Michael Wheeler (2008), Cognition in Context: Phenomenology, Situated Robotics and the Frame Problem. Int.J.Phil.Stud. 16(3), 323-349.
  • [11] Tibor Vamos, Computer Epistemology: A Treatise in the Feasibility of the Unfeasible or Old Ideas Brewed New. World Scientific Pub, 1991.

۞

Mental States

October 23, 2011 § Leave a comment

The issue we are dealing with here is the question whether we are justified to assign “mental states” to other people on the basis of our experience, that is, based on weakly valid predictions and the use of some language upon them.

Hilary Putnam, in an early writing (at least before 1975), used the notion of mental states, and today almost everybody does so. In the following passage he tries to justify the reasonability of the inference of mental states (italics by H.Putnam, colored emphasis by me); I think this passage is not compatible with his results any more in “Representation and Reality”, although most people particularly from computer sciences cite him as a representative of a (rather crude) machine-state functionalism:

“These facts show that our reasons for accepting it that others have mental states are not an ordinary induction, any more than our reasons for accepting it that material objects exist are an ordinary induction Yet, what can be said in the case of material objects can also be said here our acceptance of the proposition that others have mental states is both analogous and disanalogous to the acceptance of ordinary empirical theories on the basis of explanatory induction. It is disanalogous insofar as ‘other people have mental states’ is, in the first instance, not an empirical theory at all, but rather a consequence of a host of specific hypothesis, theories, laws, and garden variety empirical statements that we accept.   […]   It is analogous, however, in that part of the justification for the assertion that other people have mental states is that to give up the proposition would require giving up all of the theories, statements, etc., that we accept implying that proposition; […] But if I say that other people do not have minds, that is if I say that other people do not have mental states, that is if I say that other people are never angry, suspicious, lustful,sad, etc., I am giving up propositions that are implied by the explanations that I give on specific occasions of the behavior of other people. So I would have to give up all of these explanations.”

Suppose, we observe someone for a few minutes while he or she is getting increasingly stressed/relaxed, and suddenly the person starts to shout and to cry, or to smile. More professionally, if we use a coding system like the one proposed by Scherer and Ekman, the famous “Facial Action Coding System,”  recently popularized by the TV series “Lie to me,” are we allowed to assign them a “mental state”?

Of course, we intuitively and instinctively start trying to guess what’s going on with the person, in order to make some prediction or diagnosis (which essentially is the same thing), for instance because we feel inclined to help, to care, to console the person, to flee, or to chummy with her. Yet, is such a diagnosis, probably taking place in the course of mutual interpretation of almost non-verbal behavior, is such a diagnosis the same as assigning “mental states”?

We are deeply convinced, that the correct answer is ‘NO’.

The answer to this question is somewhat important for an appropriate handling of machines that start to be able to open their own epistemology, which is the correct phrase for the flawed notion of “intelligent” machines. Our answer rests on two different pillars. We invoke complexity theory, and a philosophical argument as well. Complexity theory forbids states for empirical reasons; the philosophical argument forbids its usage regarding the mind due to the fact that empirical observations never can be linked to statefulness, neither by language nor by mathematics. Statefulness is then identified as a concept from the area of (machine) design.

Yet, things are a bit tricky. Hence, we have to extend the analysis a bit. Else we have to refer to what we said (or will say) about theory and modeling.

Reductionism, Complexity, and the Mental

Since the concept of “mental state” involves the concept of state, our investigation has to follow two branches. Besides the concept of “state” we have the concept of the “mental,” which still is a very blurry one. The compound concept of “mental state” just does not seem to be blurry, because of the state-part. But what if the assignment of states to the personal inner life of the conscious vis-a-vis is not justified? We think indeed that we are not allowed to assign states to other persons, at least when it comes to philosophy or science  about the mind (if you would like to call psychology a ‘science’). In this case, the concept of the mental remains blurry, of course. One could suspect that the saying of “mental state” just arose to create the illusion of a well-defined topic when talking about the mind or mindfulness.

“State” denotes a context of empirical activity. It assumes that there have been preceding measurements yielding a range of different values, which we aposteriori classify and interpret. As a result of these empirical activities we distinguish several levels of rather similar values, give them a label and call them a “state.” This labeling remains always partially arbitrary by principle. Looking backward we can see that the concept of “state” invokes measurability, interpretation and, above all, identifiability. The language game of “state” excludes basic non-identifiability. Though we may speak about a “mixed state,” which still assumes identifiability in principle, there are well-known cases of empirical subjects that we can not assign any distinct value in principle. Prigogine [2] gave many examples, and even one analytic one, based on number theory. In short, we can take it for sure that complex systems may traverse regions in their parameter space where it is not possible to assign anything identifiable. In some sense, the object does not exist as a particular thing, it just exists as a trajectory, or more precise, a compound made from history and pure potential. A slightly more graspable example for those regions are the bifurcation “points” (which are not really points for real systems).

An experimental example being also well visible are represented by arrangements like so-called Reaction-Diffusion-Systems [3]. How to describe such a system? An atomic description is not possible, if we try to refer to any kind of rules. The reason is that the description of a point in their parameter system around the indeterminate area of bifurcation is the description of the whole system itself, including its trajectory through phase space. Now, who would deny that the brain and the mind springing off from it is something which exceeds by far those “simple” complex systems in their complexity, which are used as “model systems” in the laboratory, in Petri dishes, or even computer simulations?

So, we conclude that brains can not “have” states in the analytic sense. But what about meta-stability? After all, it seems that the trajectories of psychological or behavioral parameters are somehow predictable. The point is that the concept of meta-stability does not help very much. That concept directly refers to complexity, and thus it references to the whole “system,” including a large part of its history. As a realist, or scientist believing in empiricism, we would not gain anything. We may summarize that their is no possible reduction of the brain to a perspective that would justify the usage of the notion of “state.”

But what about the mind? Let the brain be chaotic, the mind need not, probably. Nobody knows. Yet, an optimistic reductionist could argue for its possibility. Is it then allowed to assign states to the mind, that is, to separate the brain from the mind with respect to stability and “statefulness”? Firstly, again the reductionist would loose all his points, since in this case the mind and its states would turn into something metaphysical, if not from “another reality.” Secondly, measurability would fall apart, since mind is nothing you could measure as an explanans. It is not possible to split off the mind of a person from that very person, at least not for anybody who would try to justify the assignment of states to minds, brains or “mental matter.” The reason is a logical one: Such an attempt would commit a petitio principii.

Obviously, we have to resort to the perspective of language games. Of course, everything is a language game, we knew that even before refuting the state as an appropriate concept to describe the brain. Yet, we have demonstrated that even an enlightened reductionist, in the best case a contemporary psychologist, or probably also William James, must acknowledge that it is not possible to speak scientifically (or philosophically) about states concerning mental issues. Before starting with the state as a Language Game I would first like to visit the concepts of automata in their relation to language.

Automata, Mechanism, and Language

Automata are positive definite, meaning that it consists from a finite set of well-defined states. At any point in time they are exactly defined, even if the particular automaton is a probabilistic one. Well, complexity theory tells us, that this is not possible for real objects. Yet, “we” (i.e. computer hardware engineers) learned to suppress deviations far enough in order to build machines which come close to what is called the “Universal Turing Machine,” i.e. nowadays physical computers. A logical machine, or a “logics machine”, if you like, then is an automaton. Therefore, standard computer programs are perfectly predictable. They can be stopped, hibernated, restarted etc., and weeks later you can proceed at the last point of your work, because the computer did not change any single of more than 8’000’000’000 dual-valued bits. All of the software running on computers is completely defined at any point in time. Hence, logical machines not only do exist outside of time, at least from their own perspective. It is perfectly reasonable to assign them “states,” and the sequence of these states are fully reversible in the sense that either a totality of the state can be stored and mapped onto the machine, or that it can be identically reproduced.

For a long period of time, people thought that such a thing would be an ideal machine. Since it was supposed to be ideal, it was also a matter of God, and in turn, since God could not do nonsense (as it was believed), the world had to be a machine. In essence, this was the reasoning in the startup-phase of the Renaissance, remember Descartes’s or Leibniz’s ideas about machines. Later, Laplace claimed perfect predictability for the universe, if he could measure everything, as he said. Not quite randomly Leibniz also thought about the possibility to create any thought by combination from a rather limited set of primitives, and in that vein he also proposed binary encoding. Elsewhere we will discuss, whether real computers as simulators of logic machines can just and only behave deterministically. (they do not…)

Note that we are not just talking about the rather trivial case of Finite State Automata. We explicitly include the so-called Universal-Turing-Machine (UTM) into our considerations, as well as Cellular Automata, for which some interesting rules are known, producing unpredictable though not random behavior. The common property of all these entities is the positive definiteness. It is important to understand that physical computers must not conceived as UTM. The UTM is logical machine, while the computer is a physical instance of it. At the same time it is more, but also less than a UTM. The UTM consists of operations virtually without a body and without matter, and thus also without the challenge of a time viz. signal horizon: things, which usually cause trouble when it comes to exactness. The particular quality of the unfolding self-organization in Reaction-Diffusion-System is—besides other design principles—dependent on effective signal horizons.

Complex systems are different, and so are living systems (see posts about complexity). Their travel through parameter space is not reversible. Even “simple” chemical processes are not reversible. So, neither the brain nor the mind could be described as reversible entities. Even if we could measure a complex system at a given point in time “perfectly,” i.e. far beyond quantum mechanic thresholds (if such a statement makes any sense at all), even then the complex system will return to increasing unpredictability, because such systems are able to generate information [4]. Besides stability, they are also deeply nested, where each level of integration can’t be reduced to the available descriptions of the next lower level. Standard computer programs are thus an inappropriate metaphor for the brain as well as for the mind. Again, there is the strategic problem for the reductionist trying to defend the usage of the concept of states to describe mental issues, as reversibility would apriori assume complete measurability, which first have to be demonstrated, before we could talk about “states” in the brain or “in” the mind.

So, we drop the possibility that the brain or the mind either is an automaton. A philosophically inspired biological reductionist then probably will resort to the concept of mechanism. Mechanisms are habits of matter. They are micrological and more local with respect to the more global explanandum. Mechanisms do not claim a deterministic causality for all the parts of a system, as the naive mechanists of earlier days did. Yet, referring to mechanisms imports the claim that there is a linkage between probabilistic micrological (often material) components and a reproducible overall behavior of the “system.” The micro-component can be modeled deterministically or probabilistically following very strong rules, the overall system then shows some behavior which can not described by the terms appropriate for the micro-level. Adopted to our case of mental states that would lead us to the assumption that there are mechanisms. We could not say that these mechanisms lead to states, because the reductionist first has to proof that mechanisms lead to stability. However, mechanisms do not provide any means to argue on the more integrated level. Thus we conclude that—funny enough—resorting to the concept of probabilistic mechanism includes the assumptions that it is not appropriate to talk about states. Again a bad card for the reductions heading for the states in the mind.

Instead, systems theory uses concepts like open systems, dynamic equilibrium (which actually is not an equilibrium), etc. The result of the story is that we can not separate a “something” in the mental processes that we could call a state. We have to speak about processes. But that is a completely different game, as Whitehead has demonstrated as the first one.

The assignment of a “mental state” itself is empty. The reason is that there is nothing we could compare it with. We only can compare behavior and language across subjects, since any other comparison of two minds always includes behavior and language. This difficulty is nicely demonstrated by the so-called Turing-test, as well as Searle’s example of the Chinese Chamber. Both examples describe situations where it is impossible to separate something in the “inner being” (of computers, people or chambers with Chinese dictionaries); it is impossible, because that “inner being” has no neighbor, as Wittgenstein would have said. As already said, there is nothing which we could compare with. Indeed, Wittgenstein said so about the “I” and refuted its reasonability, ultimately arriving at a position of “realistic solipsism.” Here we have to oppose the misunderstanding that an attitude like ours denies the existence of mental affairs of other people. It is totally o.k. to believe and to act according to this believe that other people have mental affairs in their own experience; but it is not o.k. to call that a state, because we can not know anything about the inner experience of private realities of other people, which would justify the assignment of the quality of a “state.” We also could refer to Wittgenstein’s example of pain: it is nonsense to deny that other people have pain, but it is also nonsense to try to speak about the pain of others in a way that claims private knowledge. It is even nonsense to speak about one’s own pain in a way that would claim private knowledge—not because it is private, but because it is not a kind of knowledge. Despite we are used to think that we “know” the pain, we do not. If we would, we could speak exactly about it, and for others it would not be unclear in any sense, much like: I know that 5>3, or things like that. But it is not possible to speak in this way about pain. There is a subtle translation or transformation process in between the physiological process of releasing prostaglandin at the cellular level and the final utterance of the sentence “I have a certain pain.” The sentence is public, and mandatory so. Before that sentence, the pain has no face and no location even for the person feeling the pain.

You might say, o.k. there is physics and biology and molecules and all the things we have no direct access to either. Yet, again, these systems behave deterministically, at least some of them we can force to behave regularly. Electrons, atoms and molecules do not have individuality beyond their materiality, they can not be distinguished, they have no memory, and they do not act in their own symbolic space. If they would, we would have the same problem as with the mental affairs of our conspecifics (and chimpanzees, whales, etc.).

Some philosophers, particularly  those calling themselves analytic, claim that not only feelings like happiness, anger etc. require states, but also that intentions would do so. This, however, would aggravate the attempt to justify the assignment of states to mental affairs, since intentions are the result of activities and processes in the brain and the mind. Yet, from that perspective one could try to claim that mental states are the result of calculations or deterministic processes. As for mathematical calculations, there could be many ways leading to the same result. (The identity theory between physical and mental affairs has been refuted first by Putnam 1967 [5].) On the level of the result we unfortunately can not tell anything about the way how to achieve it. This asymmetry is even true for simple mathematics.

Mental states are often conceived as “dispositions,” we just before talked about anger and happiness, notwithstanding more “theoretical” concepts. Regarding this usage of “state,” I suppose it is circular, or empty. We can not talk about the other’s psychic affairs except the linkage we derive by experience. This experience links certain types of histories or developments with certain outcomes. Yet, their is no fixation of any kind, and especially not in the sense of a finite state automaton. That means that we are mapping probability densities to each other. It may be natural to label those, but we can not claim that these labels denote “states.” Those labels are just that: labels. Perhaps negotiated into some convention, but still, just labels. Not to be aware of this means to forget about language, which really is a pity in case of “philosophers.” The concept of “state” is basically a concept that applies to the design of (logical) machines. For these reasons is thus not possible to use “state” as a concept where we attempt to compare (hence to explain)  different entities, one of which is not the result of  design. Thus, it is also not possible to use “states” as kind of “explaining principle” for any kind of further description.

One way to express the reason for the failure of  the supervenience claim is that it mixes matter with information. A physical state (if that would be meaningful at all) can not be equated with a mind state, in none of its possible ways. If the physical parameters of a brain changes, the mind affairs may or may not be affected in a measurable manner. If the physical state remains the same, the mental affairs may remain the same; yet, this does not matter: Since any sensory perception alters the physical makeup of the brain, a constant brain would be simply dead.

Would we accept the computationalist hypothesis about the brain/mind, we would have to call the “result” a state, or the “state” a result. Both alternatives feel weird at least with respect to a dynamic entity like the brain, though the even feel weird with respect to arithmetics. There is no such thing in the brain like a finite algorithm that stops when finished. There are no “results” in the brain, something, even hard-core reductionistic neurobiologists would admit. Yet, again, exactly this determinability had to be demonstrated in order to justify the usage of “state” by the reductionist, he can not refer to it as an assumption.

The misunderstanding is quite likely caused by the private experience of stability in thinking. We can calculate 73+54 with stable results. Yet, this does not tell anything about the relation between matter and mind. The same is true for language. Again, the hypothesis underling the claim of supervenience is denying the difference between matter and information.

Besides the fact that the reductionist is running again into the same serious tactical difficulties as before, this now is a very interesting point, since it is related to the relation of brain and mind on the one side and actions and language on the other. Where do the words we utter come from? How is it possible to express thoughts such that it is meaningful?

Of course, we do not run a database with a dictionary inside it in our head. We not only don’t do so, it would not be possible to produce and to understand language at all, even to the slightest extent. Secondly, we learn language, it is not innate. Even the capability to learn language is not innate, contrary to a popular guess. Just think about Kaspar Hauser who never mastered it better than a 6-year old child. We need an appropriately trained brain to become able to learn a language. Would the capability for language being innate, we would not have difficulties to learn any language. We all know that the opposite is true, many people having severe difficulties to learn even a single one.

Now, the questions of (1) how to become able to learn a language and (2) how to program a computer that it becomes able to understand language are closely related. The programmer can NOT put the words into the machine apriori as that would be self-delusory. Else, the meaning of something can not be determined apriori without referring to the whole Lebenswelt. That’s the result of Wittgenstein’s philosophy as well as it is Putnam’s final conclusion. Meaning is not a mental category, despite that it requires always several brains to create something we call “meaning” (emphasis on several). The words are somewhere in between, between the matter and the culture. In other words there must be some kind process  that includes modeling, binding, symbolization, habituation, both directed to its substrate, the brain matter, and its supply, the cultural life.

We will discuss this aspect elsewhere in more detail. Yet, for the reductionist trying to defend the usage of the concept of states for the description of mental affairs, this special dynamics between the outer world and the cognitively established reality, and which is embedding  our private use of language, is the final defeat for state-oriented reductionisms.

Nevertheless we humans often feel inclined to use that strange concept. The question is why do we do so, and what is the potential role of that linguistic behavior? If we take the habit of assigning a state to mental affairs of other people as a language game, a bunch of interesting questions come to the fore. These are by far too complex and to rich as to be discussed here. Language games are embedded into social situations, and after all, we always have to infer the intentions of our partners in discourse, we have to establish meaning throughout the discourse, etc. Assigning a mental state to another being probably just means “Hey, look, I am trying to understand you! Would you like to play the mutual interpretation game?” That’s ok, of course, for the pragmatics of a social situation, like any invitation to mutual inferentialism [6], and like any inferentialism it is even necessary—from the perspective of the pragmatics of a given social situation. Yet, this designation of understanding should not mistake the flag with the message. Demonstrating such an interest need not even be a valid hypothesis within the real-world situation. Ascribing states in this way, as an invitation for inferring my own utterances,  is even unavoidable, since any modeling requires categorization. We just have to resist to assign these activities any kind of objectivity that would refer to the inner mental affairs of our partner in discourse. In real life, doing so instead is inevitably and always a sign of deep disrespect of the other.

In philosophy, Deleuze and Guattari in their “Thousand Plateaus” (p.48) have been among the first who recognized the important abstract contribution of Darwin by means of his theory. He opened the possibility to replace types and species by population, degrees by differential relations. Darwin himself, however, has not been able to complete this move. It took another 100 years until Manfred Eigen coined the term quasi-species as an increased density in a probability distribution. Talking about mental states is noting than a fallback into Linnean times when science was the endeavor to organize lists according to uncritical use of concepts.

Some Consequences

The conclusion is that we can not use the concept of state for dealing with mental or cognitive affairs in any imaginable way, without stumbling into serious difficulties . We should definitely drop it from our vocabulary about the mind (and the brain as well). Assuming mental states in other people is rendering those other people into deterministic machines. Thus, doing so would even have serious ethical consequences. Unfortunately, many works by many philosophers are rendered into mere garbage by mistakenly referring to this bad concept of “mental states.”

Well, what are the consequences for our endeavor of machine-based epistemology?

The most salient one is that we can not use the digital computers to produce language understanding as along as we use these computers as deterministic machines. If we still want to try (and we do so), then we need mechanisms that introduce aspects that

  • – are (at least) non-deterministic;
  • – produce manifolds with respect to representations, both on the structural level and “content-wise”;
  • – start with probabilized concepts instead of compound symbolic “whole-sale” items (see also the chapter about representation);
  • – acknowledge the impossibility to analyze a kind of causality or—equival- ently—states inside the machine in order to “understand” the process of language at a microscopic level: claiming ‘mental states’ is a garbage state, whether it is assigned to people or to machines.

Fortunately enough, we found further important constraints for our implementa- tion of a machine that is able to understand language. Of course, we need further ingredients, but for now theses results are seminal. You may wonder about such mechanisms and the possibility to implement them on a computer. Be sure, they are there!

  • [1] Hilary Putnam, Mind, language, and reality. Cambridge University Press, 1979. p.346.
  • [2] Ilya Prigogine.
  • [3] Reaction-Diffusion-Systems: Gray-Scott-systems, Turing-systems
  • [4] Grassberger, 1988. Physica A.
  • [5] Hilary Putnam, 1967, ‘The Nature of Mental States’, in Mind, Language and reality, Cambridge University Press, 1975.
  • [6] Richard Brandom, Making it Explicit. 1994.

۞

Where Am I?

You are currently browsing entries tagged with machine at The "Putnam Program".