Descartes, updated.

December 27, 2012 § 1 Comment

Yes, I am a Cartesian. Well, at least abstractly and partially.

Why Descartes? And why updating him? And why here in this series about Urban Reason?

Well, there are roughly three reasons for that. Firstly, because he was the first who concisely expressed the notion of method. And that is certainly of some relevance concerning our collateral target, planning in the context of urban affairs. Second, because the still prevailing modernist thinking is soaked by Descartes’ rationalist ideas. Doing one thing after another, the strategy of divide and conquer, is essentially Cartesian. Such, Descartes is still the secret hero among functionalists and software programmers of our days. And the third reason, finally,  for revisiting Descartes is that regarding the issues risen by planning and method we have to get clear about the problematics of rationalism1, quite beyond the more naturalist approach that we put forward earlier, aligning planning to the embryonic mode of differentiation. We again meet the “binding problem,” for at the one side Descartes’ “Methode” considers epistemic issues,  but on the other neither planning nor method could be considered just as a matter of internal epistemic stances. To put in a more rhetoric manner, could we (i) plan thinking2 and could we (ii) expect to completely think through a plan?

Descartes, living in a transitional time between two great ages, between renaissance and enlightenment, expressed for the first time a strong rational “system”, renewing and updating thereby Platon’s philosophy. When dozing in the Portuguese sun, while ears being filled with some deep house I can imagine that today we are going to experience kind of a reverse passage, a trajectory through Descartes, back from rationalist, logicist, mechanist way of thinking full of abstract ideas that are detached from life like for instance independence towards the classic praise of vortices, broiling, emergence, creativity and dignity of the human practices, that is relating to each other in first place. As one of the first we will meet Leonardo, the timeless genius.

Figure 1. A vortex, in Leonardo’s imaginations.

blobs-1b-det2

In short,  it seems,  in such day dreaming, that we are going to leave the (Roman) module, returning to Athens figures.3 Of course, on this course we carry a backpack, and not a small one, filled with more recent philosophical achievements.4

Here in this essay, I will try to outline a possible update of Cartesian thinking. I tend to propose that modernism, and thus still large parts of contemporary culture, is strongly shaped by his legacy. Obviously, this applies also for the thinking of most of the people and their thinking at least in Western Cultures.

Descartes brought us the awareness about method.  Yet, his initializing version came with tremendous costs. Cartesian thinking implemented the metaphysical believe of independence into the further history of Western societies to come.5 For our investigation, it is the general question about method, mainly with regard to planning, that serves us as a motivational base. We will see whether it is possible to develop the Cartesian concept of method without sticking to his metaphysical believes and the resulting overt rationalism.

Serving still the same purpose as intended by Descartes—to add some update on the notion of method—, in the end this update will turn out to be more like a major release, just to borrow a notion from software production. While the general intention may still resemble Descartes’ layout, the actual mechanisms will be quite different, and probably the whole thing won’t be regarded as Cartesian any more by the respective experts.

But why should one, regarding plans and their implementation, bother with philosophy and other abstract stuff of similar kinds at all, particularly in architecture and urbanism? Isn’t architecture just about pretty forms and optimal functions, optimal fulfillment of a program—whether regarding land-use or the list of rooms in a building—, mingling those with a more or less artful attitude? Isn’t urbanism just about properly building networks of streets and other infrastructure, including immaterial ones such as safety (police, fire, health) and legislative prescriptions for guiding development?

Let us listen to the voice of Vanessa Watson [3], University of Cape Town, South Africa, as she has been writing about it in an article published in 2006 (my emphasis):

The purpose of this article has been to question the appropriateness of much of the thinking in planning that relates to values and judgement. I argue that two main aspects of this thinking are problematic: a focus on process and a neglect of outcomes, together with the assumption that such processes can be guided by a universal set of deontological values, shaped by the liberal tradition. These aspects become particularly problematic in a world which is characterized by deepening social and economic differences and inequalities and by the aggressive promotion of neoliberal values by particular dominant nation-states. (p. 46)

Obviously,  she is asking about the conditions of such implementation. Particularly, she argues that one should be aware about values.

The notion of introducing values into deliberative processes is explored.  (p.31)

In fact, the area of planning6 is a hot spot for all issues about the question what humans would like to “be”, to achieve. Not primarily as an individual (though this could not be neglected), but rather as a “group” in these ages of globalization.7 And many believe not only that human affairs are based on values, but also that this is necessarily so. Watson’s article is just one example for that.

Quite obviously, planning is about the future, and more precisely, about decision-making regarding this future. Equally obvious, it would be ridiculous to confine planning just to that. Yet, stating that ex-post is something very different from ex-ante, as Moroni [4] does in his review of [5], is not only not sufficient, it is struck by several blind spots, e.g. regarding the possibility of predictive modeling. Actually, bringing ex-post and ex-ante perspective to a match is the only way to enable oneself for proper anticipation, as it is well known in financial industries and empiric risk analysis. This is not only admissible in economic contexts. It has been demonstrated as a valuable tool in digital humanities as well. Else, it should be clear that a reduction to either the process or the outcome must be regarded as seriously myopic. What then is planning? (If there is a possible viable definition of it at all.)

Actually, looking to the literature there seem to be as much different definitions for planning as there are people calling themselves planners. In the community of those people there is a fierce discussion about it, even after more than a century of town planning offices. Different schools can be observed, such as rationalists (cf. [5]) or “radical hands-on practitioners,” the former believing in the possibility of pervasive comprehension, the latter denying the feasibility of theory and just insisting on manuals as collections of mystical hands-on recipes [6]. Others, searching for kind of a salvation, are trying to adopt theories from other domains, which poses at least a double-sided problem, if neither the source such as complexity or evolutionary theory is properly understood (cf. [7], [8], [9]) nor the process of adopting them, as Angelique Chettiparamb has been pointing out [10]. As a matter of fact urban or regional planning still fails much too often, particularly corresponding to the size and the scope of the project, and a peculiar structure shows up in this failure: the missing of a common structure across planning projects. One of the reasons at the surface for complicating the subject matter is certainly the extended time horizon affected by the larger plans. Of course, there is also the matter of scale. Small projects often succeed: they are completed within budget, within time, they look like designed and clients are permanently satisfied. Yet, this establishes swarms of independent planning and building, which, according to Koolhaas led to Junkspace. And we should not overlook urban sprawl, which many call the largest failure of planning. Swarms of small projects, even if all of them would be successful, can’t replace large-scale design, it seems.

In other words, the suspicion is that there is a problem with the foundations, with the concepts buried in the idea of planning, the way of speaking, i.e. the performed language games, and probably even with the positioning of the whole area, with the methods, or with all of those issues together. In agreement with Franco Archibugi [5] we may conclude that there are two main challenges: (i) the area of planning is largely devoid of a proper discourse about its foundations and (ii) it is seriously suffering from the binding problem as well.

The question about the foundations is “foundational” for the possibility of a planning science at large. Heidegger in “Sein und Zeit” mentioned ([11]p.9)

Even as the significance of scientific research is always given in this positivity, its actual progress completes not so much through the collection of results and their salvage in “manuals” than in the asking for the basic constitutions of the respective domain, an asking that mostly will be seen as reactively driven out of the increasing technical expertise being fixed in such manuals.

…and a few sentences later :

The level of a science is determined by its capability for a crisis of its foundational concepts.8

Nowadays, we even can understand that this crisis has to be an ongoing crisis. It has to be built into the structure of the respective science itself, such that the “crisis as event” is not possible any more. As an example we will not only throw a glimpse towards biology, we will even assimilate its methodological structure.

I believe that all those methodological (meta-)issues can’t be addressed separately, and also not separately from so-called practical issues. Additionally, I think that in case of an investigation that reaches out into the “social” the question of method can’t be separated from that about the relation between ethics and planning, or from its target, the Urban (cf. [12]). Such a separation would implicitly follow the structure of reductionist rationalism,  which we have, of course, to avoid as a structural predetermination. Therefore I decided to articulate and to braid these issues in a first round all together into one single essay, even to the cost of its considerable length.9

The remainder of this essay revolves around method, plan and their vicinity, arranged to the following sections (active links):

1. Method a la Carte(sian)

Descartes meant to extend the foundations devised long before him by Aristotle. The conviction that some kind of foundations are necessary and possible is called foundationalism. In his essay about Descartes epistemology [13], Newman holds that

The central insight of foundationalism is to organize knowledge in the manner of a well-structured, architectural edifice. Such an edifice owes its structural integrity to two kinds of features: a firm foundation and a superstructure of support beams firmly anchored to the foundation. A system of justified beliefs might be organized by two analogous features: a foundation of unshakable first principles, and a superstructure of further propositions anchored to the foundation via unshakable inference.

In Descartes’ own words:

Throughout my writings I have made it clear that my method imitates that of the architect. When an architect wants to build a house which is stable on ground where there is a sandy topsoil over underlying rock, or clay, or some other firm base, he begins by digging out a set of trenches from which he removes the sand, and anything resting on or mixed in with the sand, so that he can lay his foundations on firm soil. In the same way, I began by taking everything that was doubtful and throwing it out, like sand … (Replies 7, AT 7:537)

Here the reference to architecture is a homage to Aristotle, who also used architecture as kind of a structural template. The big question is whether such a stable ground is possible in the realm of arguments. If not, a re-import of the expected stability won’t be possible, of course. The founder of mechanics, Archimedes, already mentioned that given a stable anchor point he could move the whole world. For him it was clear that such a stable point of reference is to be found only for local contexts.

In his “Discours de la Methode” Descartes distinguished four precepts, or rules, about how to achieve a proper way of thinking.

(1) The first was never to accept anything for true which I did not clearly know to be such; that is to say, carefully to avoid precipitancy and prejudice, and to comprise nothing more in my judgment than what was presented to my mind so clearly and distinctly as to exclude all ground of doubt.

(2) The second, to divide each of the difficulties under examination into as many parts as possible, and as might be necessary for its adequate solution.

(3) The third, to conduct my thoughts in such order that, by commencing with objects the simplest and easiest to know, I might ascend by little and little, and, as it were, step by step, to the knowledge of the more complex; assigning in thought a certain order even to those objects which in their own nature do not stand in a relation of antecedence and sequence.

(4) And the last, in every case to make enumerations so complete, and reviews so general, that I might be assured that nothing was omitted.

Put briefly, and employing a modernized shape, he demands to follow these principles:

  • (1) Stability: proceed only from stable grounds, i.e. after excluding all doubts;
  • (2) Additivity: practice the strategy of “divide & conquer”;
  • (3) Duality: not to mistake empirical causality for logical sequence;
  • (4) Transferability: try to generalize your insight, and apply the generalization to as much cases as possible.

Descartes proposes a certain “Image of Thought”, as Deleuze will call it much later in the 1960ies.10 There are some important objections about these precepts, of which Descartes, of course, could not have been aware. It needed at least two radical turns (Copernican by Kant, Linguistic by Wittgenstein) to render those problems visible. In the following we will explicate these problems around Descartes’ four methodological precepts in a yet quite brief manner.

ad (1), Stability

There two important assumptions here. First, that it is possible to exclude all doubts, (2) that it is possible to use language in a way that would not be vulnerable to any kind of doubt. Meanwhile, both assumptions have been destroyed, the first by Gödel and his incompleteness theorem, the second by Wittgenstein with his insisting on the primacy of language. This primacy makes language as a languagability a transcendent (not: transcendental!) entity, such that it is even apriori to any possible metaphysics. There are several implications of that, first regarding the meaning of “meaning” [14]. Surprisingly enough, at least for all rationalists and positivists, it is untenable to think that meaning is a mental entity, as this would lead to the claim that there is something like a private language. This has been excluded by Wittgenstein (see also [14][16]) and all the work of later Putnam is about this issue [17]. Language is fundamentally a “communal thing,” both synchronically and diachronically. Frankly, it is a mistake to think that meaning could be assigned or that meaning would be attached to words. The combined rejections of Descartes’ first precept leads us to the primacy of interpretation. Before interpretation there is nothing. This holds even for what usually is called “pure” matter. A consequence of that is the inseparability of form and matter, or if you like, information and matter. It is impossible to talk about matter without also talking about information and form. For Aristotle, this was a cornerstone. Since Newton, many lost the grip onto that insight.

ad (2), Additivity

This inconspicuous rule is probably the most influential one. In some way it dominates even the first one. This rule was to set out the framing for positivism. The claim is basically that it is generally possible, that is for any kind of subject in thinking, to understand that subject by breaking it up into as many parts as possible. Nothing would be lost by breaking it up.  In the end, we could recombine the “parts of understanding” into a combined version. If this property is assigned to an empirical whole11, this property is usually called “additivity” or “linearity”.

By this rule, Descartes clearly sets himself apart from Aristotle, who would clearly have refused it. For Aristotle, most things could not be split into parts without loosing the quality. The whole is different from the sum of its parts. (Metaphysic VII 17, 1041b) From the other direction this means that putting things together always creates something that haven’t been there before. Today we call this emergence. Yet, we have to distinguish different kinds of emergence, as we have to distinguish different kinds of splitting. When talking about emergence and complexity, we are not interested in emergence by rearrangement (association or by combination (water from hydrogen and oxygen), but rather in strong emergence, which opens a new organizational level.

The additivity of things in thought as well as of things in the world is a direct consequence of the theological metaphysics of Descartes. For him, man had to be independent from God in order to be able to be man able to and for reason.

He [God]… agitated variously and confusedly the different parts of this matter, so that there resulted a chaos as disordered as the poets ever feigned, and after that did nothing more than lend his ordinary concurrence to nature, and allow her to act in accordance with the laws which he had established.

There are general laws effective in the background, as a general condition, but there is no direct action of the divine principle anymore. In other words:  In his actions, man is independent from God. By means of this believe into the metaphysical independence12, Descartes and Leibniz, who thought similarly (see his Theodizee), became the founders and grandfathers of modernism as it still prevails today.

ad (3), Duality

Simply great. The issue has been rediscovered, and of course extended and deepened by Wittgenstein. Wittgenstein understood as the first ever that logic is transcendent. There is neither a direct way from the world into logic, nor from logic into world. It is impossible to claim truth values for worldly entities. Doing so instead results in the implicit claim that the world could be described analytically. This has been the position of idealist rationalists and positivists. Note that it is not a problem to behave rationally, but it is definitely a problem to claim this idealistically as a norm. For this would exclude any kind of creativity or inventiveness.

Descartes did not recognize that his third precept contradicts his second one at least partially. Neither did Aristotle with his conceptualization of the whole and the claim that the truth could be recognized within the world.

ad (4), Transferability

Also a great principle, which is still valid. It rejects what today is known as case-study (the most stupid thing positivism has brought along).

Yet, this also has to be extended. What exactly happens when we are generalizing from observations? What happens, if we apply a generalization to a case? We already discussed this in detail in our contemplation about the comparison.

One of the results that we found there is that even the most simple comparison needs something that is not empirical, something that can not be found by just looking (starring?) at it. It not only implies a concept, it also requires at least one concept that is apriori to the comparison or likewise the observation. The next step is to regard the concept itself as a quasi-material empirical thing. Yet, we will find the same situation again, though this does not establish circularity or a regress!

In order to apply an already established generalization, or a concept, we need some rules. This could be a model of some kind. The important thing then is to understand completely the fact that concepts and generalizations could not be analytical. Hence there are always many ways to apply a generalization. The habit to select a particular style for the instantiation of the concept I called orthoregulation. In Kantian terms we could call it forms of constructions, mirroring his forms of intuition (or schemata).

It is this inevitability of manifold instantiation of abstractions, ideas or generalizations which idealist rationalism does not recognize and thus fails in the most serious way. For its mistake being the claim that there is a single “correct” way to apply a concept.

2. Foundation, now

Descartes clearly expressed that the four parts of the method are suitable to follow first principles, but not sufficient for finding the first principle. For that he devised his method of doubt. Yet, after all, this as well as his whole foundationalist systematics was in need for being anchored in God.

But what if we would try to follow the foundational path without referring to God?13 Setting something else as a first principle is not suitable outside of mathematics or logic. In the case of the former we call it axiom, in the case of the latter tautology. In kind of a vertigo both areas still struggle for a foundation, searching for a holy grail that can’t exist. Outside of mathematics, it is quite obvious that we can’t set an axiom as a first principle. How to justify it?

Now we met the real important question. If we can’t answer it, so it was thought, any knowledge would immediately become subject to the respective circumstances, implying kind of a tertiary chaos, deep relativity and arbitrariness. Yet, the question is important, but somewhat surprisingly the answer is irrelevant. For the question is ill-posed, where its misguidedness represents its importance. There is no absolute justification, thus there is no justification at all, and in turn the question is based on a misbelief.

This does not mean, however, that there is no foundation in the sense that there is nothing beyond (or: behind) this foundation. In our essay “A Deleuzean Move” we presented a possibility for a self-referential conceptualization of the foundation that provides a foundation without being based on a first principle. Of course, there are still requirements. Yet, all required positive-definite items or proposals—such as symbols or logic—become part of the concept itself and are explained and dissolved by it. The remaining conditions are identified as transcendent: modelity, conceptuality, mediality and virtuality. Each of them can be translated or transposed into actual items, and in each “move” all of them are invoked to some, varying degree. These four transcendent and foundational conditions for thought, ideas and language establish a space, whose topology is a hyperbolic, embedding a second-order Deleuzean differential. All together we called it the choreostemic space, because different styles of human activity creates more or less distinct attractors in this space.

Such, the axiomatic nature of Descartes’ foundation which we may conceive as a proposal based on constants is changed into a procedural approach without any fixed point. Instead, the safety in the ocean of possible choreostemic forms derives solely from the habit of thought as it practiced in a community. The second-order differential prevents this space becoming representational, as it needs a double instantiation. It can’t be used to map or project anything into it, including intentions. Nevertheless it records the style of unfolding intentions, wishes, stories, informational activities etc. and renders different styles comparable. These styles can be described as a distinct dynamics in the choreostemic space, between the transcendent entities of concept, model, mediality and virtuality.

This choreostemic space traces the immanence of thought and the relation between immanence (of creation), transcendence (of condition) and the transcendental (of the outside). This outside is beyond the border of language, but for the first time it appears as an imaginary. Note that the divine and the existential are both in this outside, yet into different virtual directions. Neither God nor existence is conceived as something to which we could point to, or about which we could speak by means of actual terms. And at least for the existential it doe not make much sense to doubt it. Here we agree with Descartes as well as with Wittgenstein. Despite we can’t say anything about it, we can traverse it. We always do so when we experience existential resistance, like an astronaut in a Space Shuttle visiting the incompatible interplanetary zone. Only limited trips are possible, we always have to return into an atmosphere.

Saying that the choreostemic space establishes a self-referential foundation implies that it is also critical (Kantian), and even meta-critical (Post-Kantian), yet without being doomed to idealism (Fichte, Frege) or totality (Hegel) and the logicistic functionalism implied by those.

Above we mentioned that the transcendent elements of the choreostemic space, namely model, concept, mediality and virtuality, can be transposed into actual items. This yields a tremendous advantage of the choreostemic space. It does not just dissolve the problem of ultimate justification without scarifying epistemic stability, it also bridges the rather wide gap between transcendence and application. In order to put it into simple terms, the choreostemic space just reflects the necessity of social embedding of modeling, the role of belief and potential in actual moves we take in the world, and finally the importance of concepts, which can be conceived as ideas being detached from the empiric constitution (or parts) of language. In discourses about planning as well as in actual planning projects this 4-fold vector describes nothing less than a proper communicational setup that is part of goal-directed organizational processes.

There are some interesting further topics that can be derived from this choreostemic space, which you can find in the main essay about it. The important message here is that a constant, a metaphysical axiom gets completely dissolved in a procedure that links the informational of the individual with the informational of the communal. 

3. Method, now

3.1. …Taken Abstract

Method is not primarily an epistemological issue, such as models or concepts, or modelity and conceptuality, respectively. It combines rules into a whole of procedures and actions such that this whole can be seen as the operational equivalent of a goal or purpose. As such, it refers to action, strategy, and style, thus aesthetic issues. Hence, also to creativity and its hidden company, formalization. Despite the aspect of reproducibility is usually strongly emphasized, there is also always an element of open experimentation in the “methodological,” allowing to “harvest” the immanent potential, far beyond the encoding and its mechanistic implications. This holds even for thinking itself.

Descartes, of course, and similarly to Kant later, clearly addressed the role of projected concepts as a means of “making sense,” while these projections don’t respond to the object(s) hosting some assumed necessity. As part of the third precept in performing method he writes (see above):

“…   assigning in thought a certain order even to those objects which in their own nature do not stand in a relation of antecedence and sequence.”

Objectively, logically confirmed stable grounds are not part of methodological arrangements any more. There is some kind of stability, of course, yet this comes just as a procedural regularity, which is dependent on the context. In turn, this allows to evade analyticity towards adaptivity.

Any method thus comprises at least two different levels of rules, though usually there are quite a few more. The first will address the factual re-arrangement, while the second—let us call it the upper—level is concerned about the regularization of the application of the rules on the first level, as well as the integration of the rather heterogenic set on the lowest level. Just think about a laboratory, or the design and implementation of a plan in a project to get a feeling for the vey different kinds of subjects that have to be handled by and integrated into a method. The levels are tightly linked to each other, there is still a link to empiric issues on the second level. Thus there are not too much degrees of freedom for the rules on the upper level.

Saying this we already introduced a concept and actively built upon it that has not been available to Descartes: information. Although it could be traced in his 3rd and 4th precept, information as a well-distinguished category was not available before the mid of the 20th century. Itself being dependent on the notions of the (Peircean) sign and probability, information does not only allow for additional levels of abstraction, it also renders some important concept accessible, which otherwise would remain completely hidden. Among those are a clear image about measurement, the reflection about rules, the reflection about abstraction itself—think about the Deleuzean Differential—, the proceduralization, accumulation, transformation and re-distribution of executive knowledge, the associative networks, distributed causes, complexity, and the distinction between reversibility and irreversibility. All those conceptual categories are highly relevant for a theory of planning. None of them could be found explicitly and appropriately assimilated so far in the literature about planning (in the end of 2012).

These categories provide us with a vantage point that opens the possibility for a proper formulation of “method”, where “proper” means that it could be appropriately operationalized and instantiated into practical contexts. We can say that…

Methods are structured collections of more or less strict rules that organize the transformational flow of items.

These items could be documents, data, objects in software, material objects, but also ideas and concepts. In short, and from a different angle, anything that could be symbolized. In the context of planning, any of those particular kinds may be involved, since planning is the task of effectively rearranging matter, stocks and flows embedded into a problematic field spanning from design [19] and project management to logistics and politics. There is little sense to wrangle about the question whether design should be included in planning and planning theory or not [1]. Or whether one should follow a dedicated rationalist route or not [4].

Such questions derive mainly from two blind spots. Firstly, people are obviously caught in a configuration ruled by the duality of “context” and “definition”. It is not that the importance of context is not recognized. Fortunately,  the completely inadequate and almost stupid response of leaning towards case-based-reasoning, case studies or casuistic (cf. [20]) is quite rare.14 Secondly, planning seems to be conceived implicitly as something like an external object. Only Objects can be defined. Yet, objects are created by performing a definition and this “act of defining” in itself is strongly analytical. Conceptual work is outside of the work of the definition. Who, besides orthodox rationalists or logical positivists would claim that planning is something analytical? As a further suspicion we already could add that there are quite strong hints that favor a grand cultural hypothesis for planning.

3.2. … from the Domain Perspective

In order to get clear about this we could look for an example from another domain, where the future—as in planning—is also a major determinant. Hence, let us take the science of biology. Organisms are settling in a richly structured temporal space, always engaging with the future, on any scale. The reason is quite simple: Those who didn’t sufficiently, let it be as a species, or as individual, do not exist any more.

Biology is the science about all aspects of living entities. This definition is pretty simple, isn’t it? Yet, it is not a definition, it is a vague description, because it is by no means clear what “life” should mean. Recent textbooks on biology do not contain a definition of life anymore. So, how is biology structured as a science? Perhaps you know that physicists claimed since Darwin that biology isn’t a “science” at all, because its proclaimed lack of “laws” and respective abstract and formal generalizations. They always get puzzled by the huge amount of particularities, the historicity, the context-specificity, the individuality of the subjects of interest. So, we can clearly recognize that a planning science, whatever it will turn out to be, won’t be a science like physics.

It is not possible to describe all the relevant structural aspects of biology as science and the respective approaches and attitudes here. Yet, there is kind of an initiation of biology as a modern science that is easy to grasp. The breakthrough in biology came with Niko Tinbergen’s distinction of the four central vectors of or perspectives in biological thought:

  • (1) ontogenesis (embryology, growing up, learning),
  • (2) physiology,
  • (3) behavior, and
  • (4) phylogenesis (evolution).

The basic motivation for such a distinction arose from the differences regarding the tools and approaches for observation. There are simply different structures and scales in space-time and concept- space, roughly along the lines Tinbergen carved out. From the perspective of the organism, these four perspectives could be conceived as “functional compartments”. Later, this concept of the functional compartment has been applied with considerable success in cell biology. There, people called them genome, transcriptome, proteome, etc., in order to organize the discourse. Meanwhile it became obvious, however, that this distinction is not an analytic, i.e. “idealistic” one, since in cells and organisms we find any kind of interaction across any number of integrative organizational “levels”.

Any of these areas started with some kind of collecting, followed by taxonomies in order to master the particularity. Since the 1970ies, however, there is an increasing trend towards mathematical modeling. Techniques (sometimes fuzzily also called methods) comprise probabilistic modeling, Markov-models, analytic modeling such as the Marginal-Value-theorem in eco-behavior [21], any kind of statistics, graph-based methods, and data-based, or empirical classification by means of clusterization, and often a combination of them. These techniques are used for deriving concepts.

Interestingly, organisms and their populations are often described (i) in terms of a “currency”, which in biology is time and energy, and (ii) in terms of “strategies,” both on the individual as well as on the collective level. Famous the concept evolutionarily stable strategy (ESS) by Maynard-Smith from 1970 [22].

As a fifth part of biology we nowadays could add the particular  concerns about the integration of the four aspects as introduced by Tinbergen. The formal study of this integration is certainly given by the concept  of complexity.15

Whatever the final agreement about planning and method in Urban16 Affairs will comprise, it is pretty sure that there won’t be a closed definition of planning. Instead, and almost certainly we will also see the agreement on some kind of “Big four/five” perspectives. In the next section we are going to check out the possibility for an extension of it.  Note, that taxonomy is not one of those! And despite there are myriads of highly particular descriptive reports, biology never engaged in case studies.

3.3. The Specialty…

No question, the pragmatic approach of separating basic perspectives without sacrificing the idea of integration has been valuable for the development of biology. There are good chances that the adoption of these perspectives—carried out appropriately, that is not representationalist—will be fruitful for the further development of the domain of planning and planning theory. There is at least kind of a homeomorphism: in both areas we find a strong alignment to the future, which in turn means that adaptivity and persistence (sustainability) also play an important role.

The advantage of such a methodological alignment would be that planning theory would not have to repeat all the discussions regarding the proper concepts of observation. Planning could even learn from the myriads of different strategies of natural systems. For instance, the need for compartmentalization. Or the fact that the immediate results of initial plans (read: genes and transcripts) are in need for heavy post-processing. Or the reliability of probabilistic processes. Or the fact, that evolutionary processes are directed to increased generality, despite their basic blindness.

Yet, there are at least two large differences to the domain of planning. Firstly, planning takes place as a symbolic act in a culture, and secondly, planning involves normative structures and acts, to which we will take a closer look below. Both aspects are fundamentally different from the perspectivism in biology insofar as they don’t allow for a complete conceptual externalization as it is the case with biological subjects. Quite to the contrary, symbols and norms introduce a significant self-referentiality into all methods regarding method and planning in the context of the Urban.

Thus, additionally to the 4+1 structure that we could adopt from biology for dealing with the externalizable aspects, we need two further perspectives that are suitable to deal with the dynamics of symbols and the normative. For the first one, we already have proposed a suitable structure, the choreostemic space. Two notes about that. First, the choreostemic space could be turned into a methodological attitude. Second, the choreostemic explicitly comprises the potential and mediality as major determinants of any “worldly” move, besides models and concepts. The further issue of normativity we will discuss in the next section.

Meanwhile, we finally can formulate what method could mean in the context of the Urban. First, our perspectives for dealing with the subject of “planning,” the subjects of planning, and the respective methods would be the following (read 1 thru 4 in parallel to Tinbergen’s)

  • (1) genesis of the plan and genesis of the planned;
  • (2) mechanisms for implementation, mostly considering particular quasi-material aspects, and mechanisms in the implemented;
  • (3) behavior (of individuals, groups, and the whole) and social dynamics, during planning and in the implemented arrangement;
  • (4) adaptivity, persistence, sustainability and evolution of plans and the planned;
  • (5) Choreostemic of concepts and interaction, in planning and in the planned,;
  • (6) Ethical and moral considerations;
  • (7) Integration of planning and the planned as a complex system (see also below).

Within these perspectives, particular methods and techniques will evolve. Yet, we also could bundle all of it into a single methodological attitude. In any case we could say that…

Methods are collections of more or less strict rules that organize the transformational flow of items, where these collections are structured along basic perspectives.

3.4. …and the (Notorious, Critical) Game

Last, but not least, “method” is a language game—of course, I would like to add. As usual, several implications immediately derive. First, it is embedded into a Form of Life. Methods are by no means restricted to rationalism or the famous “Western perspective”. Any society knows language, rules and norms, and thus also regularity. Of course, the shape of the method may differ considerably. Yet, from the concept as we propose it here, these differences are just parameters. In terms of choreostemic space, methods result in different attractors in a non-representative metaphysical space of immanence.

This brings us to the second implication: the language game “method” is a “strongly singular term”. We can’t do anything without it, not even thinking in the most reduced manner, let even be a combined action-thinking. “Method” is one of these pervasive constructs in the basement of culture. Moreover, as a strongly singular term it introduces self-referentiality, and hence an immanent creativity. Thus the third implication: Whenever we use a method, we have to apply it critically. This basically means that there is no method without a clear indication about its conditions.

Regarding our concept of Generic Differentiation and its trinitary way of actualizing change, we thus have to expect that we will find the “method aspect” everywhere. No matter whether we take the perspective of the planning process or that of the planned. In order to illustrate this aspect using a metaphor, let me refer to the structure of atoms and molecules, particularly to the concept of the electron orbital. Orbital electrons are responsible for the electro-magnetic binding forces between atoms in molecules. It is through these electrons that molecules (and also metals and crystals) can exist at all.

Figure 2: the so-called orbitals of outer electrons of atoms in a molecule of CO2, showing their importance in building molecules from atoms. The cudgels (yellow, blue, green) should not be taken as well-defined 3-dimensional material volumes. They rather indicate fuzzy areas of increased probability for meeting an electron if a measurement would be taken.

co2-hybridization

Similarly, methods, as elements of choreostemic moves, may be conceived the mediators of binding forces between the aspects involved in thinking about differentiation.

Our concept of Generic Differentiation allows to overcome the wrong distinction between theory and practice. While the true dualism consists of theory or practice on the one side and performance on the other, it is still necessary to clarify the relation between theory, model and operation. We already derived that theories may be beneficially conceived as orthoregulating milieus for assembling models. But still, this is only a condition. I think that the relation between theory and structural models on the one side,  and predictive/operational models on the other side concerns a question that points right to the heart of actualization: How to organize interpretation? Again we meet a question that is invisible for rationalists and modernists17 as well, since both are blind against the necessity of forms of construction and the implied freedom, or manifoldness of choice, respectively. This issue of how to organize interpretation concerns, of course, all phases and aspects of planning, from creating the plan until living in the implemented plan.

4. Grand Cultural Perspective

Franco Archibugi is completely right in emphasizing that planning is pervasively relevant [5]. Planning of xyz is not just relevant for the subject xyz, where xyz could be something like land-use, city-layout, street planning, organizational planning, etc.

In other words, it [m: planning] is a system that concerns the entire social life and includes all the possible decision-makers that act within it. It is a holistic system. 18

So far, so good. He is also right in criticizing the positivistic approach to planning, which, according to him, has been prevalent in planning until recently. Yet, despite in his book he describes a lot of reasonable means and potential practices for an improved choreography of planning, comprising institutions down to consulting, it is not really an advance to replace the positivist attitude with a functionalist one, claiming that planning has to follow the paradigm of “programming”.

Among other weaknesses such as a weird concept of theory and theoricity—leading to rather empty distinctions like theory on, of and in planning and the mistake to mix case-studies with story-telling—, Archibugi is almost completely unaware about the ethical dimension and/or its challenges, apparently hoping to cover the aspect of difference and divergence by means of institutions. Since he believes in penetrating comprehensibility, complexity  and self-referentiality didn’t make it into his treatise as well, even if we would consider it in the limited way mainstream is using it.  Despite he wants to separate from positivist approach in his outline of “the first routes of the new discipline,” he proposes an “operational logical framework” which integrates and unifies all types, forms, and procedures of planning.19

Therein, Archibugi surely counts as an arch-rationalist, a close relative to the otherworldly stories published by Luhmann and Habermas. Yet, we certainly can’t apply pervasive rationalism for designing this “system”.  Social life can’t be planned and, more important, it should not be planned, as the inherent externalizing perspective introduced by plans implies to treat human beings as means.20

Our support of the grand cultural attitude is rooted quite differently. In this series of essays about the Urban (with a capital “U”, see footnote 16) we have been trying to find support for the concept of Urban Reason. Basically, this concept claims that human reason is strongly shaped or even determined by the embedding culture, which today, as a matter of fact, is urban culture. In short, human reason is itself a cultural phenomenon. One could indeed argue that this follows quite directly from Wittgenstein’s philosophy and the extensions provided by the late Putnam: Any rule following is deeply anchored in the respective Form of Life; any human thinking, which is largely based on language, hence has the communal as one of its main components. As a consequence of the increasing weight of urban culture, which meanwhile turned into a dominance even against the nation state, human reason is strongly shaped by the Form of Life of urban citizens. This holds for any tiny bit of the surface of planet earth, of course, even if an arbitrary tribal community never would have been in contact with modern forms of human social organization.

The quality of the Urban can’t be separated any more from human reason, thus from human culture at large. Everything we do around the Urban and within the Urban contributes to culture. This we call the Grand Cultural Hypothesis. In Deleuzean terms we could say that the Urban could be conceived as a distributed, process- and population-based, probabilistic plane of immanence. Regarding our extension of this Deleuzean concept, the Choreostemic Space, we could also say that the Urban establishes a particular attractor in it.

We even could extend this Grand Cultural Hypothesis by stating that all the institutions we nowadays rate as cultural emanence always have been urban. Things like writing, numbers, newspapers, books, astronomy, guilds, printing, operas, stadium, open source, bureaucracy, police, power or governmentality could have emerged only in those arrangements we call city. We have been discussing this already elsewhere and won’t repeat it.

The argument here is that the Urban is a particular form of dealing with differentiation. In turn, designing or at least establishing a particular way of dealing with differentiation and of inducing differentiating processes circumscribes what could be labeled a particular culture. Urban differentiation processes rarely engage with physical constraints, for the Urban introduces an emancipation from them, and people being immersed in the Urban invent things like money and insurances. In other words, the Urban provides a stable platform for safe-guarded experimentation with cultural goods, inventing also methods and conditions for experimenting. Thus, even the very notion of method, as opposed to tradition, has been shaped by the Urban.

All this is not really surprising.  It is well-known that cities are breeding grounds for symbolization and sign processes. The Urban creates its own mediality. The Urban puts differentiation onto its stage, it invokes an almost cinematographic mise-en-scene of differentiation21. This result is strongly contradicts the Cartesian and rationalist expectation that it would be possible to plan (aspects of) the city. Planning must be considered as just one of the three modes of differentiation, besides evolution and learning. Believing into the possibility and sufficiency of an apriori determinability just means to mistake the embryo for the fully fledged animal.

Obviously, the weighting of the three forms of actualization of differentiation is an act of will, albeit this could be observed so far only in very rare cases22. This irreducible trinity in differentiation should, however, not be assigned just to the individuals. It is a matter of politics and the collective as well, though this introduces a completely new level of negotiation into politics for most countries (except Switzerland, perhaps). Yet, probably it is the only form of politics that will remain in a truly and stable enlightened society. Each particular configuration of the above mentioned trinity will exert rather specific constraints and even consequences. A first benefit from our extended concept of Generic Differentiation concerns the possibility and the mode of communicating qualitative consequences of implementing certain designs.

The  great advantage of talking at this level of abstraction is that the problematic field can be relieved from the collision of “values” and facts. It is accessible through the Differential23, that is, a vertical speciation (just in contrast to Descartes’ method and also deconstructivism, both of which are applying horizontal differencing only). Values and facts are not disregarded completely by rigorous linguistic hygienic, as Latour suggests. They are just not taken as a starting point. One should acknowledge that values and facts are nothing else than kind of shortcuts in thinking, when thinking becomes a bit lazy.

Another advantage is that there is no possibility any more to clash outcome (by any means) and process (towards an open end). They are now deeply integrated into Generic Differentiation. This does not exclude indicative measures for the quality of a city or its neighborhoods, whether regarding for instance more general issues like adaptivity, or more concrete ones like the development or relative level of the attractiveness as measured by the monetary value of the cells in a district. It should be clear, however, that it is impossible to define short-term outcomes, e.g. as the “result” of the implementation of a plan. We even could say that measuring the city could be done almost in arbitrary ways, as long as there are many measures, the measures are going to address various organizational levels and the measures are stable across a long period of time.

All this allows us to rethink planning. It will have a profound effect on the self-perception of planners and the profession of planning at large. Calls like that forwarded by Vanessa Watson, demanding for “respecting cultural differences” [1] become dispensable, at least. We can see that they even lead to a false emphasis on identity, revitalizing the separation of into process and outcome against its own intentions.

Starting with the primacy of difference, in contrast, allows to bring in evolutionary aspects in a completely self-conscious manner. Difference is nothing that must be respected or created. It must be deeply braided into the method, not into the corporeality of people as a representationalist concept. More exactly, as deep as possible, that is as a transcendent principle. It is more or less canting to acclaim “be different”, or “rescue difference”, as this implies the belief in transcendental identity and logicism.

But now it is urgent to discuss the issue of ethics regarding planning and methods.

5. Values, Ethics, and Plans

No doubt, our attitudes towards our own future(s) are not only shaped by contextual utility and some overarching (idealistic) rationality may play only a partial role as well. From the background, or if you prefer: subliminally,  a rich and blurry structure determines our preferences, hopes and intentions. Usually, this sphere of personal opacity is also thought to comprise what often is called values. Not surprising, values also appear in the literature about planning  (cf. [24]24).

Undeniably, planning is in need for ethics25 and moral standards [25]. Yet, the area is a rather difficult one, to say the least. Rather well-known approaches like that proposed by Rawls (based on the abstract idea of justice), rationalism, or utilitarianism are known to be either defect, not suitable for contemporary challenges, or both. Furthermore, it is difficult to derive moral standards from the known philosophical theories. Fortunately, there is an alternative. Yet, before we start we have to shed some light on the rhetoric implied by the notion of “plan”.

5.1. Language Games

In the context of the concept of Generic Differentiation we already identified the “plan” and the respective notion of “development” as just one of the three modes of differentiation—development, evolution and learning—, which neither can’t be separated from each other nor be reduced to each other. It is just a matter of relative weight.

Such we can ask about the language game of “plan”.  Language games are more or less organized and more or less stable arrangement of rules about the actualization of concepts into speech. I won’t go into details here, you can find the discussion of relevant aspects in earlier essays.26 Yet, some points should be made explicit here as well.

 The first is that the notion of language game, as devised by Wittgenstein in his Philosophical Investigations, implies the “paradox of rule-following”27, which can be resolved only through the reference to the Form of Life, which in simplified terms concerns the entirety of culture. Second, as a practice in language, the language game, e.g. that of talking about “plan”, implies a particular pragmatics, or different kinds of aspect is such a speech act. Austin originally distinguished the locutionary, illocutionary and perlocutionary aspect. Austin maintains that these aspects are always present, they are not a matter of psychology or consciousness, but rather of language. With Deleuze (in Cinema 2) we can add the aspect of story-telling, which we called the delocutionary aspect of speech acts. Third, any actualization of a “bag of concepts” which let us then invoke the term “plan” is just one out of a manifold, for actualization of concepts require forms of construction, or orthoregulation, as we called it. Usually, we apply rather stable habits in this “way down” from concepts to words and acts, but always keep in mind that there are many different ways for this.

Underneath of all of that is an acknowledgment of the primacy of interpretation, which includes a strong rejection of the claim of analyticity. Note, that we reject analyticity here not as a consequence of some property of our subject, that is the property of “complexity,” in our case the complexity of the city. I think it is much stronger to reject it as a consequence of (human) culture and the fact of language itself.

Such, we can ask about three things regarding the notions of “plan” or “planning”, despite the aspects are certainly overlapping. First, which concepts are going to be invoked? Second, which story is to be told? Third, how is the story to be told?

The dimension of concepts could be covered by the notion of the “image of the city”. The “image of the city” is quite a bit more than just a model or a theory, albeit these make up a large deal of it. A preferential way to deal with images about the city, albeit it is just a starting point, is David Shane’s way of theorizing the city. He manages to combine morphological, historical, political, technological and socio-dynamical aspects in a neat manner. Another, quite different mode of story-telling is provided by Rem Koolhaas, as we have discussed it before.

The two latter questions are, of course, the more important ones. Just think about the idea of “ideal city,” the “garden city,” the “city of mobility,” or the “complex city”. Or the different stances such as rationalism, neo-liberalism, or utilitarianism. Or the issue of participation versus automation. Or who is going to tell the story? Let us start by returning to said “values”.

5.2. Values

Values are constants, singularities, quite literally so. As such, they destroy any possibility of comparison or mediatedness. Just as numbers as mere values don’t have an meaning. To build a mathematics you need a systematicity about operations as well. The complete story is always made from procedures and variables, where the former always dominates the latter. A value itself is like a statue showing a passer-by. Yet, values are fixed, devoid of any possibility to move around, “pure” territorialization.

Thus, a secondary symbolization, mediatization and distribution of values (cf.[26]) does not really help in mitigating these difficulties. Claiming and insisting on values means just to claim “I am not interested in exchange at all”. Values are existential terms: either they are, or they are not. They are strictly dichotomous. Thus they are also logical terms. Not really surprising we find utilitarist folks to make abundant use of positively formulated values.

Yet, values fail even with regard to their pretension of existentiality. Heidegger [11] writes (p.100) that

[…] the recourse towards “valueish” configurations [can not] bring into sight the Being as readiness-to-hand, let alone becoming it an ontological issue.
( […] die Zuflucht zu »wertlichen« Beschaffenheiten [kann] das Sein als Zuhandenheit auch nur in den Blick bringen, geschweige denn ontologisch zum Thema werden lassen.)

Consequently it is nothing but a formal mistake to think that values could be even near the foundation for decision-making. Their existential incommensurability is the reason for a truly disastrous effect: Values are the cause of wars, small ones and large ones. (And there is hardly another reason for it.) Values implement a particular mechanic of costs, which only could be measured in existential terms, too. What would be needed instead is a scale, not necessarily smooth, but at least useful for establishing some more advanced space of expressibility. Only such a double-articulating space, which is abstract and practical at the same time, allows for the possibility of translation, at first, followed by mutual transformation.

This triple move of enabling expression, translation and transformation has nothing to do with tolerance. Tolerance, similar to values, is a language game that indicates that there is no willingness for translation, not even for transformation of ones own “position”. In order to establish a true multiplicity, the contributing instances have to interpenetrate each other; otherwise, one just ends up with modernist piles of dust, “social dust particles” in this case, without any structure.

In this context it is interesting to take a look to Bergson’s conceptualization of temporality. For Bergson, free will, the basic human tendency for empathy and temporality are closely linked through the notion of multiplicity. In his contribution to the Stanford Encyclopedia Lawlor writes [27]:

The genius of Bergson’s description is that there is a heterogeneity of feelings here, and yet no one would be able to juxtapose them or say that one negates the other. There is no negation in the duration. […] In any case, the feelings are continuous with one another; they interpenetrate one another, and there is even an opposition between inferior needs and superior needs. A qualitative multiplicity is therefore heterogeneous (or singularized), continuous (or interpenetrating), oppositional (or dualistic) at the extremes, and progressive (or temporal, an irreversible flow, which is not given all at once).

Bergson’s qualitative multiplicity that he devises as a foundation for the possibility of empathy is, now in our terms, nothing else than the temporal unfolding of a particular and abstract space of expressibility. The concept of values make this space vanish into a caricature of isolated points. There is a remarkable consistency now that we can conclude with Bergson that values also abolish temporality itself. Yet, without temporality, how should be there any exchange, progress, or planning?

Some time ago, Bruno Latour argued in his “Politics of Nature” [28], albeit he meanwhile refreshed and extended his first investigations, that the distinction between facts and values is rarely useful and usually counterproductive:

We must avoid two types of fraud: one in which values are used in secret, to interrupt discussions of facts; and one in which matters of fact are surreptitiously used to impose values. But the point is not to maintain the dichotomy between moral judgments and scientific judgments. (p.100)

The way to overcome this dual and mutual assuring fraudulent arrangement Latour proposes three major moves. First, stopping to talk about nature (facts), which results in abolishing the concept of nature completely. This amounts to a Wittgensteinian move, and aligns to Deleuze as well in his critique of common sense. Already the talk about nature insinuates the fact and produces values as their complementary and incommensurable counterpart. “Nature” is an empty determination, since fro a considerable time now everything on this globe relates to mankind and the human, as Merleau-Ponty pointed out from a different perspective.

The second step in Latour’s strategy amounts to the application of the Actor-Network-Theory, ANT.  As a consequence, everything becomes political, even if the “thing” is not human, but for instance a device, or an animal, or any other element being non-human.28 Within the network of actors, he locates two different kinds of powers, the two powers to take into account (perplexity and consultation), traditionally called science, and the two powers to put in order (hierarchy and institution),  usually called politics. The third step, finally, consists in gluing everything together by a process model29, according too which actors “translate” them mutually in a purely political process, a “due process”. In other words, Latour applies a constitutional model, yet not a two-chamber-model, but rather one of continuous assimilation and transformation. This process finally turns into kind of “collective experimentation”.

Latour’s model is one that settles in in the domain of socio-politics. As such, it is a normative model. Latour explicates the four principles, assigned to two kinds of power, by respective moral demands, this or that one “shall” do or not. Not being rooted in a proper theory of morality, the Latourean moral appears arbitrary. It is simply puzzling to read about the “requirement of closure” meaning that once the discussion is closed, it should not be re-opened, or about the “requirement of the institution” (p.111).

What Latour tries to explain is just the way how groups can find a common base as a common sense that stabilizes into a persistent organizational form, in other words that would align this thought to our concept of complexity the transition from order—patterns in the widest sense—to organization.

Yet, Latour fails in his endeavor as it is presented in the “Politics of Nature”.

As Fraser remarked from a Deleuzean perspective [29],

Latour’s concept of exteriority obliges him to pursue a politics of reality which is the special providence of ‘moralists’, rather than a politics of virtual reality in which all entities, human and non-human, are engaged.

In order to construct his argument, he just replaces any old value by some new values, while his main (and mistaken) “enemy” is Platon’s idealism. His attempts are inconsistent and incomplete.

Latour’s concept is too flat, without vertical contours, despite its rugged rhetoric. We must go “deeper,” and much more close to the famous wall where one could get a “bloody nose” (Wittgenstein). Yet, Latour also builds on a the move of proceduralization, rejecting a single totalizing principle [28].

[…] to redifferentiate the collective using procedures taken either from scientific assemblies or from political assemblies. (p.31)

This move away from positive fixation yet towards procedures that are supposed to spur the emergence of a certain goal or even purpose may well be considered as one of the most important ones in the history of thought. The underlying insight is that any such positive fixation inevitably results in some kind of naïve metaphysics or politically practiced totalitarianism.

5.3. Ethics: Theories of Morality

Contrary to a widely held belief, ethics itself can’t say anything about the suitability of a social rule. As a theory30 about moral, ethics helps to derive an appropriate set of moral rules, but there can’t be “content” in ethics. It is extremely important to distinguish properly between ethics and morality. Sue Hendler, for instance, a rather influential scholar in planning ethics, never stopped messing ethics and morality [30].

As a branch of philosophy, ethics is the study of moral behaviour and judgements. A key concept from the field of ethics is that it is possible to evaluate a given behaviour and give coherent reasons why it is ,good or bad’. […] What criteria can be used to decide whether a given action is ethical?

Philosophy never “studies behavior”. Actions “are” not ethical, they can’t be for grammatical reasons. Henderson equates types with tokens, a common fault committed by positivists. Contrary to the fashion of initiating any kind of ethics, such as environmental ethics or said planning ethics, a terminology that appears frequently in respective journals about planning, it is bare nonsense, based on the same conflation of ethics and morality, that is, theory and model. There can be only on level of theoretical argumentation that could be called ethics. There could be different such theories, of course, but any of them would not consider directly practical cases. Behavior is subject of morality, while morality is subject of ethics. 

5.4. Proceduralizing Theory

Some years ago, Wilhelm Vossenkuhl [31]31 published a viable alternative, or more precise, a viable embedding for the concept of value, one which then ultimately would lead to their dissemination. By means of myriad of examples, Vossenkuhl first demonstrates that in the field of morals and ethics there are no “solutions”. Moral affairs remain problematic even after perfect agreements. Yet, he also rejects well-founded the usual trail of abstract principles, such as “justice”, which has been proposed by Rawls in 1971. As Kant remarked in 1796 [32],  any such singular principle can’t be realized except by a miracle. The reason is that any actualization of a singular principle corrupts the principle and its moral status  itself.32 What we can see here is the detrimental effect of the philosophy of identity. If identity is preferred over difference33, you end up with a self-contradiction. Additionally, a singularity can’t be generative, which implies that an external institution is needed to actualize the principle formulated by the singularity. This leads to a self-contradiction as well.

Vossenkuhl’s proposal is radically different. In great detail He formulates a procedural approach to ethics and moral action. He refuses a positive formulation of moral content. Ethics, as a theory of morality, is necessarily empty. Instead, he formulates three precepts that together can be followed as individual and communal mechanisms in order to establish a moral procedurality. This allows to achieve commonly acceptable factual configurations (as goals) without the necessity to define apriori the content of a principle, or even a preference order regarding the implied values, or profiles of values. These three precepts Vossenkuhl calls the maxims about scarcity (affecting the distribution of goods), norms (ensuring their viability) and integration (of goods and norms). All precepts regard the individual as well as the collective. The threefold mechanisms unfold in a field of tensions between the individual and the communal.

Such, ethics becomes the theory of the proceduralization of morality. Values—as constants of morality—are dissolved into procedures. This is the new Image of Ethics. Instead of talking about values, whether in planning, politics or elsewhere, one should simply care about the conditions for the possibility that such a proceduralization can take place. It should be noted that this proceduralization is closely related to Wittgenstein’s notion of rule-following.

There is nothing wrong to conceive this as an implementation, because this ethics as well as the moral is free of content. Only if this is the case, people engaging in a discourse that affects moral positions (values) can talk to each other, find a new position by negotiation, transforming such themselves, finally settling successfully a proper agreement. Note that this completely different from a tradeoff or from “tolerance”.

The precepts should not be imagined as kind of objects or entities with a clear border, or even with a border at all. After all, they are practiced by people, and usually by many of them. It is thus an idealistic delusion to think that the scarcity of goods or the safety of norms could be determined objectively, i.e. by a generally accepted scale. Instead, we deal with a population and the precepts are best conceived as quasi-species, more or less separated subsets in the distribution of intensities. For these reasons, we can find a two-fold source for opposition. (i) The random variation of all implied parameters in the population, and (ii) the factual or anticipated contradiction of expected outcomes for small variations of the relative intensities of the precepts. In other words, the precepts introduce genuine complexity, and hence creativity through emergence and self-generated ability for performing grouping.

The precepts are not only formulated as maxims to be followed, which means that they demand for dynamic behavior of individuals. Together, they also have the potential to set a genuine dynamic creativity into motion, yet now on the level of the collective. The precepts are dynamic and create dynamics.

So, what about the relation between planning and ethics, between a plan and moral action? Let us briefly recapitulate. First, the modern version of ethics combines generative bottom-up mechanisms with the potential for mutual opposition and top-down constraints into a dynamic process. Particularly this dynamics dissolves the mere possibility for identifiable borders between good and bad. The categories of good and bad are unmasked as misguided application of logic to the realm of the social. Second we found that plans demand inherently their literal implementation. As far as plans represent factual goals instead of probabilistic structural ones, e.g. as possibility or constraint, plans must be conceived as representational, hence simplistic models about the world. In extremis we even could say that plans represent their own world. Plans are devices for actualization the principle of the embryonic.

The consequence is quite clear. As long as plans address factual affairs they are not compatible with an appropriate ethics. Hence, in order to allow for a role of ethics in planning, plans have to retreat from concrete factual goals. This in turn has, of course, massive consequences for the way of controlling the implementation of plans. One possibility is again to follow an appropriate operationalization through some currency, where for instance the adaptive potential of the implemented plan is reflected.

This result may sound rather shocking at first sight. Yet, it is perfectly compatible with the perspective made possible through an applicable conceptualization of complexity, which we will meet again in a later section about the challenge of dealing with future(s).

6. Dealing with Future(s)

Differentiation is a process, pretty trivial. Yet, this means that we could observe a series of braided events, in short, an unfolding in time and a generation of time. We have to acknowledge that the events neither do unfold with the same speed, nor on the same thread, nor linearly, albeit at large the entirety of braided braids proceeds. The generation of time refers to the very possibility for as well as the possible form of further differentiation is created by the process itself.

We already mentioned that planning as one of the possible forms of differentiation represents only the deterministic, embryonic part of it. It is inherently analytic and representationalist, since the embryonic game demands a strict decoding and implementation of a plan, once the plan exists as some kind of a encoded document. In other words, planning praises causality.

6.1. Informational Tools

Here we meet just a further blind spot of planning as far as it is understood today. Elsewhere we have argued that we can’t speak about causality in any meaningful manner without also talking about information. It is simply a rather dirty reductionism, which even does not apply in physics any more, except perhaps in case of Newton’s balls (apples?).

This blind spot concerning information comes with dramatic costs. I mean, it is really a serious blindness, affecting the unlocking of a whole methodological universe. The consequence of which has been called the “dark side of planning” Bent Flyvbjerg [34]. He coined that notion in order to distinguish ideal planning from actual planning. It is pretty clear that a misconceived structure opens plenty of opportunities to exploit the resulting frictions. It is certainly a common reaction among politicians to switch to strong directives in cases where the promised causality does not appear. Hence, failing planning is always mirrored in open—and anti-democratic—demonstration of political power, which in turn affects future planning negatively. As any deep structure, so the philosophy of identity is more or less a self-fulfilling prophecy… unfortunately with all the costs, usually burdened to the “small” people.

The argument is pretty simple. First, everybody will agree that planning is about the future. Second, as we have shown, the restriction of differentiation to planning imposes the constraint that everything around a plan is pressed into the scheme of identifiable causality, which excludes all forms that can be described only in terms of information. It is not really surprising that planners have certain difficulties with the primacy of interpretation, that is, the primacy of difference. Hence they are so much in favor of cybernetic philosophers like Habermas and Hegel. Thinking in direct causes strictly requires that a planner is pervasively present. Since this is not possible in reality, most plans fail, often in a double fashion: The fail despite huge violations of budgets. There is a funny parallel to the field of IT-projects and their management, of which is well-known that 80% of all projects fail, doubly. Planning induces open demonstration of power, i.e. strictness, due to its structural strictness.

Without a “living” concept of information as a structural element a number of things, concepts and tools are neither visible nor accessible:

  • – risk, simulation, serious gaming, and approaches like Frederic Vester’s methodology,
  • – market,
  • – insurance
  • – participatory evolutionary forms of organization, such as open source.

Let us just focus on the aspects risk and market. Taking recent self-critical articles from the field of planning (cf. [4],[35]), but also a quick Google ™ search (first 300 entries), not a single notion of risk can be found, where it would be taken as a tool, not just as a parlance. Hence, tools and concepts for risk management are completely unknown in planning theory,  for instance value-of-risk methods for evaluating alternatives or the current “state” of the implementation, or scenario games34. Even conservative approaches such as “key performance indicators” from controlling are obviously unknown.

We already indicated that planning theory suffers from a lack of abstract concepts. One of those concerns the way of mediating incommensurable and indivisible goals. In an information-based perspective it is easy to find ways to organize a goal-finding process. Essentially, there are two possibilities: the concept of willingness-to-pay and the Delphi method (from so-called “soft operations research”).

Willingness-to-pay employs a market perspective. It should not be mistaken as a “capitalist” or even “neo-liberal” strategy, of course. Quite in contrast, it introduces a currency as a basis for abstraction, thereby the possibility for constructing a comparability. This currency is not necessarily represented by money. Else, it serves in both possible directions, regarding costs as well as benefits. Without that abstraction it is simply impossible to find any common aspects in those affairs that appear as incommensurable at first sight. Unfortunately, almost every aspect in human society is incommensurable at first sight.

The second example is the Delphi method. This can be used, for instance, even for the very first step in case of the necessity of mediating incommensurabilities in goals and expectations: finding a common vocabulary, operationalized as a list of qualitative, but quantifiable properties, finding “weights” for those, and making holistic profiles transparent for any involved person.

It is quite clear that a metaphysical belief in identity, independence and determinability renders the accessibility of such approaches completely impossible. Poor guys…

6.2. Complexity

Not only in planning theory it is widely held that, as Manson puts it [36],

[…] there is no single identifiable complexity theory, but instead an array of concepts applicable to complex systems.

Further more, he also states that

[…] we have identified an urgent need to address the question of appropriate levels of generalization and specificity in complexity-based research.

Research about complexity is strongly flavored by the respective domain of its invocation, such as physics, biology or sociology. As an imported general concept, complexity is often more or less directly equaled to concepts like self-organization, fractals, chaos or even the edge of it, emergence, strange attractors, dissipativity and the like. (also Haken etc.)

A lot of myths appeared around these labels. For instance, it has been claimed that chaos is necessary for emergence, which is utterly wrong. Even more catastrophic is the habit to mix cybernetics and cybernetical systems theory with complexity. Luhmannian and Habermasian talking represent the conceptual opposite to an understanding of complexity. Nothing could be more different from each other! Yet, there are even researchers [37] who (quite nonsensical) explain emergence by the Law of Large Numbers, … indeed a rather disappointing approach. Else, it must be clear that self-organization and fractals are only weakly linked to chaos, if at all. On the other hand, concepts like self-organization or emergence are just aspects of complexity, and even more important, they are macro-theoretical descriptive terms which could not be transferred across domains.

The major problem in the contemporary discourse about complexity is that it this discourse is not critical enough. Instead, people first always asked “what is complexity?” before they then despaired of their subject. Finally, the research about “complexity” made its way into the realm of the symbolic, expressing now more a habit than a concept that could be utilized in a reasonable manner. The 354th demonstration of a semi-logarithmical scaling is simply boring and has nothing to do with “complexity”. Note that a multiplicative junction of two purely random processes creates the same numerical effect…

Despite those difficulties, complexity entered various domains, yet, always just as an attitude. Usually, this leads either to a tremendous fuzziness of the respective research or writing, or to perfected emptiness. Franco Archibugi, who proposes a rationalist approach to planning, recently wrote ([5], p.64):

The planning system is a complex system (footnote 24).

… and in the respective footnote 24:

Truly this seems a tautology; any system is complex by definition.

Here, the property “complex” gets both inflated and logified, and neither is appropriate.

What has been missing so far is an appropriate elementarization on the level of mechanisms. In order to adapt the concept of complexity to any particular domain, these mechanisms then have to be formulated in a probabilistic manner, or strictly with regard to information. The five elements of complexity as we devised it previously in a dedicated essay are

  • (1) dissipation, i.e. deliberate creation of additional entropy by the system at hand;
  • (2) an antagonistic setting of distributed opposing “forces” similar to the morphogenetic reaction-diffusion-system described first by Alan Turing;
  • (3) standardization;
  • (4) active compartmentalization as a means of modulating the signal horizon as signal intensity length;
  • (5) systemic knots.

Arranging the talk about complexity in this way has several advantages. First, these five elements are abstract principles that together form a dynamic setup resulting in the concept of “complexity”. This way, it is a proceduralization of the concept, which allows to avoid the burden of a definition without slipping into fuzzy areas. Second, these elements can be matched rather directly to empirical observation across a tremendous range of domains. No metaphorical work is necessary as there is no transfer of a model from one domain to another.

Note, that for instance “emergence” is not part of our setup. Emergence is itself a highly integrated concept with a considerable degree of internal heterogeneity. We would have to discern weak from strong emergence, at least, we would have to clarify what we understand by “novelty” and so on, that is questions that neither could be clarified nor be used on the descriptive, empirical level.

There is yet a third significant methodological aspect of this elementarization. It is possible to think about a system that is missing one of those elements, that is, where one of these elements is set to zero in its intensity. The five elements thus span a space that transcends the quality of a particular system. These five elements create two spaces, one conceptual and one empirical, which however are homeomorphic. The elements are first necessary and sufficient to talk about complexity, but they are also necessary and sufficient for any corporeal arrangement to develop “complexity”. Thus, it is easy and straightforward to apply our concept of complexity.

The first step is always to ask for the respective instantiation of the elements: Which antagonism could we detect? What is the material carrier of it? How many parts could we distinguish in space and time? Which kind of process is embedding this antagonism? How is compartmentalization going to be established, material or immaterial? How stable is it? Is it morphological or a functional compartmentalization? What is the mechanism for establishing the transition from order to organization? Which levels of integration do we observe? Is there any instance of self-contradictory top-down regulation? Are there measures to avoid such (as for instance in military)?

These questions can be “turned around,” of course, then being used as design principles. In other words, using this elementarization it is perfectly possible to scale the degree of volatility shown by the “complex system”.

The only approach transparently providing such an elementarization and the respective possibility  for utilizing  the concept of complexity in a meaningful way is ours (still, and as far as we are aware of recent publications35… feedback about that is welcome here!)36.

From those, the elements 2 and 4 are the certainly the most important ones when it comes to the utilization of the concept of complexity. First, one has to understand that adaptivity requires a preceding act of creativity. Next, only complex systems can create emergent patterns, which in turn can be established as a persistent form only in either of two ways: either by partially dying, creating a left-over, or by evolution. The first of which is internal to the process at hand, the second external. Consequently, only complex systems can create adaptivity, which in in turn is mandatory for a sustainable regenerativity.

So, the element (2), the distributed antagonism denies the reasonability of identity and of consensus-finding as a homogenizing procedure, if the implemented arrangement (“system”) is thought to be adaptive (and enabled for sustainability). Element (4) emphasizes the importance of the transition from order (mere volatile pattern) to persistent or even morphological structures, called organization. Yet, living systems provide plenty of demonstrations that persistence does not mean “eternal”. In most cases structures are temporary, despite their stability. In other words, turnover and destroying is an active process in complex systems.

Complexity needs to be embraced by planning regarding its self-design as well as the plan and its implementation. Our elementarization opens the route to plan complexity. Even a smooth scaling of regarding the space between complexity and determination could be addressed now.

It is quite obvious that an appropriate theory of complexity is highly relevant for any planning in any domain. There are of course some gifted designers and architects as well as a few authors that have been following this route, some even long ago, as for instance Koolhaas in his Euro-Lille. Others like Michael Batty [42][43] or Angelique Chettiparamb (cf. [44][45][46]) investigate and utilize the concept of complexity in the fields of urbanism or planning almost as I propose it. Yet, just almost, for they did not conceptualize the notion of complexity in an operationalizable manner so far.

There is a final remark on complexity to put here, concerning its influence on the dynamics of theory work. Clearly, the concept of complexity transcends ideas such as rationalism or pragmatism. It may be conceived as a generic proceduralization that reaches from thought (“theory”) till action. It is its logic of genesis, as Deleuze called it, that precedes any particular “ism” as well as the separation of theory and practice in the space of the Urban. It is once again precisely here in this space of ever surprising novelty that ethics becomes important, notably an ethics that is structurally homeomorphic through its own proceduralization, where the procedures are at least partially antagonistic to each other.

6.3. Vision

Finally, let me formulate kind of a vision, by referring just to one of the more salient examples. In developing countries there is a large amount of informal settlements, more often tending towards slum conditions than not. More than 30% of urban citizens across the world live in slum conditions. At some point in time, the city administration usually decides to eradicate the whole area. Yet, this comes at the cost of destroying a more or less working social fabric. The question obviously is one of differentiation. How to improve means how to differentiate, which in turn means how to accumulate potential. The answer is quite easy: by supporting enlightened liberalism through an anti-directionist politics (cf. [48]). Instead of bulldozing and enforcing people to leave, and even instead of implanting the “solution” of whatsoever kind in a top-down manner, simply provide them two things: (i) the basic education about materials and organization in an accessibly compiled form, and (ii) the basic materials. The rest will be arranged by the people, as this introduces the opportunity for arbitrage profits. It will not only create a sufficiently diversified market, which of course can be supported in its evolution. It also will create a common good of increased value of the whole area. Such an approach will work for the water problem, whether fresh water or waste water. My vision is that this kind of thinking would be understood, at least (much) more frequently…

7. Perplexion

The history of the human, the history of conceptual thinking and—above all—its transmission by the manifold ways and manners this conceptual thinking has been devising, all of this, until the contemporary urban society, is a wonderful (quite literally) and almost infinite braid. Our attempts here are nothing more than just an attempt to secure this braiding by pointing to some old, almost forgotten embroidery patterns and by showing some new one.

I always have been clear about another issue, but I would like to emphasize it again: Starting with the idea of being, which equals that of existence or identity, demolishes any possibility for thinking the different, the growing, the novel, in short, life. This holds even for Whitehead’s process philosophy. Throughout this blog, as it is there so far, I have been trying to build something, not a system, not a box, but something like an Urban Thought. The ideas, concepts, ways in which that something have been actualizing are stuffed (at least in my hopes) with an inherent openness. Nevertheless I have to admit that it feels like approaching a certain limit, as thoughts and words tend increasingly to enter the “eternal return”. Yet, don’t take this as a resignation or even the beginning of a nihilistic phase. It is said as an out and out positive thought. But still…

Maybe,  these thoughts have been triggered by a friends’ hint towards a small, quite (highly?) exceptional book or booklet of unknown origin:  The “Liber viginti quattuor philosophorum”, the Book of the 24 Philosophers.37 Written presumably somewhere between 800 and 1200 ac38, it consists just of 24 philosophical theses about our relation to God. The main message is that we can’t know, despite it seems to be implicated.

7.1. Method, Generic Differentiation and Urban Reason.

Anyway. In this essay we explored the notion of method. Beginning with Descartes’ achievements, we then tried to develop a critique of it. Next we embedded the issue of planning and method into the context of Urban Reason, including the concept of Generic Differentiation [henceforth GD], which we explicated in the previous essay where we devised it for organizing theory works. Let us reproduce it here again, just as a little reminder.

Figure 3: The structural pragmatic module of Generic Differentiation for binding theory works, modeling and operations (for details see here). This module is part of a fluid moebioid fractal that grows and forms throughout thinking and acting, which thereby are folded into each other. The trinity of modes of actualization (planning, adapting, learning) passes through this fractal figure.

urban reason 4t

All of the four concepts of growth, networks, associativity and complexity can be conceptualized in a proceduralized form as well. Additionally, they all could be taken as perspectives onto abstract, randolated and thus virtual yet probabilistic networks.

Interestingly, this notion opens a route into mathematics through the notions of computability and non-turing computing (also see [52]). Here, we may take this just as a further indication to the fundamental perspective of information as a distinct element of construction whenever we talk about the city, the Urban and the design regarding it.

7.2. “Failing” Plans

Thinking of planning without the aspects of evolution and learning would equal, we repeatedly emphasized this point, the claim of the analyticity of the world. Such a planning would follow positivist or rationalist schemes and could be called “closed planning”. Only under the presupposition of the world’s analyticity such planning could be considered as reasonable.

Since the presupposition is obviously wrong, closed planning schemes such as positivist or rationalist ones are doomed to fail. Yet, this failing is a failure only from the perspective of the plan or planner. From the outside, we can’t criticize plans as failing, since in this case we would confine ourselves to the rationalist scheme. For the diagnosis of failure in a cultural artifice like such of a city, or settlement in the widest sense, always requires presuppositions itself. Of course, in some contexts like that of financial planning within an organization these presuppositions can be operationalized straightforwardly into amounts of money, since the whole context is dominated by it. Financial planning is almost exclusively closed planning.

In the context of town planning, however, even the result of bad planning will always be inhabitable in some way, for in reality the plan is actualized into an open non-analytical world. The argument is the same as Koolhaas applied to the question of the quality of buildings. In China, architects in average build hundreds if not thousands of times more space than in Europe. There is no particular awareness on what Western people call the quality of architecture. The material arrangements into which plans actualize will always be used in some way. But is is equally true that there always will be a considerable part in this usage that imposes ways of using the result that have not been planned.

This way, they never fail, but at the same time they always fail, as they always have to be corrected. The only thing that becomes clear by this is that the reduction of the planners perspective to plan sensu stricto is the actual failure. A planning theory that does not consider evolution and learning isn’t worth the paper onto which it is written.

Both aspects, evolution and learning, need to be expressed, of course, in a proper form before one could assimilate them to the domain of arranging future elements (and elements of the future). Particularly important to understand is that “learning” does not refer to human cognition. Here it refers to the whole, that is the respectively active segment of the city itself, much in the sense of an Actor-Network (following Bruno Latour [53]), but also the concept of the city as an associative corporeality in itself,  as I have been pointing out some time ago [54].

7.3. Eternal Folds

Generic Differentiation is deeply doubly-articulated, as Deleuze would perhaps have said it39. GD may serve as kind of a scaffold to organize thoughts (and hence actions) around the challenge of how to effectuate ideas and concepts. Remember that concepts are transcendent and not to be mistaken as definitions! Here in this piece we tried to outline how an update of the notion of “method” could look like. Perhaps you have been missing references to the more recent discourses, in which, among others, you could find Michel Serres, or Isabelle Stengers, but also Foucault to name just a few. The reason to dismiss them is just given by our focus on planning and the Urban, about which those authors did not talk too much (I mean with respect to the problematics of method).

Another route I didn’t follow was to develop and provide a recipe for planning of whatsoever sort, particularly not one that could be part of a cookbook for mindless robots. It would simply contradict the achieved insights about Differentiation. Yet, I think, that something rather close to a manual could be possible, perhaps a meta-manual targeting the task of creating a manual, that would help to write down a methodology. A “methodology“ which deserves the label is kind of an open didactic talking about methods, and such necessarily comprises some reflection (which is missing in recipes). Such, it is clear that the presented concepts about method around Generic Differentiation should not be perceived as such a methodology. Take it more as a pre-specific scaffold for externalizing and effectuating thought, to confront it with the existential resistance. Thus, the second joint of said double-articulation of Generic Differentiation, besides such scaffolding of thought, connects towards the scaffolding of action.

The double-articulated rooting of method (as we developed it as a concept here) in the dynamics of physical arrangements and the realm of thoughts and ideas enables us to pose three now rather urgent questions in a clear manner :

  • (1) How to find new ways into regenerative urban arrangements? (cf. [51]);
  • (2) How to operate the “Image of Urban”?40
  • (3) The question for a philosophy of the urban […] is how the energetic flow of undifferentiated potentiality in/of urban arrangement might be encoded and symbolically integrated, such that through its transposition into differentiable capacity ability, proficiency and artifice may emerge. (after [52], p.149)

Bühlmann (in [55] p.144/145) points out that

The difficulty, in philosophically cogitating the city or the urban, lies […] with the capacity of dealing in an open and open-ended, yet systematic manner with the determinability of initial and final states. It is precisely the determination of such “initial” and “final” states that needs to be proceduralized.

I guess that those three questions could be answered only together. It is in the corpus (and corporeality) of the virtual and actualized answers that we will meet the Urban Reason. Here, in concluding this essay, we can only indicate the directions, and this only rather broad strokes.

Regenerative cities in the sense of “sustainable sustainability” can be achieved only through a persistent and self-sustained, yet modulated complexity of the city. A respective process model is easy to achieve once it is understood how complexity and ethics are mutually supportive. This implies also a significant political aspect which has been often neglected in the literature about planning. We also referred to Latour’s suggestion of a “Politics of Nature,” which however does not contribute to the problem that he pretends to address.

We have shown here, that and how our notion of method and complexity can be matched with a respective contemporary ethics, which is a mandatory part of the planning game. Planning as such, i.e. is in the traditional meaning of mechanistic implementation ceases to exist. Instead, planning has to address the condition of the possible.

Such, any kind of planning of any kind of arrangement undergoes first a  Kantian turn through which it inevitably changes into “planning of the potential”. Planning the potential, in turn, may be regarded as a direct neighbor to design, its foundation [56] and methodology.41 This reflects the awareness for the primacy of the conditions for the possibility for complexity. These conditions can be actualized only, if planning is understood as one of the aspects of the trinity of Generic Differentiation, which comprises besides planning also evolution and learning, invoking in turn the concepts of population/probabilism and associativity. All parts of the “differentiation game” have to be practiced, of course, in their prozeduralized form. No fixed goals on the level of facts any more, no directive policies, no territorialism, no romanticism hugging the idea of identity any more, please… It is the practice of proceduralization, based on a proper elementarization and bridging from ethics to complexity, that we can identify as the method of choice.

The philosophical basis for such a layout must necessarily deny the idea of identity as a secure starting point. Instead, all the achievements presented here may appear only on the foundation provided by transcendent difference [57]. I am deeply convinced that any “Science of the City” or “Methodology of Planning” (the latter probably as a section of the former) must adhere to appropriate structural and philosophical foundations, for instance those that we presented here and which are part of Urban Reason. Otherwise it will quite likely give rise to the surge of a quite similar kind of political absolutism that succeeded Descartes’ consideration of the “Methode”.

8. Summary

We explored the notion of “method” and its foundations with regard to planning. Starting from its original form as created by Descartes in his “Methode de la Discourse” we found four basic vectors that span the conceptual space of planning.

Ethics and complexity are not only regarded as particular focal points, but rather as common and indispensable elements of any planning activity. The proposed four-fold determination of planning should be suitable to overcome rationalist, neo-liberal, typical modernist or positivist approaches. In other words, without those four elements it is impossible to express planning as an activity or to talk reasonably about it. In its revised form, both the concept and the field of planning allow for the integration of deep domain-specific knowledge from the contributing specializing domains, without stopping the operational aspects of planning. Particularly, however, the new, or renewed, image of planning offers the important possibility to join human reason into the Urban activities of designing and planning our urban neighborhood, and above all, living it.

9. Outlook

In most cases I didn’t give an outlook to the next essay, due to the spontaneous character of this bloggy journey as well as the inevitable autonomy of the segregated text that is increasing more and more as time passes.

This time, however, the topic of the follow-up is pretty clear. Once started with the precis of Koolhaas “Generic City” the said journey led us first to the concept of “Urban Reason” and the Urban as its unique, if not solitary cultural condition. The second step then consisted in bundling several abstract perspectives into the concept of Generic Differentiation. Both steps have been linked through the precept of “Nothing regarding the Urban Makes Sense Except in the Light of the Orchestration of Change.” The third step, as elaborated here, was then a brief (very brief indeed) investigation of the subject and the field of planning. Today, this field is still characterized by rather misty methodological conditions.

The runway towards the point of take-off for the topic of the next essay, then, could be easily commented by a quote from Sigfried Giedion’s “Space, Time and Architecture” (p.7):

For planning of any sort our knowledge must go beyond the state of affairs that actually prevails. To plan we must know what has gone on in the past and feel what is is coming in the future.

Giedion has been an interesting person, if not to say, composition, in order to borrow a notion from Bruno Latour. Being historian, engineer and entrepreneur, among several other roles, he has been in many ways modernist as well as a-modern. Not completely emancipated from the underlying modernist credo of metaphysical independence, he also demanded an integration of the aspect of time as well as that of relationability, which assigns him the attitude of a-modernism, if we utilize Aldo Rossi’s verdict on modernism’s attempt to expunge time from architecture.

Heidegger put it very clear (only marginally translated into my own words): Without understanding the role of time and temporality for the sphere of the human we can’t expect to understand the Being of man-made artifacts and human culture. Our challenge regarding Heidegger will be that we have to learn from his analysis without partaking in his enterprise to give a critique of fundamental ontology.

More recently, Yeonkyung Lee and Sungwoo Kim [58] pointed to the remarkable fact, based on Giedion’s work, that there is only little theoretical work about time in the field of architecture and urbanism. We regard this as a consequence of the prevailing physicalist reductionism. They also hold that

further critical and analytical approaches to time in architecture should be followed for more concrete development of this critical concept in architecture. (p.15)

Hence, our next topic will be just a subsection of Giedion’s work: Time and Architecture. The aspect of space can’t be split off of course, yet we won’t discuss it in any depth, because it deserves a dedicated treatment itself, mainly due to the tons of materialist nonsense that is floating around since Lefebvre’s (ideologic) speculations (“Production of Space”). Concerning the foundations, that is the concept of time, we will meet mainly Deleuze and Heidegger, Bergson and his enemy Einstein, and, of course, also Wittgenstein. As a result, I hopefully will enrich and differentiate the concept of Generic Differentiation even more, and thus also the possible space of the Urban.

Notes 

1. Descartes’ popularity is based, of course, on his condensed and almost proverbial “Cogito, ergo sum”, by which he sought to gain secure grounds for knowledge. Descartes’ Cogito raises difficult issues, and I can only guess that there are lots of misunderstandings about it. Critique of the Cogito started already with Leibniz, and included among almost everybody also Kant, Hume, Nietzsche and Russell. The critique targets either logic (“ergo”), the implications regarding existence (“sum”), or the “I” in the premise. I won’t neither add to this criticism nor comment it; yet, I just would like to point to another possibility to approach it opened by refraining from logic and existentialism: self-referentiality. The “I am thinking” may be taken as a simple, still unconscious observation that there is something going on that uses language. In other words, a language-pragmatic approach paired with self-referentiality opens a quite fresh perspective onto the cogito. Yet, this already would have to count as an update of the original notion. To my knowledge this has never been explored by any of the philosophical scholars. In my opinion, most of the critiques on the cogito are wrong, because they stick to rationalism themselves. The foundation of which, however, can’t be rational itself in its beginning, only through its end (not: “ends”!) and its finalization. Anyway, neither the Cogito nor the sum nor the “I” is subject of our considerations here. Actually, there is not much to say, as such “traditional” metaphysics misunderstands “grammatical sentences” as metaphysical sentences (Ludwig Wittgenstein, in “About Certainty”).

Concerning the wider topic of rationalism as a problematic field in philosophy, I suggest to resolve its position and (at least partial) incommensurability to other “-ism” – modes by means of the choreostemic space, where it just forms a particular attractor.

2. Wittgenstein and main stream cognitive science hold that this should not be possible. Yet, things are not as simple as it may appear at first sight. We could not expect that there is a “nature” of thinking, somehow buried beneath the corporeality of the brain. We certainly can take a particular attitude to our own thinking as well as we can (learn to) apply certain tools and even methodologies in our thought that is directed to our thought. The (Deleuzean) Differential is just one early example.

3. Just to mention here as a more recent example the “failure” of Microsoft’s strategy of recombinable software modules as opposed to the success of the unique app as it has been inaugurated by Apple.

4. Most of the items and boxes in this backpack did not influence the wider public in the same way as Descartes did. One of the most influential among the available items, Hegel, we already removed, it is just dead freight. The group of less known but highly important items comprises the Kantian invention of critique, the transparent description of the sign by Peirce, the insight into the importance of the Form of Life and the particular role and relation of language (Wittgenstein, Foucault), or the detrimental effects of founding thought on logicism—also known as the believe into necessity, truth values, and the primacy of identity—are not recognized among the wider public, whether we would consider sciences, the design area or politics. All these achievements are clearly beyond Descartes’, but we should not forget two things. Firstly, he just was a pioneer. Secondly, we should not forget that the whole era favored a mechanic cosmology. The lemma of the large numbers in the context of probabilism as a perspective had not been invented yet at his times.

5. The believe into this independence may well count as the most dominating of the influences that brought us the schizophrenias that culminated in the 19h and 20th century. Please don’t misunderstand this as a claim for “causality” as understood in the common sense! Of course, there have been great achievements, but the costs of those have always been externalized, first to the biological environment, and second to future generations of mankind.

6. By “planning” I don’t refer just to the “planning of land-use” or other “physical planning” of course. In our general context of Urban Reason and the particular context of the question about method here in this essay I would like to include any aspect around the planning within the Urban, particularly organizational planning.

7. Meant here without any kind of political, ethical or sociological reading, just as the fact of the mere physical and informational possibility.

8. Original in German language (my translation): ” Ob das Gewicht der Forschung gleich immer in dieser Positivität liegt, ihr eigentlicher Fortschritt vollzieht sich nicht so sehr in der Aufsammlung der Resultate und Bergung derselben in »Handbüchern«, als in dem aus solcher anwachsenden Kenntnis der Sachen meist reaktiv hervorgetriebenen Fragen nach den Grundverfassungen des jeweiligen Gebietes. […] Das Niveau einer Wissenschaft bestimmt sich daraus, wie weit sie einer Krisis ihrer Grundbegriffe fähig ist.”

9. As we mentioned elsewhere, the habitus of this site about practical aspects of Hilary Putnam’s philosophical stance is more that of a blook than that of a blog.

10. Descartes and Deleuze are of course not the only guys interested in the principles or methods of and in thought. For instance, Dedekind proposed “Laws of Thought” which shall include things like creative abstraction. It would be a misunderstanding, however, to look to psychology here. Even so-called cognitive psychology can’t contribute to the search for such principles, precisely because it is in need for schemata to investigate. Science always can investigate only what “there is”.

11. Nowadays often called system, and by that referring to “systems science”, often also to Niklaus Luhmann’s extension of cybernetics into the realm of the social. Yet, it is extremely important to distinguish the whole from a system. The whole is neither an empiric nor an analytic entity, it couldn’t be described completely as observation, a set of formula(s), a diagram or any combination thereof, which for instance is possible for a cybernetic system. Complex “systems” must not be conceived as as systems in the mood of systems theory, since openness and creativity belong to their basic characteristics. For complex systems, the crude distinction of “inside” and “outside” does not make much sense.

12. Thinking “items” as independent becomes highly problematic if this belief is going to be applied to culture itself in a self-referential manner. Consequently, man has been thought to be independent from nature. “Precisely, what is at stake is to show how the misguided constitution of modernity finds its roots in the myth of emancipation common to the Moderns. […] Social emancipation should not be condemned to be associated with an avulsion from nature, […]. The error of the modern constitution lies in the way it describes the world as two distinct entities separated from each other.” [18]. It is quite clear that the metaphysical believe into independence is beneath the dualisms of nature/culture, nature/nurture, and body/mind. This does not mean that we could not use in our talking the differences expressed in those dichotomies, yet, the differences need not be placed into a strictly dichotomic scheme. see section about “values” and Bruno Latour’s proposal.

13. This does not imply a denial of God. Yet, I think that any explicit reference to the principle of divinity implicitly corroborates that idea.

14. It is inadequate because by definition you can’t learn from a case study. It is a mis-believe, if not a mystical particularism to think that case studies could somehow “speak for themselves.” The role of a case study must be that it is taken as an opportunity to challenge classifications, models and theories. As such, they have to be used as a means and a target for transformative processes. Yet, such is rarely done with case studies.

15. Subsequent to Niko Tinbergen’s distinction, Dobzhansky introduced a particular weight onto those four perspectives, emphasizing the evolutionary aspect: Nothing in biology makes sense except in the light of evolution. For him, evolution served as a kind of integrative perspective.

16. As in the preceding essays, we use the capital “U” if we refer to the urban as a particular quality and as a concept in the vicinity of Urban Reason, in order to distinguish it from the ordinary adjective that refers to common sense understanding.

17. Difference between architecture and arts, particularly painting.

18. Yet, he continues: “As such, it must be designed according to a model which takes into account all the possible fields of decision-making and all decision-makers who play a role in social life. It has a territorial dimension which is “global” in the literal sense: it extends to the planetary scale.” (p.64) So, since he proposes a design of planning he obviously invokes a planning of planning. Yet, Archibugi does not recognize this twist. Instead, he claims that this design can be performed in a rationalist manner on a global scale, which—as an instance of extended control phantasm—is definitely overdone.

19. In more detail, Archibugi claims that his approach is able to integrate traditional fields of planning in a transdisciplinary methodological move, based on a “programming” approach ( as opposed to the still dominant positivistic approach). The individual parts of this approach are
+ a procedural scheme for the selection of plans;
+ clarification interrelationship between different “levels” of planning;
+ describing institutional procedures of plan bargaining;
+ devising a consulting system on preference, information,
monitoring, and plan evaluation.

Yet, such a scheme, particularly if conducted as a rationalist program, is doomed to fail for several reasons. In monitoring, for instance, he applies an almost neo-liberal scheme (cf. p.81), being unaware of the necessity of the apriori of theoretical attitudes as well as the limitation of reasoning that solely is grounded on empirical observations.

20. Of course, we are not going to claim that “society” does not need the activity of and the will to design itself. Yet, while any externalization needs a continuous legitimization—and by this I don’t refer to one election every four years—, the design of the social should target exclusively the conditions for its open unfolding. There is a dark line from totalitarian Nazi-Germany, the Jewish exiled sociologist, the Macy-Conferences and their attempt to apply cybernetics directly to the realm of social, finally followed by the rationalist Frankfurt School with its late proponent Habermas and his functionalism. All of those show the same totalitarian grammar.

21. Deleuze’s books about cinema and the image of time [33].

22. Rem Koolhaas, Euro-Lille, see this.

23. Just for recall: the Differential is the major concept in Deleuze’s philosophy of transcendental empiricism, which set difference, not identity, as primal, primacy of interpretation, rejection of identity and analyticity, a separation-integration.

24. Sue Hendler despises philosophical foundations of ethics for the area planning as “formalistic”. Instead she continues to draw on values, interestingly backed by a strong contractual element. As this may sound pragmatic in the first instance, it is nothing but utilitarian. Contracts in this case are just acts of ad-hoc institutionalizations, which in turn build on the legislative milieu. Thus I reject this approach, because in this case ethics would just turn into a matter of the size of the monetary investment into lawyers.

25. Note that ethics is the theory of morality, while morality is the way we deal with rules about social organization.

26. here and here or here;

27. It is a paradox only from a rationalist perspective.,of course.

28. “thing” is an originally Nordic concept that refers to the fixation of a mode of interpretation through negotiation. The “althing” is the name of the Islandic parliament, existing roughly since 930 ac in an uninterrupted period. A thing such exists as an objectified/objectifiable entity only subsequent to the communal negotiation, which may or may not include institutions.

29. inspired by Alfred N. Whitehead and Isabel Stengers.

30. See this about the concept of theory.

31. Unfortunately available in German language only.

32. This just demonstrates that it is not unproblematic to jump on the bandwagon of a received view, e.g. on the widely discussed and academically well-introduced Theory of Justice by John Rawls, as for instance exemplified by [23].

33. What is needed instead for a proper foundation is a practicable philosophy of Difference, for instance in the form proposed by Deleuze. Note that Derrida’s proclaimed “method” of deconstruction neither can serve as a philosophical foundation in general nor as an applicable one. Deconstruction establishes the ideal of negativity, from which nothing could be generated.

34. With one (1) [41], or probably two (2) [40] notable and somewhat similar exceptions which however did not find much (if any) resonance so far…

35. Jensen contributed also to a monstrous encyclopedia about “Complexity and Systems Science” [39], comprising more than 10’000 pages (!), which however does not contain one single useable operationalization of the notion of “complexity”.

36. One of the more advanced formulations of complexity has been provided by the mathematician Henrik Jeldtoft Jensen (cf. [38]). Yet, it is still quite incomplete, because he does neither recognize or refer to the importance of the distributed antagonism nor does he respond to the necessity that complex systems have to be persistently complex. Else he is also wrong about the conjecture that there must be a “large number of interacting components”.

37. see review by the German newspaper FAZ, a book in German language, a unofficial translation into English, and into French. Purportedly, there are translations into Spanish, yet I can’t provide a link.

38. Hudry [49] attributes it to Aristotle.

39. Deleuze & Guattari developed and applied this concept first in their Milles Plateaus [50].

40. The notion of an „Image of Urban“ is not a linguistic mistake, of course. It parallels Deleuze’s “Image of Thought”, where thought refers to a habit, or a habitus, a gestalt if you prefer, that comprises the conditions for the possibility of its actualization.

41. At first sight it seems as if such extended view on design, particularly if understood as the design of pre-specifics, could reduce or realign planning to the engineering part of it. Yet, planning in the context of the Urban always has to consider immaterial, i.e. informational aspects, which in turn introduces the fact of interpretation. We see, that no “analytic” domain politics is possible.

References

  • [1] Ian Hacking, The Emergence of Probability: A Philosophical Study of Early Ideas About Probability, Induction and Statistical Inference. Cambridge University Press, Cambridge 1984.
  • [2] Geoffrey Binder (2012). Theory(izing)/practice: The model of recursive cultural adaptation. Planning Theory 11(3): 221–241.
  • [3] Vanessa Watson (2006). Deep Difference: Diversity, Planning and Ethics. Planning Theory, 5(1): 31–50.
  • [4] Stefano Moroni (2006). Book Review: Teoria della pianificazione. Planning Theory 5: 92–96.
  • [5] Franco Archibugi, Planning Theory: From the Political Debate to the Methodological Reconstruction, 2007.
  • [6] Bernd Streich, Stadtplanung in der Wissensgesellschaft: Ein Handbuch. VS Verlag für Sozialwissenschaften, Wiesbaden 2011.
  • [7] Innes, J.E. and Booher,D.E. (1999). Consensus Building and Complex Adaptive Systems – A Framework for Evaluating Collaborative Planning. Journal of the American Planning Association 65(4): 412–23.
  • [8] Juval Portugali, Complexity, Cognition and the City. (Understanding Complex Systems). Springer, Berlin 2011.
  • [9] Hermann Haken, Complexity and Complexity Theories: Do these Concepts Make Sense? in: Juval Portugali, Hans Meyer, Egbert Stolk, Ekim Tan (eds.), Complexity Theories of Cities Have Come of Age: An Overview with Implications to Urban Planning and Design. 2012. p.7-20.
  • [10] Angelique Chettiparamb (2006). Metaphors in Complexity Theory and Planning. Planning Theory 5: 71.
  • [11] Martin Heidegger, Sein und Zeit. Niemayer, Tübingen 1967.
  • [12] Susan S. Fainstein (2005). Planning Theory and the City. Journal of Planning Education and Research 25:121-130.
  • [13] Newman, Lex, “Descartes’ Epistemology”, The Stanford Encyclopedia of Philosophy (Fall 2010 Edition), Edward N. Zalta (ed.), available online.
  • [14] Hilary Putnam, The meaning of “meaning”. University of Minnesota 1975.
  • [15] Ludwig Wittgenstein, Philosophical Investigations.
  • [16] Wilhelm Vossenkuhl, Ludwig Wittgenstein. Beck’sche Reihe, München 2003.
  • [17] Hilary Putnam, Representation and Reality. MIT Press, Cambridge (MA.) 1988.
  • [18] Florence Rudolf and Claire Grino (2012), The Nature-Society Controversy in France: Epistemological and Political Implications in: Dennis Erasga (ed.)”Sociological Landscape – Theories, Realities and Trends”, InTech. available online.
  • [19] John V. Punter, Matthew Carmona, The Design Dimension of Planning: Theory, Content, and Best Practice for design policies. Chapman & Hall, London 1997.
  • [20] David Thacher (2004). The Casuistical Turn in Planning Ethics. Lessons from Law and Medicine. Journal of Planning Education and Research 23(3): 269–285.
  • [21] E. L. Charnov (1976). Optimal foraging: the marginal value theorem. Theoretical Population Biology 9:129–136.
  • [22] John Maynard Smith, G.R. Price (1973). The logic of animal conflict. Nature 246: 15–18.
  • [23] Stanley M. Stein and Thomas L. Harper (2005). Rawls’s ‘Justice as Fairness’: A Moral Basis for Contemporary Planning Theory. Planning Theory 4(2): 147–172.
  • [24] Sue Hendler, “On the Use of Models in Planning Ethics”. In S. Mandelbaum, L. Mazza and R. Burchell (eds.), Explorations in Planning Theory. Center for Urban Policy Research. New Brunswisk (NJ) 1996. pp. 400–413.
  • [25] Heather Campbell (2012). ‘Planning ethics’ and rediscovering the idea of planning. Planning Theory 11(4): 379–399.
  • [26] Vera Bühlmann, Primary Abundance, Urban Philosophy. Information and the Form of Actuality”, in: Vera Bühlmann, Ludger Hovestadt (Eds.), Printed Physics. Applied Virtuality Series Vol. I, Birkhäuser Basel 2013, pp. 114–154 (forthcoming). available online.
  • [27] Leonard Lawlor and Valentine Moulard, “Henri Bergson”, The Stanford Encyclopedia of Philosophy (Fall 2012 Edition), Edward N. Zalta (ed.), available online: http://plato.stanford.edu/archives/fall2012/entries/bergson/.
  • [28] Bruno Latour, Politics of Nature. 2004.
  • [29] Mariam Fraser (2006). The ethics of reality and virtual reality: Latour, facts and values. History of the Human Sciences 19(2): 45–72.
  • [30] Sue Hendler and Reg Lang (1986). Planning and Ethics: Making the Link. Ontario Planning Journal September-October, 1986, p.14–15.
  • [31] Wilhelm Vossenkuhl, Die Möglichkeit des Guten. Beck, München 2006.
  • [32] Immanuel Kant, Zum ewigen Frieden.
  • [33] Gilles Deleuze , Cinema II – The Time- Image .
  • [34] Bent Flyvbjerg, “The Dark Side of Planning: Rationality and Realrationalität,” in: Seymour Mandelbaum, Luigi Mazza, and Robert Burchell (eds.), Explorations in Planning Theory. Center for Urban Policy Research Press, New Brunswick (NJ) 1996. pp. 383–394.
  • [35] Henry Mintzberg, Rise and Fall of Strategic Planning. Free Press, New York 1994.
  • [36] S. M. Manson, D. O’Sullivan (2006). Complexity theory in the study of space and place. Environment and Planning A 38(4): 677–692.
  • [37] John H. Miller, Scott E. Page, Complex Adaptive Systems: An Introduction to Computational Models of Social Life. 2007.
  • [38] Henrik Jeldtoft Jensen, Elsa Arcaute (2010). Complexity, collective effects, and modeling of ecosystems: formation, function, and stability. Ann. N.Y. Acad. Sci. 1195 (2010) E19–E26.
  • [39] Robert A. Meyers (ed.), Encyclopedia of Complexity and Systems Science. Springer 2009.
  • [40] Frederic Vester.
  • [41] Juval Portugali (2002). The Seven Basic Propositions of SIRN (Synergetic InterRepresentation Networks). Nonlinear Phenomena in Complex Systems 5(4):428-444.
  • [42] Michael Batty (2010). Visualizing Space–Time Dynamics in Scaling Systems. Complexity 16(2): 51–63.
  • [43] Michael Batty, Cities and Complexity: Understanding Cities with Cellular Automata, Agent-Based Models, and Fractals. MIT Press, Boston 2007.
  • [44] Angelique Chettiparamb (2005). Fractal spaces in planning and governance, Town Planning Review, 76(3): 317–340.
  • [45] Angelique Chettiparamb (2014, forthcoming). Complexity Theory and Planning: Examining ‘fractals’ for organising policy domains in planning practice. Planning Theory, 13(1).
  • [46] Angelique Chettiparamb (2013, forthcoming). Fractal Spatialities. Environment and Planning D: Society and Space.
  • [47] Gilles Deleuze, The Logic of Sense. 1968.
  • [48] Fred Moten and Stefano Harney (2011). Politics Surrounded. The South Atlantic Quarterly 110:4.
  • [49] ‘Liber viginti quattuor philosophorum’ (CCM, 143A [Hermes Latinus, t.3, p.1]), edited by: F. Hudry, Turnhout, Brepols, 1997.
  • [50] Gilles Deleuze, Félix Guattari. A Thousand Plateaus. [1980].
  • [51] Anna Leidreiter (2012). Circular metabolism: turning regenerative cities into reality. The Global Urbanist – Environment, 24. April 2012. available online.
  • [52] Marius Buliga (2011). Computing with space: a tangle formalism for chora and difference. arXiv:1103.6007v2 21 Apr 2011. available online.
  • [53] Bruno Latour (2010). “Networks, Societies, Spheres: Reflections of an Actor-network Theorist,” International Journal of Communication, Manuel Castells (ed.), special issue Vol.5, pp.796–810. available online.
  • [54] Klaus Wassermann (2010). SOMcity: Networks, Probability, the City, and its Context. eCAADe 2010, Zürich. September 15-18, 2010. available online.
  • [55] Vera Bühlmann, “Primary Abundance, Urban Philosophy. Information and the Form of Actuality”. in: Vera Bühlmann, Ludger Hovestadt (Eds.), Printed Physics (Applied Virtuality Series Vol. 1), Springer, Wien 2012. pp. 114–154.
  • [56] Vera Bühlmann, “The Integrity of Objects: Design, Information, and the Form of Actuality” to appear in ADD METAPHYSICS, ed. by Jenna Sutela et.al. Aalto University Digital Design Laboratory, ADDLAB (forthcoming 2013).
  • [57] Gilles Deleuze, Difference and Repetition. 1967.
  • [58] Yeonkyung Lee and Sungwoo Kim (2008). Reinterpretation of S. Giedion’s Conception of Time in Modern Architecture – Based on his book, Space, Time and Architecture. Journal of Asian Architecture and Building Engineering 7(1):15–22.

۞

Behavior

September 7, 2012 § Leave a comment

Animals behave. Of course, one could say.

Yet, why do we feel a certain naturalness here, in this relation between the cat as an observed and classified animal on the one side and the language game “behavior” on the other? Why don’t we say, for instance, that the animal happens? Or, likewise, that it is moved by its atoms? To which conditions does the language game “behavior” respond?

As strange as this might look like, it is actually astonishing that physicists easily attribute the quality of “behavior” to their dog or their cat, albeit they rarely will attribute them ideas (for journeys or the like). For physicists usually claim that the whole world can be explained in terms of the physical laws that govern the movement of atoms (e.g. [1]). Even physicists, it seems, exhibit some dualism in their concepts when it comes to animals. Yet, physicists claimed for a long period of time, actually into the mid of the 1980ies, that behavioral sciences actually could not count as a “science” at all, despite the fact that Lorenz and Tinbergen won the Nobel prize for medical sciences in 1973.

The difficulties physicists obviously suffer from are induced by a single entity: complexity. Here we refer to the notion of complexity that we developed earlier, which essentially is built from the following 5 elements.

  • – Flux of entropy, responsible for dissipation;
  • – Antagonistic forces, leading to emergent patterns;
  • – Standardization, mandatory for temporal persistence on the level of basic mechanisms as well as for selection processes;
  • – Compartmentalization, together with left-overs leading to spatio-temporal persistence as selection;
  • – Self-referential hypercycles, leading to sustained 2nd order complexity with regard to the relation of the whole to its parts.

Any setup for which we can identify this set of elements leads to probabilistic patterns that are organized on several levels. In other words, these conditioning elements are necessary and sufficient to “explain” complexity. In behavior, the sequence of patterns and the sequence of more simple elements within patterns are by far not randomly arranged, yet, it is more and more difficult to predict a particular pattern the higher its position in the stack of nested patterns, that is, its level of integration. Almost the same could be said about the observable changes in complex systems.

Dealing with behavior is thus a non-trivial task. There are no “laws” that would be mapped somehow into the animal such that an apriori defined mathematical form would suffice for a description of the pattern, or the animal as a whole. In behavioral sciences, one first has to fix a catalog of behavioral elements, and only by reference to this catalog we can start to observe in a way that will allow for comparisons with other observations. I deliberately avoid the concept of “reproducibility” here. How to know about that catalog, often called behavioral taxonomy? The answer is we can’t know in the beginning. To reduce observation completely to the physical level is not a viable alternative either. Observing a particular species, and often even a particular social group or individual improves over time, yet we can’t speak about that improvement. There is a certain notion of “individual” culture here that develops between the “human” observer and the behaving system, the animal. The written part of this culture precipitates in the said catalog, but there remains a large part of habit of observing that can’t be described without performing it. Observations on animals are never reproducible in the same sense as it is possible with physical entities. The ultimate reason being that the latter are devoid of individuality.

A behavioral scientist may work on quite different levels. She could investigate some characteristics of behavior in relation to the level of energy consumption, or to differential reproductive success. On this level, one would hardly go into the details of the form of behavior. Quite differently to this case are those investigations that are addressing the level of the form of the behavior. The form becomes an important target of the investigation if the scientist is interested in the differential social dynamics of animals belonging to different groups, populations or species. In physics, there is no form other than the mathematical. Electrons are (treated in) the same (way) by physicists all over the world, even across the whole universe. Try this with cats… You will loose the cat-ness.

It is quite clear that the social dynamics can’t be addressed by means of mere frequencies of certain simple behavioral elements, such like scratching, running or even sniffing at other animals. There might be differences, but we won’t understand too much of the animal, of course, particularly not with regard to the flow of information in which the animal engages.

The big question that arose during the 1970ies and the 1980ies was, how to address behavior, its structure, its patterning, and thereby to avoid a physicalist reduction?

Some intriguing answers has been given in the respective discourse since the beginning of the 1950ies, though only a few people recognized the importance of the form. For instance, to understand wolves Moran and Fentress [2] used the concept of choreography to get a descriptional grip on the quite complicated patterns. Colmenares, in his work about baboons, most interestingly introduced the notion of the play to describe the behavior in a group of baboons. He distinguished more than 80 types of social games as an arrangement of “moves” that span across space and time in a complicated way; this behavioral wealth rendered it somewhat impossible to analyze the data at that time. The notion of the social game is so interesting because it is quite close to the concept of language game.

Doing science means to translate observations into numbers. Unfortunately, in behavioral sciences this translation is rather difficult and in itself only little standardized (so far) despite many attempts, precisely for the reason that behavior is the observable output of a deeply integrated complex system, for instance the animal. Whenever we are going to investigate behavior we carefully have to instantiate the selection of the appropriate level we are going to investigate. Yet, in order to understand the animal, we even could not reduce the animal onto a certain level of integration. We should map the fact of integration itself.

There is a dominant methodological aspect in the description of behavior that differs from those in sciences more close to physics. In behavioral sciences one can invent new methods by inventing new purposes, something that is not possible in classic physics or engineering, at least if matter is not taken as something that behaves. Anyway, any method for creating formal descriptions invokes mathematics.

Here it becomes difficult, because mathematics does not provide us any means to deal with emergence. We can’t, of course, blame mathematics for that. It is not possible in principle to map emergence onto an apriori defined set of symbols and operations.

The only way to approximate an appropriate approach is by a probabilistic methodology that also provides the means to distinguish various levels of integration. The first half of this program is easy to accomplish, the second less so. For the fact of emergence is a creative process, it induces the necessity for interpretation as a constructive principle. Precisely this has been digested by behavioral science into the practice of the behavioral catalog.

1. This Essay

Well, here in this essay I am not interested mainly in the behavior of animals or the sciences dealing with the behavior of animals. Our intention was just to give an illustration of the problematic field that is provoked by the “fact” of the animals and their “behavior”.  The most salient issue in this problematic field is the irreducibility, in turn caused by the complexity and the patterning resulting from it. The second important part on this field is given by the methodological answers to these concerns, namely the structured probabilistic approach, which responds appropriately to the serial characteristics of the patterns, that is, to the transitional consistency of the observed entity as well as the observational recordings.

The first of these issues—irreducibility—we need not to discuss in detail here. We did this before, in a previous essay and in several locations. We just have to remember that empiricist reduction means to attempt for a sufficient description through dissecting the entity into its parts, thereby neglecting the circumstances, the dependency on the context and the embedding into the fabric of relations that is established by other instances. In physics, there is no such fabric, there are just anonymous fields, in physics, there is no dependency on the context, hence form is not a topic in physics. As soon as form becomes an issue, we leave physics, entering either chemistry or biology. As said, we won’t go into further details about that. Here, we will deal mainly with the second part, yet, with regard to two quite different use cases.

We will approach these cases, the empirical treatment of “observations” in computational linguistics and in urbanism, first from the methodological perspective, as both share certain conditions with the “analysis” of animal behavior. In chapter 8 we will give more pronounced reasons about this alignment, which at first sight may seem to be, well, a bit adventurous. The comparative approach, through its methodological arguments, will lead us to the emphasis of what we call “behavioral turn”. The text and the city are regarded as behaving entities, rather than the humans dealing with them.

The chapters in this essay are the following:

Table of Content (active links)

2. The Inversion

Given the two main conceptual landmarks mentioned above—irreducibility and the structured probabilistic approach—that establish the problematic field of behavior, we now can do something exciting. We take the concept and its conditions, detach it from its biological origins and apply it to other entities where we meet the same or rather similar conditions. In other words, we practice a differential as Deleuze understood it [3]. So, we have to spend a few moments for dealing with these conditions.

Slightly re-arranged and a bit more abstract than it is the case in behavioral sciences, these conditions are:

  • – There are patterns that appear in various forms, despite they are made from the same elements.
  • – The elements that contribute to the patterns are structurally different.
  • – The elements are not all plainly visible; some, most or even the most important are only implied.
  • – Patterns are arranged in patterns, implying that patterns are also elements, despite the fact that there is no fixed form for them.
  • – The arrangement of elements and patterns into other patterns is dependent on the context, which in turn can be described only in probabilistic terms.
  • – Patterns can be classified into types or families; the classification however, is itself non-trivial, that is, it is not supported.
  • – The context is given by variable internal and external influences, which imply a certain persistence of the embedding of the observed entity into its spatial, temporal and relational neighborhood.
  • – There is a significant symbolic “dimension” in the observation, meaning that the patterns we observe occur in sequence space upon an alphabet of primitives, not just in the numerical space. This symbolistic account is invoked by the complexity of the entity itself. Actually, the difference between symbolic and numerical sequences and patterns are much less than categorical, as we will see. Yet, it makes a large difference either to include or to exclude the methodological possibility for symbolic elements in the observation.

Whenever we meet these conditions, we can infer the presence of the above mentioned problematic field, that is mainly given by irreducibility and­­­—as its match in the methodological domain—the practice of a structured probabilistic approach. This list provides us an extensional circumscription of abstract behavior.

A slightly different route into this problematic field draws on the concept of complexity. Complexity, as we understand it by means of the 5 elements provided above (for details see the full essay on this subject), can itself be inferred by checking for the presence of the constitutive elements. Once we see antagonisms, compartments, standardization we can expect emergence and sustained complexity, which in turn means that the entity is not reducible and in turn, that a particular methodological approach must be chosen.

We also can clearly state what should not be regarded as a member of this field. The most salient one is the neglect of individuality. The second one, now in the methodological domain, is the destruction of the relationality as it is most easy accomplished by referring to raw frequency statistics. It should be obvious that destroying the serial context in an early step of the methodological mapping from observation to number also destroys any possibility to understand the particularity of the observed entity. The resulting picture will not only be coarse, most probably it also will be utterly wrong, and even worse, there is no chance to recognize this departure into the area that is free from any sense.

3. The Targets

At the time of writing this essay, there are currently three domains that suffer most from the reductionist approach. Well, two and a half, maybe, as the third, genetics, is on the way to overcome the naïve physicalism of former days.

This does not hold for the other two areas, urbanism and computational linguistics, at least as far as it is relevant for text mining  and information retrieval1. The dynamics in the respective communities are of course quite complicated, actually too complicated to achieve a well-balanced point of view here in this short essay. Hence, I am asking to excuse the inevitable coarseness regarding the treatment of those domains as if they would be homogenous. Yet, I think, that in both areas the mainstream is seriously suffering from a mis-understood scientism. In some way, people there strangely enough behave more positivist than researchers in natural sciences.

In other words, we follow the question how to improve the methodology in those two fields of urbanism and computerized treatment of textual data. It is clear that the question about methodology implies a particular theoretical shift. This shift we would like to call the “behavioral turn”. Among other changes, the “behavioral turn” as we construct it allows for overcoming the positivist separation between observer and the observed without sacrificing the possibility for reasonable empiric modeling.2

Before we argue in a more elaborate manner about this proposed turn in relation to textual data and urbanism, we first would like two accomplish two things. First, we briefly introduce two methodological concepts that deliberately try to cover the context of events, where those events are conceived as part of a series that always also develops into kind of a network of relations. Thus, we avoid to conceive of events as a series of separated points.

Secondly, we will discuss current mainstream methodology in the two fields that we are going to focus here. I think that the investigation of the assumptions of these approaches, often remaining hidden, sheds some light onto the arguments that support the reasonability of the “behavioral turn”.

4. Methodology

The big question remaining to deal with is thus: how to deal with the observations that we can make in and about our targets, the text or the city?

There is a clear starting point for the selection of any method as a method that could be considered as appropriate. The method should inherently respond to the seriality of the basic signal. A well-known method of choice for symbolic sequences are Markov chains, another important one are random contexts and random graphs. In the domain of numerical sequences wavelets are the most powerful way to represent various aspects of a signal at once.

Markov Processes

A Markov chain is the outcome of applying the theory of Markov processes onto a symbolic sequence. A Markov process is a neat description of the transitional order in a sequence. We also may say that it describes the conditional probabilities for the transitions between any subset of elements. Well, in this generality it is difficult to apply. Let us thus start with the most simple form, the Markov process of 1st order.

A 1st order Markov process describes just and only all pairwise transitions that are possible for given “alphabet” of discrete entries (symbols). These transitions can be arranged in a so-called transition matrix if we obey to the standard to use the preceding part of the transitional pair as row header and the succeeding part of the transitional pair as a column header. If a certain transition occurs, we enter a tick into the respective cell, given by the address row x column, which derives from the pair prec -> succ. That’s all. At least for the moment.

Such a table captures in some sense the transitional structure of the observed sequence. Of course, it captures only a simple aspect, since the next pair does not know anything about the previous pair. A 1st order Markov process is thus said to have no memory. Yet, it would be a drastic misunderstanding to generalize the absence of memory to any kind of Markov process. Actually, Markov processes can precisely be used to investigate the “memories” in a sequence, as we will see in a moment.

Anyway, on any kind of such a transition table we can do smart statistics, for instance to identify transitions that are salient for the “exceptional” high or low frequency. Such a reasoning takes into account the marginal frequencies of such a table and is akin to correspondence analysis. Van Hooff developed this “adjusted residual method” and  has been applying it with great success in the analysis of observational data on Chimpanzees [4][5].

These residuals are residuals against a null-model, which in this case is the plain distribution. In other words, the reasoning is simply the same as always in statistics, aiming at establishing a suitable ratio of observed/expected, and then to determine the reliability of a certain selection that is based on that ratio. In the case of transition matrices the null-model states that all transitions occur with the same frequency. This is of course, simplifying, but it is also simple to calculate. There are of course some assumptions in that whole procedure that are worthwhile to be mentioned.

The most important assumption of the null-model is that all elements that are being used to set up the transitional matrix are independent from each other, except their 1st order dependency, of course. This also means that the null-model assumes equal weights for the elements of the sequence. It is quite obvious that we should assume so only in the beginning of the analysis. The third important assumption is that the process is stationary, meaning the kind and the strength of the 1st order dependencies do not change for the entire observed sequence.

Yet, nothing enforces us to stick to just the 1st order Markov processes, or to apply it globally. A 2nd order Markov process could be formulated which would map all transitions x(i)->x(i+2). We may also formulate a dense process for all orders >1, just by overlaying all orders from 1 to n into a single transitional matrix.

Proceeding this way, we end up with an ensemble of transitional models. Such an ensemble is suitable for the comparatist probabilistic investigation of the memory structure of a symbolic sequence that is being produced by a complex system. Matrices can be compared (“differenced”) regarding their density structure, revealing even spurious ties between elements across several steps in the sequence. Provided the observed sequence is long enough, single transition matrices as well as ensembles thereof can be resampled on parts of sequences in order to partition the global sequence, that is, to identify locally stable parts of the overall process.

Here you may well think that this sounds like a complicated “work-around” for a Hidden Markov Model (HMM). Yet, despite a HMM is more general than the transition matrix perspective in some respect, it is also less wealthy. In HMM, the multiplicity is—well—hidden. It reduces the potential complexity of sequential data into a single model, again with the claim of global validity. Thus, HMM are somehow more suitable the closer we are to physics, e.g. in speech recognition. But even there their limitation is quite obvious.

From the domain of ecology we can import another trick for dealing with the transitional structure. In ecosystems we can observe the so-called succession. Certain arrangements of species and their abundance follow rather regularly, yet probabilistic to each other, often heading towards some stable final “state”. Given a limited observation about such transitions, how can we know about the final state? Using the transitional matrix the answer can be found simply by a two-fold operation of multiplying the matrix with itself and intermittent filtering by renormalization. This procedure acts as a frequency-independent filter. It helps to avoid type-II errors when applying the adjusted residuals method, that is, transitions with a weak probability will be less likely dismissed as irrelevant ones.

Contexts

The method of Markov processes is powerful, but is suffers from a serious problem. This problem is introduced by the necessity to symbolize certain qualities of the signal in advance to its usage in modeling.

We can’t use Markov processes directly on the raw textual data. Doing so instead would trap us in the symbolistic fallacy. We would either ascribe the symbol itself a meaning—which would result in a violation of the primacy of interpretation—or it would conflate the appearance of a symbol with its relevance, which would constitute a methodological mistake.

The way out of this situation is provided by a consequent probabilization. Generally we may well say that probabilisation takes the same role for quantitative sciences as the linguistic turn did for philosophy. Yet, it is still an attitude that is largely being neglected as a dedicated technique almost everywhere in any science. (for an example application of probabilisation with regard to evolutionary theory see this)

Instead of taking symbols as they are pretended to be found “out there”, we treat them as outcome of an abstract experiment, that is, as a random variable. Random variables establish them not as dual concepts, as 1 or 0, to be or not to be, they establish themselves as a probability distribution. Such a distribution contains potentially an infinite number of discretizations. Hence, probabilistic methods are always more general than those which rely on “given” symbols.

Kohonen et al. proposed a simple way to establish a random context [6]. The step from symbolic crispness to a numerical representation is not trivial, though. We need a double-articulated entity that is “at home” in both domains. This entity is a high-dimensional random fingerprint. Such a fingerprint consists simply of a large number, well above 100, of random values from the interval [0..1]. According to the Lemma of Hecht-Nielsen [7]  any two of such vectors are approximately orthogonal to each other. In other words, it is a name expressed by numbers.

After a recoding of all symbols in a text into their random fingerprints it is easy to establish  probabilistic distributions of the neighborhood of any word. The result is a random context, also called a random graph. The basic trick to accomplish such a distribution is to select a certain, fixed size for the neighborhood—say five or seven positions in total—and then arrange the word of interest always to a certain position, for instance into the middle position.

This procedure we do for all words in a text, or any symbolic series. Doing so, we get a collection of random contexts, that overlap. The final step then is a clustering of the vectors according to their similarity.

It is quite obvious that this procedure as it has been proposed by Kohonen sticks to strong assumptions, despite its turn to probabilization. The problem is the fixed order, that is, the order is independent from context in his implementation. Thus his approach is still limited in the same way as the n-gram approach (see chp.5.3 below). Yet, sometimes we meet strong inversions and extensions of relevant dependencies between words. Linguistics speak of injected islands with regard to wh*-phrases. Anaphors are another example. Chomsky critized the approach of fixed–size contexts very early.

Yet, there is no necessity to limit the methodology to fixed-size contexts, or to symmetrical instances of probabilistic contexts. Yes, of course this will result in a situation, where we corrupt the tabularity of the data representation. Many rows are different in their length and there is (absolutely) no justification to enforce a proper table by filling “missing values” into the “missing” cells of the table

Fortunately, there is another (probabilistic) technique that could be used to arrive at a proper table, without distorting the content by adding missing values. This technique is random projection, first identified by Johnson & Lindenstrauss (1984), which in the case of free-sized contexts has to be applied in an adaptive manner (see [8] or [9] for a more recent overview). Usually, a source (n*p) matrix (n=rows, p=columns=dimensions) is multiplied with a (p*k) random matrix, where the random numbers follow a Gaussian distribution), resulting in a target matrix of only k dimensions and n rows. This way a matrix of 10000+ columns can be projected into one made only from 100 columns without loosing much information. Yet, using the lemma of Hecht-Nielsen we can compress any of the rows of a matrix individually. Since the random vectors are approximately orthogonal to each other we won’t introduce any information across all the data vectors that are going to be fed into the SOM. This stepwise operation becomes quite important for large amounts of documents, since in this case we have to adopt incremental learning.

Such, we approach slowly but steadily the generalized probabilistic context that we described earlier. The proposal is simply that in dealing with texts by means of computers we have to apply precisely the most general notion of context, which is devoid from structural pre-occupations as we can meet them e.g. in the case of n-grams or Markov processes.

5. Computers Dealing with Text

Currently, so-called “text mining” is a hot topic. More and more of human communication is supported by digitally based media and technologies, hence more and more texts are accessible to computers without much efforts. People try to use textual data from digital environments for instance to do sentiment analysis about companies, stocks, or persons, mainly in the context of marketing. The craziness there is that they pretend to classify a text’s sentiment without understanding it, more or less on the frequency of scattered symbols.

The label “text mining” reminds to “data mining”; yet, the structure of the endeavors are drastically different. In data mining one is always interested in the relevant variables n order to build a sparse model that even could be understood by human clients. The model then in turn is used to optimize some kind of process from which the data for modeling has been extracted.

In the following we will describe some techniques, methods and attitudes that are highly unsuitable for the treatment of textual “data”, despite the fact that they are widely used.

Fault 1 : Objectivation

The most important difference between the two flavor of “digital mining” concerns however, the status of the “data”. In data mining, one deals with measurements that are arranged in a table. This tabular form is only possible on the basis of a preceding symbolization, which additionally is strictly standardized also in advance to the measurement.

In text mining this is not possible. There are no “explanatory” variables that could be weighted. Text mining thus just means to find a reasonable selection of text in response to a “query”. For textual data it is not possible to give any criterion how to look at a text, how to select a suitable reference corpus for determining any property of the text, or simply to compare it to other texts before its interpretation. There are no symbols, no criteria that could be filled into a table. And most significant, there is no target that could be found “in the data”.

It is devoid of any sense to try to optimize a selection procedure by means of a precision/recall ratio. This would mean that the meaning of text could be determined objectively before any interpretation, or, likewise, that the interpretation of a text is standardisable up to a formula. Both attempts are not possible, claiming otherwise is ridiculous.

People responded to these facts with a fierce endeavor, which ironically is called “ontology”, or even “semantic web”. Yet, neither will the web ever become “semantic” nor is database-based “ontology” a reasonable strategy (except for extremely standardized tasks). The idea in both cases is to determine the meaning of an entity before its actual interpretation. This of course is utter nonsense, and the fact that it is nonsense is also the reason why the so-called “semantic web” never started to work. They guys should really do more philosophy.

Fault 2 : Thinking in Frequencies

A popular measure for describing the difference of texts are variants of the so-called tf-idf measure. “tf” means “term frequency” and describes the normalized frequency of a term within a document. “idf” means “inverse document frequency”, which, actually, refers to the frequency of a word across all documents in a corpus.

The frequency of a term, even its howsoever differentialized frequency, can hardly be taken as the relevance of that term given a particular query. To cite the example from the respective entry in Wikipedia, what is “relevant” to select a document by means of the query “the brown cow”? Sticking to terms makes sense only if and only if we accept an apriori contract about the strict limitation to the level of the terms. Yet, this has nothing to do with meaning. Absolutely nothing. It is comparing pure graphemes, not even symbols.

Even if it would be related to meaning it would be the wrong method. Simply think about a text that contains three chapters: chapter one about brown dogs, chapter two about the relation of (lilac) cows and chocolate, chapter three about black & white cows. There is no phrase about a brown cow in the whole document, yet, it would certainly be selected as highly significant by the search engine.

This example nicely highlights another issue. The above mentioned hypothetical text could nevertheless be highly relevant, yet only in the moment the user would see it, triggering some idea that before not even was on the radar. Quite obviously, despite the search would have been different, probably, the fact remains that the meaning is neither in the ontology nor in the frequency and also not in text as such—before the actual interpretation by the user. The issue becomes more serious if we’d consider slightly different colors that still could count as “brown”, yet with a completely different spelling. And even more, if we take into account anaphoric arrangement.

The above mentioned method of Markov processes helps a bit, but not completely of course.

Astonishingly, even the inventors of the WebSom [6], probably the best model for dealing with textual data so far, commit the frequency fallacy. As input for the second level SOM they propose a frequency histogram. Completely unnecessary, I have to add, since the text “within” the primary SOM can be mapped easily to a Markov process, or to probabilistic contexts, of course. Interestingly, any such processing that brings us from the first to the second layer reminds somewhat more to image analysis than to text analysis. We mentioned that already earlier in the essay “Waves, Words and Images”.

Fault 3 : The Symbolistic Fallacy (n-grams & co.)

Another really popular methodology to deal with texts is n-grams. N-grams are related to Markov processes, as they also take the sequential order into account. Take for instance (again the example from Wiki) the sequence “to be or not to be”. The transformation into a 2-gram (or bi-gram) looks such “to be, be or, or not, not to, to be,” (items are between commas), while the 3-gram transformation produces “to be or, be or not, or not to, not to be”. In this way, the n-gram can be conceived as a small extract from a transition table of order (n-1). N-grams share a particular weakness with simple Markov models, which is the failure to capture long-range dependencies in language. These can be addressed only by means of deep grammatical structures. We will return to this point later in the discussion of the next fault No.4 (Structure as Meaning).

The strange thing is that people drop the tabular representation, thus destroying the possibility of calculating things like adjusted residuals. Actually, n-grams are mostly just counted, which is committing the first fault of thinking in frequencies, as described above.

N-gram help to build queries against databases that are robust against extensions of words, that is prefixes, suffixes, or forms of verbs due to flexing. All this has, however, nothing to do with meaning. It is a basic and primitive means to make symbolic queries upon symbolic storages more robust. Nothing more.

The real problem is the starting point: taking the term as such. N-grams start with individual words that are taken blindly as symbols. Within the software doing n-grams, they are even replaced by some arbitrary hash code, i.e. the software does not see a “word”, it deals just with a chunk of bits.

This way, using n-grams for text search commits the symbolistic fallacy, similar to ontologies, but even on a more basic level. In turn this means that the symbols are taken as “meaningful” for themselves. This results in a hefty collision with the private-language-argument put forward by Wittgenstein a long time ago.

N-grams are certainly more advanced than the nonsense based on tf-idf. Their underlying intention is to reflect contexts. Nevertheless, they fail as well. The ultimate reason for the failure is the symbolistic starting point. N-grams are only a first, though far too trivial and simplistic step into probabilization.

There is already a generalisation of n-grams available as described in published papers by Kohonen & Kaski: random graphs, based on random contexts, as we described it above. Random graphs overcome the symbolistic fallacy, especially if used together with SOM. Well, honestly I have to say that random graphs imply the necessity of a classification device like the SOM. This should not be considered as being a drawback, since n-grams are anyway often used together with Bayesian inference. Bayesian methods are, however, not able to distil types from observations as SOM are able to do. That now is indeed a drawback since in language learning the probabilistic approach necessarily must be accompanied with the concept of (linguistic) types.

Fault 4 : Structure as Meaning

The deep grammatical structure is an indispensable part of human languages. It is present from the sub-word level up to the level of rhetoric. And it’s gonna get really complicated. There is a wealth of rules, most of them to be followed rather strict, but some of them are applied only in a loose manner. Yet, all of them are rules, not laws.

Two issues are coming up here that are related to each other. The first one concerns the learning of a language. How do we learn a language? Wittgenstein proposed, simply by getting shown how to use it.

The second issue concerns the status of the models about language. Wittgenstein repeatedly mentioned that there is no possibility for a meta-language, and after all we know that Carnap’s program of a scientific language failed (completely). Thus we should be careful when applying a formalism to language, whether it is some kind of grammar, or any of the advanced linguistic “rules” that we know of today (see the lexicon of linguistics for that). We have to be aware that these symbolistic models are only projective lists of observations, arranged according to some standard of a community of experts.

Linguistic models are drastically different from models in physics or any other natural science, because in linguistics there is no outer reference. (Computational) Linguistics is mostly on the stage of a Babylonian list science [10], doing more tokenizing than providing useful models, comparable to biology in the 18th century.

Language is a practice. Language is a practice of human beings, equipped with a brain and embedded in a culture. In turn language itself is contributing to cultural structures and is embedded into it. There are many spatial, temporal and relational layers and compartments to distinguish. Within such arrangements, meaning happens in the course of an ongoing interpretation, which in turn is always a social situation. See Robert Brandom’s Making it Explicit as an example for an investigation of this aspect.

What we definitely have to be aware of is that projecting language onto a formalism, or subordinating language to an apriori defined or standardized symbolism (like in formal semantics) looses essentially everything language is made from and referring to. Any kind of model of a language is implicitly also claiming that language can be detached from its practice and from its embedding without loosing its main “characteristics”, its potential and its power. In short, it is the claim that structure conveys meaning.

This brings us to the question about the role of structure in language. It is a fact that humans not only understand sentences full of grammatical “mistakes”, and quite well so, in spoken language we almost always produce sentences that are full of grammatical mistakes. In fact, “mistakes” are so abundant that it becomes questionable to take them as mistakes at all. Methodologically, linguistics is thus falling back into a control science, forgetting about the role and the nature of symbolic rules such as it is established by grammar. The nature is an externalization, the role is to provide a standardization, a common basis, for performing interpretation of sentences and utterances in a reasonable time (almost immediately) and in a more or less stable manner. The empirical “given” of a sentence alone, even a whole text alone, can not provide enough evidence for starting with interpretation, nor even to finish it. (Note that a sentence is never a “given”.)

Texts as well as spoken language are nothing that could be controlled. There is no outside of language that would justify that perspective. And finally, a model should allow for suitable prediction, that is, it should enable us to perform a decision. Here we meet Chomsky’s call for competence. In case of language, a linguistic models should be able to produce language as a proof of concept. Yet, any attempt so far failed drastically, which actually is not really a surprise. Latest here it should become clear that the formal models of linguistics, and of course all the statistical approaches to “language processing” (another crap term from computational linguistics) are flawed in a fundamental way.

From the perspective of our interests here on the “Putnam Program” we conceive of formal properties as Putnam did in his “Meaning of “Meaning””. Formal properties are just that: properties among other properties. In our modeling essay we proposed to replace the concept of properties by the concept of the assignate, in order to emphasize the active role of the modeling instance in constructing and selecting the factors. Sometimes we use formal properties of terms and phrases, sometimes not, dependent on context, purpose or capability. There is neither a strict tie of formal assignates to the entity “word” or “sentence” nor could we detach them as part of formal approach.

Fault 5 : Grouping, Modeling and Selection

Analytic formal models are a strange thing, because such a model essentially claims that there is no necessity for a decision any more. Once the formula is there, it claims a global validity. The formula denies the necessity for taking the context as a structural element into account. It claims a perfect separation between observer and the observed. The global validity also means that the weights of the input factors are constant, or even that there are no such weights. Note that the weights translates directly into the implied costs of a choice, hence formulas also claim that the costs are globally constant, or at least, arranged in a smooth differentiable space. This is of course far from any reality for almost any interesting context, and of course for the contexts of language and urbanism, both deeply related to the category of the “social”.

This basic characteristic hence limits the formal symbolic approach to physical, if not just to celestial and atomic contexts. Trivial contexts, so to speak. Everywhere else something rather different is necessary. This different thing is classification as we introduced it first in our essay about modeling.

Searching for a text and considering a particular one as a “match” to the interests expressed by the search is a selection, much like any other “decision”. It introduces a notion of irreversibility. Searching itself is a difficult operation, even so difficult that is questionable whether we should follow this pattern at all. As soon as we start to search we enter the grammatological domain of “searching”. This means that we claim the expressibility of our interests in the search statement.

This difficulty is nicely illustrated by an episode with Gary Kasparov in the context of his first battle against “Deep Blue”. Given the billions of operations the super computer performed, a journalist came up with the question “How do find the correct move so fast?” Obviously, the journalist was not aware about the mechanics of that comparison. Kasparov answered: “ I do not search, I just find it.” His answer is not perfectly correct, though, as he should have said “I just do it”. In a conversation we mostly “just do language”. We practice it, but we very rarely search for a word, an expression, or the like. Usually, our concerns are on the strategic level, or in terms of speech act theory, on the illocutionary level.

Such we arrive now at the intermediary result that we have some kind of non-analytical models on the one hand, and the performance of their application on the other. Our suggestion is that most of these models are situated on an abstract, orthoregulative level, and almost never on the representational level of the “arrangement” of words.

A model has a purpose, even if it is an abstract one. There are no models without purpose. The purpose is synonymic to the selection. Often, we do not explicitly formulate a purpose, we just perform selections in a consistent manner. It is this consistency in the selections that imply a purpose. The really important thing to understand is also that the abstract notion of purpose is also synonymic to what we call “perspective”, or point of view.

One could mention here the analytical “models”, but those “models” are not models because they are devoid of a purpose. Given any interesting empirical situation, everybody knows that things may look quite different, just dependent on the “perspective” we take. Or in our words, which abstract purpose we impose to the situation. The analytic approach denies such a “perspectivism”.

The strange thing now is that many people mistake the mere clustering of observation on the basis of all contributing or distinguished factors as a kind of model. Of course, that grouping will radically change if we withdraw some of the factors, keeping only a subset of all available ones. Not only the grouping changes, but also the achievable typology and any further generalization will be also very different. In fact, any purpose, and even the tuning of the attitude towards the risk (costs) of unsuitable decisions changes the set of suitable factors. Nothing could highlight more the nonsense to call naïve take-it-all-clustering a “unsupervised modeling”. First, it is not a model. Second, any clustering algorithm or grouping procedure follows some optimality criterion, that is it supervises it despite claiming the opposite. “Unsupervised modeling” claims implicitly that it is possible to build a suitable model by pure analytic means, without any reference to the outside at all. This is, f course, not possible. It is this claim that is introducing a contradiction to the practice itself, because clustering usually means classification, which is not an analytic move at all. Due to this self-contradiction the term “unsupervised modeling” is utter nonsense. It is not only nonsense, it is even deceiving, as people get vexed by the term itself: they indeed believe that they are modeling in a suitable manner.

Now back to the treatment of texts. One of the most advanced procedures—it is a non-analytical one—is the WebSom. We described it in more detail in previous essays (here and here). Yet, as the second step Kohonen proposes clustering as a suitable means to decide about the similarity of texts. He is committing exactly the same mistake as described before. The trick, of course, is to introduce (targeted) modeling to the comparison of texts, despite the fact that there are no possible criteria apriori. What seems to be irresolvable disappears, however, as a problem if we take into account the self-referential relations of discourses, which necessarily engrave into the interpreter as self-modifying structural learning and historical individuality.

6. The Statistics of Urban Environments

The Importance of Conceptual Backgrounds

There is no investigation without implied purpose, simply because any investigation has to perform more often many selections rather than just some. One of the more influential selections that has to be performed considers the scope of the investigation. We already met this issue above when we discussed the affairs as we can meet it in behavioral sciences.

Considering investigations about social entities like urban environments, architecture or language. “scope” largely refers to the status of the individual, and in turn, to the status of time that we instantiate in our investigation. Both together establish the dimension of form as an element of the space of expressibility that we choose for the investigation.

Is the individual visible at all? I mean, in the question, in the method and after applying a methodology? For instance, as soon as we ask about matters of energy, individuals disappear. They also disappear if we apply statistics to raw observations, even if at first hand we would indeed observe individuals as individuals. To retain the visibility of individuals as individuals in a set of relations we have to apply proper means first. It is clear, that any cumulative measure like those from socio-economics also cause the disappearance of the context and the individual.

If we keep the individuals alive in our method, the next question we have to ask concerns the relations between the individuals. Do we keep them or do we drop them? Finally, regarding the unfolding of the processes that result from the temporal dynamics of those relations, we have to select whether we want to keep aspects of form or not. If you think that the way a text unfolds or the way things are happening in the urban environment is at least as important as their presence,  well in this case you would have to care about patterns.

It is rather crucial to understand that these basic selections determine the outcome of an investigation as well as of any modeling or even theory building as grammatological constraints. Once we took a decision on the scope, the problematics of that choice becomes invisible, completely transparent. This is the actual reason for the fact that choosing a reductionist approach as the first step is so questionable.

In our earlier essay about the belief system in modernism we emphasized the inevitability of the selection of a particular metaphysical stance, ways before we even think about the scope of an investigation in a particular domain. In case of modernistic thinking, from positivism to existentialism, including any shape of materialism, the core of the belief system is metaphysical independence, shaping all the way down towards politics methods, tools, attitudes and strategies. If you wonder whether there is an alternative to modernistic thinking, take a look to our article where we introduce the concept of the choreostemic space.

Space Syntax

In the case of “Space Syntax” the name is program. The approach is situated in urbanism; it has been developed and is still being advocated by Bill Hillier. Originally, Hillier was a geo-scientist, which is somewhat important to follow his methodology.

Put into a nutshell, the concept of space syntax claims that the description of the arrangement of free space in a built environment is necessary and sufficient for describing the quality of a city. The method of choice to describe that arrangement is statistics, either through the concept of probabilistic density of people or through the concept of regression, relating physical characteristics of free space with the density of people. Density in turn is used to capture the effect of collective velocity vectors. If people start to slow down, walking around in different directions, density increases. Density of course also increases as a consequence of narrow passages. Yet, in this case the vectors are strongly aligned.

The spatial behavior of individuals is a result and a means of social behavior in many animal species. Yet it makes a difference whether we consider the spatial behavior of individuals or the arrangement of free space in a city as a constraint of the individual spatial behavior. Hillier’s claim of “The Space is the Machine” is mistaking the one for the other.

In his writings, Hillier over and over again commits the figure of the petitio principii. He starts with the strong belief in analytics and upon that he tries to justify the use of analytical techniques. His claim of “The need for an analytic theory of architecture” ([11], p.40) is just one example. He writes

The answer proposed in this chapter is that once we accept that the object of architectural theory is the non-discursive — that is, the configurational — content of space and form in buildings and built environments, then theories can only be developed by learning to study buildings and built environments as non-discursive objects.

Excluding the discourse as a constitutional element only the analytic remains. He drops any relational account, focusing just the physical matter and postulating meaning of physical things, i.e. meaning as an apriori in the physical things. His problem is just his inability to distinguish different horizons of time, of temporal development. Dismissing time means to dismiss memory, and of course also culture. For a physicalist or ultra-modernist like him this blindness is constitutive. He never will understand the structure of his failure.

His dismissal of social issues as part of a theory serves eo ipso as his justification of the whole methodology. This is only possible due to another, albeit consistent, mistake, the conflation of theory and models. Hillier is showing us over and over again only models, yet not any small contribution to an architectural theory. Applying statistics shows us a particular theoretical stance, but is not to be taken as such! Statistics instantiates those models, that is his architectural theory is following largely the statistical theory. We repeatedly pointed to the problems that appear if we apply statistics to raw observations.

The high self-esteem Hillier expresses in his nevertheless quite limited writings is topped by treating space as syntax, in other words as a trivial machine. Undeniably, human beings have a material body, and buildings take space as material arrangements. Undeniably matter arranges space and constitutes space. There is a considerably discussion in philosophy about how we could approach the problematic field of space. We won’t go into details here, but Hillier simply drops the whole stuff.

Matter arranges in space. This becomes quickly a non-trivial insight, if we change perspective from abstract matter and the correlated claim of the possibility of reductionism to spatio-temporal processes, where the relations are kept taken as a starting point. We directly enter the domain of self-organization.

By means of “Space Syntax” Hillier claimed to provide a tool for planning districts of a city, or certain urban environments. If he would restrict his proposals to certain aspects of the anonymized flow of people and vehicles, it would be acceptable as a method. But it is certainly not a proper tool to describe the quality of urban environments, or even to plan them.

Recently, he delivered a keynote speech [12] where he apparently departed from his former Space Syntax approach, that reaches back to 1984. There he starts with the following remark.

On the face of it, cities as complex systems are made of (at least) two sub-systems: a physical sub-system, made up of buildings linked by streets, roads and infrastructure; and a human sub-system made up of movement, interaction and activity. As such, cities can be thought of as socio-technical systems. Any reasonable theory of urban complexity would need to link the social and technical sub-systems to each other.

This clearly is much less reductionist, at first sight at least, than “Space Syntax”. Yet, Hillier remains aligned to hard-core positivism. Firstly, in the whole speech he fails to provide a useful operationalization of complexity. Secondly, his Space Syntax simply appears wrapped in new paper. Agency for him is still just spatial agency. The relevant urban networks for him is just the network of streets. Thirdly, it is bare nonsense to separate a physical and a human subsystem, and then to claim the lumping together of those as a socio-technical system. He obviously is unaware of more advance and much more appropriate ways of thinking about culture, such as ANT, the Actor-Network-Theory (Bruno Latour), which precisely drops the categorical separation of physical and human. This separation has been first critized by Merlau-Ponty in the 1940ies!

Hillier served us just as an example, but you may have got the point. Occasionally, one can meet attempts that at least try to integrate a more appropriate concept of culture and human being in urban environments. Think about Koolhaas and his AMO/OMA, for instance, despite the fact that Koolhaas himself also struggles with the modernist mindset (see our introductions into “JunkSpace” or “The Generic City”). Yet, he at least recognized that something is fundamentally problematic with that.

7. The Toolbox Perspective

Most of the interesting and relevant systems are complex. It is simply a methodological fault to use frequencies of observational elements to describe these systems, whether we are dealing with animals, texts, urban environments or people (dogs, cats) moving around in urban environments.

Tools provide filters, they respond to certain issues, both of the signal and of the embedding. Tools are artifacts for transformation. As such they establish the relationality between actors, things and processes. Tools produce and establish Heidegger’s “Gestell” as well as they constitute the world as a fabric of relations as facts and acts, as Wittgenstein emphasized so often and already in the beginning of the Tractatus.

What we like to propose here is a more playful attitude towards the usage of tools, including formal methods. By “playful” we refer to Wittgenstein’s rule following, but also to a certain kind of experimentation, not induced by theory, but rather triggered by the know-how of some techniques that are going to be arranged. Tools as techniques, or techniques as tools are used to distil symbols from the available signals. Their relevancy is determined only by the subsequent step of classification, which in turn is (ortho-)regulated by strategic goal or cultural habits. Never, however, should we take a particular method as a representative for the means to access meaning from a process, let it a text or an urban environment.

8. Behavior

In this concluding chapter we are going to try to provide more details about our move to apply the concept of behavior to urbanism and computational linguistics.

Text

Since Friedrich Schleiermacher in 1830ies, hermeneutics is emphasizing a certain kind of autonomy of the text. Of course, the text itself is not a living thing as we consider it for animals. Before it “awakes” it has to be entered into mind matter, or more generally, it has to be interpreted. Nevertheless, an autonomy of the text remains, largely due to the fact that there is no Private Language. The language is not owned by the interpreting mind. Vilem Flusser proposed to radically turn the perspective and to conceive the interpreter as medium for texts and other “information”, rather than the other way round.

Additionally, the working of the brain is complex, least to say. Our relation to our own brain and our own mind is more that of an observer than that of a user or even controller. We experience them. Both together, the externality of language and the (partial) autonomy of the brain-mind lead to an arrangement where the text becomes autonomous. It inherits complimentary parts of independence from both parts of the world, from the internal and the external.

Furthermore, human languages are unlimited in their productivity. It is not only unlimited, it also is extensible. This pairs with its already mentioned deep structure, not only concerning the grammatical structure. Using language, or better, mastering language means to play with the inevitable inner contradictions that appear across the various layers, levels, aspects and processes of applied language. Within practiced language, there are many time horizons, instantiated by structural and semantic pointers. These aspects render the original series of symbols into an associative network of active components, which contributes further to the autonomy of texts. Roland Barthes notes (in [17]) that

The Plural of the Text depends … not on the ambiguity of its contents but on what might be called the sterographic plurality of its weave of signifiers (etymologically, the text is a tissue, a woven fabric). The reader of the Text may be compared to someone at a loose end.

Barthes implicitly emphasizes that the text does not convey a meaning, the meaning is not in the text, it can’t be conceived as something externalizable. In this essay he also holds that a text can’t be taken as just a single object. It is a text only in the context of other texts, and so the meaning that it develops upon interpretation is also dependent on the corpus into which it is embedded.

Methodologically, this (again) highlights the problematics that Alan Hajek called the reference class problem [13]. It is impossible for an interpreter to develop the meaning of a text outside of a previously chosen corpus. This dependency is inherited by any phrase, any sentence and any word within the text. Even a label like “IBM” that seems to be bijectively unique regarding the mapping of the graphem to its implied meaning is dependent on that. Of course, it will always refer somehow to the company. Yet, without the larger context it is not clear in any sense to which aspect of that company and its history the label refers to in a particular case. In literary theory this is called intertextuality. Further more, it is almost palpable here in this example that signs refer only to signs (the cornerstone of Peircean semiotics), and that concepts are nothing that could be defined (as we argued earlier in more detail).

We may settle here that a text as well as any part of it is established even through the selection of the embedding corpus, or likewise, a social practice, a life-form. Without such an embedding the text simply does not exist as a text. We just would find a series of graphemes. It is a hopeless exaggeration , if not self-deception, if people call the statistical treatment of texts “text mining”. reading it in another way, it may be considered even as a cynical term.

It is this dependence on local and global contexts, synchronically and diachronically, that renders the interpretation of a text similar to the interpretation of animal behavior.

Taken together, conceiving of texts as behaving systems is probably less a metaphor than it appears at first sight. Considering the way we make sense of a text, approaching a text is in many ways comparable with approaching an animal of a familiar species. We won’t know exactly what is going to happen, the course of events and action depends significantly on ourselves. The categories and ascribed properties necessary to establish an interaction are quite undefined in the beginning, also available only as types of rules, not as readily parameterized rules itself. And like in animals, the next approach will never be a simple repetition of the former one, even one knows the text quite good.

From the methodological perspective the significance of such a “behavioral turn”3 can’t be underestimated. For instance, nobody would interpret an animal by a rather short series of photographs, and keep the conclusion thereof once and for all. Interacting with a text as if it would behave demands for a completely different set of procedures. After all, one would deal with an open interaction. Such openness must be responded to with an appropriate attitude of the willingness for open structural learning.  This holds not only for human interpreters, but rather also for any interpreter, even if it would be software. In other words, the software dealing with text must itself be active in a non-analytical manner in order to constitute what we call a “text”. Any kind of algorithm (in the definition of Knuth) does not deal with text, but just and blindly with a series of dead graphemes.

The Urban

For completely different material reasons cities can be considered also as autonomous entities. Their patterns of growth and differentiation looks much more like that of ensembles of biological entities than that of minerals. Of course, this doesn’t justify the more or less naïve assignment of the “city as organism”. Urban arrangements are complex in the sense we’ve defined it, they are semiogenic and associative. There is a continuous contest between structure as regulation and automation on the one side and liquification as participation and symbolization on the other, albeit symbols may play for both parties.

Despite this autonomy, it remains a fact that without human activity cities are as little alive as texts are. This raises the particular question of the relationships between a city and its inhabitants, between the people as citizens of the city that they constitute. This topic has been subject of innumerable essay, novels, and investigations. Recently, a fresh perspective onto that has been opened by Vera Bühlmann’s notion of the “Quantum City”.[14]

We can neither detach the citizens from their city, not vice versa. Nevertheless, the standardized and externalized collective contribution across space and time creates an arrangement that produces dissipative flows and shows a strong meta-stability that transcends the activities of the individuals. This stability should not be mistaken as a “state”, though. Like for any other complex system, including texts, we should avoid to try to assign a “state” to a particular city, or even a part of it. Everything is a process within a complex system, even if it appears to be rather stable. yet, this stability depends on the perspective of the observer. In turn, the seeming stability does not mean that a city-process could not be destroyed by human activity, let it be by individuals (Nero), by a collective, or by socio-economic processes. Yet, again as in case of complex systems, the question of causality would be the wrong starting point for addressing the issue of change as it would be a statistical description.

Cities and urban environments are fabrics of relations between a wide range of heterogenic and heterotopic (See Foucault or David Shane [15]) entities and processes across a likewise large range of temporal scales, meeting any shade between the material and the immaterial. There is the activity of single individuals, of collectives of individuals, of legislative and other norms, the materiality of the buildings and their changing usage and roles, different kinds of flows and streams as well as stores and memories.

Elsewhere we argued that this fabric may be conceived as a dynamic ensemble of associative networks [16]. Those should be clearly distinguished from logistic networks, whose purpose is given by organizing any kind of physical transfer. Associative networks re-arrange, sort, classify and learn. Such, they are also the abstract location of the transposition of the material into the immaterial. Quite naturally, issues of form and their temporal structure arise, in other words, behavior.

Our suggestion thus is to conceive of a city as en entity that behaves. This proposal has (almost) nothing to do with the metaphor of the “city as organism”, a transfer that is by far too naïve. Changes in urban environments are best conceived as “outcomes” of probabilistic processes that are organized as overlapping series, both contingent and consistent. The method of choice to describe those changes is based on the notion of the generalized context.

Urban Text, Text and Urbanity, Textuality and Performance

Urban environments establish or even produce a particular kind of mediality. We need not invoke the recent surge of large screens in many cities for that. Any arrangement of facades encodes a rich semantics that is best described employing a semiotic perspective, just as Venturi proposed it. Recently, we investigated the relationship between facades, whether made from stone or from screens, and the space that they constitute [17].

There is yet another important dimension between the text and the city. For many hundred years now, if not even millenia, cities are not imaginable without text in one or the other form. Latest since the early 19th century, text and city became deeply linked to one another with the surge of newspapers and publishing houses, but also through the intricate linkage between the city and the theater. Urban culture is text culture, far more than it could be conceived as an image culture. This tendency is only intensified through the web, albeit urbanity now gets significantly transformed by and into the web-based aspects of culture. At least we may propose that there is a strong co-evolution between the urban (as entity and as concept) and mediality, whether it expresses itself as text, as movie or as webbing.

The relationship between the urban and the text has been explored many times. It started probably with Walter Benjamin’s “flâneur” (for an overview see [18]). Nowadays, urbanists often refer to the concept of the “readability” of a city layout, a methodological habit originated by Kevin Lynch. Yet, if we consider the relation between the urban and the textual, we certainly have to take an abstract concept of text, we definitely have to avoid the idea that there are items like characters or words out there in the city. I think, we should at least follow something like the abstract notion of textuality, as it has been devised by Roland Barthes in his “From Work to Text” [19] as a “methodological field”. Yet, this probably is still not abstract enough, as urban geographers like Henri Lefebvre mistook the concept of textuality as one of intelligibility [20]. Lefebvre obviously didn’t understand the working of a text. How should he, one might say, as a modernist (and marxist) geographer. All the criticism that was directed against the junction between the urban and textuality conceived­—as far as we know—text as something object-like, something that is out there as such, awaiting passively to be read and still being passive as it is being read, finally maybe even as an objective representation beyond the need (and the freedom for) interpretation. This, of course, represents a rather limited view on textuality.

Above we introduced the concept of “behaving texts”, that is, texts as active entities. These entities become active as soon as they are mediatized with interpreters. Again: not the text is conceived as the media or in a media-format, but rather the interpreter, whether it is a human brain-mind or a a suitable software tat indeed is capable for interpreting, not just for pre-programmed and blind re-coding. This “behavioral turn” renders “reading” a text, but also “writing” it, into a performance. Performances, on the other hand, comprise always and inevitable a considerable openness, precisely because they let collide the immaterial and the material from the side of the immaterial. Such, performances are the counterpart of abstract associativity, yet also settling at the surface that sheds matter from ideas.

In the introduction to their nicely edited book ”Performance and the City” Kim Solga, D.Hopkins and Shelley Orr [18] write, citing the urban geographer Nigel Thrift:

Although de Certeau conceives of ‘walking in the city’ not just as a textual experience but as a ‘series’ of embodied, creative’ practices’ (Lavery: 152), a ‘spatial acting-out of place’ (de Certeau: 98, our emphasis), Thrift argues that de Certeau: “never really leaves behind the operations of reading and speech and the sometimes explicit, sometimes implicit claim that these operations can be extended to other practices. In turn, this claim [ … ] sets up another obvious tension, between a practice-based model of often illicit ‘behaviour’ founded on enunciative speech-acts and a text-based model of ‘representation’ which fuels functional social systems.” (Thrift 2004: 43)

Quite obviously, Thrift didn’t manage to get the right grip to Certeau’s proposal that textual experience may be conceived—I just repeat it— as a series of embodied, creative practices. It is his own particular blindness that lets Thrift denunciate texts as being mostly representational.

Solsa and colleagues indeed emphasize the importance of performance, not just in their introduction, but also through their editing of the book. Yet, they explicitly link textuality and performance as codependent cultural practices. They write:

While we challenge the notion that the city is a ‘text’ to be read and (re)written, we also argue that textuality and performativity must be understood as linked cultural practices that work together to shape the body of phenomenal, intellectual, psychic, and social encounters that frame a subject’s experience of the city. We suggest that the conflict, collision, and contestation between texts and acts provoke embodied struggles that lead to change and renewal over time. (p.6)

Such, we find a justification for our “behavioral turn” and its application to texts as well as to the urban from a rather different corner. Even more significant, Solsa et al. seem to agree that performativity and textuality could not be detached from the urban at all. Apparently, the urban as a particular quality of human culture more and more develops into the main representative of human culture.

Yet, neither text nor performance, nor their combination count for a full account of the mediality of the urban. As we already indicated above, the movie as kind of a cross-media from text, image, and performance is equally important.

The relations between film and the urban, between architecture and the film, are also quite wide-spread. The cinema, somehow the successor of the theatre, could be situated only within the city. From the opposite direction, many would consider a city without cinemas as being somehow incomplete. The co-evolutionary story between both is still being under vivid development, I think.

There is particularly one architect/urbanist who is able to blend the film and the building into each other. You may know him quite well, I refer to Rem Koolhaas. Everybody knows that he has been an experimental moviemaker in his youth. It is much less known that he deliberately organized at least one of his buildings as kind of a movie: The Embassy of the Netherlands in Berlin (cf. [21]).

Here, Koolhaas arranged the rooms along a dedicated script. Some of the views out of the window he even trademarked to protect them!

Figure 1: Rem Koolhaas, Dutch Embassy, Berlin. The figure shows the script of pathways as a collage (taken from [21]).

9. The Behavioral Turn

So far we have shown how the behavioral turn could be supported and which are some of the first methodological consequences, if we embrace it. Yet, the picture developed so far is not complete, of course.

If we accept the almost trivial concept that autonomous entities are best conceived as behaving entities—remember that autonomy implies complexity—, then we further can ask about the structure of the relationship between the behaving subject and its counterpart, whether this is also a behaving subject or whether it is conceived more like passive object. For Bruno Latour, for instance, both together form a network, thereby blurring the categorical distinction between both.

Most descriptions of the process of getting into touch with something nowadays is dominated by the algorithmic perspective of computer software. Even Designer started to speak about interfaces. The German term for the same thing—“Schnittstelle”—is even more pronounced and clearly depicts the modernist prejudice in dealing with interaction. “Schnittstelle” implies that something, here the relation, is cut into two parts. A complete separation between interacting entities is assumed apriori. Such a separation is deeply inappropriate, since it would work only in strictly standardized environments, up to being programmed algorithmically. Precisely this was told us over and over again by designers of software “user interfaces”. Perhaps here we can find the reason for so many bad designs, not only concerning software. Fortunately, though just through a slow evolutionary process, things improve more and more. So-called “user-centric” design, or “experience-oriented” design became more abundant in recent years, but their conceptual foundation is still rather weak, or a wild mixture of fashionable habits and strange adaptations of cognitive science.

Yet, if we take the primacy of interpretation serious, and combine it with the “behavioral turn” we can see a much more detailed structure than just two parts cut apart.

The consequence of such a combination is that we would drop the idea of a clear-cut surface even for passive objects. Rather, we could conceive objects as being stuffed with a surrounding field that becomes stronger the closer we approach the object. By means of that field we distinguish the “pure” physicality from the semiotically and behaviorally active aspects.

This field is a simple one for stone-like matter, but even there it is still present. The field becomes much more rich, deep and vibrant if the entity is not a more or less passive object, but rather an active and autonomous subject. Such as an animal, a text, or a city. The reason being that there are no apriori and globally definable representative criteria that we could use to approach such autonomous entities. We only can know about more or less suitable procedures about how to derive such criteria in the particular case, approaching a particular individual {text, city}. The missing of such criteria is a direct correlate for their semantic productivity, or, likewise, for their unboundedness.

Approaching a semantically productive entity—such entities are also always able to induce new signs, they are semiosic entities—is reminds to approaching a gravitational field. Yet it is also very different from a gravitational field, since our semio-behavioral field shows increasing structural richness the closer the entities approach to each other. It is quite obvious that only by means of such a semio-behavioral field we can close the gap between the subject and the world that has been opened, or at least deepened by the modernist contributions from the times of Descartes until late computer science. Only upon a concept like the semio-behavioral field, which in turn is a consequence of the behavioral turn, we can overcome the existential fallacy as it has been purported and renewed over and over again by the dual pair of material and immaterial. The language game that separates the material and immaterial inevitably leads into the nonsensical abyss of existentialism. Dual concepts always come with tremendous costs, as they prevent any differentiated way of speaking about the matter. For instance, it prevents to recognize the materiality of symbols, or more precisely, the double-articulation of symbols between the more material and the more immaterial aspects of the world.

The following series of images may be taken as a metaphorical illustration of that semio-behavioral field. We call it the zona extima of the behavioral coating of entities.

Figure 2a: The semio-behavioral field around an entity.

Figure 2b: The situation as another entity approaches perceptively.

Figure 2c: Mutual penetration of semio-behavioral fields.

Taken together we may say, that whenever {sb,sth} gets into contact with {sb, sth}, we do so through the behavioral coating. This zone is of contact is not intimate (as Peter Sloterdijk describes it), it is rather extimate, though there is a smooth and graded change of quality from extimacy to intimacy as the distance decreases. The zona extima is a borderless (topological) field, driven by purposes (due to modelling), it is medial, behaviorally  choreographed as negotiation, exposure, call & request.

The concept of extimation, or also the process of extimating, is much more suitable than “interaction” to describe what‘s going on when we act, behave, engage, actively perceive, encounter with or towards the other. The interesting thing with the web-based media is that some aspects of zona extima can be transferred.

10. Conclusion

In this essay we try to argument in favor of a behavioral turn as a general attitude when it comes to conceive the interaction of any kind of two entities. The behavioral turn is a consequence of three major and interrelated assumptions:

  • – primacy of interpretation in the relation to the world;s;
  • – primacy of process and relation against matter and point;
  • – complexity and associativity in strongly mediatized environments.

All three assumptions are strictly outside of anything that phenomenological, positivist or modernist approaches can talk about or even practice.

It particularly allows to overcome the traditional and strict separation between the material and the immaterial, as well as the separation between the active and the passive. These shifts can’t be underestimated; they have far-reaching consequences upon the way we practice and conceive our world.

The behavioral turn is the consequence of a particular attitude that respects the bi-valency of world as a dynamic system of populations of relations. It is less the divide between the material and the immaterial, which anyway is somewhat an illusion deriving from the metaphysical claim of the possibility of essences. For instance, the jump that occurs between the realms of the informational and the causal establishes as a pair of two complimentary but strictly and mutually exclusive modes of speaking about the orderliness in the world. In some way, it is also the orderliness in the behavior of the observer—as repetition—that creates the informational that the observer than may perceive. The separation is thus a highly artificial one, in either direction. It is simply silly to discuss the issue of causality without referring to the informational aspects (for a full discussion of the issue see this essay). In any real-world case we always find both aspects together, and we find it as behavior.

Actually, the bi-valent aspect that I mentioned before refers to something quite different, in fact so different that we even can’t speak properly about it. It refers to these aspects that are apriori to modeling or any other comprehension, that are even outside to the performance of the individual itself. What I mean is the resistance of existential arrangements, inclusive the body that the comprehending entity is partially built from. This existential resistance introduces something like outer space for the cultural sphere. Needless to say that we can exist only within this cultural sphere. Yet, any action upon the world enforces us to take a short trip into the vacuum, and if we are lucky the re-entrance is even productive. We may well expect an intensification of the aspect of the virtual, as we argued here. Far from being suitable to serve as a primacy (as existentialism misunderstood the issue), the existential resistance, the absolute outside, enforces us to bark on the concept of behavior. Only “behavior” as a perceptional and performative attitude allows to extract coherence from the world without neglecting the fact of that resistance or contumacy.

The behavioral turn triggers a change in the methodology for empiric investigations as well. The standard set of methods for empiric descriptions changes, using the relation and the coherent series always as the starting point, best in its probabilized form, that is, as generalized probabilistic context. This also prevents the application of statistical methods directly to raw data. There should always be some kind of grouping or selection preceding the statistical reasoning. Otherwise we would try to follow the route that Wittgenstein blocked as a “wrong usage of symbols” (in his rejection of the reasonability of Russel/Whitehead’s Principia Mathematica). The concept of abstract behavior inclusive the advanced methodology that avoids to start with representational symbolification is clearly a sound way out of this deep problem from which any positivist empiric investigation suffers.

Interaction, including any action upon some other entity, when understood within the paradigm of behavior, becomes a recurrent, though not repetitive, self-adjusting process. During this process means and symbols may change and be replaced all the way down until a successful handshake. There is no objectivity in this process other than the mutual possibility for anticipation. Despite the existential resistance and contumacy that is attached to any re-shaping of the world, and even more so if we accomplish it by means of tools, this anticipation is, of course, greatly improved upon the alignment to cultural standards, contributing to the life-world as a shared space of immanence.

This provides us finally a sufficiently abstract, but also a sufficiently rich or manifold perspective on the issue of the roles of symbols regarding the text, the urban and the anime, the animal-like. None of those could be comprehended without first creating a catalog or a system of symbols. These symbols, both material and immaterial and thus kind of a hinge, a double-articulation, are rooted both in the embedding culture (as a de-empirifying selective force) and the individual, which constitutes another double-articulation. The concept of abstract behavior, given as a set of particular conditions and attitudes, allows to respond appropriately to the symbolic.

The really big question concerning our choreostemic capabilities—and those of the alleged machinic—therefore is: How to achieve the fluency in dealing with the symbolic without presuming it as a primary entity? Probably by exercising observing. I hope that the suggestions expressed so far in these essay provide some robust starting points. …we will see.

Notes

1. Here we simply cite the term of “information retrieval”, we certainly do not agree that the term is a reasonable one, since it is deeply infected by positivist prejudices. “Information” can’t be retrieved, because it is not “out there”. Downloading a digitally encoded text is neither a hunting nor a gathering for information, because information can’t be considered to be an object. Information is only present during the act of interpretation (more details about the status of information you can find here). Actually, what we are doing is simply “informationing”.

2. The notion of a “behavioral turn” is known from geography since the late 1960ies [22][23], and also from economics. In both fields, however, the behavioral aspect is related to the individual human being. In both areas, any level of abstraction with regard to the concept of behavior is missing. Quite in contrast to those movements, we do not focus on the neglect of the behavioral domain when it comes to human society, but rather the transfer of the abstract notion of behavior to non-living entities.

Another reference to “behavioral sciences” can be found in social sciences. Yet, in social sciences “behavioral” is often reduced to “behaviorist”, which of course is nonsense. A similar misunderstanding is abundant in political sciences.

3. Note that the proposed „behavioral turn“ should not be mistaken as a “behavioristic” move, as sort of a behaviorism. We strictly reject the stimulus-response scheme of the behaviorism. Actually, behaviorism as it has been developed by Watson and Pavlov has only little to do with behavior at all. It is nothing else than an overt reductionist program, rendering any living being into a trivial machine. Unfortunately, the primitive scheme of behaviorism is experiencing kind of a come-back in so-called “Behavioral Design”, where people talk about “triggers” much in the same way as Pavlov did (c.f. BJ Fogg’s Behavior Model).

References

  • [1] Michael Epperson (2009). Quantum Mechanics and Relational Realism: Logical Causality and Wave Function Collapse. Process Studies, 38(2): 339-366.
  • [2] G. Moran, J.C. Fentress (1979). A Search for Order in Wolf Social Behavior. pp.245-283. in: E. Klinghammer (ed.), The Behavior and Ecology of Wolves. Symp. held on 23-24.5.1975 in Wilmington N.C.), Garland STPM Press, New York..
  • [3] Gilles Deleuze, Difference and repetitionGilles Deleuze, Difference and Repetition.
  • [4] J.A.R.A.M. Van Hooff (1982). Categories and sequences of behaviour: methods of description and analysis. in: Handbook of methods in nonverbal behavior research (K.R. Scherer& P. Ekman, eds). Cambridge University Press, Cambridge.
  • [5] P.G.M. van der Heijden, H. de Vries, J.A.R.A.M. van Hooff (1990). Correspondence analysis of transition matrices, with special attention to missing entries and asymmetry. Anim.Behav. 40: 49-64.
  • [6] Teuvo Kohonen, Samuel Kaski, K. Lagus und J. Honkela (1996). Very Large Two-Level SOM for the Browsing of Newsgroups. In: C. von der Malsburg, W. von Seelen, J. C. Vorbrüggen and B. Sendhoff, Proceedings of ICANN96, International Conference on Artificial Neural Networks, Bochum, Germany, July 16-19, 1996, Lecture Notes in Computer Science, Vol. 1112, pp.269-274. Springer, Berlin.
  • [7] Hecht-Nielsen (1994).
  • [8] Javier Rojo Tuan, S. Nguyen (2010). Improving the Johnson-Lindenstrauss Lemma. available online.
  • [9] Sanjoy Dasgupta, Presentation given about: Samuel Kaski (1998), Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering, Helsinki University of Technology 1998. available online.
  • [10] Michel Serres, Nayla Farouki. Le trésor. Dictionnaire des sciences. Falmamrion, Paris 1998. p.394.
  • [11] Bill Hillier, Space Syntax. E-edition, 2005.
  • [12] Bill Hillier (2009). The City as a Socio-technical System: a spatial reformulation in the light of the levels problem and the parallel problem. Keynote paper to the Conference on Spatial Information Theory, September 2009.
  • [13] Alan Hájek (2007). The Reference Class Problem is Your Problem Too. Synthese 156 (3):563-585.
  • [14] Vera Bühlmann (2012). In the Quantum City – design, and the polynomial grammaticality of artifacts. forthcoming.
  • [15] David G. Shane. Recombinant Urbanism. 2005.
  • [16] Klaus Wassermann (2010). SOMcity: Networks, Probability, the City, and its Context. eCAADe 2010, Zürich. September 15-18, 2010. available online.
  • [17] Klaus Wassermann, Vera Bühlmann, Streaming Spaces – A short expedition into the space of media-active façades. in: Christoph Kronhagel (ed.), Mediatecture, Springer, Wien 2010. pp.334-345. available here. available here.
  • [18] D.J. Hopkins, Shelley Orr and Kim Solga (eds.), Performance and the City. Palgrave Macmillan, Basingstoke 2009.
  • [19] Roland Barthes, From Work to Text. in: Image, Music, text: Essay Selected and translated. Transl. Stephen Heath, Hill&Wang, New York 1977. also available online @ google books p.56.
  • [20] Henri Lefebvre, The Production of Space. 1979.
  • [21] Vera Bühlmann. Inhabiting media. Thesis, University of Basel (CH) 2009.
  • [22] Kevin R Cox, Jennifer Wolch and Julian Wolpert (2008). Classics in human geography revisited. “Wolpert, J. 1970: Departures from the usual environment in locational analysis. Annals of the Association of American Geographers 50, 220–29.” Progress in Human Geography (2008) pp.1–5.
  • [23] Dennis Grammenos. Urban Geography. Encyclopedia of Geography. 2010. SAGE Publications. 1 Oct. 2010. available online.

۞

Junkspace, extracted.

July 16, 2012 § Leave a comment

Some years after “The Generic City” Koolhaas published

a further essay on the problematic field of identity: “Junkspace” (JS).[1] I think it is a good idea to introduce both of them and to relate them before discussing the issues of this field by ourselves.

Unlike “The Generic City” (TGC), which was constructed as kind of a report about a film script, JS is more like a “documentary manifesto,” certainly provocative (for thought?), but also not a theory. “Junkspace” throws a concept in/out, according to its message, one could say. As in TGC, Koolhaas tries to densify and to enhance contrasts in order to render the invisible visible. Its language thus should not be misunderstood as “apocalyptic” or the like, or as a reference to actual “facts”. We else must consider that even documentations are inevitably equipped with theories and models, intentions and expectations. The biggest difference between the two essays is probably the fact that in JS Koolhaas does not try to keep distance through the formal construction of the writing. Hence, it may be legitimate to read his essay indeed as kind of a seriously taken diagnosis.

In many ways, JS reads as a critique of modernism and of post-modernism, not just as attitudes in architecture, but rather concerning the whole culture, ending in a state where the “cosmetic is the new cosmic.” Albeit critique is not made (too) explicit, trying to avoid bringing in explicit value statements, the tone of JS appears negative. Yet, it does so only upon the reader’s interpretation. “Junkspace is a low-grade purgatory.” In Christian mythology, everybody had to pass it, the good ones and the evil ones, except the bravest saints, perhaps. Failure is expressed, but by referring to a certain otherworldliness: “We do not leave pyramids.”

The style of JS is ambiguous itself, presumably intentionally so. On the one hand, it reminds to mathematical, formal series of sentences. Sections often start with existential proposals: “Junkspace is …”. Together, as a series, or a hive, these imply  unspoken axioms. On the other hand it seems as if Koolhaas hesitates to use the figure of logic, or accordingly of cause and effect, with regard to the Junkspace itself. Such, Koolhaas exhibits performatively a clear-cut non-modern, or should we say “meta-modern”, attitude. By no means this should be taken as kind of some irrationality, though. We just find lines of historical developments, often even only historizing contrasts. This formal structure is anything but a self-righteous rhetoric game, it’s more like a necessary means to maintain some distance to modernism. The style of JS could be considered as (empty) rhetoric only from within  a modernist attitude.

Before we deal further with modernism (below, and more extensively here), I first want to list my selection of core passages. The sections in Koolhaas’ text are neither enumerated nor divided by headlines (no hierarchies! many “…”! a Junkspace…), so I provide the page numbers in order to facilitate reference. Additionally, I enumerated the pieces for referencing them from within our own writing.

Here is the extract from Junkspace; it is of  course hard to do such a selection—even if we allow for a total of 59 passages—, as JS is rather densely written. Koolhaas begins with some definitions before turning to its properties, readings and implications:

Précis of “Junkspace”

(p.175)

1. “Identity” is the new junk food for the dispossessed, globalization’s fodder for the disenfranchised … […] Junk-Space is the residue mankind leaves on the planet. The built […] product of modernization is not modern architecture but Junkspace. Junkspace is what remains after modernization has run its course, or, more precisely, what coagulates while modernization is in progress, its fallout. Modernization had a rational program: to share the blessings of science, universally. Junkspace is its apotheosis, or meltdown.

2. Junkspace is the sum total of our current achievement;

3. It was a mistake to invent modern architecture for the twentieth century. Architecture disappeared in the twentieth century; we have been reading a footnote under a microscope hoping it would turn into a novel;

4. […] our concern for the masses has blinded us to People’s Architecture. Junkspace seems an aberration, but it is the essence, the main thing. the product of an encounter between escalator and air-conditioning.

5. Continuity is the essence of Junkspace.

(p.176)

6. Junkspace is sealed, held together not by structure but by skin, like a bubble.

7. Junkspace is a Bermuda Triangle of concepts, an abandoned petri dish: it cancels distinctions, undermines resolve, confuses intention with realization. It replaces hierarchy with accumulation, composition with addition. […] A fuzzy empire of blur, it […] offer[s] a seamless patchwork of the permanently disjointed. […] Junkspace is additive, layered, and lightweight, not articulated in different parts but subdivided, […].

8. Junkspace’s iconography is 13 percent Roman, 8 percent Bauhaus and 7 percent Disney (neck and neck), 3 percent Art Nouveau, followed closely by Mayan.

(p.177)

9. Junkspace is beyond measure, beyond code … Because it cannot be grasped, Junks pace cannot be remembered. It is flamboyant yet unmemorable, like a screen saver;

10. Junkspace’s modules are dimensioned to carry brands;

11. Junkspace performs the same role as black holes in the universe: they are essences through which meaning disappears.

12. Junkspace is best enjoyed in a state of post-revolutionary gawking. Polarities have merged.

13. Modern architecture […] exposes what previous generations kept under wraps: structures emerge like springs from a mattress.

14. Junkspace thrives on design, but design dies in Junkspace […] Regurgitation is the new  creativity.

15. Superstrings of graphics, […] LEDs, and video describe an authorless world beyond anyone’s claim, always unique, utterly unpredictable, yet intensely familiar.

(p.178)

16. Junkspace sheds architectures like a reptile sheds skins, is reborn every Monday morning.

17. Architects thought of Junkspace first and named it Megastructure, the final solution to transcend their huge impasse.

18. In Junkspace, the tables are turned: it is subsystem only, without superstructure, orphaned particles in search of a framework or pattern.

19. Each element performs its task in negotiated isolation.

20. Instead of development, it offers entropy.

21. Change has been divorced from the idea of improvement. There is no progress; like a crab on LSD, culture staggers endlessly sideways …

22. Everywhere in Junkspace there are seating arrangements, ranges of modular chairs, even couches, as if the experience Junkspace offers its consumers is significantly more exhausting than any previous spatial sensation;

(p.179)

23. Junkspace is fanatically maintained, the night shift undoing the damage of the day shift in an endless Sisyphean replay. As you recover from Junkspace, Junkspace recovers from you.

24. Traditionally, typology implies demarcation, the definition of a singular model that excludes other arrangements. Junkspace represents a reverse typology of cumulative, approximative identity, less about kind than about quantity. But formlessness is still form, the formless also a typology.

25. Junkspace can either be absolutely chaotic or frighteningly aseptic-like a best-seller-overdetermined and indeterminate at the same time.

26. Junkspace is often described as a space of flows, but that is a misnomer; flows depend on disciplined movement, bodies that cohere. Junkspace is a web without a spider; […] It is a space of collision, a container of atoms, busy, not dense …

(p.180)

27. Junkspace features the tyranny of the oblivious: sometimes an entire Junkspace comes unstuck through the nonconformity of one of its members; a single citizen of an another culture-a refugee, a mother-can destabilize an entire Junkspace, […]

28. Flows in Junkspace lead to disaster: department stores at the beginning of sales; the stampedes triggered by warring compartments of soccer fans;

29. Traffic is Junkspace, from airspace to the subway; the entire highway system is Junkspace […]

30. Aging in Junkspace is nonexistent or catastrophic; sometimes an entire Junkspace—a department store, a nightclub, a bachelor pad-turns into a slum overnight without warning.

(p.181)

31. Corridors no longer simply link A to B, but have become “destinations.” Their tenant life tends to be short: the most stagnant windows, the most perfunctory dresses, the most implausible flowers. All perspective is gone, as in a rainforest (itself disappearing, they keep saying … ).

32. Trajectories are launched as ramp, turn horizontal without any warning, intersect, fold down, suddenly emerge on a vertiginous balcony above a large void. Fascism minus dictator.

(p.182)

33. There is zero loyalty—and zero tolerance—toward configuration, no “original” condition; architecture has turned into a time-lapse sequence to reveal a “permanent evolution.” … The only certainty is conversion-continuous-followed, in rare cases, by “restoration,” the process that claims ever new sections of history as extensions of Junkspace.

34. History corrupts, absolute history corrupts absolutely. Color and matter are eliminated from these bloodless grafts.

35. Sometimes not overload but its opposite, an absolute absence of detail, generates Junkspace. A voided condition of frightening sparseness, shocking proof that so much can be organized by so little.

36. The curse of public space: latent fascism safely smothered in signage, stools, sympathy … Junkspace is postexistential; it makes you uncertain where you are, obscures where you go, undoes where you were. Who do you think you are? Who do you want to be? (Note to architects: You thought that you could ignore Junkspace, visit it surreptitiously, treat it with condescending contempt or enjoy it vicariously … because you could not understand it, you’ve thrown away the keys … But now your own architecture is infected, has become equally smooth, all-inclusive, continuous, warped, busy, atrium-ridden …)

(p.183)

37. Restore, rearrange, reassemble, revamp, renovate, revise, recover, redesign, return-the Parthenon marbles-redo, respect, rent: verbs that start with re-produce Junkspace …

38. Junkspace will be our tomb.

39. Junkspace is political: It depends on the central removal of the critical faculty in the name of comfort and pleasure.

40. Not exactly “anything goes”; in fact, the secret of Junkspace is that it is both promiscuous and repressive: as the formless proliferates, the formal withers, and with it all rules, regulations, recourse …

41. Junkspace […] is the interior of Big Brother’s belly. It preempts people’s sensations. […] it blatantly proclaims how it wants to be read. Junkspace pretends to unite, but it actually splinters. It creates communities not out of shared interest or free association, but out of identical statistics and unavoidable demographics, an opportunistic weave of vested interests.

(p.184)

42. God is dead, the author is dead, history is dead, only the architect is left standing … an insulting evolutionary joke … A shortage of masters has not stopped a proliferation of masterpieces. “Masterpiece” has become a definitive sanction, a semantic space that saves the object from criticism, leaves its qualities unproven, its performance untested, its motives unquestioned.

43. Junkspace reduces what is urban to urbanity. Instead of public life, Public SpaceTM: what remains of the city once the unpredictable has been removed …

44. Inevitably, the death of God (and the author) has spawned orphaned space; Junkspace is authorless, yet surprisingly authoritarian … At the moment of its greatest emancipation, humankind is subjected to the most dictatorial scripts.: […] The chosen theater of megalomania—the dictatorial—is no longer politics, but entertainment.

45. Why can’t we tolerate stronger sensations? Dissonance? Awkwardness? Genius? Anarchy? … Junkspace heals, or at least that is the assumption of many hospitals.

(p.185)

46. Often heroic in size, planned with the last adrenaline of modernism’s grand inspiration, we have made them (too) human;

47. Junkspace is space as vacation;

(p.186)

48. Junkspace features the office as the urban home, a meeting-boudoir. […] Espace becomes E-space.

49. Globalization turns language into Junkspace. […] Through the retrofitting of language, there are too few plausible words left; our most creative hypotheses will never be formulated, discoveries will remain unmade, concepts unlaunched, philosophies muffled, nuances miscarried … We inhabit sumptuous Potemkin suburbs of weasel terminologies. Aberrant linguistic ecologies sustain virtual subjects in their claim to legitimacy, help them survive … Language is no longer used to explore, define, express, or to confront but to fudge, blur, obfuscate, apologize, and comfort … it stakes claims, assigns victimhood, preempts debate, admits guilt, fosters consensus. […] a Satanic orchestration of the meaningless …

50. Intended for the interior, Junkspace can easily engulf a whole city.

(p.187)

51. Seemingly at the opposite end of Junkspace, the golf course is, in fact, its conceptual double: empty, serene, free of commercial debris. The relative evacuation of the golf course is achieved by the further charging of Junkspace. The methods of their design and realization are similar: erasure, tabula rasa, reconfiguration. Junkspace turns into biojunk; ecology turns into ecospace. Ecology and economy have bonded in Junkspace as ecolomy.

52. Junkspace can be airborne, bring malaria to Sussex;

(p.188)

53. Deprivation can be caused by overdose or shortage; both conditions happen in Junkspace (often at the same time). Minimum is the ultimate ornament, a self-righteous crime, the contemporary Baroque.

54. It does not signify beauty, but guilt.

55. Outside, in the real world, the “art planner” spreads Junkspace’s fundamental incoherence by assigning defunct mythologies to residual surfaces and plotting three-dimensional works in leftover emptiness. Scouting for authenticity, his or her touch seals the fate of what was real, taps it for incorporation in Junkspace.

56. The only legitimate discourse is loss; art replenishes Junkspace in direct proportion to its own morbidity.

(p.189)

57. […] maybe the origins of Junkspace go back to the kindergarten …

58. Will Junkspace invade the body? Through the vibes of the cell phone? Has it already? Through Botox injections? […] Is each of us a mini-construction site? […]

(p.190)

59. Is it [m: mankind] a repertoire of reconfiguration that facilitates the intromission of a new species into its self-made Junksphere? The cosmetic is the new cosmic… ◊

Modernism

JS is about the consequences of modernism for architecture and for urbanism. Koolhaas does not hesitate to explicate it: Modernization, modernism ends in a “meltdown”. As an alternative he offers the “apotheosis”, a particular quality as a Golden Calf of modernization. Within the context of urban life and architectural activities, this outcome shows up as “Junkspace”. The essence of it is emptiness, isolation, splintering, arbitrariness. Its “victory” is named by its offer, entropy, and its essence is continuity. Probably it is meant as kind of a tertiary chaos, vanishing any condition for the possibility of discernability, unfortunately as the final point attractor. We will see.

Koolhaas describes Junkspace as an unintended outcome of a global collective activity. Obviously, Koolhaas is struggling with that, or with the unintendedness of the effect, in other words with emergence and self-organization. Emergence and self-organization can be understood exclusively in the wider context of complexity as we have outlined it previously (see this piece). The concept of complexity as we have constructed it is by no means anti-scientific in a fundamental sense. Yet, it is a severe challenge to scientism as it is practiced today, as our concept explicitly refers to a reflected conceptual embedding, something that is still excluded from natural science today. Anyway, complexity as an explicated concept must be considered as a necessary part of architectural theory, if we take Koolhaas and his writings such as “Junkspace” serious. Without it, we could not make sense of the difference between standardization and homogenization, between uniqueness and singularity, between history and identity, between development and evolution, or between randomness and heterotopia.

Modernism and its effects is the not so hidden agenda of JS. We have to be clear about this concept—at least concerning its foundations, albeit we will not find space enough here for discussing or even just listing its branches that reach not only till Marcuse’s office in Frankfurt—if we want to understand neo-leftist interpretations of JS as that by Jameson (“Future City” [2]), and the not so hidden irony expressed by the resonating label “Future Cities Lab” that denotes the urbanism project of the Department of Architecture (one of the biggest in Europe) of the Swiss Federal Institute of Technology (ETHZ). It is also the name of a joint venture between National University of Singapore (NUS) and ETHZ. Yes, they indeed call it Lab(oratory), a place usually producing hives of “petri dishes,” either abandoned (see 7. above) or “containing” the city itself (see section 8.1. of “The Generic City”), and at the same time still, and partially contradictory to its practices, an oratory of modernism. Perhaps. (more about that later).

Latest here we have to address the question:
What is the problem with modernism?

This will be the topic of the next post.

References
  • [1] Rem Koolhaas (2002). Junkspace. October, Vol. 100, “Obsolescence”, pp. 175-190. MIT Press. available here
  • [2] Fredric Jameson, Future City, New Left Review NLR 21, May-June 2003, pp. 65-79. available here

۞

The Text Machine

July 10, 2012 § Leave a comment

What is the role of texts? How do we use them (as humans)?

How do we access them (as reading humans)? The answers to such questions seem to be pretty obvious. Almost everybody can read. Well, today. Noteworthy, reading itself, as a performance and regarding its use, changed dramatically at least two times in history: First, after the invention of the vocal alphabet in ancient Greece, and the second time after book printing became abundant during the 16th century. Maybe, the issue around reading isn’t so simple as it seems in everyday life.

Beyond such accounts of historical issues and basic experiences, we have a lot of more theoretical results concerning texts. Beginning with Friedrich Schleiermacher who was the first to identify hermeneutics as a subject around 1830 and formulated it in a way that has been considered as more complete and powerful than the version proposed by Gadamer in the 1950ies. Proceeding of course with Wittgenstein (language games, rule following), Austin (speech act theory) or Quine (criticizing empirism). Philosophers like John Searle, Hilary Putnam and Robert Brandom then explicating and extending the work of the former heroes. And those have been accompanied by many others. If you wonder about linguistics missing here, well, then because linguistics does not provide theories about language. Today, the domain is largely caught by positivism and the corresponding analytic approach.

Here in his little piece we pose these questions in the context of certain relations between machines and texts. There are a lot of such relations, and even quite sophisticated or surprising ones. For instance, texts can be considered as kind of machines. Yet, they bear a certain note of (virtual) agency as well, resulting in a considerable non-triviality of this machine aspect of texts. Here we will not deal with this perspective. Instead, we just will take a look on the possibilities and the respective practices to handle or to “treat” texts with machines. Or, if you prefer, the treating of texts by machines, as far as a certain autonomy of machines could be considered as necessary to deal with texts at all.

Today, we can find a fast growing community of computer programmers that are dealing with texts as kind of unstructured information. One of the buzz-words is the so-called “semantic web”, another one is “sentiment analysis”. We won’t comment in any detail about those movements, because they are deeply flawed. The first one is trying to formalize semantics and meaning apriori, trying to render the world into a trivial machine. We repeatedly criticized this and we agree herein with Douglas Hofstadter. (see this discussion of his “Fluid Analogy”). The second is trying to identify the sentiment of a text or a “tweet”, e.g. about a stock or an organization, on the basis of statistical measures about keywords and their utterly naive “n-grammed” versions, without actually paying any notice to the problem of “understanding”. Such nonsense would not be as widespread if programmers would read only a few fundamental philosophical texts about language. In fact, they don’t, and thus they are condemned to visit any of the underdeveloped positions that arose centuries ago.

If we neglect the social role of texts for a moment, we might identify a single major role of texts, albeit we have to describe it then in rather general terms. We may say that the role of a text, as a specimen of many other texts from a large population, is its functioning as a medium for the externalization of mental content in order to serve the ultimate purpose, which consists of the possibility for a (re)construction of resembling mental content on the side of the interpreting person.

This interpretation is a primacy. It is not possible to assign meaning to text like a sticky note, then putting the text including the yellow sticky note directly into the recipients brain. That may sound silly, but unfortunately it’s the “theory” followed by many people working in the computer sciences. Interpretation can’t be controlled completely, though, not even by the mind performing it, not even by the same mind who seconds before externalized the text through writing or speaking.

Now, the notion of mental content may seem both quite vague and hopelessly general as well. Yet, in the previous chapter we introduced a structure, the choreostemic space, which allows to speak pretty precise about mental content. Note that we don’t need to talk about semantics, meaning or references to “objects” here. Mental content is not a “state” either. Thinking “state” and the mental together is much on the same stage as to seriously considering the existence of sea monsters in the end of 18th century, when the list science of Linnaeus was not yet reshaped by the upcoming historical turn in the philosophy of nature. Nowadays we must consider it as silly-minded to think about a complex story like the brain and its mind by means of “state”. Doing so, one confounds the stability of the graphical representation of a word in a language with the complexity of a multi-layered dynamic process, spanned between deliberate randomness, self-organized rhythmicity and temporary thus preliminary meta-stability.

The notion of mental content does not refer to the representation of referenced “objects”. We do not have maps, lists or libraries in our heads. Everything which we experience as inner life builds up from an enormous randomness through deep stacks of complex emergent processes, where each emergent level is also shaped from top-down, implicitly and, except the last one usually called “consciousness,” also explicitly. The stability of memory and words, of feelings and faculties is deceptive, they are not so stable at all.  Only their externalized symbolic representations are more or less stable, their stability as words etc.  can be shattered easily. The point we would like to emphasize here is that everything that happens in the mind is constructed on the fly, while the construction is completed only with the ultimate step of externalization, that is, speaking or writing. The notion of “mental content” is thus a bit misleading.

The mental may be conceived most appropriately as a manifold of stacked and intertwined processes. This holds for the naturalist perspective as well as for the abstract perspective, as he have argued in the previous chapter. It is simply impossible to find a single stable point within the (abstract) dynamics between model, concept, mediality and virtuality, which could be thought of as spanning a space. We called it the choreostemic space.

For the following remarks about the relation between text and machines and the practitioners engaged in building machines to handle texts we have to keep in mind just those two things: (i) there is a primacy of interpretation, (ii) the mental is a non-representative dynamic process that can’t be formalized (in the sense of “being represented” by a formula).

In turn this means that we should avoid to refer to formulas when going to build a “text machine”. Text machines will be helpful only if their understanding of texts, even if it is a rudimentary understanding, follows the same abstract principles as our human understanding of texts does. Machines pretending to deal with texts, but actually only moving dead formal symbols back and forth, as it is the case in statistical text mining, n-gram based methods and similar, are not helpful at all. The only thing that happens is that these machines introduce a formalistic structure into our human life. We may say that these techniques render humans helpful to machines.

Nowadays we can find a whole techno-scientific community that is engaged in the field of machine learning, devised to “textual data”. The computers are programmed in such a way that they can be used to classify texts. The idea is to provide some keywords, or anti-words, or even a small set of sample texts, which then are taken by the software as a kind of template that is used to build a selection model. This model then is used to select resembling texts from a large set of texts. We have to be very clear about the purpose of these software programs: they classify texts.

The input data for doing so is taken from the texts themselves. More precisely, they are preprocessed according to specialized methods. Each of the texts gets described by a possibly large set of “features” that have been extracted by these methods. The obvious point is that the procedure is purely empirical in the strong sense. Only the available observations (the texts) are taken to infer the “similarity” between texts. Usually, not even linguistic properties are used to form the empirical observations, albeit there are exceptions. People use the so-called n-gram approach, which is only little more than counting letters. It is a zero-knowledge model about the series of symbols, which humans interpret as text. Additionally, the frequency or relative positions of keywords and anti-words are usually measured and expressed by mostly quite simple statistical methods.

Well, classifying texts is something that is quite different from understanding texts. Of course. Yet, said community tries to reproduce the “classification” achieved or produced by humans. Such, any of the engineers of the field of machine learning directed to texts implicitly claims kind of an understanding. They even organize competitions.

The problems with the statistical approach are quite obvious. Quine called it the dogma of empiricism and coined the Gavagai anecdote about it, which even provides much more information than the text alone. In order to understand a text we need references to many things outside the particular text(s) at hand. Two of those are especially salient: concepts and the social dimension. Straightly opposite to the believe of positivists, concepts can’t be defined in advance to a particular interpretation. Using catalogs of references does not help much, if these catalogs are used just as lists of references. The software does not understand “chair” by the “definition” stored in a database, or even by the set of such references. It simply does not care whether there are encoded ASCII codes that yield the symbol “chair” or the symbol “h&e%43”. Douglas Hofstadter has been stressing this point over and over again, and we fully agree to that.

From that necessity to a particular and rather wide “background” (notion by Searle) the second problem derives, which is much more serious, even devastating to the soundness of the whole empirico-statistical approach. The problem is simple: Even we humans have to read a text before being able to understand it. Only upon understanding we could classify it. Of course, the brain of many people is trained sufficiently as to work about the relations of the texts and any of its components while reading the text. The basic setup of the problem, however, remains the same.

Actually, what is happening is a constantly repeated re-reading of the text, taking into account all available insights regarding the text and the relations of it to the author and the reader, while this re-reading often takes place in the memory. To perform this demanding task in parallel, based on the “cache” available from memory, requires a lot of experience and training, though. Less experienced people indeed re-read the text physically.

The consequence of all of that is that we could not determine the best empirical discriminators for a particular text in-the-reading in order to select it as-if we would use a model. Actually, we can’t determine the set of discriminators before we have read it all, at least not before the first pass. Let us call this the completeness issue.

The very first insight is thus that a one-shot approach in text classification is based on a misconception. The software and the human would have to align to each other in some kind of conversation. Otherwise it can’t be specified in principle what the task is, that is, which texts should actually be selected. Any approach to text classification not following the “conversation scheme” is necessarily bare nonsense. Yet, that’s not really a surprise (except for some of the engineers).

There is a further consequence of the completeness issue. We can’t set up a table to learn from at all. This too is not a surprise, since setting up a table means to set up a particular symbolization. Any symbolization apriori to understanding must count as a hypothesis. Such simple. Whether it matches our purpose or not, we can’t know before we didn’t understand the text.

However, in order to make the software learning something we need assignates (traditionally called “properties”) and some criteria to distinguish better models from less performant models. In other words, we need a recurrent scheme on the technical level as well.

That’s why it is not perfectly correct to call texts “unstructured data”. (Besides the fact that data are not “out there”: we always need a measurement device, which in turn implies some kind of model AND some kind of theory.) In the case of texts, imposing a structure onto a text simply means to understand it. We even could say that a text as text is not structurable at all, since the interpretation of a text can’t never be regarded as finished.

All together, we may summarize the issue of complexity of texts as deriving from the following properties in the following way:

  • – there are different levels of context, which additionally stretch across surrounds of very different sizes;
  • – there are rich organizational constraints, e.g. grammars
  • – there is a large corpus of words, while any of them bears meaning only upon interpretation;
  • – there is a large number of relations that not only form a network, but which also change dynamically in the course of reading and of interpretation;
  • – texts are symbolic: spatial neighborhood does not translate into reference, in neither way;
  • understanding of texts requires a wealth of external, and quite abstract-concepts, that appear as significant only upon interpretation, as well as a social embedding of mutual interpretation,.

This list should at least exclude any attempt to defend the empirico-statistical approach as a reasonable one. Except the fact that it conveys a better-than-nothing attitude. These brings us to the question of utility.

Engineers build machines that are supposedly useful, more exactly, they are intended to be fulfill a particular purpose. Mostly, however, machines, even any technology in general, is useful only upon processes of subjective appropriation. The most striking example for this is the car. Else, computers have evolved not for reasons of utility, but rather for gaming. Video did not become popular for artistic reasons or for commercial ones, but due to the possibilities the medium offered for the sex industry. The lesson here being that an intended purpose is difficult to achieve as of the actual usage of the technology. On the other hand, every technology may exert some gravitational forces to develop a then unintended symbolic purpose and regarding that even considerable value. So, could we agree that the classification of texts as it is performed by contemporary technology is useful?

Not quite. We can’t regard the classification of texts as it is possible with the empirico-statistical approach as a reasonable technology. For the classification of texts can’t be separated from their understanding. All we can accomplish by this approach is to filter out those texts that do not match our interests with a sufficiently high probability. Yet, for this task we do not need text classification.

Architectures like 3L-SOM could also be expected to play an important role in translation, as translation requires even deeper understanding of texts as it is needed for sorting texts according to a template.

Besides the necessity for this doubly recurrent scheme we haven’t said much so far here about how then actually to treat the text. Texts should not be mistaken as empiric data. That means that we have to take a modified stance regarding measurement itself. In several essays we already mentioned the conceptual advantages of the two-layered (TL) approach based on self-organizing maps (TL-SOM). We already described in detail how the TL-SOM works, including the the basic preparation of the random graph as it has been described by Kohonen.

The important thing about TL-SOM is that it is not a device for modeling the similarity of texts. It is just a representation, even as it is a very powerful one, because it is based on probabilistic contexts (random graphs). More precisely, it is just one of many possible representations, even as it is much more appropriate than n-gram and other jokes. We even should NOT consider the TL-SOM as so-called “unsupervised modeling”, as the distinction between unsupervised vs. supervised is just another myth (=nonsense if it comes to quantitative models). The TL-SOM is nothing else than an instance for associative storage.

The trick of using a random graph (see the link above) is that the surrounds of words are differentially represented as well. The Kohonen model is quite scarce in this respect, since it applies a completely neutral model. In fact, words in a text are represented as if they would be all the same: of the same kind, of the same weight, etc. That’s clearly not reasonable. Instead, we should represent a word in several, different manners into the same SOM.

Yet, the random graph approach should not be considered just as a “trick”. We repeatedly argued (for instance here) that we have to “dissolve” empirical observations into a probabilistic (re)presentation in order to evade and to avoid the pseudo-problem of “symbol grounding”. Note that even by the practice of setting up a table in order to organize “data” we are already crossing the rubicon into the realm of the symbolic!

The real trick of the TL-SOM, however, is something completely different. The first layer represents the random graph of all words, the actual pre-specific sorting of texts, however, is performed by the second layer on the output of the first layer. In other words, the text is “renormalized”, the SOM itself is used as a measurement device. This renormalization allows to organize data in a standardized manner while allowing to avoid the symbolic fallacy. To our knowledge, this possible usage of the renormalization principle has not been recognized so far. It is indeed a very important principle that puts many things in order. We will deal later in a separate contribution with this issue again.

Only based on the associative storage taken as an entirety appropriate modeling is possible for textual data. The tremendous advantage of that is that the structure for any subsequent consideration now remains constant. We may indeed set up a table. The content of this table, the data, however is not derived directly from the text. Instead we first apply renormalization (a technique known from quantum physics, cf. [1])

The input is some description of the text completely in terms of the TL-SOM. More explicit, we have to “observe” the text as it behaves in the TL-SOM. Here, we are indeed legitimized to treat the text as an empirical observation, albeit we can, of course, observe the text in many different ways. Yet, observing means to conceive the text as a moving target, as a series of multitudes.

One of the available tools is Markov modeling, either as Markov chains, or by means of Hidden Markov Models. But there are many others. Most significantly, probabilistic grammars, even probabilistic phrase structure grammars can be mapped onto Markov models. Yet, again we meet the problem of apriori classification. Both models, Markovian as well as grammarian, need an assignment of grammatical type to a phrase, which often first requires understanding.

Given the autonomy of text, their temporal structure and the impossibility to apply apriori schematism, our proposal is that we just have to conceive of the text like we do of (higher) animals. Like an animal in its habitat, we may think of the text as inhabiting the TL-SOM, our associative storage. We can observe paths, their length and form, preferred neighborhoods, velocities, size and form of habitat.

Similar texts will behave in a similar manner. Such similarity is far beyond (better: as if from another planet) the statistical approach. We also can see now that the statistical approach is being trapped by the representationalist fallacy. This similarity is of course a relative one. The important point here is that we can describe texts in a standardized manner strictly WITHOUT reducing their content to statistical measures. It is also quite simple to determine the similarity of texts, whether as a whole, or whether regarding any part of it. We need not determine the range of our source at all apriori to the results of modeling. That modeling introduces a third logical layer. We may apply standard modeling, using a flexible tool for transformation and a further instance of a SOM, as we provide it as SomFluid in the downloads. The important thing is that this last step of modeling has to run automatically.

The proposed structure keeps any kind of reference completely intact. It also draws on its collected experience, that is, all texts it have been digesting before. It is not necessary to determine stopwords and similar gimmicks. Of course, we could, but that’s part of the conversation. Just provide an example of any size, just as it is available. Everything from two words, to a sentence, to a paragraph, to the content of a directory will work.

Such a 3L-SOM is very close to what we reasonably could call “understanding texts”. But does it really “understand”?

As such, not really. First, images should be stored in the same manner (!!), that is, preprocessed as random graphs over local contexts of various size, into the same (networked population of) SOM(s). Second, a language production module would be needed. But once we have those parts working together, then there will be full understanding of texts.

(I take any reasonable offer to implement this within the next 12 months, seriously!)

Conclusion

Understanding is a faculty to move around in a world of symbols. That’s not meant as a trivial issue. First, the world consists of facts, where facts comprise an universe of dynamic relations. Symbols are just not like traffic signs or pictograms as these belong to the more simple kind of symbols. Symbolizing is a complex, social, mediatized diachronic process.

Classifying, understood as “performing modeling and applying models” consists basically of two parts. One of them could be automated completely, while the other one could not treated by a finite or apriori definable set of rules at all: setting the purpose. In the case of texts, classifying can’t be separated from understanding, because the purpose of the text emerges only upon interpretation, which in turn requires a manifold of modeling raids. Modeling a (quasi-)physical system is completely different from that, it is almost trivial. Yet, the structure of a 3L-SOM could well evolve into an arrangement that is capable to understand in a similar way as we humans do. More precisely, and a bit more abstract, we also could say, that a “system” based on a population of 3L-SOM once will be able to navigate in the choreostemic space.

References
  • [1] B. Delamotte (2003). A hint of renormalization. Am.J.Phys. 72 (2004) 170-184, available online: arXiv:hep-th/0212049v3.

۞

Dealing with a Large World

June 10, 2012 § Leave a comment

The world as an imaginary totality of all actual and virtual

relationships between assumed entities can be described in innumerable ways. Even what we call a “characteristic” forms only in a co-dependent manner together with the formation processes of entities and relationships. This fact is particularly disturbing if we encounter something for the first time, without the guidance provided by more or less applicable models, traditions, beliefs or quasi-material constraints. Without those means any selection out of all possible or constructible properties is doomed to be fully contingent, subject to pure randomness.

Yet, this does not result in results that are similarly random. Given that the equipment with tools and methods is given for a task or situation at hand, modeling is for the major part the task to reduce the infiniteness of possible selections in such a way that the resulting representation can be expected to be helpful. Of course, this “utility” is not a hard measure in itself. It is not only dependent on the subjective attitude to risk, mainly the model risk and the prediction risk, utility is also relative to the scale of the scope, in other words, whether one is interested in motor or other purely physical aspects, tactical aspects or strategic aspects, whether one is interested in more local or global aspects, both in time and space, or whether one is interested in any kind of balanced mixture of those aspects. Establishing such a mixture is a modeling task in itself, of course, albeit one that is often accomplished only implicitly.

The randomness mentioned above is a direct corollary of the empirical underdetermination1. From a slightly different perspective, we also may say that it is an inevitable consequence of the primacy of interpretation. And we also should not forget that language and particularly metaphors in language—and any kind of analogical thinking as well—are means to deal constructively with that randomness, turning physical randomness into contingency. Even within the penultimate guidance of predictivity—it is only a soft guidance though—large parts of what we reasonably could conceive as facts (as temporarily fixed arrangement of relations) is mere collaborative construction, an ever undulating play between the individual and the general.
Even if analogical thinking indeed is the cornerstone, if not the Acropolis, of human mindedness, it is always preceded by and always rests upon modeling. Only a model allows to pick some aspect out of the otherwise unsorted impressions taken up from the “world”. In previous chapters we already discussed quite extensively the various general as well as some technical aspects of modeling, from an abstract as well as from a practical perspective.2  Here we focus on a particular challenge, the selection task regarding the basic descriptors used to set up a particular model.

Well, given a particular modeling task we have the practical challenge to reduce a large set of pre-specific properties into a small set of “assignates” that together represent in some useful way the structure of the dynamics of the system that we’d observed. How to reduce a set of properties created by observation that comprises several hundreds of them?
The particular challenge arises even in the case of linear systems if we try to avoid subjective “cut-off” points that are buried deeply into the method we use. Such heuristic means are wide-spread in statistically based methods. The bad thing about that is that you can’t control their influence onto the results. Since the task comprises the selection of properties for the description of the entities (prototypes) to be formed, such arbitrary thresholds, often justified or even enforced just by the method itself, will exert a profound influence on the semantic level. In other words it corroborates its own assumption of neutrality.

Yet, we also never should assume linearity of a system, because most of the interesting real systems are non-linear, even in the case of trivial machines. Brute force approaches are not possible, because the number of possible models is 2^n, with n the number of properties or variables. Non-linear models can’t be extrapolated from known ones, of course. The Laplacean demon3 became completely wrapped by Thomean folds4, being even quite worried by things like Turing’s formal creativity5.

When dealing with observations from “non-linear entities”, we are faced with the necessity to calculate and evaluate any selection of variables explicitly. Assuming a somewhat phantastic figure of 0.0000001 seconds (10e-6) needed to calculate a single model, we still would need 10E15 years to visit all models if we would have to deal with just 100 variables. To make it more palpable: It would take 80 million times longer than the age of the earth, which is roughly 4.8 billion years…

Obviously, we have to drop the idea that we can “proof” the optimality of a particular model. The only thing we can do is to minimize the probability that within a given time T we can find a better model. On the other hand, the data are not of unbounded complexity, since real systems are not either. There are regularities, islands of stability, so to speak. There is always some structure, otherwise the system would not persist as an observable entity. As a consequence, we can organize the optimization of “failure time probability”, we may even consider this as a second-order optimization. We may briefly note that the actual task thus is not only to select a proper set of variables, we also should identify the relations between the observed and constructed variables. Of course, there are always several if not many sets of variables that we could consider as “proper”, precisely for the reason that they form a network of relations, even if this network is probabilistic in nature and itself being kind of a model.

So, how to organize this optimization? Basically, everything has to be organized as nested, recurrent processes. The overall game we could call learning. Yet, it should be clear that every “move” and every fixation of some parameter and its value is nothing else than a hypothesis. There is no “one-shot-approach”, and no linear progression either.
If we want to avoid naive assumptions—and any assumption that remains untested is de facto a naive assumption—we have to test them. Everything is trial and error, or expressed in a more educated manner, everything has to be conceived as a hypothesis. Consequently we can reduce the number of variables only by a recurrent mechanism. As a lemma we conclude that any approach that reduces the number of variables not in a recurrent fashion can’t be conceived as a sound approach.

Contingent Collinearities

It is the structuredness of the observed entity that cause the similarity of any two observations across all available or apriori chosen properties. We also may expect that any two variables could be quite “similar”6 across all available observations. This provides the first two opportunities for reducing the size of the problem. Note that such reduction by “black-listing” applies only to the first steps in a recurrent process. Once we have evidence that certain variables do not contribute to the predictivity of our model, we may loosen the intensity of any of the reductions! Instead of removing it from the space of expressibility we may preferably achieve a weighted preference list in later stages of modeling.
So, if we find n observations or variables being sufficiently collinear, we could remove a portion p(n) from this set, or we could compress them by averaging.
R1: reduction by removing or compressing collinear records.
R2: reduction by removing or compressing collinear variables.
A feasible criterion for assessing the collinearity is the monotonicity in the relationship between two variables as it is reflected by Spearman’s correlation. We also could apply K-means clustering using all variables, then averaging all observations that are “sufficiently close” to the center of the clusters.
Albeit the respective thresholding is only a preliminary tactical move, we should be aware of the problematics we introduce by such a reduction. Firstly, it is the size of the problem that brings in a notion of irreversibility, even if we are fully aware of the preliminarity. Secondly, R1 is indeed critical because it is in some quite obvious way a petitio principii. Even tiny differences in some variables could be masked by larger differences in such variables that penultimately are recognized as irrelevant. Hence, very tight constraints should be applied when performing R1.
When removing collinear records we else have to care about the outcome indicator. Often, the focused outcome is much less frequent than its “opposite”. Preferably, we should remove records that are marked as negative outcome, up to a ratio of 1:1 between positive and negative outcome in the reduced data. Such “adaptive” sampling is similar to so-called “biased sampling”.

Directed Collinearities

Additionally to those two collinearities there is a third one, which is related to the purpose of the model. Variables that do not contribute to the predictive reconstruction of the outcome we could call “empirically empty”.

R3: reduction by removing empirically empty variables

Modeling without a purpose can’t be considered to be modeling at all7, so we always have a target variable available that reflects the operationalization of the focused outcome. We could argue that only those variables are interesting for a detailed inspection that are collinear to the target variable.

Yet, that’s a problematic argument, since we need some kind of model to draw the decision whether to exclude a variable or not, based on some collinearity measure. Essentially, that model claims to predict the predictivity of the final model, which of course is not possible. Any such apriori “determination” of the contribution of a variable to the final predictivity of a model is nothing else than a very preliminary guess. Thus, we indeed should treat it just as a guess, i.e. we should consider it as a propensity weight for selecting the variable. In the first explorative steps, however, we could choose an aggressive threshold, causing the removal of many variables from the vector.

Splitting

R1 removes redundancy across observations. The same effect can be achieved by a technique called “bagging”, or similarly “foresting”. In both cases a comparatively small portion of the observations are taken to build a “small” model, where the “bag” or “forest” of all small models then are taken to build the final, compound model. Bagging as a technique of “split & reduce” can be applied also in the variable domain.

R4: reduction of complexity by splitting

Confirming

Once an acceptable model or set of models has been built, we can check the postponed variables one after another. In the case of splitting, the confirmation is implicitly performed by weighting the individual small models.

Compression and Redirection

Elsewhere we already discussed the necessity and the benefits of separating the transformation of data from the association of observations. If we separate it, we can see that everything we need is an improvement or a preservation of the potential distinguishability of observations. The associative mechanism need not to “see” anything that even comes close to the raw data, as long as the resulting association of observations results in a proper derivation of prototypes.8

This opens the possibility for a compression of the observations, e.g. by the technique of random projection. Random projection maps vector spaces onto each other. If the dimensionality of the resulting vector of reduced size remains large enough (100+), then the separability of the vectors is kept intact. The reason is that in a high-dimensional vector space almost all vectors are “orthogonal” to each other. In other words, random projection does not change the structure of the relations between vectors.

R5: reduction by compression

During the first explorative steps one could construct a vector space of d=50, which allows a rather efficient exploration without introducing too much noise. Noise in normalized vector space essentially means to change the “direction” of the vectors, the effect of changing the length of vectors due to random projection is much less profound. Else note that introducing noise is not a bad thing at all: it helps to avoid overfitting, resulting in more robust models.

If we conceive of this compression by means of random projection as a transformation, we could store the matrix of random numbers as parameters of that transformation. We then could apply it in any subsequent classification task, i.e. when we would apply the model to new observations. Yet, The transformation by random projection destroys the semantic link between observed variables and the predictivity of the model. Any of the columns after such a compression contains information from more than one of the input variables. In order to support understanding, we have to reconstruct the semantic link.
That’s fortunately not a difficult task, albeit it is only possibly if we use an index that allows to identify the observations even after the transformation. The result of the building the model is a collection of groups of records, or indices, respectively. Based on these indices we simply identify those variables, which minimize the ratio of variance within the groups to the variance of the means per variable across the groups. This provides us the weights for the list of all variables, which can be used to drastically reduce the list of input variables for the final steps of modeling.

The whole approach could be described as sort of a redirection procedure. We first neglect the linkage between semantics of individual variables and prediction in order to reduce the size of the task, then after having determined the predictivity we restore the neglected link.
This opens the road for an even more radical redirection path. We already mentioned that all we need to preserve through transformation is the distinguishability of the observations without distorting the vectors too much. This could be accomplished not only by random projection though. If we’d interpret large vectors as a coherent “event” we can represent them by the coefficients of wavelets, built from individual observations. The only requirement is that the observations consist from a sufficiently large number of variables, typically n>500.

Compression is particularly useful, if the properties, i.e. the observed variables do not bear much semantic value in itself, as it is the case in image analysis, analysis of raw sensory data, or even in case of the modeling of textual information.

Conclusion

In this small essay we described five ways to reduce large sets of variables, or “assignates” (link) as they are called more appropriately. Since for pragmatic reasons a petitio principii can’t be avoided in attempting such a reduction, mainly due to the inevitable fact that we need a method for it, the reduction should be organized as a process that decreases the uncertainty in assigning a selection probability to the variables.

Regardless the kind of mechanism to associate observations into groups and forming thereby the prototypes, a separation of transformation and association is mandatory for such a recurrent organization being possible.

Notes

1. Quine  [1]

2. see: the abstract model, modeling and category theory, technical aspects of modeling, transforming data;

3. The “Laplacean Demon” refers to Laplace’s belief that if all parts of the universe could be measured the future development of the universe could be calculated. Such it is the paradigmatic label for determinism. Today we know that even IF we could measure everything in the universe with arbitrary precision we (what we could not, of course) we even could NOT pre-calculate the further development of the universe. The universe does not develop, it performs an open evolution.

4. Rene Thom [2] was the first to explicate the mathematical theory of folds in parameter space, which was dubbed “catastrophe theory” in order to reflect the subjects experience moving around in folded parameter spaces.

5. Alan Turing not only laid the foundations of deterministic machines for performing calculations; he also derived as the first one the formal structure of self-organization [3]. Based on this formal insights we can design the degree of creativity of a system.

impossibility to know for sure is the first and basic reason for culture.

6. note that determining similarity also requires apriori decisions about methods and scales, that need to be confirmed. In other words we always have to start with a belief.

7. Modeling without a purpose can’t be considered to be modeling at all. Performing a clusterization by means of some algorithm is not creating a model until we do not use it, e.g. in order to get some impression. Yet, as soon as we indeed take a look following some goal we imply a purpose. Unfortunately, in this case we would be enslaved by the hidden parameters built into the method. Things like unsupervised modeling, or “just clustering” always implies hidden targets and implicit optimization criteria, determined by the method itself. Hence, such things can’t be regarded as a reasonable move in data analysis.

8. This sheds an interesting light to the issue of “representation”, which we could not follow here.

References
  • [1] WvO Quine. Two Dogmas of Empiricism.
  • [2] Rene Thom. Catastrophe Theory
  • [3] Alan Turing (1956) Chemical basis of Morphogenesis

۞

Prolegomena to a Morphology of Experience

May 2, 2012 § Leave a comment

Experience is a fundamental experience.

The very fact of this sentence demonstrates that experience differs from perception, much like phenomena are different from objects. It also demonstrates that there can’t be an analytic treatment or even solution of the question of experience. Experience is not only related to sensual impressions, but also to affects, activity, attention1 and associations. Above all, experience is deeply linked to the impossibility to know anything for sure or, likewise, apriori. This insight is etymologically woven into the word itself: in Greek, “peria” means “trial, attempt, experience”, influencing also the roots of “experiment” or “peril”.

In this essay we will focus on some technical aspects that are underlying the capability to experience. Before we go in medias res, I have to make clear the rationale for doing so, since, quite obviously so, experience could not be reduced to those said technical aspects, to which for instance modeling belongs. Experience is more than the techné of sorting things out [1] and even more than the techné of the genesis of discernability, but at the same time it plays a particular, if not foundational role in and for the epistemic process, its choreostemic embedding and their social practices.

Epistemic Modeling

As usual, we take the primacy of interpretation as one of transcendental conditions, that is, it is a condition we can‘t go beyond, even on the „purely“ material level. As a suitable operationalization of this principle, still a quite abstract one and hence calling for situative instantiation, we chose the abstract model. In the epistemic practice, the modeling does not, indeed, even never could refer to data that is supposed to „reflect“ an external reality. If we perform modeling as a pure technique, we are just modeling, but creating a model for whatsoever purpose, so to speak „modeling as such“, or purposed modeling, is not sufficient to establish an epistemic act, which would include the choice of the purpose and the choice of the risk attitude. Such a reduction is typical for functionalism, or positions that claim a principle computability of epistemic autonomy, as for instance the computational theory of mind does.

Quite in contrast, purposed modeling in epistemic individuals already presupposes the transition from probabilistic impressions to propositional, or say, at least symbolic representation. Without performing this transition from potential signals, that is mediated „raw“ physical fluctuations in the density of probabilities, to the symbolic it is impossible to create a structure, let it be for instance a feature vector as a set of variably assigned properties, „assignates“, as we called it previously. Such a minimal structure, however, is mandatory for purposed modeling. Any (re)presentation of observations to a modeling methods thus is already subsequent to prior interpretational steps.

Our abstract model that serves as an operationalization of the transcendental principle of the primacy of interpretation thus must also provide, or comprise, the transition from differences into proto-symbols. Proto-symbols are not just intensions or classes, they are so to speak non-empiric classes that have been derived from empiric ones by means of idealization. Proto-symbols are developed into symbols by means of the combination of naming and an associated practice, i.e a repeating or reproducible performance, or still in other words, by rule-following. Only on the level of symbols we then may establish a logic, or claiming absolute identity. Here we also meet the reason for the fact that in any real-world context a “pure” logic is not possible, as there are always semantic parts serving as a foundation of its application. Speaking about “truth-values” or “truth-functions” is meaningless, at least. Clearly, identity as a logical form is a secondary quality and thus quite irrelevant for the booting of the capability of experience. Such extended modeling is, of course, not just a single instance, it is itself a multi-leveled thing. It even starts with the those properties of the material arrangement known as body that allow also an informational perspective. The most prominent candidate principle of such a structure is the probabilistic, associative network.

Epistemic modeling thus consists of at least two abstract layers: First, the associative storage of random contexts (see also the chapter “Context” for their generalization), where no purpose is implied onto the materially pre-processed signals, and second, the purposed modeling. I am deeply convinced that such a structure is only way to evade the fallacy of representationalism2. A working actualization of this abstract bi-layer structure may comprise many layers and modules.

Yet, once one accepts the primacy of interpretation, and there is little to say against it, if anything at all, then we are lead directly to epistemic modeling as a mandatory constituent of any interpretive relationship to the world, for primitive operations as well as for the rather complex mental life we experience as humans, with regard to our relationships to the environment as well as with regard to our inner reality. Wittgenstein emphasized in his critical solipsism that the conception of reality as inner reality is the only reasonable one [3]. Epistemic modeling is the only way to keep meaningful contact with the external surrounds.

The Bridge

In its technical parts experience is based on an actualization of epistemic modeling. Later we will investigate the role and the usage of these technical parts in detail. Yet, the gap between modeling, even if conceived as an abstract, epistemic modeling, and experience is so large that we first have to shed some light on the bridge between these concepts. There are some other issues with experience than just the mere technical issues of modeling that are not less relevant for the technical issues, too.

Experience comprises both more active and more passive aspects, both with regard to performance and to structure. Both dichotomies must not be taken as ideally separated categories, of course. Else, the basic distinction into active and passive parts is not a new one either. Kant distinguished receptivity and spontaneity as two complementary faculties that combine in order to bring about what we call cognition. Yet, Leibniz, in contrast, emphasized the necessity of activity even in basic perception; nowadays, his view has been greatly confirmed by the research on sensing in organic (animals) as well as in in-organic systems (robots). Obviously, the relation between activity and passivity is not a simple one, as soon as we are going to leave the bright spheres of language.3

In the structural perspective, experience unfolds in a given space that we could call the space of experiencibility4. That space is spanned, shaped and structured by open and dynamic collections of any kind of theory, model, concept or symbol as well as by the mediality that is “embedding” those. Yet, experience also shapes this space itself. The situation reminds a bit to the relativistic space in physics, or the social space in humans, where the embedding of one space into another one will affect both participants, the embedded as well as the embedding space. These aspects we should keep in mind for our investigation of questions about the mechanisms that contribute to experience and the experience of experience. As you can see, we again refute any kind of ontological stances even to their smallest degrees.5

Now when going to ask about experience and its genesis, there are two characteristics of experience that enforce us to avoid the direct path. First, there is the deep linkage of experience to language. We must get rid of language for our investigation in order to avoid the experience of finding just language behind the language or what we call upfront “experience”; yet, we also should not forget about language either. Second, there is the self-referentiality of the concept of experience, which actually renders it into a strongly singular term. Once there are even only tiny traces of the capability for experience, the whole game changes, burying the initial roots and mechanisms that are necessary for the booting of the capability.

Thus, our first move consists in a reduction and linearization, which we have to catch up with later again, of course. We will achieve that by setting everything into motion, so-to-speak. The linearized question thus is heading towards the underlying mechanisms6:

How do we come to believe that there are facts in the world? 7

What are—now viewed from the outside of language8—the abstract conditions and the practiced moves necessary and sufficient for the actualization­­ of such statements?

Usually, the answer will refer to some kind of modeling. Modeling provides the possibility for the transition from the extensional epistemic level of particulars to the intensional epistemic level of classes, functions or categories. Yet, modeling does not provide sufficient reason for experience. Sure, modeling is necessary for it, but it is more closely related to perception, though also not being equivalent to it. Experience as a kind of cognition thus can’t be conceived as kind of a “high-level perception”, quite contrary to the suggestion of Douglas Hofstadter [4]. Instead, we may conceive experience, in a first step, as the result and the activity around the handling of the conditions of modeling.

Even in his earliest writings, Wittgenstein prominently emphasized that it is meaningless to conceive of the world as consisting from “objects”. The Tractatus starts with the proposition:

The world is everything that is the case.

Cases, in the Tractatus, are states of affairs that could be made explicit into a particular (logical) form by means of language. From this perspective one could derive the radical conclusion that without language there is no experience at all. Despite we won’t agree to such a thesis, language is a major factor contributing to some often unrecognized puzzles regarding experience. Let us very briefly return to the issue of language.

Language establishes its own space of experiencibility, basically through its unlimited expressibility that induces hermeneutic relationships. Probably mainly to this particular experiential sphere language is blurring or even blocking clear sight to the basic aspects of experience. Language can make us believe that there are phenomena as some kind of original stuff, existing “independently” out there, that is, outside the human cognition.9 Yet, there is no such thing like a phenomenon or even an object that would “be” before experience, and for us humans even not before or outside of language. It is even not reasonable to speak about phenomena or objects as if they would exist before experience. De facto, it is almost non-sensical to do so.

Both, objects as specified entities and phenomena at large are consequences of interpretation, in turn deeply shaped by cultural imprinting, and thus heavily depending on language. Refuting that consequence would mean to refute the primacy of interpretation, which would fall into one of the categories of either naive realism or mysticism. Phenomenology as an ontological philosophical discipline is nothing but a mis-understanding (as ontology is henceforth); since phenomenology without ontological parts must turn into some kind of Wittgensteinian philosophy of language, it simply vanishes. Indeed, when already being teaching in Cambridge, Wittgenstein once told a friend to report his position to the visiting Schlick, whom he refused to meet on this occasion, as “You could say of my work that it is phenomenology.” [5] Yet, what Wittgenstein called “phenomenology” is completely situated inside language and its practicing, and despite there might be a weak Kantian echo in his work, he never supported Husserl’s position of synthetic universals apriori. There is even some likelihood that Wittgenstein, strongly feeling to be constantly misunderstood by the members of the Vienna Circle, put this forward in order to annoy Schlick (a bit), at least to pay him back in kind.

Quite in contrast, in a Wittgensteinian perspective facts are sort of collectively compressed beliefs about relations. If everybody believes to a certain model of whatever reference and of almost arbitrary expectability, then there is a fact. This does not mean, however, that we get drowned by relativism. There are still the constraints implied by the (unmeasured and unmeasurable) utility of anticipation, both in its individual and its collective flavor. On the other hand, yes, this indeed means that the (social) future is not determined.

More accurately, there is at least one fact, since the primacy of interpretation generates at least the collectivity as a further fact. Since facts are taking place in language, they do not just “consist” of content (please excuse such awful wording), there is also a pragmatics, and hence there are also at least two different grammars, etc.etc.

How do we, then, individually construct concepts that we share as facts? Even if we would need the mediation by a collective, a large deal of the associative work takes place in our minds. Facts are identifiable, thus distinguishable and enumerable. Facts are almost digitized entities, they are constructed from percepts through a process of intensionalization or even idealization and they sit on the verge of the realm of symbols.

Facts are facts because they are considered as being valid, let it be among a collective of people, across some period of time, or a range of material conditions. This way they turn into kind of an apriori from the perspective of the individual, and there is only that perspective. Here we find the locus situs of several related misunderstandings, such as direct realism, Husserlean phenomenology, positivism, the thing as such, and so on. The fact is even synthetic, either by means of “individual”10 mental processes or by the working of a “collective reasoning”. But, of course, it is by no means universal, as Kant concluded on the basis of Newtonian science, or even as Schlick did in 1930 [6]. There is neither a universal real fact, nor a particular one. It does not make sense to conceive the world as existing from independent objects.

As a consequence, when speaking about facts we usually studiously avoid the fact of risk. Participants in the “fact game” implicitly agree on the abandonment of negotiating affairs of risk. Despite the fact that empiric knowledge never can be considered as being “safe” or “secured”, during the fact game we always behave as if. Doing so is the more or less hidden work of language, which removes the risk (associated with predictive modeling) and replaces it by metaphorical expressibility. Interestingly, here we also meet the source field of logic. It is obvious (see Waves & Words) that language is neither an extension of logics, nor is it reasonable to consider it as a vehicle for logic, i.e. for predicates. Quite to the contrast, the underlying hypothesis is that (practicing) language and (weaving) metaphors is the same thing.11 Such a language becomes a living language that (as Gier writes [5])

“[…] grows up as a natural extension of primitive behavior, and we can count on it most of the time, not for the univocal meanings that philosophers demand, but for ordinary certainty and communication.”

One might just modify Gier’s statement a bit by specifying „philosophers“ as idealistic, materialistic or analytic philosophers.

In “On Certainty” (OC, §359), Wittgenstein speaks of language as expressing primitive behavior and contends that ordinary certainty is “something animal”. This now we may take as a bridge that provides the possibility to extend our asking about concepts and facts towards the investigation of the role of models.

Related to this there is a pragmatist aspect that is worthwhile to be mentioned. Experience is a historicizing concept, much like knowledge. Both concepts are meaningful only in hindsight. As soon as we consider their application, we see that both of them refer only to one half of the story that is about the epistemic aspects of „life“. The other half of the epistemic story and directly implied by the inevitable need to anticipate is predictive or, equivalently, diagnostic modeling. Abstract modeling in turn implies theory, interpretation and orthoregulated rule-following.

Epistemology thus should not be limited to „knowledge“, the knowable and its conditions. Epistemology has explicitly to include the investigation of the conditions of what can be anticipated.

In a still different way we thus may repose the question about experience as the transition from epistemic abstract modeling to the conditions of that modeling. This would include the instantiation of practicable models as well as the conditions for that instantiation, and also the conditions of the application of models.In technical terms this transition is represented by a problematic field: The model selection problem, or in more pragmatic terms, the model (selection) risk.

These two issues, the prediction task and the condition of modeling now form the second toehold of our bridge between the general concept of experience and some technical aspects of the use of models. There is another bridge necessary to establish the possibility of experience, and this one connects the concept of experience with languagability.

The following list provides an overview about the following chapters:

These topics are closely related to each other, indeed so closely that other sequences would be justifiable too. Their interdependencies also demand a bit of patience from you, the reader, as the picture will be complete only when we arrive at the results of modeling.

A last remark may be allowed before we start to delve into these topics. It should be clear by now that any kind of phenomenology is deeply incompatible with the view developed here. There are several related stances, e.g. the various shades of ontology, including the objectivist conception of substance. They are all rendered as irrelevant and inappropriate for any theory about episteme, whether in its machine-based form or regarding human culture, whether as practice or as reflecting exercise.

The Modeling Statement

As the very first step we have to clearly state the goal of modeling. From the outside that goal is pretty clear. Given a set of observations and the respective outcomes, or targets, create a mapping function such that the observed data allow for a reconstruction of the outcome in an optimized manner. Finding such a function can be considered as a simple form of learning if the function is „invented“. In most cases it is not learning but just the estimation of pre-defined parameters.12 In a more general manner we also could say that any learning algorithm is a map L from data sets to a ranked list of hypothesis functions. Note that accuracy is only one of the possible aspects of that optimization. Let us call this for convenience the „outer goal“ of modeling. Would such mapping be perfect within reasonable boundaries, we would have found automatically a possible transition from probabilistic presentation to propositional representation. We could consider the induction of a structural description from observations as completed. So far the secret dream of Hans Reichenbach, Carl Schmid-Hempel, Wesley Salmon and many of their colleagues.

The said mapping function will never be perfect. The reasons for this comprise the complexity of the subject, noise in the measured data, unsuitable observables or any combinations of these. This induces a wealth of necessary steps and, of course, a lot of work. In other words, a considerable amount of apriori and heuristic choices have to be taken. Since a reliable, say analytic mapping can’t be found, every single step in the value chain towards the model at once becomes questionable and has to be checked for its suitability and reliability. It is also clear that the model does not comprise just a formula. In real-world situations a differential modeling should be performed, much like in medicine a diagnosis is considered to be complete only if a differential diagnosis is included. This comprises the investigation of the influence of the method’s parameterization onto the results. Let us call the whole bunch of respective goals the „inner goals“ of modeling.

So, being faced with the challenge of such empirical mess, how does the statement about the goals of the „inner modeling“ look like? We could for instance demand to remove the effects of the shortfalls mentioned above, which cause the imperfect mapping: complexity of the subject, noise in the measured data, or unsuitable observables.

To make this more concrete we could say, that the inner goals of modeling consist in a two-fold (and thus synchronous!) segmentation of the data, resulting in the selection of the proper variables and in the selection of the proper records, where this segmentation is performed under conditions of a preceding non-linear transformation of the embedding reference system. Ideally, the model identifies the data for which it is applicable. Only for those data then a classification is provided. It is pretty clear that this statement is an ambitious one. Yet, we regard it as crucial for any attempt to step across our epistemic bridge that brings us from particular data to the quality of experience. This transition includes something that is probably better known by the label „induction“. Thus, we finally arrive at a short statement about the inner goals of modeling:

How to conclude and what to conclude from measured data?

Obviously, if our data are noisy and if our data include irrelevant values any further conclusion will be unreliable. Yet, for any suitable segmentation of the data we need a model first. From this directly follows that a suitable procedure for modeling can’t consist just from a single algorithm, or a „one-shot procedure“. Any instance of single-step approaches are suffering from lots of hidden assumptions that influence the results and its properties in unforeseeable ways. Modeling that could be regarded as more than just an estimation of parameters by running an algorithm is necessarily a circular and—dependent on the amount of variables­—possibly open-ended process.

Predictability and Predictivity

Let us assume a set of observations S obtained from an empirical process P. Then ­­­this process P should be called “predictable” if the results of the mapping function f(m) that serves as an instance of a hypothesis h from the space of hypotheses H coincides with the outcome of the process P in such a way that f(m) forms an expectation with a deviation d<ε for all f(m). In this case we may say that f(m) predicts P. This deviation is also called “empirical risk”, and the purpose of modeling is often regarded as minimizing the empirical risk (ERM).

There are then two important questions. Firstly, can we trust f(m), since f(m) has been built on a limited number of observations? Secondly, how can we make f(m) more trustworthy, given the limitation regarding the data? Usually, these questions are handled under the label of validation. Yet, validation procedures are not the only possible means to get an answer here. It would be a misunderstanding to think that it is the building or construction of a model that is problematic.

The first question can be answered only by considering different models. For obtaining a set of different models we could apply different methods. That would be o.k. if prediction would be our sole interest. Yet, we also strive for detecting structural insights. And from that perspective we should not, of course, use different methods to get different models. The second possibility for addressing the first question is to use different sub-samples, which turns simple validation into a cross-validation. Cross-validation provides an expectation for the error (or the risk). Yet, in order to compare across methods one actually should describe the expected decrease in “predictive power”13 for different sample sizes (independent cross-validation per sample size). The third possibility for answering question (1) is related to the the former and consists by adding noised, surrogated (or simulated) data. This prevents the learning mechanism from responding to empirically consistent, but nevertheless irrelevant noisy fluctuations in the raw data set. The fourth possibility is to look for models of equivalent predictive power, which are, however, based on a different set of predicting variables. This possibility is not accessible for most statistical approaches such like Principal Component Analysis (PCA). Whatever method is used to create different models, models may be combined into a “bag” of models (called “bagging”), or, following an even more radical approach, into an ensemble of small and simple models. This is employed for instance in the so-called Random Forest method.

Commonly, if a model passes cross-validation successfully, it is considered to be able to “generalize”. In contrast to the common practice, Poggio et al. [7] demonstrated that standard cross-validation has to be extended in order to provide a characterization of the capability of a model to generalize. They propose to augment

CV1oo stability with stability of the expected error and stability of the empirical error to define a new notion of stability, CVEEE1oo stability.

This makes clear that Poggio’s et al. approach is addressing the learning machinery, not any longer just the space of hypotheses. Yet, they do not take the free parameters of the method into account. We conclude that their proposed approach still remains an uncritical approach. Thus I would consider such a model as not completely trustworthy. Of course, Poggio et al. are definitely pointing towards the right direction. We recognize a move away from naive realism and positivism, instead towards a critical methodology of the conditional. Maybe, philosophy and natural sciences find common grounds again by riding the information tiger.

Checking the stability of the learning procedure leads to a methodology that we called “data experiments” elsewhere. The data experiments do NOT explore the space of hypotheses, at least not directly. Instead they create a map for all possible models. In other words, instead of just asking about the predictability we now ask about the differential predictivity of in the space of models.

From the perspective of a learning theory Poggio’s move can’t be underestimated. Statistical learning theory (SLT)[8] explicitly assumes that a direct access to the world is possible (via identity function, perfectness of the model). Consequently, SLT focuses (only) on the reduction of the empirical risk. Any learning mechanism following the SLT is hence uncritical about its own limitation. SLT is interested in the predictability of the system-as-such, thereby not rather surprisingly committing the mistake of pre-19th century idealism.

The Independence Assumption

The independence assumption [I.A.], or linearity assumption, acts mainly on three different targets. The first of them is the relationship between observer and observed, while its second target is the relationship between observables. The third target finally regards the relation between individual observations. This last aspect of the I.A. is the least problematic one. We will not discuss this any further.

Yet, the first and the second one are the problematic ones. The I.A. is deeply buried into the framework of statistics and from there it made its way into the field of explorative data analysis. There it can be frequently met for instance in the geometrical operationalization of similarity, the conceptualization of observables as Cartesian dimensions or independent coefficients in systems of linear equations, or as statistical kernels in algorithms like the Support Vector Machine.

Of course, the I.A. is just one possible stance towards the treatment of observables. Yet, taking it as an assumption we will not include any parameter into the model that reflects the dependency between observables. Hence, we will never detect the most suitable hypothesis about the dependency between observables. Instead of assuming the independence of variables throughout an analysis it would be methodological much more sound to address the degree of dependency as a target. Linearity should not be an assumption, it should be a result of an analysis.

The linearity or independence assumption carries another assumption with it under its hood: the assumption of the homogeneity of variables. Variables, or assignates, are conceived as black-boxes, with unknown influence onto the predictive power of the model. Yet, usually they exert very different effects on the predictive power of a model.

Basically, it is very simple. The predictive power of a model depends on the positive predictive value AND the negative predictive value, of course; we may also use closely related terms sensitivity and specificity. Accordingly, some variables contribute more to the positive predictive value, other help to increase the negative predictive value. This easily becomes visible if we perform a detailed type-I/II error analysis. Thus, there is NO way to avoid testing those combinations explicitly, even if we assume the initial independence of variables.

As we already mentioned above, the I.A. is just one possible stance towards the treatment of observables. Yet, its status as a methodological sine qua non that additionally is never reflected upon renders it into a metaphysical assumption. It is in fact an irrational assumption, which induces serious costs in terms of the structural richness of the results. Taken together, the independence assumption represents one of the most harmful habits in data analysis.

The Model Selection Problem

In the section “Predictability and Predictivity” above we already emphasized the importance of the switch from the space of hypotheses to the space of models. The model space unfolds as a condition of the available assignates, the size of the data set and the free parameters of the associative (“modeling”) method. The model space supports a fundamental change of the attitude towards a model. Based on the denial of the apriori assumption of independence of observables we identified the idea of a singular best model as an ill-posed phantasm. We thus move onwards from the concept of a model as a mapping function towards ensembles of structurally heterogeneous models that together as a distinguished population form a habitat, a manifold in the sphere of the model space. With such a structure we neither need to arrive at a single model.

Methods, Models, Variables

The model selection problem addresses two sets of parameters that are actually quite different from each other. Model selection should not be reduced to the treatment of the first set, of course, as it happens at least implicitly for instance in [9]. The first set refers to the variables as known from the data, sometimes also called the „predictors“. The selection of the suitable variables is the first half of the model selection problem. The second set comprises all free parameters of the method. From the methodological point of view, this second set is much more interesting than the first one. The method’s parameters are apriori conditions to the performance of the method, which additionally usually remain invisible in the results, in contrast to the selection of variables.

For associative methods like SOM or other clustering variables the effect of de-/selecting variables can be easily described. Just take all the objects in front of you, for instance on the table, or in your room. Now select an arbitrary purpose and assign this purpose as a degree of support to those objects. For now, we have constructed the target. Now we go “into” the objects, that is, we describe them by a range of attributes that are present in most of the objects. Dependent on the selection of  a subset from these attributes we will arrive at very different groups. The groups now represent the target more or less, that’s the quality of the model. Obviously, this quality differs across the various selections of attributes. It is also clear that it does not help to just use all attributes, because some of the attributes just destroy the intended order, they add noise to the model and decrease its quality.

As George observes [10], since its first formulation in the 1960ies a considerable, if not large number of proposals for dealing with the variable selection problem have been proposed. Although George himself seem to distinguish the two sets of parameters, throughout the discussion of the different approaches he always refers just to the first set, the variables as included in the data. This is not a failure of the said author, but a problem of the statistical approach. Usually, the parameters of statistical procedures are not accessible, as any analytic procedure, they work as they work. In contrast to Self-organizing Maps, and even to Artificial Neural Networks (ANN) or Genetic Procedures, analytic procedures can’t be modified in order to achieve a critical usage. In some way, with their mono-bloc design they perfectly fit into representationalist fallacy.

Thus, using statistical (or other analytic) procedures, the model selection problem consists of the variable selection problem and the method selection problem. The consequences are catastrophic: If statistical methods are used in the context of modeling, the whole statistical framework turns into a black-box, because the selection of a particular method can’t be justified in any respect. In contrast to that quite unfavorable situation, methods like the Self-Organizing Map provide access to any of its parameters. Data experiments are only possible with methods like SOM or ANN. Not the SOM or the ANN are „black-boxes“, but the statistical framework must be regarded as such. Precisely this is also the reason for the still ongoing quarrels about the foundations of the statistical framework. There are two parties, the frequentists and the bayesians. Yet, both are struck by the reference class problem [11]. From our perspective, the current dogma of empirical work in science need to be changed.

The conclusion is that statistical methods should not be used at all to describe real-world data, i.e. for the modeling of real-world processes. They are suitable only within a fully controlled setting, that is, within a data experiment. The first step in any kind of empirical analysis thus must consist of a predictive modeling that includes the model selection task.14

The Perils of Universalism

Many people dealing with the model selection task are mislead by a further irrational phantasm, caused by a mixture of idealism and positivism. This is the phantasm of the single best model for a given purpose.

Philosophers of science long ago recognized, starting with Hume and ultimately expressed by Quine, that empirical observations are underdetermined. The actual challenge posed by modeling is given by the fact of empirical underdetermination. Goodman felt obliged to construct a paradox from it. Yet, there is no paradox, there is only the phantasm  of the single best model. This phantasm is a relic from the Newtonian period of science, where everybody thought the world is made by God as a miraculous machine, everything had to be well-defined, and persisting contradictions had to be rated as evil.

Secondarily, this moults into the affair of (semantic) indetermination. Plainly spoken, there are never enough data. Empirical underdetermination results in the actuality of strongly diverging models, which in turn gives rise to conflicting experiences. For a given set of data, in most cases it is possible to build very different models (ceteris paribus, choosing different sets of variables) that yield the same utility, or say predictive power, as far as this predictive power can be determined by the available data sample at all. Such ceteris paribus difference will not only give rise to quite different tracks of unfolding interpretation, it is also certainly in the close vicinity of Derrida’s deconstruction.

Empirical underdetermination thus results in a second-order risk, the model selection risk. Actually, the model selection risk is the only relevant risk. We can’t change the available data, and data are always limited, sometimes just by their puniness, sometimes by the restrictions to deal with them. Risk is not attached to objects or phenomena, because objects “are not there” before interpretation and modeling. Risk is attached only to models. Risk is a particular state of affair, and indeed a rather fundamental one. Once a particular model would tell us that there is an uncertainty regarding the outcome, we could take measures to deal with that uncertainty. For instance, we hedge it, or organize some other kind of insurance for it. But hedging has to rely on the estimation of the uncertainty, which is dependent on the expected predictive power of the model, not just the accuracy of the model given the available data from a limited sample.

Different, but equivalent selections of variables can be used to create a group of models as „experts“ on a given task to decide on. Yet, the selection of such „experts“ is not determinable on the basis of the given data alone. Instead, further knowledge about the relation of the variables to further contexts or targets needs to be consulted.

Universalism is usually unjustifiable, and claiming it instead usually comes at huge costs, caused by undetectable blindnesses once we accept it. In contemporary empiricism, universalism—and the respecting blindness—is abundant also with regard to the role of the variables. What I am talking about here is context, mediality and individuality, which, from a more traditional formal perspective, is often approximated by conditionality. Yet, it more and more becomes clear that the Bayesian mechanisms are not sufficient to get the complexity of the concept of variables covered. Just to mention the current developments in the field of probability theory I would like to refer to Brian  Weatherson, who favors and develops the so-called dynamic Keynesian models of uncertainty. [10] Yet, we regard this only as a transitional theory, despite the fact that it will have a strong impact on the way scientists will handle empiric data.

The mediating individuality of observables (as deliberately chosen assignates, of course) is easy to observe, once we drop the universalism qua independence of variables. Concerning variables, universalism manifests in an indistinguishability of the choices made to establish the assignates with regard to their effect onto the system of preferences. Some criteria C will induce the putative objects as distinguished ones only, if another assignate A has pre-sorted it. Yet, it would be a simplification to consider the situation in the Bayesian way as P(C|A). The problem with it is that we can’t say anything about the condition itself. Yet, we need to “play” (actually not “control”) with the conditionability, the inner structure of these conditions. As it is with the “relation,” which we already generalized into randolations, making it thereby measurable, we also have to go into the condition itself in order to defeat idealism even on the structural level. An appropriate perspective onto variables would hence treat it as a kind of media. This mediality is not externalizable, though, since observables themselves precipitate from the mediality, then as assignates.

What we can experience here is nothing else than the first advents of a real post-modernist world, an era where we emancipate from the compulsive apriori of independence (this does not deny, of course, its important role in the modernist era since Descartes).

Optimization

Optimizing a model means to select a combination of suitably valued parameters such that the preferences of the users in terms of risk and implied costs are served best. The model selection problem is thus the link between optimization problems, learning tasks and predictive modeling. There are indeed countless many procedures for optimization. Yet, the optimization task in the context of model selection is faced with a particular challenge: its mere size. George begins his article in the following way:

A distinguishing feature of variable selection problems is their enormous size. Even with moderate values of p, computing characteristics for all 2p models is prohibitively expensive and some reduction of the model space is needed.

Assume for instance a data set that comprises 50 variables. From that 1.13e15 models are possible, and assume further that we could test 10‘000 models per second, then we still would need more than 35‘000 years to check all models. Usually, however, building a classifier on a real-world problem takes more than 10 seconds, which would result in 3.5e9 years in the case of 50 variables. And there are many instances where one is faced with much more variables, typically 100+, and sometimes going even into the thousands. That’s what George means by „prohibitively“.

There are many proposals to deal with that challenge. All of them fall into three classes: they use either (1) some information theoretic measure (AIC, BIC, CIC etc. [11]), or (2) they use likelihood estimators, i.e. they conceive of parameters themselves as random variables, or (3) they are based of probabilistic measures established upon validation procedures. Particularly the instances from the first two of those classes are hit by the linearity and/or the independence assumption, and also by unjustified universalism. Of course, linearity should not be an assumption, it should be a result, as we argued above. Hence, there is no way to avoid the explicit calculation of models.

Given the vast number of combinations of symbols it appears straightforward to conceive of the model selection problem from an evolutionary perspective. Evolution always creates appropriate and suitable solutions from the available „evolutionary model space“. That space is of size 230‘000 in the case of humans, which is a „much“ larger number than the number of species ever existent on this planet. Not a single viable configuration could have been found by pure chance. Genetics-based alignment and navigation through the model space is much more effective than chance. Hence, the so-called genetic algorithms might appear on the radar as the method of choice .

Genetics, revisited

Unfortunately, for the variable selection problem genetic algorithms15 are not suitable. The main reason for this is still the expensive calculation of single models. In order to set up the genetic procedure, one needs at least 500 instances to form the initial population. Any solution for the variable selection problem should arrive at a useful solution with less than 200 explicitly calculated models. The great advantage of genetic algorithms is their capability to deal with solution spaces that contain local extrema. They can handle even solution spaces that are inhomogeneously rugged, simply for the reason that recombination in the realm of the symbolic does not care about numerical gradients and criteria. Genetic procedures are based on combinations of symbolic encodings. The continuous switch between the symbolic (encoding) and the numerical (effect) are nothing else than the pre-cursors of the separation between genotypes and phenotypes, without which there would not be even simple forms of biological life.

For that reason we developed a specialized instantiation of the evolutionary approach (implemented in SomFluid). Described very briefly we can say that we use evolutionary weights as efficient estimators of the maximum likelihood of parameters. The estimates are derived from explicitly calculated models that vary (mostly, but not necessarily ceteris paribus) with respect to the used variables. As such estimates, they influence the further course of the exploration of the model space in a probabilistic manner. From the perspective of the evolutionary process, these estimates represent the contribution of the respective parameter to the overall fitness of the model. They also form a kind of long-term memory within the process, something like a probabilistic genome. The short-term memory in this evolutionary process is represented by the intensional profiles of the nodes in the SOM.

For the first initializing step, the evolutionary estimates can be estimated themselves by linear procedure like the PCA, or by non-parametric procedures (Kruskal-Wallis, Mann-Whitney, etc.), and are available after only a few explicitly calculated models (model here means „ceteris paribus selection of variables“).

These evolutionary weights reflect the changes of the predictive power of the model when adding or removing variables to the model. If the quality of the model improves, the evolutionary weight increases a bit, and vice versa. In other words, not the apriori parameters of the model are considered, but just the effect of the parameters. The procedure is an approximating repetition: fix the parameters of the model (method specific, sampling, variables), calculate the model, record the change of the predictive power as compared to the previous model.

Upon the probabilistic genome of evolutionary weights there are many different ways one could take to implement the “evo-devo” mechanisms, let it be the issue of how to handle the population (e.g. mixing genomes, aspects of virtual ecology, etc.), or the translational mechanisms, so to speak the “physiologies” that are used to proceed from the genome to an actual phenotype.

Since many different combinations are being calculated, the evolutionary weight represents the expectable contribution of a variable to the predictive power of the model, under whatsoever selection of variables that represents a model. Usually, a variable will not improve the quality of the model irrespective to the context. Yet, if a variable indeed would do so, we not only would say that its evolutionary weight equals 1, we also may conclude that this variable is a so-called confounder. Including a confounder into a model means that we use information about the target, which will not be available when applying the model for classification of new data; hence the model will fail disastrously. Usually, and that’s just a further benefit of dropping the independence-universalism assumption, it is not possible for a procedure to identify confounders by itself. It is also clear that the capability to do so is one of the cornerstones of autonomous learning, which includes the capability to set up the learning task.

Noise, and Noise

Optimization raises its own follow-up problems, of course. The most salient of these is so-called overfitting. This means that the model gets suitably fitted to the available observations by including a large number of parameters and variables, but it will return wrong predictions if it is going to be used on data that are even only slightly different from the observations used for learning and estimating the parameters of the model. The model represents noise, random variations without predictive value.

As we have been describing above, Poggio believes that his criterion of stability overcomes the defects with regard to the model as a generalization from observations. Poggio might be too optimistic, though, since his method still remains to be confined to the available observations.

In this situation, we apply a methodological trick. The trick consists in turning the problem into a target of investigation, which ultimately translates the problem into an appropriate rule. In this sense, we consider noise not as a problem, but as a tool.

Technically, we destroy the relevance of the differences between the observations by adding noise of a particular characteristic. If we add a small amount of normally distributed noise, nothing will probably change, but if we add a lot of noise, perhaps even of secondarily changing distribution, this will result in the mere impossibility to create a stable model at all. The scientific approach is to describe the dependency between those two unknowns, so to say, to set up a differential between noise (model for the unknown) and the model (of the unknown). The rest is straightforward: creating various data sets that have been changed by imposing different amounts of noise of a known structure, and plotting the predictive power against the the amount of noise. This technique can be combined by surrogating the actual observations via a Cholesky decomposition.

From all available models then those are preferred that combine a suitable predictive power with suitable degree of stability against noise.

Résumé

In this section we have dealt with the problematics of selecting a suitable subset from all available observables (neglecting for the time being that model selection involves the method’s parameters, too). Since mostly we have more observables at our disposal than we actually presume to need, the task could be simply described as simplification, aka Occam’s Razor. Yet, it would be terribly naive to first assume linearity and then selecting the “most parsimonious” model. It is even cruel to state [9, p.1]:

It is said that Einstein once said

Make things as simple as possible, but not simpler.

I hope that I succeeded in providing some valuable hints for accomplishing that task, which above all is not a quite simple one. (etc.etc. :)

Describing Classifiers

The gold standard for describing classifiers is believed to be the Receiver-Operator-Characteristic, or short, ROC. Particularly, the area under the curve is compared across models (classifiers). The following Figure 1demonstrates the mechanics of the ROC plot.

Figure 1: Basic characteristics of the ROC curve (reproduced from Wikipedia)

Figure 2. Realistic ROC curves, though these are typical for approaches that are NOT based on sub-group structures or ensembles (for instance ANN or logistic regression). Note that models should not be selected on the basis of the Area-under-Curve. Instead the true positive rate (sensitivity) at a false positive rate FPR=0 should be used for that. As a further criterion that would indicate the stability of of the model one could use the slope of the curve at FPR=0.

Utilization of Information

There is still another harmful aspect of the universalistic stance in data analysis as compared to a pragmatic stance. This aspect considers the „reach“ of the models we are going to build.

Let us assume that we would accept a sensitivity of approx 80%, but we also expect a specificity of >99%. In other words, the cost for false positives (FP) are defined as very high, while the costs for false negatives (FN, not recognized preferred outcomes) are relatively low. The ratio of costs for error, or in short the error cost ratio err(FP)/err(FN) is high.

Table 1a: A Confusion matrix for a quite performant classifier.

Symbols: test=model; TP=true positives; FP=false positives; FN=false negatives; TN=true negatives; ppv=positive predictive value, npv=negative predictive value. FN is also called type-I-error (analogous to “rejecting the null hypothesis when it is true”), while FP is called type-II-error (analogous to “accepting the null hypothesis when it is false”), and FP/(TP+FP) is called type-II-error-rate, sometime labeled as β-error, where (1-β) is the called the “power” of the test or model. (download XLS example)

condition Pos

condition Neg

test Pos

100 (TP)

3 (FP)

0.971

ppv

test Neg

28 (FN)

1120 (TN)

0.976

npv

0.781

0.997

sensitivity

specificity

Let us further assume that there are observations of our preferred outcome that we can‘t distinguish well from other cases of the opposite outcome that we try to avoid. They are too similar, and due to that similarity they form a separate group in our self-organizing map. Let us assume that the specificity of these clusters is at 86% only and the sensitivity is at 94%.

Table 1b: Confusion matrix describing a sub-group formed inside the SOM, for instance as it could be derived from the extension of a “node”.

condition Pos

condition Neg

test Pos

0 (50)

0 (39)

0.0 (0.56)

ppv

test Neg

50 (0)

39 (0)

0.44 (1.0)

npv

0.0 (1.0)

1.0 (0.0)

sensitivity

specificity

Yet, this cluster would not satisfy our risk attitude. If we would use the SOM as a model for classification of new observations, and the new observation would fall into that group (by means of similarity considerations) the implied risk would violate our attitude. Hence, we have to exclude such clusters. In the ROC this cluster represents a value further to the right on the specificity (X-) axis.

Note that in the case of acceptance of the subgroup as a representative for a contributor of a positive prediction, the false negative is always 0 aposteriori, and in case of denial the true positives is always set to 0 (accordingly the figures for the condition negative).

There are now several important points to that, which are related to each other. Actually, we should be interested only in such sub-groups with specificity close to 1, such that our risk attitude is well served. [13] Likewise, we should not try to optimize the quality of the model across the whole range of the ROC, but only for the subgroups with acceptable error cost ratio. In other words, we use the available information in a very specific manner.

As a consequence, we have to set the ECR before calculating the model. Setting the ECR after the selection of a model results in a waste of information, time and money. For this reason it is strongly indicated to use methods that are based on building a representation by sub-groups. This again rules out statistical methods as they always take into account all available data. Zytkow calls such methods empirically empty [14].

The possibility to build models of a high specificity is a huge benefit of sub-group based methods like the SOM.16 To understand this better let us assume we have a SOM-based model with the following overall confusion matrix.

condition Pos

condition Neg

test Pos

78

1

0.9873

ppv

test Neg

145

498

0.7745

npv

0.350

0.998

sensitivity

specificity

That is, the model recognizes around 35% of all preferred outcomes. It does so on the basis of sub-groups that all satisfy the respective ECR criterion. Thus we know that the implied risk of any classification is very low too. In other words, such models recognize whether it is allowed to apply them. If we apply them and get a positive answer, we also know that it is justified to apply them. Once the model identifies a preferred outcome, it does so without risk. This lets us miss opportunities, but we won’t be trapped by false expectations. Such models we could call auto-consistent.

In a practical project that has been aiming at an improvement of the post-surgery risk classification of patients (n>12’000) in a hospital we have been able to demonstrate that the achievable validated rate of implied risk can be as low as <10e-4. [15] Such a low rate is not achievable by statistical methods, simply because there are far too few incidents of wrong classifications. The subjective cut-off points in logistic regression are not quite suitable for such tasks.

At the same time, and that’s probably even more important, we get a suitable segmentation of the observations. All observations that can be identified as positive do not suffer from any risk. Thus, we can investigate the structure of the data for these observations, e.g. as particular relationships between variables, such as correlations etc. But, hey, that job is already done by the selection of the appropriate set of variables! In other words, we not only have a good model, we also have found the best possibility for a multi-variate reduction of noise, with a full consideration of the dependencies between variables. Such models can be conceived as reversed factorial experimental design.

The property of auto-consistency offers a further benefit as it is scalable, that is, “auto-consistent” is not a categorical, or symbolic, assignment. It can be easily measured as sensitivity under the condition of specificity > 1-ε, ε→0. Thus, we may use it as a random measure (it can be described by its density) or as a scale of reference in case of any selection task among sub-populations of models. Additionally, if the exploration of the model space does not succeed in finding a model of a suitable degree of auto-consistency, we may conclude that the quality of the data is not sufficient. Data quality is a function of properly selected variables (predictors) and reproducible measurement. We know of no other approach that would be able to inform about the quality of the data without referring to extensive contextual “knowledge”. Needless to say that such knowledge is never available and encodable.

There are only weak conditions that need to be satisfied. For instance, the same selection of variables need to be used within a single model for all similarity considerations. This rules out all ensemble methods, as far as different selections of variables are used for each item in the ensemble; for instance decision tree methods (a SOM with its sub-groups is already “ensemble-like”, yet, all sub-groups are affected by the same selection of variables). It is further required to use a method that performs the transition from extensions to intensions on a sub-group level,which rules out analytic methods, and even Artificial Neural Networks (ANN). The way to establish auto-consistent models is not possible for ANN. Else, the error-cost ratio must be set before calculating the model, and the models have to be calculated explicitly, which removes linear methods from the list, such as Support Vector Machines with linear kernels (regression, ANN, Bayes). If we want to access the rich harvest of auto-consistent models we have to drop the independence hypothesis and we have to refute any kind of universalism. But these costs are rather low, indeed.

Observations and Probabilities

Here we developed a particular perspective onto the transition from observations to intensional representations. There are of course some interesting relationships of our point of view to the various possibilities of “interpreting” probability (see [16] for a comprehensive list of “interpretations” and interesting references). We also provide a new answer to Hume’s problem of induction.

Hume posed the question, how often should we observe a fact until we could consider it as lawful? This question, called the “problem of induction” points to the wrong direction and will trigger only irrelevant answer. Hume, living still in times of absolute monarchism, in a society deeply structured by religious beliefs, established a short-cut between the frequency of an observation and its propositional representation. The actual question, however, is how to achieve what we call an “observation”.

In very simple, almost artificial cases like the die there is nothing to interpret. The die and its values are already symbols. It is in some way inadequate to conceive of a die or of dicing as an empirical issue. In fact, we know before what could happen. The universe of the die consists of precisely 6 singular points.

Another extreme are so-called single-case observations of structurally rich events, or processes. An event, or a setting should be called structurally rich, if there are (1) many different outcomes, and (2) many possible assignates to describe the event or the process. Such events or processes will not produce any outcome that is could be expected by symbolic or formal considerations. Obviously, it is not possible to assign a relative frequency to a unique, a singular, or a non-repeatable event. Unfortunately, however, as Hájek points out [17], any actual sequence can be conceived of as a singular event.

The important point now is that single-case observations are also not sufficiently describable as an empirical issue. Ascribing propensities to objects-in-the-world demands for a wealth of modeling activities and classifications, which have to be completed apriori to the observation under scrutiny. So-called single-case propensities are not a problem of probabilistic theory, but one of the application of intensional classes and their usage as means for organizing one’s own expectations. As we said earlier, probability as it is used in probability theory is not a concept that could be applied meaningful to observations, where observations are conceived of as primitive “givens”. Probabilities are meaningful only in the closed world of available subjectively held concepts.

We thus have to distinguish between two areas of application for the concept of probability: the observational part, where we build up classes, and the anticipatory part, where we are interested in a match of expectations and actual outcomes. The problem obviously arises by mixing them through the notion of causality.17 Yet, there is absolutely no necessity between the two areas. The concept of risk probably allows for a resolution of the problems, since risk always implies a preceding choice of a cost function, which necessarily is subjective. Yet, the cost function and the risk implied by a classification model is also the angle point for any kind of negotiation, whether this takes place on an material, hence evolutionary scale, or within a societal context.

The interesting, if not salient point is that the subjectively available intensional descriptions and classes are dependent on ones risk attitude. We may observe the same thing only  if we have acquired the same system of related classes and the same habits of using them. Only if we apply extreme risk aversion we will achieve a common understanding about facts (in the Wittgensteinian sense, see above). This then is called science, for instance. Yet, it still remains a misunderstanding to equate this common understanding with objects as objects-out-there.

The problem of induction thus must be considered as a seriously  ill-posed problem. It is a problem only for idealists (who then solve it in a weird way), or realists that are naive against the epistemological conditions of acting in the world. Our proposal for the transition from observations to descriptions is based on probabilism on both sides, yet, on either side there is a distinct flavor of probabilism.

Finally, a methodological remark shall be allowed, closely related to what we already described in the section about “noise” above. The perspective onto “making experience” that we have been proposing here demonstrates a significant twist.

Above we already mentioned Alan Hájek’s diagnosis that the frequentist and the Bayesian interpretation of probabilities suffer from the reference class problem. In this section we extended Hájek’s concerns to the concept of propensity. Yet, if the problem shows a high prevalence we should not conceive it as a hurdle but should try to treat it dynamically as a rule.The reference class is only a problem as long as (1) either the actual class is required as an external constant, or (2) the abstract concept of the class is treated as a fixed point. According to the rule of Lagrange-Deleuze, any constant can be rewritten into a procedure (read: rules) and less problematic constants. Constants, or fixed points on a higher abstract level are less problematic, because the empirically grounded semantics vanishes.

Indeed, the problem of the reference class simply disappears if we put the concept of the class, together with all the related issues of modeling, as the embedding frame, the condition under which any notion of probability only can make sense at all. The classes itself are results of “rule-following”, which  admittedly is blind, but whose parameters are also transparently accessible. In this way, probabilistic interpretation is always performed in a universe, that is closed and in principle fully mapped. We need the probabilistic methods just because that universe is of a huge size. In other words, the space of models is a Laplacean Universe.

Since statistical methods and similar interpretations of probability are analytical techniques, our proposal for a re-positioning of statistics into such a Laplacean Universe is also well aligned with the general habit of Wittgenstein’s philosophy, which puts practiced logic (quasi-logic) second to performance.

The disappearance of the reference class problem should be expected if our relations to the world are always mediated through the activity with abstract, epistemic modeling. The usage of probability theory as a “conceptual game” aiming for sharing diverging attitudes towards risks appears as nothing else than just a particular style of modeling, though admittedly one that offers a reasonable rate of success.

The Result of Modeling

It should be clear by now, that the result of modeling is much more than just a single predictive model. Regardless whether we take the scientific perspective or a philosophical vantage point, we need to include operationalizations of the conditions of the model, that reach beyond the standard empirical risk expressed as “false classification”. Appropriate modeling provides not only a set of models with well-estimated stability and of different structures; a further goal is to establish models that are auto-consistent.

If the modeling employs a method that exposes its parameters, we even can avoid the „method hell“, that is, the results are not only reliable, they are also valid.

It is clear that only auto-consistent models are useful for drawing conclusions and in building up experience. If variables are just weighted without actually being removed, as for instance in approaches like the Support Vector Machines, the resulting methods are not auto-consistent. Hence, there is no way towards a propositional description of the observed process.

Given the population of explicitly tested models it is also possible to describe the differential contribution of any variable to the predictive power of a model. The assumption of neutrality or symmetry of that contribution, as it is for instance applied in statistical learning, is a simplistic perspective onto the variables and the system represented by them.

Conclusion

In this essay we described some technical aspects of the capability to experience. These technical aspects link the possibility for experience to the primacy of interpretation that gets actualized as the techné of anticipatory, i.e. predictive or diagnostic modeling. This techné does not address the creation or derivation of a particular model by means of employing one or several methods. The process of building a model could be fully automated anyway. Quite differently, it focuses the parametrization, validation, evaluation and application of models, particularly with respect to the task of extract a rule from observational data. This extraction of rules must not be conceived as a “drawing of conclusions” guided by logic. It is a constructive activity.

The salient topics in this practice are the selection of models and the description of the classifiers. We emphasized that the goal of modeling should not be conceived as the task of finding a single best model.

Methods like the Self-organizing Map which are based on sub-group segmentation of the data can be used to create auto-consistent models, which represent also an optimally de-noised subset of the measured data. This data sample could be conceived as if it would have been found by a factorial experimental design. Thus, auto-consistent models also provide quite valuable hints for the setup of the Taguchi method of quality assurance, which could be seen as a precipitation of organizational experience.

In the context of exploratory investigation of observational data one first has to determine the suitable observables (variables, predictors) and, by means of the same model(s), the suitable segment of observations before drawing domain-specific conclusions. Such conclusions are often expressed as contrasts in location or variation. In the context of designed experiments as e.g. in pharmaceutical research one first has to check the quality of the data, then to de-noise the data by removing outliers by means of the same data segmentation technique, before again null hypotheses about expected contrasts could be tested.

As such, auto-consistent models provide a perfect basis for learning and for extending the “experience” of an epistemic individual. According to our proposals this experience does not suffer from the various problems of traditional Humean empirism (the induction problem), or contemporary (defective) theories of probabilism (mainly the problem of reference classes). Nevertheless, our approach remains fully empirico-epistemological.

Notes

1. As many other philosophers Lyotard emphasized the indisputability of an attention for the incidential, not as a perception-as, but as an aisthesis, as a forming impression. see: Dieter Mersch, ›Geschieht es?‹ Ereignisdenken bei Derrida und Lyotard. available online, last accessed May 1st, 2012. Another recent source arguing into the same direction is John McDowell’s “Mind and World” (1996).

2. The label “representationalism” has been used by Dreyfus in his critique of symbolic AI, the thesis of the “computational mind” and any similar approach that assumes (1) that the meaning of symbols is given by their reference to objects, and (2) that this meaning is independent of actual thoughts, see also [2].

3. It would be inadequate to represent such a two-fold “almost” dichotomy as a 2-axis coordinate system, even if such a representation would be a metaphorical one only; rather, it should be conceived as a tetraedic space, given by two vectors passing nearby without intersecting each other. Additionally, the structure of that space must not expected to be flat, it looks much more like an inhomogeneous hyperbolic space.

4. “Experiencibility” here not understood as an individual capability to witness or receptivity, but as the abstract possibility to experience.

5. In the same way we reject Husserl’s phenomenology. Phenomena, much like the objects of positivism or the thing-as-such of idealism, are not “out there”, they are result of our experiencibility. Of course, we do not deny that there is a materiality that is independent from our epistemic acts, but that does not explain or describe anything. In other words we propose go subjective (see also [3]).

6. Again, mechanism here should not be misunderstood as a single deterministic process as it could be represented by a (trivial) machine.

7. This question refers to the famous passage in the Tractatus, that “The world is everything that is the case.“ Cases, in the terminology of the Tractatus, are facts as the existence of states of affairs. We may say, there are certain relations. In the Tractatus, Wittgenstein excluded relations that could not be explicated by the use of symbols., expressed by the 7th proposition: „Whereof one cannot speak, thereof one must be silent.“

8. We must step outside of language in order to see the working of language.

9. We just have to repeat it again, since many people develop misunderstandings here. We do not deny the material aspects of the world.

10. “individual” is quite misleading here, since our brain and even our mind is not in-divisable in the atomistic sense.

11. thus, it is also not reasonable to claim the existence of a somehow dualistic language, one part being without ambiguities and vagueness, the other one establishing ambiguity deliberately by means of metaphors. Lakoff & Johnson started from a similar idea, yet they developed it into a direction that is fundamentally incompatible with our views in many ways.

12. Of course, the borders are not well defined here.

13. “predictive power” could be operationalized in quite different ways, of course….

14. Correlational analysis is not a candidate to resolve this problem, since it can’t be used to segment the data or to identify groups in the data. Correlational analysis should be performed only subsequent to a segmentation of the data.

15. The so-called genetic algorithms are not algorithms in the narrow sense, since there is no well-defined stopping rule.

16. It is important to recognize that Artificial Neural Networks are NOT belonging to the family of sub-group based methods.

17. Here another circle closes: the concept of causality can’t be used in a meaningful way without considering its close amalgamation with the concept of information, as we argued here. For this reason, Judea Pearl’s approach towards causality [16] is seriously defective, because he completely neglects the epistemic issue of information.

References
  • [1] Geoffrey C. Bowker, Susan Leigh Star. Sorting Things Out: Classification and Its Consequences. MIT Press, Boston 1999.
  • [2] Willian Croft, Esther J. Wood, Construal operations in linguistics and artificial intelligence. in: Liliana Albertazzi (ed.) , Meaning and Cognition. Benjamins Publ, Amsterdam 2000.
  • [3] Wilhelm Vossenkuhl. Solipsismus und Sprachkritik. Beiträge zu Wittgenstein. Parerga, Berlin 2009.
  • [4] Douglas Hofstadter, Fluid Concepts And Creative Analogies: Computer Models Of The Fundamental Mechanisms Of Thought. Basic Books, New York 1996.
  • [5] Nicholas F. Gier, Wittgenstein and Deconstruction, Review of Contemporary Philosophy 6 (2007); first publ. in Nov 1989. Online available.
  • [6] Henk L. Mulder, B.F.B. van de Velde-Schlick (eds.), Moritz Schlick, Philosophical Papers, Volume II: (1925-1936), Series: Vienna Circle Collection, Vol. 11b, Springer, Berlin New York 1979. with Google Books
  • [7] Tomaso Poggio, Ryan Rifkin, Sayan Mukherjee & Partha Niyogi (2004). General conditions for predictivity in learning theory. Nature 428, 419-422.
  • [8]  Vladimir Vapnik, The Nature of Statistical Learning Theory (Information Science and Statistics). Springer 2000.
  • [9] Herman J. Bierens (2006). Information Criteria and Model Selection. Lecture notes, mimeo, Pennsylvania State University. available online.
  • [10 ]Brian Weatherson (2007). The Bayesian and the Dogmatist. Aristotelian Society Vol.107, Issue 1pt2, 169–185. draft available online
  • [11] Edward I. George (2000). The Variable Selection Problem. J Am Stat Assoc, Vol. 95 (452), pp. 1304-1308. available online, as research paper.
  • [12] Alan Hájek (2007). The Reference Class Problem is Your Problem Too. Synthese 156(3): 563-585. draft available online.
  • [13] Lori E. Dodd, Margaret S. Pepe (2003). Partial AUC Estimation and Regression. Biometrics 59( 3), 614–623.
  • [14] Zytkov J. (1997). Knowledge=concepts: a harmful equation. 3rd Conference on Knowledge Discovery in Databases, Proceedings of KDD-97, p.104-109.AAAI Press.
  • [15] Thomas Kaufmann, Klaus Wassermann, Guido Schüpfer (2007).  Beta error free risk identification based on SPELA, a neuro-evolution method. presented at ESA 2007.
  • [16] Alan Hájek, “Interpretations of Probability”, The Stanford Encyclopedia of Philosophy (Summer 2012 Edition), Edward N. Zalta (ed.), available online, or forthcoming.
  • [17] Judea Pearl, Causality – Models, Reasoning, and Inference. 2nd ed. Cambridge University Press, Cambridge  (Mass.) 2008 [2000].

۞

Waves, Words and Images

April 7, 2012 § 1 Comment

The big question of philosophy, and probably its sole question,

concerns the status of the human as a concept.1 Does language play a salient role in this concept, either as a major constituent, or as sort of a tool? Which other capabilities and which potential beyond language, if it is reasonable at all to take that perspective, could be regarded as similarly constitutive?

These questions may appear far off such topics like the technical challenges to program a population of self-organizing maps, the limits of Turing-machines, or the generalization of models and their conditions. Yet, in times where lots of people are summoning the so-called singularity, the question about the status of the human is definitely not exotic at all. Notably, “singularity” is often and largely defined as “overwhelming intelligence”, seemingly coming up inevitably due to ever increasing calculation power, and which we could not “understand” any more.  From an evolutionary perspective it makes pretty little sense to talk about singularities. Natural evolution, and cultural evolution alike, is full of singularities and void of singularities at the same time. The idea of “singularity” is not a fruitful way to approach the question of qualitative changes.

As you already may have read in another chapter, we prefer the concept of machine-based episteme as our ariadnic guide. In popular terms, machine-based episteme concerns the possibility for an actualization of a particular “machine” that would understand the conditions of its own when claiming “I know.” (Such an entity could not be regarded as a machine anymore, I guess.) Of course, in following this thread we meet a lot of already much-debated issues. Yet, moving the question about the episteme into the sphere of the machinic provides particular perspectives onto these issues.

In earlier times it has been tried, and some people still are trying today, to determine that status of the “human” as sort of a recipe. Do this and do that, but not that and this, then a particular quality will be established in your body, as your person, visible for others as virtues, labeled and conceived henceforth as “quality of being human”. Accordingly, natural language with all its ambiguities need not be regarded as an essential pillar. Quite to the opposite, if the “human” could be defined as a recipe, then our everyday language has to be cleaned up, made more close to crisp logic in order to avoid misunderstandings as far as possible; you may recognize this as the program of contemporary analytical philosophy. In methodological terms it was thought that it would be possible to determine the status of the human in positively given terms, or short, in a positive definite manner.

Such positions are, quite fortunately so, now recognized more and more as highly problematic. The main reason is that it is not possible to justify any kind of determination in an absolute manner. Any justification requires assumptions, while unjustified assumptions are counter-pragmatic to the intended justification. The problematics of knowledge is linked in here, as it could not be regarded as “justified, true belief” any more2. It was first Charles S. Peirce who concluded that the application of logic (as the grammar of reason) and ethics (as the theory of morality) are not independent from each other. In political terms, any positive definite determination that would be imposed to communities of other people must be regarded as an instance of violence. Hence, philosophy is not any more concerned about the status of the human as a fact, but, quite differently, the central question is how to speak about the status of the human, thereby not neglecting that speaking, using language is not a private affair. This looking for the “how” has to obey, of course, itself to the rule not to determine rules in a positive definite manner. As a consequence, the only philosophical work we can do is exploring the conditions, where the concept of “condition” refers to an open, though not recursive, chain. Actually, already Aristotle dubbed this as “metaphysics” and as the core interest of philosophy. This “metaphysics” can’t be overtaken by any “natural” discipline, whether it is a kind of science or engineering. There is a clear downstream relation: science as well as engineering should be affected by it in emphasizing the conditions for their work more intensely.

Practicing, turning the conditions and conditionability into facts and constraints is the job of design, let it manifest this design as “design,” as architecture, as machine-creating technology, as politician, as education, as writer and artist, etc.etc.  Philosophy can not only never explain, as Wittgenstein mentioned, it also can’t describe things “as such”. Descriptions and explanations are only possible within a socially negotiated system of normative choices. This holds true even for natural sciences. As a consequence, we should start with philosophical questions even in the natural sciences, and definitely always in engineering. And engaging in fields like machine learning, so-called artificial intelligence or robotics without constantly referring to philosophy will almost inevitably result in nonsense. The history of these fields a full of examples for that, just remember the infamous “General Problem Solver” of Simon and Newell.

Yet, the issue is not only one of ethics, morality and politics. It has been Foucault as the first one, in sort of a follow-up to Merleau-Ponty, who claimed a third region between the empiricism of affections and the tradition of reflecting on pure reason or consciousness.3 This third region, or even dimension (we would say “aspection”), being based on the compound consisting from perception and the body, comprises the historical evolution of systems of thinking. Foucault, together with Deleuze, once opened the possibility for a transcendental empiricism, the former mostly with regard to historical and structural issues of political power, the latter mostly with regard to the micronics of individual thought, where the “individual” is not bound to a single human person, of course. In our project as represented by this collection of essays we are following a similar path, starting with the transition from the material to the immaterial by means of association, and then investigating the dynamics of thinking in the aspectional space of transcendental conditions (forthcoming chapter), which build an abstract bridge between Deleuze and Foucault as it covers both the individual and the societal aspects of thinking.

This Essay

This essay deals with the relation of words and a rather important aspect in thinking, representation. We will address some aspects of its problematics, before we approach the role of words in language. Since the representation is something symbolic in the widest sense and that representation has to be achieved autonomously by a mainly material arrangement, e.g. called “the machine”4, we also will deal (again) with the conditions for the transformation of (mainly) physical matter into (mainly) symbolic matter. Particularly, however, we will explore the role of words in language. The outline comprises the following sections:

From Matter to Mind

Given the conditioning mentioned above, the anthropological history of the genus of homo5 poses a puzzle. Our anatomical foundations6 have been stable since at least 60’000 years, but contemporary human beings at the age of, let me say, 20 or 30 years are surely much more “intelligent”7. Given the measurement scale established as I.Q. in the beginning of the 20th century, a significant increase can be observed for the supervised populations even throughout the last 60 years.

So, what makes the difference then, between the earliest ancient cultures and the contemporary ones? This question is highly relevant for our considerations here that focus on the possibility of a machine-based episteme, or in more standard, yet seriously misplaced terms, machine learning, machine intelligence or even artificial intelligence. In any of those fields, one could argue, researchers and engineers somehow start with mere matter, then imprinting some rules and symbols to that matter, only to expect then the matter becoming “intelligent” in the end. The structure of the problematics remains the same, whether we take the transition that started from paleo-cultures or that rooted in the field of advanced computer science. Both instances concern the role of culture in the transformation of physical matter into symbolic matter.

While philosophy has tackled that issue for at least two and a half millennia, resulting in a rich landscape of arguments, including the reflection of the many styles of developing those arguments, computer science is still almost completely blind against the whole topic. Since computer scientists and computer engineers inevitably get into contact with the realm of the symbolic, they usually and naively repeat past positions, committing naïve, i.e. non-reflective idealism or materialism that is not even on a pre-socratic level. David Blair [6] correctly identifies the picture of language on which contemporary information retrieval systems are based on as that of Augustine: He believed that every word has a meaning. Notably, Augustine lived in the late 4th till early 5th century A.C. This story simply demonstrates that in order to understand the work of a field one also has, as always, to understand its history. In case of computer sciences it is the history of reflective thought itself.

Precisely this is also the reason for the fact that philosophy is much more than just a possibly interesting source for computer scientists. More directly expressed, it is probably one of the major structural faults of computer science that it is regarded as just a kind of engineering. Countless projects and pieces of software failed for the reason of such applied methodological reductionism. Everything that gets into contact with computers developed from within such an attitude then also becomes infected by the limited perspective of engineering.

One of the missing aspects is the philosophy of techno-science, which not just by chance seriously started with Heidegger8 as its first major proponent. Merleau-Ponty, inspired by Heidegger, then emphasized that everything concerning the human is artificial and natural at the same time. It does not make sense to set up that distinction for humans or man-made artifacts as well, as if such a difference would itself be “natural”. Any such distinction refers more directly than not to Descartes as well as to Hegel, that is, it follows either simplistic materialism or overdone idealism, so to speak idealism in its machinic, Cartesian form. Indeed, many misunderstandings about the role of computers in contemporary science and engineering, but also in the philosophy of science and the philosophy of information can be deciphered as a massive Cartesio-Hegelian heir, with all its drawbacks. And there are many.

The most salient perhaps is the foundational element9 of Descartes’ as well as Hegel’s thoughts: independence. Of course, for both of them independence was a major incentive, goal and demand, for political reasons (absolutism in the European 17th century), but also for general reasons imposed by the level of techno-scientific insights, which remained quite low until the mid of the 20th century. People before the scientific age had been exposed to all sorts of threatening issues, concerning health, finances, religious or political freedom, collective or individual violence, all together often termed “fate”. Being independent meant a basic condition to live more or less safely at all, physically and/or  mentally. Yet, Descartes and Hegel definitely exaggerated it.

Yet, the element of independence made its way into the cores of the scientific method itself. Here it blossomed as reductionism, positivism and physicalism, all of which can be subsumed under the label of naive realism. It took decades until people developed some confidence not to prejudge complexity as esotericism.

With regard to computer science there is an important consequence. We first and safely can drop the label of  “artificial intelligence” or “machine learning” just along with the respective narrow and limited concepts. Concerning machine learning we can state that only very few of the approaches to machine learning that exist so far is at most a rudimentary learning in the sense of structural self-transformation. The vast majority of approaches that are dubbed as “machine learning” represent just some sort of advanced parameter estimation, where the parameters to be estimated are all defined (i) apriori, and (ii) by the programmer(s). And regarding intelligence we can recognize that we never can assign concepts like artificial or natural to it, since there is always a strong dependence on culture in it. Michel Serres once called written language the first artificial intelligence, pointing to the central issue of any technology: externalization of symbol-based systems of references.

This brings us back to our core issue here, the conditions for the transformation of (mainly) physical matter into (mainly) symbolic matter. In some important way we even can state that there is no matter without symbolic aspects. Two pieces of matter can interact only if they are not completely transparent to each other. If there is an effective transfer of energy between those, then the form of the energy becomes important, think of it for instance as wave length of some electromagnetic radiation, or the rhythmicity of it, which becomes distinctive in the case of a LASER [9,10]. Sure, in a LASER there are no symbols to be found; yet, the system as a whole establishes a well-defined and self-focusing classification, i.e. it performs the transition from a white-noised, real-valued randomness to a discrete intensional dynamics. The LASER has thus to be regarded as a particular kind of associative system, which is able to produce proto-symbols.

Of course, we may not restrict our considerations to such basic instances of pan-semiotics. When talking about machine-based episteme we talk about the ability of an entity to think about the conditions for its own informational dynamics (avoiding the term knowledge here…). Obviously, this requires some kind of language. The question for any attempt to make machines “intelligent” thus concerns in turn the question about how to think about the individual acquisition of language, and, of course, with regard to our interests here how to implement the conditions for it. Note that homo erectus who lived 1 million years ago must have had a clear picture not only about causality, and not only individually, but they also must have had the ability to talk about that, since they have been able to keep fire burning and to utilize it for cooking meal and bones. Logic has not been invented as a field at these times, but it seems absolutely mandatory that they have been using a language.10 Even animals like cats, pigs or parrots are able to develop and to perform plans, i.e. to handle causality, albeit probably not in a conscious manner. Yet, neither wild pigs nor cats are able for symbol based culture, that is a culture, which spreads on the basis of symbols that are independent from a particular body or biological individual. The research programs of machine learning, robotics or artificial intelligence thus appears utterly naive, since they all neglect the cultural dimension.

The central set of questions thus considers the conditions that must be met in order to become able to deal with language, to learn it and to practice it.

These conditions are not only “private”, that is, they can’t be reduced to individual brains, or a machines, that would “process” information. Leaving the simplistic perspective onto information as it is usually practiced in computer sciences aside for the moment, we have to accept that learning language is a deeply social activity, even if the label of the material description of the entity is “computer”. We also have to think about the mediality of symbolic matter, the transition from nature to culture, that is from contexts of low symbolic intensity to those of high symbolic intensity. Handling language is not an affair that could be thought to be performed privately, there is no such thing as a “private language”. Of course, we have brains, for which the matter could still be regarded as dominant, and the processes running there are running only there11.

Note that implementing the handling of words as apriori existing symbols is not what we are talking about here. As Hofstadter pointed out [12], calling the computing processes on apriori defined strings “language understanding” is nothing but silly. We are not allowed to call the shuffling of predefined encoded symbols forth and back “understanding”. But what could we call “understanding” then? Again, we have to postpone this question for the time being. Meanwhile we may reshape the question about learning language a bit:

How do we come to be able to assign names to things, classes, types, species, animals and other humans? What is role of such naming, and what is the role of words?

The Unresolved Challenge

The big danger when addressing these issues is to start too late, provoked by an ontological stance that is applied to language. The most famous example probably being provided by Heidegger and his attempt of “fundamental ontology”, which failed glamorously. It is all too easy to get bewitched by language itself and to regard it as something natural, as something like stones: well-defined, stable, and potentially serving as a tool. Language itself makes us believe that words exist as such, independent from us.

Yet, language is a practice, as Wittgenstein said, and this practice is neither a single homogenous one nor does it remain constant throughout life, nor are the instances identical and exchangeable. The practice of language develops, unfolds, gains quasi-materiality, turns from an end to a means and back. Indeed, language may be characterized just by the capability to provide that variability in the domain of the symbolic. Take as a contrast for instance the symbolon, or take the use of signs in animals, in both cases there is exactly one single “game” you can play. Only in such trivial cases the meaning of a name could be said to be close to its referent. Yet, language games are not trivial.

I already mentioned the implicit popularity of Augustine among computer scientists and information systems engineers. Let me cite the passage that Wittgenstein chose in his opening remarks to the famous Philosophical Investigations (PI)12. Augustine writes:

When they (my elders) named some object, and accordingly moved towards something, I saw this and I grasped that the thing was called by the sound they uttered when they meant to point it out. Their intention was shewn by their bodily movements, as it were the natural language of all peoples: the expression of the face, the play of the eyes, the movement of other parts of the body, and the tone of voice which expresses our state of mind in seeking, having, rejecting, or avoiding something. Thus, as I heard words repeatedly used in their proper places in various sentences, I gradually learnt to understand what objects they signified; and after I had trained my mouth to form these signs, I used them to express my own desires.

Wittgenstein gave two replies, one directly in the PI, the other one in the collection entitled “Philosophical Grammar” (PG).

These words, it seems to me, give us a particular picture of the essence of human language. It is this: the individual words in language name objects—sentences are combinations of such names.—In this picture of language we find the roots of the following idea: Every word has a meaning. This meaning is correlated with the word. It is the object for which the word stands.

Augustine does not speak of there being any difference between kinds of word. If you describe the learning of language in this way you are, I believe, thinking primarily of nouns like “table,” “chair,” “bread,” and of people’s names, and only secondarily of the names of certain actions and properties; and of the remaining kind of words as something that will take care of itself. (PI §1)

And in the Philosophical Grammar:

When Augustine talks about the learning of language he talks about how we attach names to things or understand the names of things. Naming here appears as the foundation, the be all and end all of language. (PG 56)

Before we will take the step to drop and to drown the ontological stance once and for all we would like to provide two things. First, we will briefly cite a summarizing table from Blair [1]13. Blair’s book is indeed a quite nice work about the peculiarities of language as far as it concerns “information retrieval” and how Wittgenstein’s philosophy could be helpful in resolving the misunderstandings. Second, we will (also very briefly) make our perspective to names and naming explicit.

David Blair dedicates quite some efforts to render the issue of indeterminacy of language as clear as possible. In alignment to Wittgenstein he emphasizes that indeterminacy in language is not the result of sloppy or irrational usage. Language is neither a medium of logics nor a something like a projection screen of logics. There are good arguments, represented by the works of Ludwig Wittgenstein, late Hilary Putnam and Robert Brandom, to believe that language is not an inferior way to express a logical predicate (see the previous chapter about language). Language can’t be “cleared” or being made less ambiguous, its vagueness is a constitutive necessity for its use and utility in social intercourse. Many people in linguistics (e.g. Rooij [13]) and large parts of cognitive sciences (e.g. Alvin Goldman [14]14), but also philosophers like Saul Kripke [16] or Scott Soames [17] take the opposite position.

Of course, in some contexts it is reasonable to try to limit the vagueness of natural language, e.g. in law and contracts. Yet, it is also clear that positivism in jurisdiction is a rather bad thing, especially if it shows up as a pair with idealism.

Blair then contrasts two areas in so-called “information retrieval”15, distinguished by the type of data that is addressed: structured data that could be arranged in tables on the one hand, Blair calls it determinate data, and such “data” that can’t be structured apriori, like language. We already met this fundamental difference in other chapters (about analogies, language). The result of his investigation he summarized in the following table. It is more than obvious that the characteristics of the two fields are drastically different, which equally obvious has to be reflected in the methods going to be applied. For instance, the infamous n-gram method is definitely a no-go.

For the same reasons, semantic disambiguation is not possible by a set of rules that could be applied by an individual, whether this individual is a human or a machine. Quite likely it is even completely devoid of sense to try to remove ambiguity from language. One of the reasons is given by the fact that concepts are transcendental entities. We will return to the issue of “ambiguity” later.

In the quote from the PG shown above Wittgenstein rejects Augustine’s perspective that naming is central to language. Nevertheless, there is a renewed discussion in philosophy about names and so-called “natural kind terms”, brought up by Kripke’s “Naming and Necessity” [16]. Recently, Scott Soames explicitly referred to Kripke’s. Yet, as so many others, Soames commits the drastic mistake introduced along the line formed by Frege, Russell and Carnap in ascribing language the property of predicativity (cf. [18]  p.646).

These claims are developed within a broader theory which, details aside, identifies the meaning of a non-indexical sentence S with a proposition asserted by utterances of S in all normal contexts.

We won’t delve in any detail to the discussion of “proper names”16, because it is largely a misguided and unnecessary one. Let me just briefly mention three main (and popular) alternative approaches to address the meaning of names: the descriptivist theories, the referential theory originally arranged by John Stuart Mill, and the causal-historical theory. They are all not tenable because they implicitly violate the primacy of interpretation, though not in an obvious manner.

Why can’t we say that a name is a description? A description needs assignates17, or aspects, if you like, at least one scale. Assuming that there is the possibility for a description that is apriori justified and hence objective invokes divinity as a hidden parameter, or any other kind of Fregean hyper-idealism. Assignates are chosen according to and in dependence from the context. Of course, one could try to expel any variability of any expectable context, e.g. by literally programming society, or some kind of philosophical dictatorship. In any other case, descriptions are variant. The actual choice for any kind of description is the rather volatile result of negotiation processes in the embedding society. The rejection of names as description results from the contradictory pragmatic stances. First, names are taken as indivisible, atomic entities, but second descriptions are context-dependent subatomic properties, which by virtue of the implied pragmatics, corroborates the primary claim. Remember that the context-dependency results from the empirical underdetermination. In standard situations it is neither important that water consists as a compound of hydrogen and oxygen, nor is this what we want to say in everyday situations. We do not carry the full description of the named entity along into any instance of its use, despite there are some situations where we indeed are interested in the description, e.g. as a scientist, or as a supporter of  the “hydrogen economy”. The important point is that we never can determine the status of the name before we have interpreted the whole sentence, while we also can’t interpret the sentence without determining the status of the named entity. Both entities co-emerge. Hence we also can’t give an explicit rule for such a decision other than just using the name or uttering the sentence. Wittgenstein thus denies the view that assumes a meaning behind the words that is different from their usage.

The claim that the meaning of a proper name is its referent meets similar problems, because it just introduces the ontological stance through the backdoor. Identifying the meaning of a label with its referent implies that the meaning is taken as something objective, as something that is independent from context, and even beyond that, as something that could be packaged and transferred *as such*. In other words, it deliberately denies the primacy of interpretation. We need not say anything further, except perhaps that Kripke (and Soames as well, in taking it seriously) commits a third mistake in using “truth-values” as factual qualities.18 We may propose that the whole theory of proper names follows a pseudo-problem, induced by overgeneralized idealism or materialism.

Names, proper: Performing the turn completely

Yet, what would be an appropriate perspective to deal with the problem of names? What I would like to propose is a consequent application of the concept of “language game”. The “game” perspective could not only be applied to the complete stream of exchanged utterances, but also to the parts of the sentences, e.g. names and single words. As a result, new questions become visible. Wittgenstein himself did not explore this possibility (he took Augustine as a point of departure), and it could not be found in contemporary discourse either”19. As so often, philosophers influenced by positivism simply forget about the fact that they are speaking. Our proposal is markedly different from and also much more powerful than the causal-historical or the descriptivist approach, and also avoids the difficulties of Kripke’s externalist version.

After all, naming, to give a name and to use names, is a “language game”. Names are close to observable things, and as a matter of fact, observable things are also demonstrable. Using a name refers to the possibility of a speaker to provide a description to his partner in discourse such that this listener would be able to agree on the individuality of the referenced thing. The use of the name “water” for this particular liquid thing does not refer to an apriori fixed catalog of properties. Speaker and listener even need not agree on the identity of the set of properties ascribed to the referred physical thing. The chemist may always associate the physico-chemical properties of the molecule even when he reads about the submersed sailors in Shakespeare’s *tempest*, but nevertheless he easily could talk about that liquid matter with a 9 year old boy that does neither know about Shakespeare nor about the molecule.

It is thus neither possible nor is it reasonable to try to achieve a match regarding the properties, since a rich body of methods would be necessarily invoked to determine that set. Establishing the identity of representations of physical, external things, or even of the physical things themselves, inevitably invokes a normative act (which is rather incommensurable to the empiricists claims).

For instance, saying just “London”, out of the blue, it is not necessary that we envisage the same aspects of the grand urban area. Since cities are inevitably heterotopic entities (in the sense of Foucault [19, 20], acc. to David Graham Shane [21]), this agreement is actually impossible. Even for the undeniably more simple minded cartographers the same problem exists: “Where” is that London, in terms of spheric coordinates? Despite these unavoidable difficulties both the speaker and the listener easily agree on the individuality of the imaginary entity “London”. The name of “London” does not point to a physical thing but just to an imaginative pole. In contrast to concepts, however, names take a different grammatical role as they not only allow for a negotiation of rather primitive assignates in order to take action, they even demonstrate the possibility of such negotiation. The actual negotiations could be quite hard, though.

We conclude that we are not allowed to take any of the words as something that would “exist” as a, or like a physical “thing”. ­­­­Of course, we get used to certain words, the gain a quasi-materiality because a constancy appears that may be much stronger than the initial contingency. But this “getting used” is a different topic, it just refers how we speak about words. Naming remains a game, and as any other game this one also does not have an identifiable border.

Despite this manifold that is mediated through language, or as language, it is also clear that language remains rooted in activity or the possibility of it. I demonstrate the usage of a glass and accompany that by uttering “glass”. Of course, there is the Gavagai problematics20 as it has been devised by Quine [22]. Yet, this problematics is not a real problem, since we usually interact repeatedly. On the one hand this provides us the possibility to improve our capability to differentiate single concepts in a certain manner, but on the other hand the extended experience introduces a secondary indeterminacy.

In some way, all words are names. All words may be taken as indicators that there is the potential to say more about them, yet in a different, orthogonal story. This holds even for the abstract concepts denoted by the word “transcendental” or for verbs.

The usage of names, i.e. their application in the stream of sentences, gets more and more rich, but also more and more indeterminate. All languages developed some kind of grammar, which is a more or less strict body of rules about how to arrange words for certain language games. Yet, the grammar is not a necessity for language at all, it is just a tool to render language-based communication more easy, more fast and more precise. Beyond the grammars, it is the experience which enables us to use metaphors in a dedicated way. Yet, language is not a thing that sometimes contains metaphors and sometimes not. In a very basic sense all the language is metaphorical all the time.

So, we first conclude that there is nothing enigmatic in learning a language. Secondly, we can say that extending the “gameness” down to words provides the perspective of the mechanism, notably without reducing language to names or propositions.

Instead, we now can clearly see how these mechanisms mediate between the language game as a whole, the metaphorical characteristics of any language and simple rule-based mechanisms.

Representing Words

There is a drastic consequence of the completed gaming perspective. Words can’t be “represented” as symbols or as symbolic strings in the brain, and words can’t be appropriately represented as symbols in the computer either. Given any programming language, strings in a computer program are nothing else than particularly formatted series of values. Usually, this series is represented as an array of values, which is part of an object. In other words, the word is represented as a property of an object, where such objects are instances of their respective classes. Such, the representation of words in ANY computer program created so far for the purpose of handling texts, documents, or textual information in general is deeply inappropriate.

Instead, the representation of the word has to carry along its roots, its path of derivation, or in still other words, its traces of precipitation of the “showing”. This rooting includes, so we may say, a demonstrativum, an abstract image. This does not mean that we have to set up an object in the computer program that contains a string and an abstract image. This would be just the positivistic approach, leaving all problems untouched, the string and the image still being independent. the question of how to link them would be just delegated to the next analytic homunculus.

What we propose are non-representational abstract compounds that are irrevocably multi-modal since they are built from the assignates of  abstract “things” (Gegenstände). These compounds are nothing else than combined sets of assignates. The “things” represented in this way are actually always more or less “abstract”. Through the sets of assignates we actually may combine even things which appear incommensurable on the level of their wholeness, at least at first sight. An action is an action, not a word, and vice versa, an image is neither a word nor an action, isn’t it? Well, it depends; we already mentioned that we should not take words as ontological instances. Any of those entities can be described using the same formal structure, the probabilistic context that is further translated into a set of assignates. The probabilistic context creates a space of expressibility, where the incommensurability disappears, notably without reducing the comprised parts (image, text,…) to the slightest extent.

The situation reminds a bit synesthetic experiences. Yet, I would like to avoid calling it synesthetic, since synesthecism is experienced on a highly symbolic level. Like other phenomenological concepts, it also does not provide any hint about the underlying mechanisms. In contrast, we are talking about a much lower level of integration. Probably we could call this multi-modal compound a “syn-presentational” compound, or short, a “synpresentation”.21

Words, images and actions are represented together as a quite particular compound, which is an inextricable multi-modal compound. We also may say that these compounds are derived qualia. The exciting point is that the described way of probabilistic multi-modal representation obviates the need for explicit references and relations between words and images. These relations even would have to be defined apriori (strongly: before programming, weakly: before usage). In our approach, and quite to the contrast to the model of external control, relations and references *can be* subject to context-dependent alignments, either to the discourse, or the task (of preparing a deliverable from memory).

The demonstrativum may not only refer to an “image”. First note that the image does not exist outside of its interpretation. We need to refer to that interpretation, not to an index in a data base or a file system. Interpretation thus means that we apply a lot of various processing and extraction methods to it, each of them providing a few assignates. The image is dissolved into probabilistic contexts as we do it for words (footnote: we have described it elsewhere). The dissolving of an image is of course not the endpoint of a communicable interpretation, it is just the starting point. Yet, this does not matter, since the demonstrativum may also refer to any derived intension and even to any derived concept.22

The probabilistic multi-modal representation exhibits three highly interesting properties, concerning abstractness, relations and the issue of foundations. First, the  abstractness of represented items becomes scalable in an almost smooth manner. In our approach, “abstractness” is not a quality any more. Secondly, relations and references of both words and the “content” of images are transformed into their pre-specific versions. Both, relations and references need not be implemented apriori or observed as an apriori. Initially, they appear only as randolations23. Thirdly, some derived and already quite abstract entities on an intermediate level of “processing” are more basic than the so-called raw observations24.

Words, Classes, Models, Waves

It is somewhat tempting to arrange these four concepts to form a hierarchical series. Yet, things are not that simple. Actually, any of the concepts that appear more as a symbolistic entity also may re-turn into a quasi-materiality, into a wave-like phenomenon that itself serves as a basis for potential differences. This re-turn is a direct consequence of the inextricable mediality of the world, mediality understood here thus as a transcendental category. Needless to say that mediality is just another blind spot in contemporary computer sciences. Cybernetics as well as engineering straightaway exclude the possibility to recognize the mediatedness of worldly events.

In this section we will try to explicate the relations between the headlined concepts to some extent, at least as far as it concerns the mapping of those into an implementable system of (non-Turing) “computer programs”. The computational model that we presuppose here is the extended version of the 2-layered SOM, as we have it introduced previously.

Let us start with first things first. Given a physical signal, here in the literal sense, that is as a potentially perceivable difference in a stream of energy, we find embodied modeling, and nothing else. The embodiment of the initial modeling is actualized in sensory organs, or more generally, in any instance that is able to discretize the waves and differences at least “a bit more”. In more technical terms, the process of discretization is a process that increases the signal-noise ratio. In biological systems we often find a frequency encoding of the intensity of a difference. Though the embodiment of that modeling is indeed a filtering and encoding, hence already some kind of a modeling representation, it is not a modeling in the more narrow sense. It points out of the individual entity into the phylogenesis, the historical contingency of the production of that very individual entity. We also can’t say that the initial embodied processing by the sensory organs is a kind of encoding. There is no code consisting of well-identified symbols at the proximate end of the sensory cell. It is still a rather probabilistic affair.

This basic encoding is not yet symbolic, albeit we also can’t call it a wave any more. In biological entities this slightly discretized wave then is subject of an intense modeling sensu strictu. The processing of the signals is performed by associative mechanisms that are arranged in cascades. This “cascading” is highly interesting and probably one of the major mandatory ingredients that are neglected by computer science so far. The reason is quite clear: it is not an analytic process, hence it is excluded from computer science almost by definition.

Throughout that cascade signals turn more and more into information as an interpreted difference. It is clear that there is not a single or identifiable point in this cascade to which one could assign the turn from “data” to “information”. The process of interpretation is, quite in contrast to idealistic pictures of the process of thinking, not a single step. The discretized waves that flow into the processing cascade are subject to many instances and very different kinds of modeling, throughout of which discrete pieces get separated and related to other pieces. The processing cascade thus is repeating a modular principle consisting from association and distribution.

This level we still could not label as “thinking”, albeit it is clearly some kind of a mental process. Yet, we could still regard it as something “mechanical”, even as we also find already class-like representations, intensions and proto-concepts. Thinking in its meaningful dimension, however, appears only through assigning sharable symbols. Thinking of something implicitly means that one could tell about the respective thoughts. It does not matter much whether these symbols are shared between different regions in the brain or between different bodily entities does not matter much. Hence, thinking and mental processes need to be clearly distinguished. Yet, assigning symbols, that is assigning a word, a specific sound first, and later, as a further step of externalization, a specific grapheme that reflects the specific sound, which in turn represents an abstract symbol, this process of assigning symbols is only possible through cultural means. Cats may recognize situations very well and react accordingly, they may even have a feeling that they have encountered that situation before, but cats can’t share they symbols, they can’t communicate the relational structure of a situation. Yet, cats and dogs already may take part in “behavior games”, and such games clearly has been found in baboons by Fernando Colmenares [24]. Colmenares adopted the concept of “games” precisely because the co-occurrence of obvious rules, high variability, and predictive values of actions and reactions of the individual animals. Such games unfold synchronic as well as diachronic, and across dynamically changing assignment of social roles. All of this is accompanied by specific sounds. Other instances of language-like externalization of symbols can presumably be found in grey parrots [25], green vervet monkey [26], bonobos, dolphins and Orcas.

But still… in animals those already rather specific symbols are not externalized by imprinting them into matter different from their own bodies. One of the most desirable capabilities for our endeavor here about machine-based episteme thus consists in just that externalization processes embedded in social contexts.

Now the important thing to understand is that this whole process from waves to words is not simply a one-way track. First, words do not exist as such, they just appear as discrete entities through usage. It is the usage of X that introduces irreversibility. In other words, the discreteness of words is a quality that is completely on the aposteriori side of thinking. Before their actual usage, their arrangement into sentences words “are” nothing else than probabilistic relations. It needs a purpose, a target oriented selection (call it “goal-directed modeling”) to let them appear as crisp entities.

The second issue is that a sentence is an empirical phenomenon, remarkably even to the authoring brain itself. The sentence needs interpretation, because it is never ever fully determinate. Interpretation, however, of such indeterminate instances like sentences renders the apparent crisp phenomenon of words back into waves. A further effect of interpretation of sentences as series of symbols is the construction of a virtual network. Texts, and in a very similar way, pieces of music, should not be conceived as series, as computer linguistics is treating them. Much more appropriately texts are conceived as networks, that even may exert there own (again virtual) associative power, which to some extent is independent from the hosting interpreter, as I have argued here [28].

Role of Words

All these characteristics of words, their purely aposteriori crispness, their indeterminacy as sub-sentential indicators of randolational networks, their quality as signs by which they only point to other signs, but never to “objects”, their double quality as constituent and result of the “naming game”, all these “properties” make it actually appear as highly unlikely and questionable whether language is about references at all. Additionally, we know that the concept of “direct” access to the mind or the brain is simply absurd. Everything we know about the world as individuals is due to modeling and interpretation. That of course concerns also the interpretation of cultural artifacts or culturally enabled externalization of symbols, for instance into the graphemes that we use to represent words.

It is of utmost importance to understand that the written or drawn grapheme is not the “word” itself. The concept of a “word-as-such” is highly inappropriate, if not bare nonsense.

So, if words, sentences and language at large are not about “direct” referencing of (quasi-) material objects, how then should we conceive of the process we call “language game”, or “naming game”? Note that we now can identify van Fraassen’s question about “how do words and concepts acquire their reference?” as a misunderstanding, deeply informed by positivism itself. It does not make sense to pose that question in this way at all. There is not first a word which then, in a secondary process gets some reference or meaning attached. Such a concept is almost absurd. Similarly, the distinction between syntax and semantics, once introduced by the positivist Morris in the late 1940ies, is to be regarded as much the same pseudo-problem, established just by the fundamental and elemental assumptions of positivism itself: linear additivity, metaphysical independence and lossless separability of parts of wholenesses. If you scatter everything into single pieces of empirical dust, you will never be able to make any proposition anymore about the relations you destroyed before. That’s the actual reason for the problem of positivistic science and its failure.

In contrast to that we tend to propose a radically different picture of language, one that of course has been existing in many preformed flavors. Since we can’t transfer anything directly into one’s other mind, the only thing we can do is to invite or trigger processes of interpretation. In the chapter about vagueness we called words  “processual indicative” for slightly different reasons. Language is a highly structured, institutionalized and symbolized “demonstrating”, an invitation to interpret. Richard Brandom investigated in great detail [29] the processes and the roles of speakers and listeners in that process of mutual invitation for interpretation. The mutuality allows a synchronization, a resonance and a more or less strong resemblance between pairs of speaker-listeners and listener-speakers.

The “naming game” and its derivative, the “word game” is embedded into a context of “language games”. Actually, word games and language games are not as related as it might appear prima facie, at least beyond their common characteristics that we may label “game”. This becomes apparent if we ask what happens with the “physical” representative of a single word that we throw into our mechanisms. If there is no sentential context, or likewise no social context like a chat, then a lot of quite different variants of possible continuations are triggered. Calling out “London” our colleague in chatting may continue with “Jack London”  (the writer), “Jack the Ripper”, Chelsea, London Tower, Buckingham, London Heathrow, London Soho, London Stock Exchange, etc. but also Paris, Vienna, Berlin, etc., choices being slightly dependent on our mood, the thoughts we had before etc. In other words, the word that we bring to the foreground as a crisp entity behaves like a seedling: it is the starting point of a potential garden or forest, it functions as the root of the unfolding of a potential story (as a co-weaving of a network of abstract relations). Just to bring in another metaphorical representation: Words are like the initial traces of firework rockets, or the traces of elementary particles in statu nascendi as they can be observed in a bubble chamber: they promise a rich texture of upcoming events.

Understanding (Images, Words, …)

We have seen that “words” gain shape only as a result of a particular game, the “naming game”, which is embedded into a “language game”. Before those games are played, “words” do not exist as a discrete, crisp entity, say as a symbol, or a string of letters. Would they, we could not think. Even more than the “language game” the “naming game” works mainly as an invitation or as an acknowledged trigger for more or less constrained interpretation.

Now there are those enlightened language games of “understanding” and “explaining”. Both of them work just as any other part of speech do: they promise something. The claim to understand something refers to the ability for a potential preparation of a series of triggers that one additionally claim to be able to arrange in such a way as to support the gaining of the respective insight in my chat partner. Slightly derived from that understanding also could mean to transfer the structure of the underlying or overarching problematics to other contexts. This ability for adaptive reframing of a problematic setting is thus always accompanied by a demonstrativum, that is, by some abstract image, either by actual pictorial information or its imagination, or by its activity. Such a demonstrativum could be located completely within language itself, of course, which however is probably quite rare.

Ambiguity

It is clear that language does not work as a way to express logical predicates. Trying to do so needs careful preparations. Language can’t be “cured” and “cleaned” from ambiguities, trying to do so would establish a categorical misunderstanding. Any “disambiguation” happens as a resonating resemblance of at least two participants in language-word-gaming, mutually interpreting each other until both believe that their interest and their feelings match. An actual, so to speak objective match is neither necessary nor possible. In other words, language does not exist in two different forms, one without ambiguity and without metaphors, and the other form full of them. Language without metaphorical dynamics is not a language at all.

The interpretation of empirical phenomena, whether outside of language or concerning language itself, is never fully determinable. Quine called the idea of the possibility of such a complete determination a myth and as the “dogma of empiricism” [30]. Thus, given this underdetermination, it does not make any sense to expect that language should be isomorphic to logical predicates or propositions. Language is basically an instance of impredicativity. Elsewhere we already met the self-referentiality of language (its strong singularity) as another reason for this. Instead, we should expect that this fundamental empirical underdetermination is reflected appropriately in the structure of language, namely as analogical thinking, or quite related to that, as metaphorical thinking.

Ambiguity is not a property of language or words, it is a result, or better, a property of the process of interpretation at some arbitrarily chosen point in time. And that process takes place synchronously within a single brain/mind as well as between two brains/minds. Language is just the mediating instance of that intercourse.

“Intelligence”

It is now possible to clarify the ominous concept of “intelligence”. We find the concept in the name of a whole discipline (“Artificial Intelligence”), and it is at work behind the scenes in areas dubbed as “machine learning”. Else, there is the hype about the so-called “collective intelligence”. These observations, and of course our own intentions make it necessary to deal briefly with it, albeit we think that it is a misleading and inappropriate idea.

First of all one has to understand that “intelligence” is an operationalization of a research question, allowing for a measurement, hence for a quantitative comparison. It is questionable whether the mental qualities can be made quantitatively measurable without reducing them seriously. For instance, the capacity for I/O operations related to a particular task surely can’t be equaled with “intelligence”, even if it could be a necessary condition.

It is just silly to search for “intelligence” in machines or beings, or to assign more or less intelligence to any kind of entity. Intelligence as such does not “exist” independently of a cultural setup, we can’t find it “out there”. Ontology is, as always, not only a bad trail, it directly leads into the abyss of nonsense. The research question, by the way, was induced by the intention to proof that black people and women are less intelligent than white males.

Yet, even if we take “intelligence” in an adapted and updated form as the capability for autonomous generalization, it is a bad concept, simply because it does not allow to pose further reasonable questions. This directly follows from its characteristics of being itself an operationalization. Investigating the operationalization hardly brings anything useful to light about the pretended subject of interest.

The concept of intelligence arose in a strongly positivistic climate, where the positivism has been practiced even in a completely unreflected manner. Hence, their inventors have not been aware of the effect of their operationalization. The concept of intelligence implies a strong functional embedding of the respective, measured entity. Yet, dealing with language undeniably has something to do with higher mental abilities, but language is a strictly non-functional phenomenon. It does not matter here that positivists still claim the opposite. And who would stand up claiming that a particular move, e.g. in planning a city, or dealing with the earth’s climate, is more smart than another? In other words, the other strong assumption of positivism, measurability and identifiability, also fails dramatically when it comes to human affairs. And everything on this earth is a human affair.

Intelligence is only determinable relative to a particular Lebensform. It is thus not possible to “compare the intelligence” across individuals living in different contexts. This renders the concept completely useless, finally.

Conclusions

The hypothesis I have been arguing for in this essay claims that the trinity of waves, words and images plays a significant role in the ability to deal with language and for the emergence of higher mental abilities. I proposed first that this trinity is irreducible and second that is responsible for this ability in the sense of a necessary and sufficient condition. In order to describe the practicing of that trinity, for instance with regard to possible implementations, I introduced the term of “synpresentation”. This concept draws the future track of how to deal with words and images as far as it concerns machine-based episteme.

In more direct terms, we conclude that without the capability to deal with “names”, “words” and language, the attempt to mapping higher mental capacities onto machines will not experience any progress. Once the machine will have arrived such a level, it will find itself exactly in the same position as we as humans do. This capability is definitely not sufficiently defined by “calculation power”; indeed, such an idea is ridiculous. Without embedding into appropriate social intercourse, without solving the question of representation (contemporary computer science and its technology do NOT solve it, of course), even a combined 1020000 flops will not cause the respective machine or network of machines25 “intelligent” in any way.

Words and proper names are re-formulated as a particular form of “games”, though not as “language games”, but on a more elementary level as “naming game”. I have tried to argue how the problematics of the reference could be thought of to disappear as a pseudo-problem on the basis of such a reformulation.

Finally, we found important relationships to earlier discussions of concepts like the making of analogies or vagueness. We basically agree on the stance that language can’t be clarified and that it is inappropriate (“free of sense”) to assign any kind of predicativity to language. Bluntly spoken, the application of logic is the mind, and nowhere else. Communicating about this application is not based on a language any more, and similarly, projecting logic onto language destroys language. The idea of a scientific language is empty as it is the idea of a generally applicable and understandable language. A language that is not inventive could not be called such.

Notes

1. If you read other articles in this blog you might think that there is a certain redundancy in the arguments and the targeted issues. This is not the case, of course. The perspectives are always a bit different; such I hope that by the repeated attempt “to draw the face” (Ludwig Wittgenstein, ) the problematics is rendered more accurately. “How can one learn the truth by thinking? As one learns to see a face better if one

draws it.” ( Zettel §255, [1])

2. In one of the shortest articles ever published in the field of philosophy, Edmund Gettier [2] demonstrated that it is deeply inappropriate to conceive of knowledge as “justified true belief”. Yet, in the field of machine learning so-called “belief revision” is precisely and still following this untenable position. See also our chapter about the role of logic.

3. Michel Foucault “Dits et Ecrits” I 846 (dt.1075)  [3] cited after Bernhard Waldenfels [4] p.125

4. we will see that the distinction or even separation of the “symbolic” and the “material” is neither that clear nor is it simple. Fomr the side of the machine, Felix Guattari argued in favor for a particular quality [5], the machinic, which is roughly something like a mechanism in human affairs. From the side of the symbolic there is clearly the work of Edwina Taborsky to cite, who extended and deepened the work of Charles S. Peirce in the field of semiotics,

5. particularly homo erectus and  homo sapiens spec.

6. Humans of the species homo sapiens sapiens.

7. For the time being we leave this ominous term “intelligence” untouched, but I also will warn you about its highly problematic state. We will resolve this issue till the end of that essay.

8. Heidegger developed the figure of the “Gestell” (cf. [7]), which serves multiple purposes. It is providing a storage capacity, it is a tool for sort of well-ordered/organized hiding and unhiding (“entbergen”), it provides a scaffold for sorting things in and out, and thus it is working as a complex constraint on technological progress. See also Peter Sloterdijk on this topic [8].

9. elementarization regarding Descartes

10. Homo floresiensis, also called “hobbit man”, who lived on Flores, Indonesia, 600’000y till approx. 3’000y ago. Homo floresiensis derived from homo erectus. 600’000 years ago they obviously built a boat to transfer to the islands across a sea gate with strong currents. The interesting issue is that this endeavor requires a stable social structure, division of labor, and thus also language. Homo floresiensis had a particular fore brain anatomy which is believed to provide the “intelligence” while the overall brain was relatively small as compared to ours.

11. Concerning the “the enigma of brain-mind interaction” Eccles was an avowed dualist [11]. Consequently he searched for the “interface” between the mind and the brain, in which he was deeply inspired by the 3-world concept of Karl Popper. The “dualist” position held that the mind exists at least partially independently from and somehow outside the brain. Irrespective his contributions to neuroscience on the cellular level, these ideas (of Eccles and Popper) are just wild nonsense.

12. The Philosophical Investigations are probably the most important contribution to philosophy in the 20th century. The are often mistaken as a foundational document for analytic philosophy of language. Nothing is more wrong as to take Wittgenstein as a founding father of analytic philosophy, however. Many of the positions that refer to Wittgenstein (e.g. Kripke) are just low-quality caricatures of his work.

13. Blair’s book is a must read for any computer scientist, despite some problems in its conceptualization of information.

14. Goldman [14] provides a paradigmatic examples how psychologists constantly miss the point of philosophy, up today. In an almost arrogant tone he claims: “First, let me clarify my treatment of justificational rules, logic, and psychology. The concept of justified or rational belief is a core item on the agenda of philosophical epistemology. It is often discussed in terms of “rules” or “principles” of justification, but these have normally been thought of as derivable from deductive and inductive logic, probability theory, or purely autonomous, armchair epistemology.”

Markie [15] demonstrated that everything in these claims is wrong or mistaken. Our point about it is that something like “justification” is not possible in principle, but particularly it is not possible from an empirical perspective. Goldman’s secretions to the foundations of his own work are utter nonsense (till today).

15. It is one of the rare (but important) flaws in Blair’s work that he assimilates the concept of “information retrieval” in an unreflected manner. Neither it is reasonable to assign an ontological quality to information (we can not say that information “exists”, as this would deny the primacy of interpretation) nor can we then say that information can be “retrieved”. See also our chapter about his issue. Despite his largely successful attempt to argue in favor of the importance of Wittgenstein’s philosophy for computer science, Blair fails to recognize that ontology is not tenable at large, but particularly for issues around “information”. It is a language game, after all.

16 see Stanford Encyclopedia for a discussion of various positions.

17. In our investigation of models and their generalized form, we stressed the point that there are no apriori fixed “properties” of a measured (perceived) thing; instead we have to assign the criteria for measurement actively, hence we call these criteria assignates instead of “properties”, “features”, or “attributes”.

18. See our essay about logic.

20. See the entry in the Stanford Encyclopedia of Philosophy about Quine. Quine in “Word and Object” gives the following example (abridged version here). Imagine, you discovered a formerly unknown tribe of friendly people. Nobody knows their language. You accompany one of them hunting. Suddenly a hare rushes along, crossing your way. The hunter immediately points to the hare, shouting “Gavagai!” What did he mean? Funny enough, this story happened in reality. British settlers in Australia wondered about those large animals hopping around. They asked the aborigines about the animal and its name. The answer was “cangaroo” – which means “I do not understand you” in their language.

21. This, of course, resembles to Bergson, who, in Matter and Memory [23], argued that any thinking and understanding takes place by means of primary image-like “representations”. As Leonard Lawlor (Henri Bergson@Stanford) summarizes, Bergson conceives of knowledge as “knowledge of things, in its pure state, takes place within the things it represents.” We would not describe out principle of associativity as it can be be realized by SOMs very differently…

22. the main difference between “intension” and “concept” is that the former still maintains a set of indices to raw observations of external entities, while the latter is completely devoid of such indices.

23. We conceived randolations as pre-specific relations; one may also think of them as probabilistic quasi-species that eventually may become discrete on behalf of some measurement. The intention for conceiving of randolations is given by the central drawback of relations: their double-binary nature presumes apriori measurability and identifiability, something that is not appropriate when dealing with language.

24. “raw” is indeed very relative, especially if we take culturally transformed or culturally enabled percepts into account;

25. There are mainly two aspects about that: (1) large parts of the internet is organized as a hierarchical network, not as an associative network; nowadays everybody should know that telephone network did not, do not and will not develop “intelligence”; (2) so-called Grid-computing is always organized as a linear, additive division of labor; such, it allows to run processes faster, but no qualitative change is achieved, as it can be observed for instance in the purely size-related contrast between a mouse and an elephant. Thus, taken (1) and (2) together, we may safely conclude that doing wrong things (=counting Cantoric dust) with a high speed will not produce anything capable for developing a capacity to understand anything.

References

  • [1] Ludwig Wittgenstein, Zettel. Oxford, Basil Blackwell, 1967. Edited by G.E.M. Anscombe and G.H. von Wright, translated by G.E.M. Anscombe.
  • [2] Edmund Gettier (1963), Is Justified True Belief Knowledge? Analysis 23: 121-123.
  • [3] Michel Foucault “Dits et Ecrits”, Vol I.
  • [4] Bernhard Waldenfels, Idiome des Denkens. Suhrkamp, Frankfurt 2005.
  • [5] Henning Schmidgen (ed.), Aesthetik und Maschinismus, Texte zu und von Felix Guattari. Merve, Berlin 1995.
  • [6] David Blair, Wittgenstein, Language and Information – Back to the Rough Ground! Springer Series on Information Science and Knowledge Management, Vol.10, New York 2006.
  • [7] Martin Heidegger, The Question Concerning Technology and Other Essays. Harper, New York 1977.
  • [8] Peter Sloterdijk, Nicht-gerettet, Versuche nach Heidegger. Suhrkamp, Frankfurt 2001.
  • [9] Hermann Haken, Synergetik. Springer, Berlin New York 1982.
  • [10] R. Graham, A. Wunderlin (eds.): Lasers and Synergetics. Springer, Berlin New York 1987.
  • [11] John Eccles, The Understanding of the Brain. 1973.
  • [12] Douglas Hofstadter, Fluid Concepts And Creative Analogies: Computer Models Of The Fundamental Mechanisms Of Thought. Basic Books, New York 1996.
  • [13] Robert van Rooij, Vagueness, Tolerance and Non-Transitive Entailment. p.205-221 in: Petr Cintula, Christian G. Fermüller, Lluis Godo, Petr Hajek (eds.) Understanding Vagueness. Logical, Philosophical and Linguistic Perspectives. Vol.36 of Studies in Logic, College Publications, London 2011. book is avail online.
  • [14] Alvin I. Goldman (1988), On Epistemology and Cognition, a response to the review by S.W. Smoliar. Artificial Intelligence 34: 265-267.
  • [15] Peter J. Markie (1996). Goldman’s New Reliabilism. Philosophy and Phenomenological Research Vol.56, No.4, pp. 799-817
  • [16] Saul Kripke, Naming and Necessity. 1972.
  • [17] Scott Soames, Beyond Rigidity: The Unfinished Semantic Agenda of Naming and Necessity. Oxford University Press, Oxford 2002.
  • [18] Scott Soames (2006), Précis of Beyond Rigidity. Philosophical Studies 128: 645–654.
  • [19] Michel Foucault, Les Hétérotopies – [Radio Feature 1966]. Youtube.
  • [20] Michel Foucault, Die Heterotopien. Der utopische Körper. Aus dem Französischen von Michael Bischoff, Suhrkamp, Frankfurt 2005.
  • [21] David Grahame Shane, Recombinant Urbanism – Conceptual Modeling in Architecture, Urban Design and City Theory. Wiley Academy Press, Chichester 2005.
  • [22] Willard van Orman Quine, Word and Object. M.I.T. Press, Cambridge (Mass.) 1960.
  • [23] Henri Louis Bergson, Matter and Memory. transl. Nancy M. Paul  & W. Scott Palmer, Martino Fine Books, Eastford  (CT) 2011 [1911].
  • [24] Fernando  Colmenares, Helena Rivero (1986).  A conceptual Model for Analysing Interactions in Baboons: A Preliminary Report. pp.63-80. in: Colgan PW, Zayan R (eds.), Quantitative models in ethology. Privat I.E, Toulouse.
  • [25] Irene Pepperberg (1998). Talking with Alex: Logic and speech in parrots. Scientific American. avail online. see also the Wiki entry about Alex.
  • [26] a. Robert Seyfarth, Dorothy Cheney, Peter Marler (1980). Monkey Responses to Three Different Alarm Calls: Evidence of Predator Classification and Semantic Communication. Science, Vol.210: 801-803.b. Dorothy L. Cheney, Robert M. Seyfarth (1982). How vervet monkeys perceive their grunts: Field playback experiments. Animal Behaviour 30(3): 739–751.
  • [27] Robert Seyfarth, Dorothy Cheney (1990). The assessment by vervet monkeys of their own and another species’ alarm calls. Animal Behaviour 40(4): 754–764.
  • [28] Klaus Wassermann (2010). Nodes, Streams and Symbionts: Working with the Associativity of Virtual Textures. The 6th European Meeting of the Society for Literature, Science, and the Arts, Riga, 15-19 June, 2010. available online.
  • [29] Richard Brandom, Making it Explicit. Harvard University Press, Cambridge (Mass.) 1998.
  • [30] Willard van Orman Quine (1951), Two Dogmas of Empiricism. Philosophical Review, 60: 20–43. available here

۞

Where Am I?

You are currently browsing entries tagged with machine learning at The "Putnam Program".