Behavior

September 7, 2012 § Leave a comment

Animals behave. Of course, one could say.

Yet, why do we feel a certain naturalness here, in this relation between the cat as an observed and classified animal on the one side and the language game “behavior” on the other? Why don’t we say, for instance, that the animal happens? Or, likewise, that it is moved by its atoms? To which conditions does the language game “behavior” respond?

As strange as this might look like, it is actually astonishing that physicists easily attribute the quality of “behavior” to their dog or their cat, albeit they rarely will attribute them ideas (for journeys or the like). For physicists usually claim that the whole world can be explained in terms of the physical laws that govern the movement of atoms (e.g. [1]). Even physicists, it seems, exhibit some dualism in their concepts when it comes to animals. Yet, physicists claimed for a long period of time, actually into the mid of the 1980ies, that behavioral sciences actually could not count as a “science” at all, despite the fact that Lorenz and Tinbergen won the Nobel prize for medical sciences in 1973.

The difficulties physicists obviously suffer from are induced by a single entity: complexity. Here we refer to the notion of complexity that we developed earlier, which essentially is built from the following 5 elements.

  • – Flux of entropy, responsible for dissipation;
  • – Antagonistic forces, leading to emergent patterns;
  • – Standardization, mandatory for temporal persistence on the level of basic mechanisms as well as for selection processes;
  • – Compartmentalization, together with left-overs leading to spatio-temporal persistence as selection;
  • – Self-referential hypercycles, leading to sustained 2nd order complexity with regard to the relation of the whole to its parts.

Any setup for which we can identify this set of elements leads to probabilistic patterns that are organized on several levels. In other words, these conditioning elements are necessary and sufficient to “explain” complexity. In behavior, the sequence of patterns and the sequence of more simple elements within patterns are by far not randomly arranged, yet, it is more and more difficult to predict a particular pattern the higher its position in the stack of nested patterns, that is, its level of integration. Almost the same could be said about the observable changes in complex systems.

Dealing with behavior is thus a non-trivial task. There are no “laws” that would be mapped somehow into the animal such that an apriori defined mathematical form would suffice for a description of the pattern, or the animal as a whole. In behavioral sciences, one first has to fix a catalog of behavioral elements, and only by reference to this catalog we can start to observe in a way that will allow for comparisons with other observations. I deliberately avoid the concept of “reproducibility” here. How to know about that catalog, often called behavioral taxonomy? The answer is we can’t know in the beginning. To reduce observation completely to the physical level is not a viable alternative either. Observing a particular species, and often even a particular social group or individual improves over time, yet we can’t speak about that improvement. There is a certain notion of “individual” culture here that develops between the “human” observer and the behaving system, the animal. The written part of this culture precipitates in the said catalog, but there remains a large part of habit of observing that can’t be described without performing it. Observations on animals are never reproducible in the same sense as it is possible with physical entities. The ultimate reason being that the latter are devoid of individuality.

A behavioral scientist may work on quite different levels. She could investigate some characteristics of behavior in relation to the level of energy consumption, or to differential reproductive success. On this level, one would hardly go into the details of the form of behavior. Quite differently to this case are those investigations that are addressing the level of the form of the behavior. The form becomes an important target of the investigation if the scientist is interested in the differential social dynamics of animals belonging to different groups, populations or species. In physics, there is no form other than the mathematical. Electrons are (treated in) the same (way) by physicists all over the world, even across the whole universe. Try this with cats… You will loose the cat-ness.

It is quite clear that the social dynamics can’t be addressed by means of mere frequencies of certain simple behavioral elements, such like scratching, running or even sniffing at other animals. There might be differences, but we won’t understand too much of the animal, of course, particularly not with regard to the flow of information in which the animal engages.

The big question that arose during the 1970ies and the 1980ies was, how to address behavior, its structure, its patterning, and thereby to avoid a physicalist reduction?

Some intriguing answers has been given in the respective discourse since the beginning of the 1950ies, though only a few people recognized the importance of the form. For instance, to understand wolves Moran and Fentress [2] used the concept of choreography to get a descriptional grip on the quite complicated patterns. Colmenares, in his work about baboons, most interestingly introduced the notion of the play to describe the behavior in a group of baboons. He distinguished more than 80 types of social games as an arrangement of “moves” that span across space and time in a complicated way; this behavioral wealth rendered it somewhat impossible to analyze the data at that time. The notion of the social game is so interesting because it is quite close to the concept of language game.

Doing science means to translate observations into numbers. Unfortunately, in behavioral sciences this translation is rather difficult and in itself only little standardized (so far) despite many attempts, precisely for the reason that behavior is the observable output of a deeply integrated complex system, for instance the animal. Whenever we are going to investigate behavior we carefully have to instantiate the selection of the appropriate level we are going to investigate. Yet, in order to understand the animal, we even could not reduce the animal onto a certain level of integration. We should map the fact of integration itself.

There is a dominant methodological aspect in the description of behavior that differs from those in sciences more close to physics. In behavioral sciences one can invent new methods by inventing new purposes, something that is not possible in classic physics or engineering, at least if matter is not taken as something that behaves. Anyway, any method for creating formal descriptions invokes mathematics.

Here it becomes difficult, because mathematics does not provide us any means to deal with emergence. We can’t, of course, blame mathematics for that. It is not possible in principle to map emergence onto an apriori defined set of symbols and operations.

The only way to approximate an appropriate approach is by a probabilistic methodology that also provides the means to distinguish various levels of integration. The first half of this program is easy to accomplish, the second less so. For the fact of emergence is a creative process, it induces the necessity for interpretation as a constructive principle. Precisely this has been digested by behavioral science into the practice of the behavioral catalog.

1. This Essay

Well, here in this essay I am not interested mainly in the behavior of animals or the sciences dealing with the behavior of animals. Our intention was just to give an illustration of the problematic field that is provoked by the “fact” of the animals and their “behavior”.  The most salient issue in this problematic field is the irreducibility, in turn caused by the complexity and the patterning resulting from it. The second important part on this field is given by the methodological answers to these concerns, namely the structured probabilistic approach, which responds appropriately to the serial characteristics of the patterns, that is, to the transitional consistency of the observed entity as well as the observational recordings.

The first of these issues—irreducibility—we need not to discuss in detail here. We did this before, in a previous essay and in several locations. We just have to remember that empiricist reduction means to attempt for a sufficient description through dissecting the entity into its parts, thereby neglecting the circumstances, the dependency on the context and the embedding into the fabric of relations that is established by other instances. In physics, there is no such fabric, there are just anonymous fields, in physics, there is no dependency on the context, hence form is not a topic in physics. As soon as form becomes an issue, we leave physics, entering either chemistry or biology. As said, we won’t go into further details about that. Here, we will deal mainly with the second part, yet, with regard to two quite different use cases.

We will approach these cases, the empirical treatment of “observations” in computational linguistics and in urbanism, first from the methodological perspective, as both share certain conditions with the “analysis” of animal behavior. In chapter 8 we will give more pronounced reasons about this alignment, which at first sight may seem to be, well, a bit adventurous. The comparative approach, through its methodological arguments, will lead us to the emphasis of what we call “behavioral turn”. The text and the city are regarded as behaving entities, rather than the humans dealing with them.

The chapters in this essay are the following:

Table of Content (active links)

2. The Inversion

Given the two main conceptual landmarks mentioned above—irreducibility and the structured probabilistic approach—that establish the problematic field of behavior, we now can do something exciting. We take the concept and its conditions, detach it from its biological origins and apply it to other entities where we meet the same or rather similar conditions. In other words, we practice a differential as Deleuze understood it [3]. So, we have to spend a few moments for dealing with these conditions.

Slightly re-arranged and a bit more abstract than it is the case in behavioral sciences, these conditions are:

  • – There are patterns that appear in various forms, despite they are made from the same elements.
  • – The elements that contribute to the patterns are structurally different.
  • – The elements are not all plainly visible; some, most or even the most important are only implied.
  • – Patterns are arranged in patterns, implying that patterns are also elements, despite the fact that there is no fixed form for them.
  • – The arrangement of elements and patterns into other patterns is dependent on the context, which in turn can be described only in probabilistic terms.
  • – Patterns can be classified into types or families; the classification however, is itself non-trivial, that is, it is not supported.
  • – The context is given by variable internal and external influences, which imply a certain persistence of the embedding of the observed entity into its spatial, temporal and relational neighborhood.
  • – There is a significant symbolic “dimension” in the observation, meaning that the patterns we observe occur in sequence space upon an alphabet of primitives, not just in the numerical space. This symbolistic account is invoked by the complexity of the entity itself. Actually, the difference between symbolic and numerical sequences and patterns are much less than categorical, as we will see. Yet, it makes a large difference either to include or to exclude the methodological possibility for symbolic elements in the observation.

Whenever we meet these conditions, we can infer the presence of the above mentioned problematic field, that is mainly given by irreducibility and­­­—as its match in the methodological domain—the practice of a structured probabilistic approach. This list provides us an extensional circumscription of abstract behavior.

A slightly different route into this problematic field draws on the concept of complexity. Complexity, as we understand it by means of the 5 elements provided above (for details see the full essay on this subject), can itself be inferred by checking for the presence of the constitutive elements. Once we see antagonisms, compartments, standardization we can expect emergence and sustained complexity, which in turn means that the entity is not reducible and in turn, that a particular methodological approach must be chosen.

We also can clearly state what should not be regarded as a member of this field. The most salient one is the neglect of individuality. The second one, now in the methodological domain, is the destruction of the relationality as it is most easy accomplished by referring to raw frequency statistics. It should be obvious that destroying the serial context in an early step of the methodological mapping from observation to number also destroys any possibility to understand the particularity of the observed entity. The resulting picture will not only be coarse, most probably it also will be utterly wrong, and even worse, there is no chance to recognize this departure into the area that is free from any sense.

3. The Targets

At the time of writing this essay, there are currently three domains that suffer most from the reductionist approach. Well, two and a half, maybe, as the third, genetics, is on the way to overcome the naïve physicalism of former days.

This does not hold for the other two areas, urbanism and computational linguistics, at least as far as it is relevant for text mining  and information retrieval1. The dynamics in the respective communities are of course quite complicated, actually too complicated to achieve a well-balanced point of view here in this short essay. Hence, I am asking to excuse the inevitable coarseness regarding the treatment of those domains as if they would be homogenous. Yet, I think, that in both areas the mainstream is seriously suffering from a mis-understood scientism. In some way, people there strangely enough behave more positivist than researchers in natural sciences.

In other words, we follow the question how to improve the methodology in those two fields of urbanism and computerized treatment of textual data. It is clear that the question about methodology implies a particular theoretical shift. This shift we would like to call the “behavioral turn”. Among other changes, the “behavioral turn” as we construct it allows for overcoming the positivist separation between observer and the observed without sacrificing the possibility for reasonable empiric modeling.2

Before we argue in a more elaborate manner about this proposed turn in relation to textual data and urbanism, we first would like two accomplish two things. First, we briefly introduce two methodological concepts that deliberately try to cover the context of events, where those events are conceived as part of a series that always also develops into kind of a network of relations. Thus, we avoid to conceive of events as a series of separated points.

Secondly, we will discuss current mainstream methodology in the two fields that we are going to focus here. I think that the investigation of the assumptions of these approaches, often remaining hidden, sheds some light onto the arguments that support the reasonability of the “behavioral turn”.

4. Methodology

The big question remaining to deal with is thus: how to deal with the observations that we can make in and about our targets, the text or the city?

There is a clear starting point for the selection of any method as a method that could be considered as appropriate. The method should inherently respond to the seriality of the basic signal. A well-known method of choice for symbolic sequences are Markov chains, another important one are random contexts and random graphs. In the domain of numerical sequences wavelets are the most powerful way to represent various aspects of a signal at once.

Markov Processes

A Markov chain is the outcome of applying the theory of Markov processes onto a symbolic sequence. A Markov process is a neat description of the transitional order in a sequence. We also may say that it describes the conditional probabilities for the transitions between any subset of elements. Well, in this generality it is difficult to apply. Let us thus start with the most simple form, the Markov process of 1st order.

A 1st order Markov process describes just and only all pairwise transitions that are possible for given “alphabet” of discrete entries (symbols). These transitions can be arranged in a so-called transition matrix if we obey to the standard to use the preceding part of the transitional pair as row header and the succeeding part of the transitional pair as a column header. If a certain transition occurs, we enter a tick into the respective cell, given by the address row x column, which derives from the pair prec -> succ. That’s all. At least for the moment.

Such a table captures in some sense the transitional structure of the observed sequence. Of course, it captures only a simple aspect, since the next pair does not know anything about the previous pair. A 1st order Markov process is thus said to have no memory. Yet, it would be a drastic misunderstanding to generalize the absence of memory to any kind of Markov process. Actually, Markov processes can precisely be used to investigate the “memories” in a sequence, as we will see in a moment.

Anyway, on any kind of such a transition table we can do smart statistics, for instance to identify transitions that are salient for the “exceptional” high or low frequency. Such a reasoning takes into account the marginal frequencies of such a table and is akin to correspondence analysis. Van Hooff developed this “adjusted residual method” and  has been applying it with great success in the analysis of observational data on Chimpanzees [4][5].

These residuals are residuals against a null-model, which in this case is the plain distribution. In other words, the reasoning is simply the same as always in statistics, aiming at establishing a suitable ratio of observed/expected, and then to determine the reliability of a certain selection that is based on that ratio. In the case of transition matrices the null-model states that all transitions occur with the same frequency. This is of course, simplifying, but it is also simple to calculate. There are of course some assumptions in that whole procedure that are worthwhile to be mentioned.

The most important assumption of the null-model is that all elements that are being used to set up the transitional matrix are independent from each other, except their 1st order dependency, of course. This also means that the null-model assumes equal weights for the elements of the sequence. It is quite obvious that we should assume so only in the beginning of the analysis. The third important assumption is that the process is stationary, meaning the kind and the strength of the 1st order dependencies do not change for the entire observed sequence.

Yet, nothing enforces us to stick to just the 1st order Markov processes, or to apply it globally. A 2nd order Markov process could be formulated which would map all transitions x(i)->x(i+2). We may also formulate a dense process for all orders >1, just by overlaying all orders from 1 to n into a single transitional matrix.

Proceeding this way, we end up with an ensemble of transitional models. Such an ensemble is suitable for the comparatist probabilistic investigation of the memory structure of a symbolic sequence that is being produced by a complex system. Matrices can be compared (“differenced”) regarding their density structure, revealing even spurious ties between elements across several steps in the sequence. Provided the observed sequence is long enough, single transition matrices as well as ensembles thereof can be resampled on parts of sequences in order to partition the global sequence, that is, to identify locally stable parts of the overall process.

Here you may well think that this sounds like a complicated “work-around” for a Hidden Markov Model (HMM). Yet, despite a HMM is more general than the transition matrix perspective in some respect, it is also less wealthy. In HMM, the multiplicity is—well—hidden. It reduces the potential complexity of sequential data into a single model, again with the claim of global validity. Thus, HMM are somehow more suitable the closer we are to physics, e.g. in speech recognition. But even there their limitation is quite obvious.

From the domain of ecology we can import another trick for dealing with the transitional structure. In ecosystems we can observe the so-called succession. Certain arrangements of species and their abundance follow rather regularly, yet probabilistic to each other, often heading towards some stable final “state”. Given a limited observation about such transitions, how can we know about the final state? Using the transitional matrix the answer can be found simply by a two-fold operation of multiplying the matrix with itself and intermittent filtering by renormalization. This procedure acts as a frequency-independent filter. It helps to avoid type-II errors when applying the adjusted residuals method, that is, transitions with a weak probability will be less likely dismissed as irrelevant ones.

Contexts

The method of Markov processes is powerful, but is suffers from a serious problem. This problem is introduced by the necessity to symbolize certain qualities of the signal in advance to its usage in modeling.

We can’t use Markov processes directly on the raw textual data. Doing so instead would trap us in the symbolistic fallacy. We would either ascribe the symbol itself a meaning—which would result in a violation of the primacy of interpretation—or it would conflate the appearance of a symbol with its relevance, which would constitute a methodological mistake.

The way out of this situation is provided by a consequent probabilization. Generally we may well say that probabilisation takes the same role for quantitative sciences as the linguistic turn did for philosophy. Yet, it is still an attitude that is largely being neglected as a dedicated technique almost everywhere in any science. (for an example application of probabilisation with regard to evolutionary theory see this)

Instead of taking symbols as they are pretended to be found “out there”, we treat them as outcome of an abstract experiment, that is, as a random variable. Random variables establish them not as dual concepts, as 1 or 0, to be or not to be, they establish themselves as a probability distribution. Such a distribution contains potentially an infinite number of discretizations. Hence, probabilistic methods are always more general than those which rely on “given” symbols.

Kohonen et al. proposed a simple way to establish a random context [6]. The step from symbolic crispness to a numerical representation is not trivial, though. We need a double-articulated entity that is “at home” in both domains. This entity is a high-dimensional random fingerprint. Such a fingerprint consists simply of a large number, well above 100, of random values from the interval [0..1]. According to the Lemma of Hecht-Nielsen [7]  any two of such vectors are approximately orthogonal to each other. In other words, it is a name expressed by numbers.

After a recoding of all symbols in a text into their random fingerprints it is easy to establish  probabilistic distributions of the neighborhood of any word. The result is a random context, also called a random graph. The basic trick to accomplish such a distribution is to select a certain, fixed size for the neighborhood—say five or seven positions in total—and then arrange the word of interest always to a certain position, for instance into the middle position.

This procedure we do for all words in a text, or any symbolic series. Doing so, we get a collection of random contexts, that overlap. The final step then is a clustering of the vectors according to their similarity.

It is quite obvious that this procedure as it has been proposed by Kohonen sticks to strong assumptions, despite its turn to probabilization. The problem is the fixed order, that is, the order is independent from context in his implementation. Thus his approach is still limited in the same way as the n-gram approach (see chp.5.3 below). Yet, sometimes we meet strong inversions and extensions of relevant dependencies between words. Linguistics speak of injected islands with regard to wh*-phrases. Anaphors are another example. Chomsky critized the approach of fixed–size contexts very early.

Yet, there is no necessity to limit the methodology to fixed-size contexts, or to symmetrical instances of probabilistic contexts. Yes, of course this will result in a situation, where we corrupt the tabularity of the data representation. Many rows are different in their length and there is (absolutely) no justification to enforce a proper table by filling “missing values” into the “missing” cells of the table

Fortunately, there is another (probabilistic) technique that could be used to arrive at a proper table, without distorting the content by adding missing values. This technique is random projection, first identified by Johnson & Lindenstrauss (1984), which in the case of free-sized contexts has to be applied in an adaptive manner (see [8] or [9] for a more recent overview). Usually, a source (n*p) matrix (n=rows, p=columns=dimensions) is multiplied with a (p*k) random matrix, where the random numbers follow a Gaussian distribution), resulting in a target matrix of only k dimensions and n rows. This way a matrix of 10000+ columns can be projected into one made only from 100 columns without loosing much information. Yet, using the lemma of Hecht-Nielsen we can compress any of the rows of a matrix individually. Since the random vectors are approximately orthogonal to each other we won’t introduce any information across all the data vectors that are going to be fed into the SOM. This stepwise operation becomes quite important for large amounts of documents, since in this case we have to adopt incremental learning.

Such, we approach slowly but steadily the generalized probabilistic context that we described earlier. The proposal is simply that in dealing with texts by means of computers we have to apply precisely the most general notion of context, which is devoid from structural pre-occupations as we can meet them e.g. in the case of n-grams or Markov processes.

5. Computers Dealing with Text

Currently, so-called “text mining” is a hot topic. More and more of human communication is supported by digitally based media and technologies, hence more and more texts are accessible to computers without much efforts. People try to use textual data from digital environments for instance to do sentiment analysis about companies, stocks, or persons, mainly in the context of marketing. The craziness there is that they pretend to classify a text’s sentiment without understanding it, more or less on the frequency of scattered symbols.

The label “text mining” reminds to “data mining”; yet, the structure of the endeavors are drastically different. In data mining one is always interested in the relevant variables n order to build a sparse model that even could be understood by human clients. The model then in turn is used to optimize some kind of process from which the data for modeling has been extracted.

In the following we will describe some techniques, methods and attitudes that are highly unsuitable for the treatment of textual “data”, despite the fact that they are widely used.

Fault 1 : Objectivation

The most important difference between the two flavor of “digital mining” concerns however, the status of the “data”. In data mining, one deals with measurements that are arranged in a table. This tabular form is only possible on the basis of a preceding symbolization, which additionally is strictly standardized also in advance to the measurement.

In text mining this is not possible. There are no “explanatory” variables that could be weighted. Text mining thus just means to find a reasonable selection of text in response to a “query”. For textual data it is not possible to give any criterion how to look at a text, how to select a suitable reference corpus for determining any property of the text, or simply to compare it to other texts before its interpretation. There are no symbols, no criteria that could be filled into a table. And most significant, there is no target that could be found “in the data”.

It is devoid of any sense to try to optimize a selection procedure by means of a precision/recall ratio. This would mean that the meaning of text could be determined objectively before any interpretation, or, likewise, that the interpretation of a text is standardisable up to a formula. Both attempts are not possible, claiming otherwise is ridiculous.

People responded to these facts with a fierce endeavor, which ironically is called “ontology”, or even “semantic web”. Yet, neither will the web ever become “semantic” nor is database-based “ontology” a reasonable strategy (except for extremely standardized tasks). The idea in both cases is to determine the meaning of an entity before its actual interpretation. This of course is utter nonsense, and the fact that it is nonsense is also the reason why the so-called “semantic web” never started to work. They guys should really do more philosophy.

Fault 2 : Thinking in Frequencies

A popular measure for describing the difference of texts are variants of the so-called tf-idf measure. “tf” means “term frequency” and describes the normalized frequency of a term within a document. “idf” means “inverse document frequency”, which, actually, refers to the frequency of a word across all documents in a corpus.

The frequency of a term, even its howsoever differentialized frequency, can hardly be taken as the relevance of that term given a particular query. To cite the example from the respective entry in Wikipedia, what is “relevant” to select a document by means of the query “the brown cow”? Sticking to terms makes sense only if and only if we accept an apriori contract about the strict limitation to the level of the terms. Yet, this has nothing to do with meaning. Absolutely nothing. It is comparing pure graphemes, not even symbols.

Even if it would be related to meaning it would be the wrong method. Simply think about a text that contains three chapters: chapter one about brown dogs, chapter two about the relation of (lilac) cows and chocolate, chapter three about black & white cows. There is no phrase about a brown cow in the whole document, yet, it would certainly be selected as highly significant by the search engine.

This example nicely highlights another issue. The above mentioned hypothetical text could nevertheless be highly relevant, yet only in the moment the user would see it, triggering some idea that before not even was on the radar. Quite obviously, despite the search would have been different, probably, the fact remains that the meaning is neither in the ontology nor in the frequency and also not in text as such—before the actual interpretation by the user. The issue becomes more serious if we’d consider slightly different colors that still could count as “brown”, yet with a completely different spelling. And even more, if we take into account anaphoric arrangement.

The above mentioned method of Markov processes helps a bit, but not completely of course.

Astonishingly, even the inventors of the WebSom [6], probably the best model for dealing with textual data so far, commit the frequency fallacy. As input for the second level SOM they propose a frequency histogram. Completely unnecessary, I have to add, since the text “within” the primary SOM can be mapped easily to a Markov process, or to probabilistic contexts, of course. Interestingly, any such processing that brings us from the first to the second layer reminds somewhat more to image analysis than to text analysis. We mentioned that already earlier in the essay “Waves, Words and Images”.

Fault 3 : The Symbolistic Fallacy (n-grams & co.)

Another really popular methodology to deal with texts is n-grams. N-grams are related to Markov processes, as they also take the sequential order into account. Take for instance (again the example from Wiki) the sequence “to be or not to be”. The transformation into a 2-gram (or bi-gram) looks such “to be, be or, or not, not to, to be,” (items are between commas), while the 3-gram transformation produces “to be or, be or not, or not to, not to be”. In this way, the n-gram can be conceived as a small extract from a transition table of order (n-1). N-grams share a particular weakness with simple Markov models, which is the failure to capture long-range dependencies in language. These can be addressed only by means of deep grammatical structures. We will return to this point later in the discussion of the next fault No.4 (Structure as Meaning).

The strange thing is that people drop the tabular representation, thus destroying the possibility of calculating things like adjusted residuals. Actually, n-grams are mostly just counted, which is committing the first fault of thinking in frequencies, as described above.

N-gram help to build queries against databases that are robust against extensions of words, that is prefixes, suffixes, or forms of verbs due to flexing. All this has, however, nothing to do with meaning. It is a basic and primitive means to make symbolic queries upon symbolic storages more robust. Nothing more.

The real problem is the starting point: taking the term as such. N-grams start with individual words that are taken blindly as symbols. Within the software doing n-grams, they are even replaced by some arbitrary hash code, i.e. the software does not see a “word”, it deals just with a chunk of bits.

This way, using n-grams for text search commits the symbolistic fallacy, similar to ontologies, but even on a more basic level. In turn this means that the symbols are taken as “meaningful” for themselves. This results in a hefty collision with the private-language-argument put forward by Wittgenstein a long time ago.

N-grams are certainly more advanced than the nonsense based on tf-idf. Their underlying intention is to reflect contexts. Nevertheless, they fail as well. The ultimate reason for the failure is the symbolistic starting point. N-grams are only a first, though far too trivial and simplistic step into probabilization.

There is already a generalisation of n-grams available as described in published papers by Kohonen & Kaski: random graphs, based on random contexts, as we described it above. Random graphs overcome the symbolistic fallacy, especially if used together with SOM. Well, honestly I have to say that random graphs imply the necessity of a classification device like the SOM. This should not be considered as being a drawback, since n-grams are anyway often used together with Bayesian inference. Bayesian methods are, however, not able to distil types from observations as SOM are able to do. That now is indeed a drawback since in language learning the probabilistic approach necessarily must be accompanied with the concept of (linguistic) types.

Fault 4 : Structure as Meaning

The deep grammatical structure is an indispensable part of human languages. It is present from the sub-word level up to the level of rhetoric. And it’s gonna get really complicated. There is a wealth of rules, most of them to be followed rather strict, but some of them are applied only in a loose manner. Yet, all of them are rules, not laws.

Two issues are coming up here that are related to each other. The first one concerns the learning of a language. How do we learn a language? Wittgenstein proposed, simply by getting shown how to use it.

The second issue concerns the status of the models about language. Wittgenstein repeatedly mentioned that there is no possibility for a meta-language, and after all we know that Carnap’s program of a scientific language failed (completely). Thus we should be careful when applying a formalism to language, whether it is some kind of grammar, or any of the advanced linguistic “rules” that we know of today (see the lexicon of linguistics for that). We have to be aware that these symbolistic models are only projective lists of observations, arranged according to some standard of a community of experts.

Linguistic models are drastically different from models in physics or any other natural science, because in linguistics there is no outer reference. (Computational) Linguistics is mostly on the stage of a Babylonian list science [10], doing more tokenizing than providing useful models, comparable to biology in the 18th century.

Language is a practice. Language is a practice of human beings, equipped with a brain and embedded in a culture. In turn language itself is contributing to cultural structures and is embedded into it. There are many spatial, temporal and relational layers and compartments to distinguish. Within such arrangements, meaning happens in the course of an ongoing interpretation, which in turn is always a social situation. See Robert Brandom’s Making it Explicit as an example for an investigation of this aspect.

What we definitely have to be aware of is that projecting language onto a formalism, or subordinating language to an apriori defined or standardized symbolism (like in formal semantics) looses essentially everything language is made from and referring to. Any kind of model of a language is implicitly also claiming that language can be detached from its practice and from its embedding without loosing its main “characteristics”, its potential and its power. In short, it is the claim that structure conveys meaning.

This brings us to the question about the role of structure in language. It is a fact that humans not only understand sentences full of grammatical “mistakes”, and quite well so, in spoken language we almost always produce sentences that are full of grammatical mistakes. In fact, “mistakes” are so abundant that it becomes questionable to take them as mistakes at all. Methodologically, linguistics is thus falling back into a control science, forgetting about the role and the nature of symbolic rules such as it is established by grammar. The nature is an externalization, the role is to provide a standardization, a common basis, for performing interpretation of sentences and utterances in a reasonable time (almost immediately) and in a more or less stable manner. The empirical “given” of a sentence alone, even a whole text alone, can not provide enough evidence for starting with interpretation, nor even to finish it. (Note that a sentence is never a “given”.)

Texts as well as spoken language are nothing that could be controlled. There is no outside of language that would justify that perspective. And finally, a model should allow for suitable prediction, that is, it should enable us to perform a decision. Here we meet Chomsky’s call for competence. In case of language, a linguistic models should be able to produce language as a proof of concept. Yet, any attempt so far failed drastically, which actually is not really a surprise. Latest here it should become clear that the formal models of linguistics, and of course all the statistical approaches to “language processing” (another crap term from computational linguistics) are flawed in a fundamental way.

From the perspective of our interests here on the “Putnam Program” we conceive of formal properties as Putnam did in his “Meaning of “Meaning””. Formal properties are just that: properties among other properties. In our modeling essay we proposed to replace the concept of properties by the concept of the assignate, in order to emphasize the active role of the modeling instance in constructing and selecting the factors. Sometimes we use formal properties of terms and phrases, sometimes not, dependent on context, purpose or capability. There is neither a strict tie of formal assignates to the entity “word” or “sentence” nor could we detach them as part of formal approach.

Fault 5 : Grouping, Modeling and Selection

Analytic formal models are a strange thing, because such a model essentially claims that there is no necessity for a decision any more. Once the formula is there, it claims a global validity. The formula denies the necessity for taking the context as a structural element into account. It claims a perfect separation between observer and the observed. The global validity also means that the weights of the input factors are constant, or even that there are no such weights. Note that the weights translates directly into the implied costs of a choice, hence formulas also claim that the costs are globally constant, or at least, arranged in a smooth differentiable space. This is of course far from any reality for almost any interesting context, and of course for the contexts of language and urbanism, both deeply related to the category of the “social”.

This basic characteristic hence limits the formal symbolic approach to physical, if not just to celestial and atomic contexts. Trivial contexts, so to speak. Everywhere else something rather different is necessary. This different thing is classification as we introduced it first in our essay about modeling.

Searching for a text and considering a particular one as a “match” to the interests expressed by the search is a selection, much like any other “decision”. It introduces a notion of irreversibility. Searching itself is a difficult operation, even so difficult that is questionable whether we should follow this pattern at all. As soon as we start to search we enter the grammatological domain of “searching”. This means that we claim the expressibility of our interests in the search statement.

This difficulty is nicely illustrated by an episode with Gary Kasparov in the context of his first battle against “Deep Blue”. Given the billions of operations the super computer performed, a journalist came up with the question “How do find the correct move so fast?” Obviously, the journalist was not aware about the mechanics of that comparison. Kasparov answered: “ I do not search, I just find it.” His answer is not perfectly correct, though, as he should have said “I just do it”. In a conversation we mostly “just do language”. We practice it, but we very rarely search for a word, an expression, or the like. Usually, our concerns are on the strategic level, or in terms of speech act theory, on the illocutionary level.

Such we arrive now at the intermediary result that we have some kind of non-analytical models on the one hand, and the performance of their application on the other. Our suggestion is that most of these models are situated on an abstract, orthoregulative level, and almost never on the representational level of the “arrangement” of words.

A model has a purpose, even if it is an abstract one. There are no models without purpose. The purpose is synonymic to the selection. Often, we do not explicitly formulate a purpose, we just perform selections in a consistent manner. It is this consistency in the selections that imply a purpose. The really important thing to understand is also that the abstract notion of purpose is also synonymic to what we call “perspective”, or point of view.

One could mention here the analytical “models”, but those “models” are not models because they are devoid of a purpose. Given any interesting empirical situation, everybody knows that things may look quite different, just dependent on the “perspective” we take. Or in our words, which abstract purpose we impose to the situation. The analytic approach denies such a “perspectivism”.

The strange thing now is that many people mistake the mere clustering of observation on the basis of all contributing or distinguished factors as a kind of model. Of course, that grouping will radically change if we withdraw some of the factors, keeping only a subset of all available ones. Not only the grouping changes, but also the achievable typology and any further generalization will be also very different. In fact, any purpose, and even the tuning of the attitude towards the risk (costs) of unsuitable decisions changes the set of suitable factors. Nothing could highlight more the nonsense to call naïve take-it-all-clustering a “unsupervised modeling”. First, it is not a model. Second, any clustering algorithm or grouping procedure follows some optimality criterion, that is it supervises it despite claiming the opposite. “Unsupervised modeling” claims implicitly that it is possible to build a suitable model by pure analytic means, without any reference to the outside at all. This is, f course, not possible. It is this claim that is introducing a contradiction to the practice itself, because clustering usually means classification, which is not an analytic move at all. Due to this self-contradiction the term “unsupervised modeling” is utter nonsense. It is not only nonsense, it is even deceiving, as people get vexed by the term itself: they indeed believe that they are modeling in a suitable manner.

Now back to the treatment of texts. One of the most advanced procedures—it is a non-analytical one—is the WebSom. We described it in more detail in previous essays (here and here). Yet, as the second step Kohonen proposes clustering as a suitable means to decide about the similarity of texts. He is committing exactly the same mistake as described before. The trick, of course, is to introduce (targeted) modeling to the comparison of texts, despite the fact that there are no possible criteria apriori. What seems to be irresolvable disappears, however, as a problem if we take into account the self-referential relations of discourses, which necessarily engrave into the interpreter as self-modifying structural learning and historical individuality.

6. The Statistics of Urban Environments

The Importance of Conceptual Backgrounds

There is no investigation without implied purpose, simply because any investigation has to perform more often many selections rather than just some. One of the more influential selections that has to be performed considers the scope of the investigation. We already met this issue above when we discussed the affairs as we can meet it in behavioral sciences.

Considering investigations about social entities like urban environments, architecture or language. “scope” largely refers to the status of the individual, and in turn, to the status of time that we instantiate in our investigation. Both together establish the dimension of form as an element of the space of expressibility that we choose for the investigation.

Is the individual visible at all? I mean, in the question, in the method and after applying a methodology? For instance, as soon as we ask about matters of energy, individuals disappear. They also disappear if we apply statistics to raw observations, even if at first hand we would indeed observe individuals as individuals. To retain the visibility of individuals as individuals in a set of relations we have to apply proper means first. It is clear, that any cumulative measure like those from socio-economics also cause the disappearance of the context and the individual.

If we keep the individuals alive in our method, the next question we have to ask concerns the relations between the individuals. Do we keep them or do we drop them? Finally, regarding the unfolding of the processes that result from the temporal dynamics of those relations, we have to select whether we want to keep aspects of form or not. If you think that the way a text unfolds or the way things are happening in the urban environment is at least as important as their presence,  well in this case you would have to care about patterns.

It is rather crucial to understand that these basic selections determine the outcome of an investigation as well as of any modeling or even theory building as grammatological constraints. Once we took a decision on the scope, the problematics of that choice becomes invisible, completely transparent. This is the actual reason for the fact that choosing a reductionist approach as the first step is so questionable.

In our earlier essay about the belief system in modernism we emphasized the inevitability of the selection of a particular metaphysical stance, ways before we even think about the scope of an investigation in a particular domain. In case of modernistic thinking, from positivism to existentialism, including any shape of materialism, the core of the belief system is metaphysical independence, shaping all the way down towards politics methods, tools, attitudes and strategies. If you wonder whether there is an alternative to modernistic thinking, take a look to our article where we introduce the concept of the choreostemic space.

Space Syntax

In the case of “Space Syntax” the name is program. The approach is situated in urbanism; it has been developed and is still being advocated by Bill Hillier. Originally, Hillier was a geo-scientist, which is somewhat important to follow his methodology.

Put into a nutshell, the concept of space syntax claims that the description of the arrangement of free space in a built environment is necessary and sufficient for describing the quality of a city. The method of choice to describe that arrangement is statistics, either through the concept of probabilistic density of people or through the concept of regression, relating physical characteristics of free space with the density of people. Density in turn is used to capture the effect of collective velocity vectors. If people start to slow down, walking around in different directions, density increases. Density of course also increases as a consequence of narrow passages. Yet, in this case the vectors are strongly aligned.

The spatial behavior of individuals is a result and a means of social behavior in many animal species. Yet it makes a difference whether we consider the spatial behavior of individuals or the arrangement of free space in a city as a constraint of the individual spatial behavior. Hillier’s claim of “The Space is the Machine” is mistaking the one for the other.

In his writings, Hillier over and over again commits the figure of the petitio principii. He starts with the strong belief in analytics and upon that he tries to justify the use of analytical techniques. His claim of “The need for an analytic theory of architecture” ([11], p.40) is just one example. He writes

The answer proposed in this chapter is that once we accept that the object of architectural theory is the non-discursive — that is, the configurational — content of space and form in buildings and built environments, then theories can only be developed by learning to study buildings and built environments as non-discursive objects.

Excluding the discourse as a constitutional element only the analytic remains. He drops any relational account, focusing just the physical matter and postulating meaning of physical things, i.e. meaning as an apriori in the physical things. His problem is just his inability to distinguish different horizons of time, of temporal development. Dismissing time means to dismiss memory, and of course also culture. For a physicalist or ultra-modernist like him this blindness is constitutive. He never will understand the structure of his failure.

His dismissal of social issues as part of a theory serves eo ipso as his justification of the whole methodology. This is only possible due to another, albeit consistent, mistake, the conflation of theory and models. Hillier is showing us over and over again only models, yet not any small contribution to an architectural theory. Applying statistics shows us a particular theoretical stance, but is not to be taken as such! Statistics instantiates those models, that is his architectural theory is following largely the statistical theory. We repeatedly pointed to the problems that appear if we apply statistics to raw observations.

The high self-esteem Hillier expresses in his nevertheless quite limited writings is topped by treating space as syntax, in other words as a trivial machine. Undeniably, human beings have a material body, and buildings take space as material arrangements. Undeniably matter arranges space and constitutes space. There is a considerably discussion in philosophy about how we could approach the problematic field of space. We won’t go into details here, but Hillier simply drops the whole stuff.

Matter arranges in space. This becomes quickly a non-trivial insight, if we change perspective from abstract matter and the correlated claim of the possibility of reductionism to spatio-temporal processes, where the relations are kept taken as a starting point. We directly enter the domain of self-organization.

By means of “Space Syntax” Hillier claimed to provide a tool for planning districts of a city, or certain urban environments. If he would restrict his proposals to certain aspects of the anonymized flow of people and vehicles, it would be acceptable as a method. But it is certainly not a proper tool to describe the quality of urban environments, or even to plan them.

Recently, he delivered a keynote speech [12] where he apparently departed from his former Space Syntax approach, that reaches back to 1984. There he starts with the following remark.

On the face of it, cities as complex systems are made of (at least) two sub-systems: a physical sub-system, made up of buildings linked by streets, roads and infrastructure; and a human sub-system made up of movement, interaction and activity. As such, cities can be thought of as socio-technical systems. Any reasonable theory of urban complexity would need to link the social and technical sub-systems to each other.

This clearly is much less reductionist, at first sight at least, than “Space Syntax”. Yet, Hillier remains aligned to hard-core positivism. Firstly, in the whole speech he fails to provide a useful operationalization of complexity. Secondly, his Space Syntax simply appears wrapped in new paper. Agency for him is still just spatial agency. The relevant urban networks for him is just the network of streets. Thirdly, it is bare nonsense to separate a physical and a human subsystem, and then to claim the lumping together of those as a socio-technical system. He obviously is unaware of more advance and much more appropriate ways of thinking about culture, such as ANT, the Actor-Network-Theory (Bruno Latour), which precisely drops the categorical separation of physical and human. This separation has been first critized by Merlau-Ponty in the 1940ies!

Hillier served us just as an example, but you may have got the point. Occasionally, one can meet attempts that at least try to integrate a more appropriate concept of culture and human being in urban environments. Think about Koolhaas and his AMO/OMA, for instance, despite the fact that Koolhaas himself also struggles with the modernist mindset (see our introductions into “JunkSpace” or “The Generic City”). Yet, he at least recognized that something is fundamentally problematic with that.

7. The Toolbox Perspective

Most of the interesting and relevant systems are complex. It is simply a methodological fault to use frequencies of observational elements to describe these systems, whether we are dealing with animals, texts, urban environments or people (dogs, cats) moving around in urban environments.

Tools provide filters, they respond to certain issues, both of the signal and of the embedding. Tools are artifacts for transformation. As such they establish the relationality between actors, things and processes. Tools produce and establish Heidegger’s “Gestell” as well as they constitute the world as a fabric of relations as facts and acts, as Wittgenstein emphasized so often and already in the beginning of the Tractatus.

What we like to propose here is a more playful attitude towards the usage of tools, including formal methods. By “playful” we refer to Wittgenstein’s rule following, but also to a certain kind of experimentation, not induced by theory, but rather triggered by the know-how of some techniques that are going to be arranged. Tools as techniques, or techniques as tools are used to distil symbols from the available signals. Their relevancy is determined only by the subsequent step of classification, which in turn is (ortho-)regulated by strategic goal or cultural habits. Never, however, should we take a particular method as a representative for the means to access meaning from a process, let it a text or an urban environment.

8. Behavior

In this concluding chapter we are going to try to provide more details about our move to apply the concept of behavior to urbanism and computational linguistics.

Text

Since Friedrich Schleiermacher in 1830ies, hermeneutics is emphasizing a certain kind of autonomy of the text. Of course, the text itself is not a living thing as we consider it for animals. Before it “awakes” it has to be entered into mind matter, or more generally, it has to be interpreted. Nevertheless, an autonomy of the text remains, largely due to the fact that there is no Private Language. The language is not owned by the interpreting mind. Vilem Flusser proposed to radically turn the perspective and to conceive the interpreter as medium for texts and other “information”, rather than the other way round.

Additionally, the working of the brain is complex, least to say. Our relation to our own brain and our own mind is more that of an observer than that of a user or even controller. We experience them. Both together, the externality of language and the (partial) autonomy of the brain-mind lead to an arrangement where the text becomes autonomous. It inherits complimentary parts of independence from both parts of the world, from the internal and the external.

Furthermore, human languages are unlimited in their productivity. It is not only unlimited, it also is extensible. This pairs with its already mentioned deep structure, not only concerning the grammatical structure. Using language, or better, mastering language means to play with the inevitable inner contradictions that appear across the various layers, levels, aspects and processes of applied language. Within practiced language, there are many time horizons, instantiated by structural and semantic pointers. These aspects render the original series of symbols into an associative network of active components, which contributes further to the autonomy of texts. Roland Barthes notes (in [17]) that

The Plural of the Text depends … not on the ambiguity of its contents but on what might be called the sterographic plurality of its weave of signifiers (etymologically, the text is a tissue, a woven fabric). The reader of the Text may be compared to someone at a loose end.

Barthes implicitly emphasizes that the text does not convey a meaning, the meaning is not in the text, it can’t be conceived as something externalizable. In this essay he also holds that a text can’t be taken as just a single object. It is a text only in the context of other texts, and so the meaning that it develops upon interpretation is also dependent on the corpus into which it is embedded.

Methodologically, this (again) highlights the problematics that Alan Hajek called the reference class problem [13]. It is impossible for an interpreter to develop the meaning of a text outside of a previously chosen corpus. This dependency is inherited by any phrase, any sentence and any word within the text. Even a label like “IBM” that seems to be bijectively unique regarding the mapping of the graphem to its implied meaning is dependent on that. Of course, it will always refer somehow to the company. Yet, without the larger context it is not clear in any sense to which aspect of that company and its history the label refers to in a particular case. In literary theory this is called intertextuality. Further more, it is almost palpable here in this example that signs refer only to signs (the cornerstone of Peircean semiotics), and that concepts are nothing that could be defined (as we argued earlier in more detail).

We may settle here that a text as well as any part of it is established even through the selection of the embedding corpus, or likewise, a social practice, a life-form. Without such an embedding the text simply does not exist as a text. We just would find a series of graphemes. It is a hopeless exaggeration , if not self-deception, if people call the statistical treatment of texts “text mining”. reading it in another way, it may be considered even as a cynical term.

It is this dependence on local and global contexts, synchronically and diachronically, that renders the interpretation of a text similar to the interpretation of animal behavior.

Taken together, conceiving of texts as behaving systems is probably less a metaphor than it appears at first sight. Considering the way we make sense of a text, approaching a text is in many ways comparable with approaching an animal of a familiar species. We won’t know exactly what is going to happen, the course of events and action depends significantly on ourselves. The categories and ascribed properties necessary to establish an interaction are quite undefined in the beginning, also available only as types of rules, not as readily parameterized rules itself. And like in animals, the next approach will never be a simple repetition of the former one, even one knows the text quite good.

From the methodological perspective the significance of such a “behavioral turn”3 can’t be underestimated. For instance, nobody would interpret an animal by a rather short series of photographs, and keep the conclusion thereof once and for all. Interacting with a text as if it would behave demands for a completely different set of procedures. After all, one would deal with an open interaction. Such openness must be responded to with an appropriate attitude of the willingness for open structural learning.  This holds not only for human interpreters, but rather also for any interpreter, even if it would be software. In other words, the software dealing with text must itself be active in a non-analytical manner in order to constitute what we call a “text”. Any kind of algorithm (in the definition of Knuth) does not deal with text, but just and blindly with a series of dead graphemes.

The Urban

For completely different material reasons cities can be considered also as autonomous entities. Their patterns of growth and differentiation looks much more like that of ensembles of biological entities than that of minerals. Of course, this doesn’t justify the more or less naïve assignment of the “city as organism”. Urban arrangements are complex in the sense we’ve defined it, they are semiogenic and associative. There is a continuous contest between structure as regulation and automation on the one side and liquification as participation and symbolization on the other, albeit symbols may play for both parties.

Despite this autonomy, it remains a fact that without human activity cities are as little alive as texts are. This raises the particular question of the relationships between a city and its inhabitants, between the people as citizens of the city that they constitute. This topic has been subject of innumerable essay, novels, and investigations. Recently, a fresh perspective onto that has been opened by Vera Bühlmann’s notion of the “Quantum City”.[14]

We can neither detach the citizens from their city, not vice versa. Nevertheless, the standardized and externalized collective contribution across space and time creates an arrangement that produces dissipative flows and shows a strong meta-stability that transcends the activities of the individuals. This stability should not be mistaken as a “state”, though. Like for any other complex system, including texts, we should avoid to try to assign a “state” to a particular city, or even a part of it. Everything is a process within a complex system, even if it appears to be rather stable. yet, this stability depends on the perspective of the observer. In turn, the seeming stability does not mean that a city-process could not be destroyed by human activity, let it be by individuals (Nero), by a collective, or by socio-economic processes. Yet, again as in case of complex systems, the question of causality would be the wrong starting point for addressing the issue of change as it would be a statistical description.

Cities and urban environments are fabrics of relations between a wide range of heterogenic and heterotopic (See Foucault or David Shane [15]) entities and processes across a likewise large range of temporal scales, meeting any shade between the material and the immaterial. There is the activity of single individuals, of collectives of individuals, of legislative and other norms, the materiality of the buildings and their changing usage and roles, different kinds of flows and streams as well as stores and memories.

Elsewhere we argued that this fabric may be conceived as a dynamic ensemble of associative networks [16]. Those should be clearly distinguished from logistic networks, whose purpose is given by organizing any kind of physical transfer. Associative networks re-arrange, sort, classify and learn. Such, they are also the abstract location of the transposition of the material into the immaterial. Quite naturally, issues of form and their temporal structure arise, in other words, behavior.

Our suggestion thus is to conceive of a city as en entity that behaves. This proposal has (almost) nothing to do with the metaphor of the “city as organism”, a transfer that is by far too naïve. Changes in urban environments are best conceived as “outcomes” of probabilistic processes that are organized as overlapping series, both contingent and consistent. The method of choice to describe those changes is based on the notion of the generalized context.

Urban Text, Text and Urbanity, Textuality and Performance

Urban environments establish or even produce a particular kind of mediality. We need not invoke the recent surge of large screens in many cities for that. Any arrangement of facades encodes a rich semantics that is best described employing a semiotic perspective, just as Venturi proposed it. Recently, we investigated the relationship between facades, whether made from stone or from screens, and the space that they constitute [17].

There is yet another important dimension between the text and the city. For many hundred years now, if not even millenia, cities are not imaginable without text in one or the other form. Latest since the early 19th century, text and city became deeply linked to one another with the surge of newspapers and publishing houses, but also through the intricate linkage between the city and the theater. Urban culture is text culture, far more than it could be conceived as an image culture. This tendency is only intensified through the web, albeit urbanity now gets significantly transformed by and into the web-based aspects of culture. At least we may propose that there is a strong co-evolution between the urban (as entity and as concept) and mediality, whether it expresses itself as text, as movie or as webbing.

The relationship between the urban and the text has been explored many times. It started probably with Walter Benjamin’s “flâneur” (for an overview see [18]). Nowadays, urbanists often refer to the concept of the “readability” of a city layout, a methodological habit originated by Kevin Lynch. Yet, if we consider the relation between the urban and the textual, we certainly have to take an abstract concept of text, we definitely have to avoid the idea that there are items like characters or words out there in the city. I think, we should at least follow something like the abstract notion of textuality, as it has been devised by Roland Barthes in his “From Work to Text” [19] as a “methodological field”. Yet, this probably is still not abstract enough, as urban geographers like Henri Lefebvre mistook the concept of textuality as one of intelligibility [20]. Lefebvre obviously didn’t understand the working of a text. How should he, one might say, as a modernist (and marxist) geographer. All the criticism that was directed against the junction between the urban and textuality conceived­—as far as we know—text as something object-like, something that is out there as such, awaiting passively to be read and still being passive as it is being read, finally maybe even as an objective representation beyond the need (and the freedom for) interpretation. This, of course, represents a rather limited view on textuality.

Above we introduced the concept of “behaving texts”, that is, texts as active entities. These entities become active as soon as they are mediatized with interpreters. Again: not the text is conceived as the media or in a media-format, but rather the interpreter, whether it is a human brain-mind or a a suitable software tat indeed is capable for interpreting, not just for pre-programmed and blind re-coding. This “behavioral turn” renders “reading” a text, but also “writing” it, into a performance. Performances, on the other hand, comprise always and inevitable a considerable openness, precisely because they let collide the immaterial and the material from the side of the immaterial. Such, performances are the counterpart of abstract associativity, yet also settling at the surface that sheds matter from ideas.

In the introduction to their nicely edited book ”Performance and the City” Kim Solga, D.Hopkins and Shelley Orr [18] write, citing the urban geographer Nigel Thrift:

Although de Certeau conceives of ‘walking in the city’ not just as a textual experience but as a ‘series’ of embodied, creative’ practices’ (Lavery: 152), a ‘spatial acting-out of place’ (de Certeau: 98, our emphasis), Thrift argues that de Certeau: “never really leaves behind the operations of reading and speech and the sometimes explicit, sometimes implicit claim that these operations can be extended to other practices. In turn, this claim [ … ] sets up another obvious tension, between a practice-based model of often illicit ‘behaviour’ founded on enunciative speech-acts and a text-based model of ‘representation’ which fuels functional social systems.” (Thrift 2004: 43)

Quite obviously, Thrift didn’t manage to get the right grip to Certeau’s proposal that textual experience may be conceived—I just repeat it— as a series of embodied, creative practices. It is his own particular blindness that lets Thrift denunciate texts as being mostly representational.

Solsa and colleagues indeed emphasize the importance of performance, not just in their introduction, but also through their editing of the book. Yet, they explicitly link textuality and performance as codependent cultural practices. They write:

While we challenge the notion that the city is a ‘text’ to be read and (re)written, we also argue that textuality and performativity must be understood as linked cultural practices that work together to shape the body of phenomenal, intellectual, psychic, and social encounters that frame a subject’s experience of the city. We suggest that the conflict, collision, and contestation between texts and acts provoke embodied struggles that lead to change and renewal over time. (p.6)

Such, we find a justification for our “behavioral turn” and its application to texts as well as to the urban from a rather different corner. Even more significant, Solsa et al. seem to agree that performativity and textuality could not be detached from the urban at all. Apparently, the urban as a particular quality of human culture more and more develops into the main representative of human culture.

Yet, neither text nor performance, nor their combination count for a full account of the mediality of the urban. As we already indicated above, the movie as kind of a cross-media from text, image, and performance is equally important.

The relations between film and the urban, between architecture and the film, are also quite wide-spread. The cinema, somehow the successor of the theatre, could be situated only within the city. From the opposite direction, many would consider a city without cinemas as being somehow incomplete. The co-evolutionary story between both is still being under vivid development, I think.

There is particularly one architect/urbanist who is able to blend the film and the building into each other. You may know him quite well, I refer to Rem Koolhaas. Everybody knows that he has been an experimental moviemaker in his youth. It is much less known that he deliberately organized at least one of his buildings as kind of a movie: The Embassy of the Netherlands in Berlin (cf. [21]).

Here, Koolhaas arranged the rooms along a dedicated script. Some of the views out of the window he even trademarked to protect them!

Figure 1: Rem Koolhaas, Dutch Embassy, Berlin. The figure shows the script of pathways as a collage (taken from [21]).

9. The Behavioral Turn

So far we have shown how the behavioral turn could be supported and which are some of the first methodological consequences, if we embrace it. Yet, the picture developed so far is not complete, of course.

If we accept the almost trivial concept that autonomous entities are best conceived as behaving entities—remember that autonomy implies complexity—, then we further can ask about the structure of the relationship between the behaving subject and its counterpart, whether this is also a behaving subject or whether it is conceived more like passive object. For Bruno Latour, for instance, both together form a network, thereby blurring the categorical distinction between both.

Most descriptions of the process of getting into touch with something nowadays is dominated by the algorithmic perspective of computer software. Even Designer started to speak about interfaces. The German term for the same thing—“Schnittstelle”—is even more pronounced and clearly depicts the modernist prejudice in dealing with interaction. “Schnittstelle” implies that something, here the relation, is cut into two parts. A complete separation between interacting entities is assumed apriori. Such a separation is deeply inappropriate, since it would work only in strictly standardized environments, up to being programmed algorithmically. Precisely this was told us over and over again by designers of software “user interfaces”. Perhaps here we can find the reason for so many bad designs, not only concerning software. Fortunately, though just through a slow evolutionary process, things improve more and more. So-called “user-centric” design, or “experience-oriented” design became more abundant in recent years, but their conceptual foundation is still rather weak, or a wild mixture of fashionable habits and strange adaptations of cognitive science.

Yet, if we take the primacy of interpretation serious, and combine it with the “behavioral turn” we can see a much more detailed structure than just two parts cut apart.

The consequence of such a combination is that we would drop the idea of a clear-cut surface even for passive objects. Rather, we could conceive objects as being stuffed with a surrounding field that becomes stronger the closer we approach the object. By means of that field we distinguish the “pure” physicality from the semiotically and behaviorally active aspects.

This field is a simple one for stone-like matter, but even there it is still present. The field becomes much more rich, deep and vibrant if the entity is not a more or less passive object, but rather an active and autonomous subject. Such as an animal, a text, or a city. The reason being that there are no apriori and globally definable representative criteria that we could use to approach such autonomous entities. We only can know about more or less suitable procedures about how to derive such criteria in the particular case, approaching a particular individual {text, city}. The missing of such criteria is a direct correlate for their semantic productivity, or, likewise, for their unboundedness.

Approaching a semantically productive entity—such entities are also always able to induce new signs, they are semiosic entities—is reminds to approaching a gravitational field. Yet it is also very different from a gravitational field, since our semio-behavioral field shows increasing structural richness the closer the entities approach to each other. It is quite obvious that only by means of such a semio-behavioral field we can close the gap between the subject and the world that has been opened, or at least deepened by the modernist contributions from the times of Descartes until late computer science. Only upon a concept like the semio-behavioral field, which in turn is a consequence of the behavioral turn, we can overcome the existential fallacy as it has been purported and renewed over and over again by the dual pair of material and immaterial. The language game that separates the material and immaterial inevitably leads into the nonsensical abyss of existentialism. Dual concepts always come with tremendous costs, as they prevent any differentiated way of speaking about the matter. For instance, it prevents to recognize the materiality of symbols, or more precisely, the double-articulation of symbols between the more material and the more immaterial aspects of the world.

The following series of images may be taken as a metaphorical illustration of that semio-behavioral field. We call it the zona extima of the behavioral coating of entities.

Figure 2a: The semio-behavioral field around an entity.

Figure 2b: The situation as another entity approaches perceptively.

Figure 2c: Mutual penetration of semio-behavioral fields.

Taken together we may say, that whenever {sb,sth} gets into contact with {sb, sth}, we do so through the behavioral coating. This zone is of contact is not intimate (as Peter Sloterdijk describes it), it is rather extimate, though there is a smooth and graded change of quality from extimacy to intimacy as the distance decreases. The zona extima is a borderless (topological) field, driven by purposes (due to modelling), it is medial, behaviorally  choreographed as negotiation, exposure, call & request.

The concept of extimation, or also the process of extimating, is much more suitable than “interaction” to describe what‘s going on when we act, behave, engage, actively perceive, encounter with or towards the other. The interesting thing with the web-based media is that some aspects of zona extima can be transferred.

10. Conclusion

In this essay we try to argument in favor of a behavioral turn as a general attitude when it comes to conceive the interaction of any kind of two entities. The behavioral turn is a consequence of three major and interrelated assumptions:

  • – primacy of interpretation in the relation to the world;s;
  • – primacy of process and relation against matter and point;
  • – complexity and associativity in strongly mediatized environments.

All three assumptions are strictly outside of anything that phenomenological, positivist or modernist approaches can talk about or even practice.

It particularly allows to overcome the traditional and strict separation between the material and the immaterial, as well as the separation between the active and the passive. These shifts can’t be underestimated; they have far-reaching consequences upon the way we practice and conceive our world.

The behavioral turn is the consequence of a particular attitude that respects the bi-valency of world as a dynamic system of populations of relations. It is less the divide between the material and the immaterial, which anyway is somewhat an illusion deriving from the metaphysical claim of the possibility of essences. For instance, the jump that occurs between the realms of the informational and the causal establishes as a pair of two complimentary but strictly and mutually exclusive modes of speaking about the orderliness in the world. In some way, it is also the orderliness in the behavior of the observer—as repetition—that creates the informational that the observer than may perceive. The separation is thus a highly artificial one, in either direction. It is simply silly to discuss the issue of causality without referring to the informational aspects (for a full discussion of the issue see this essay). In any real-world case we always find both aspects together, and we find it as behavior.

Actually, the bi-valent aspect that I mentioned before refers to something quite different, in fact so different that we even can’t speak properly about it. It refers to these aspects that are apriori to modeling or any other comprehension, that are even outside to the performance of the individual itself. What I mean is the resistance of existential arrangements, inclusive the body that the comprehending entity is partially built from. This existential resistance introduces something like outer space for the cultural sphere. Needless to say that we can exist only within this cultural sphere. Yet, any action upon the world enforces us to take a short trip into the vacuum, and if we are lucky the re-entrance is even productive. We may well expect an intensification of the aspect of the virtual, as we argued here. Far from being suitable to serve as a primacy (as existentialism misunderstood the issue), the existential resistance, the absolute outside, enforces us to bark on the concept of behavior. Only “behavior” as a perceptional and performative attitude allows to extract coherence from the world without neglecting the fact of that resistance or contumacy.

The behavioral turn triggers a change in the methodology for empiric investigations as well. The standard set of methods for empiric descriptions changes, using the relation and the coherent series always as the starting point, best in its probabilized form, that is, as generalized probabilistic context. This also prevents the application of statistical methods directly to raw data. There should always be some kind of grouping or selection preceding the statistical reasoning. Otherwise we would try to follow the route that Wittgenstein blocked as a “wrong usage of symbols” (in his rejection of the reasonability of Russel/Whitehead’s Principia Mathematica). The concept of abstract behavior inclusive the advanced methodology that avoids to start with representational symbolification is clearly a sound way out of this deep problem from which any positivist empiric investigation suffers.

Interaction, including any action upon some other entity, when understood within the paradigm of behavior, becomes a recurrent, though not repetitive, self-adjusting process. During this process means and symbols may change and be replaced all the way down until a successful handshake. There is no objectivity in this process other than the mutual possibility for anticipation. Despite the existential resistance and contumacy that is attached to any re-shaping of the world, and even more so if we accomplish it by means of tools, this anticipation is, of course, greatly improved upon the alignment to cultural standards, contributing to the life-world as a shared space of immanence.

This provides us finally a sufficiently abstract, but also a sufficiently rich or manifold perspective on the issue of the roles of symbols regarding the text, the urban and the anime, the animal-like. None of those could be comprehended without first creating a catalog or a system of symbols. These symbols, both material and immaterial and thus kind of a hinge, a double-articulation, are rooted both in the embedding culture (as a de-empirifying selective force) and the individual, which constitutes another double-articulation. The concept of abstract behavior, given as a set of particular conditions and attitudes, allows to respond appropriately to the symbolic.

The really big question concerning our choreostemic capabilities—and those of the alleged machinic—therefore is: How to achieve the fluency in dealing with the symbolic without presuming it as a primary entity? Probably by exercising observing. I hope that the suggestions expressed so far in these essay provide some robust starting points. …we will see.

Notes

1. Here we simply cite the term of “information retrieval”, we certainly do not agree that the term is a reasonable one, since it is deeply infected by positivist prejudices. “Information” can’t be retrieved, because it is not “out there”. Downloading a digitally encoded text is neither a hunting nor a gathering for information, because information can’t be considered to be an object. Information is only present during the act of interpretation (more details about the status of information you can find here). Actually, what we are doing is simply “informationing”.

2. The notion of a “behavioral turn” is known from geography since the late 1960ies [22][23], and also from economics. In both fields, however, the behavioral aspect is related to the individual human being. In both areas, any level of abstraction with regard to the concept of behavior is missing. Quite in contrast to those movements, we do not focus on the neglect of the behavioral domain when it comes to human society, but rather the transfer of the abstract notion of behavior to non-living entities.

Another reference to “behavioral sciences” can be found in social sciences. Yet, in social sciences “behavioral” is often reduced to “behaviorist”, which of course is nonsense. A similar misunderstanding is abundant in political sciences.

3. Note that the proposed „behavioral turn“ should not be mistaken as a “behavioristic” move, as sort of a behaviorism. We strictly reject the stimulus-response scheme of the behaviorism. Actually, behaviorism as it has been developed by Watson and Pavlov has only little to do with behavior at all. It is nothing else than an overt reductionist program, rendering any living being into a trivial machine. Unfortunately, the primitive scheme of behaviorism is experiencing kind of a come-back in so-called “Behavioral Design”, where people talk about “triggers” much in the same way as Pavlov did (c.f. BJ Fogg’s Behavior Model).

References

  • [1] Michael Epperson (2009). Quantum Mechanics and Relational Realism: Logical Causality and Wave Function Collapse. Process Studies, 38(2): 339-366.
  • [2] G. Moran, J.C. Fentress (1979). A Search for Order in Wolf Social Behavior. pp.245-283. in: E. Klinghammer (ed.), The Behavior and Ecology of Wolves. Symp. held on 23-24.5.1975 in Wilmington N.C.), Garland STPM Press, New York..
  • [3] Gilles Deleuze, Difference and repetitionGilles Deleuze, Difference and Repetition.
  • [4] J.A.R.A.M. Van Hooff (1982). Categories and sequences of behaviour: methods of description and analysis. in: Handbook of methods in nonverbal behavior research (K.R. Scherer& P. Ekman, eds). Cambridge University Press, Cambridge.
  • [5] P.G.M. van der Heijden, H. de Vries, J.A.R.A.M. van Hooff (1990). Correspondence analysis of transition matrices, with special attention to missing entries and asymmetry. Anim.Behav. 40: 49-64.
  • [6] Teuvo Kohonen, Samuel Kaski, K. Lagus und J. Honkela (1996). Very Large Two-Level SOM for the Browsing of Newsgroups. In: C. von der Malsburg, W. von Seelen, J. C. Vorbrüggen and B. Sendhoff, Proceedings of ICANN96, International Conference on Artificial Neural Networks, Bochum, Germany, July 16-19, 1996, Lecture Notes in Computer Science, Vol. 1112, pp.269-274. Springer, Berlin.
  • [7] Hecht-Nielsen (1994).
  • [8] Javier Rojo Tuan, S. Nguyen (2010). Improving the Johnson-Lindenstrauss Lemma. available online.
  • [9] Sanjoy Dasgupta, Presentation given about: Samuel Kaski (1998), Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering, Helsinki University of Technology 1998. available online.
  • [10] Michel Serres, Nayla Farouki. Le trésor. Dictionnaire des sciences. Falmamrion, Paris 1998. p.394.
  • [11] Bill Hillier, Space Syntax. E-edition, 2005.
  • [12] Bill Hillier (2009). The City as a Socio-technical System: a spatial reformulation in the light of the levels problem and the parallel problem. Keynote paper to the Conference on Spatial Information Theory, September 2009.
  • [13] Alan Hájek (2007). The Reference Class Problem is Your Problem Too. Synthese 156 (3):563-585.
  • [14] Vera Bühlmann (2012). In the Quantum City – design, and the polynomial grammaticality of artifacts. forthcoming.
  • [15] David G. Shane. Recombinant Urbanism. 2005.
  • [16] Klaus Wassermann (2010). SOMcity: Networks, Probability, the City, and its Context. eCAADe 2010, Zürich. September 15-18, 2010. available online.
  • [17] Klaus Wassermann, Vera Bühlmann, Streaming Spaces – A short expedition into the space of media-active façades. in: Christoph Kronhagel (ed.), Mediatecture, Springer, Wien 2010. pp.334-345. available here. available here.
  • [18] D.J. Hopkins, Shelley Orr and Kim Solga (eds.), Performance and the City. Palgrave Macmillan, Basingstoke 2009.
  • [19] Roland Barthes, From Work to Text. in: Image, Music, text: Essay Selected and translated. Transl. Stephen Heath, Hill&Wang, New York 1977. also available online @ google books p.56.
  • [20] Henri Lefebvre, The Production of Space. 1979.
  • [21] Vera Bühlmann. Inhabiting media. Thesis, University of Basel (CH) 2009.
  • [22] Kevin R Cox, Jennifer Wolch and Julian Wolpert (2008). Classics in human geography revisited. “Wolpert, J. 1970: Departures from the usual environment in locational analysis. Annals of the Association of American Geographers 50, 220–29.” Progress in Human Geography (2008) pp.1–5.
  • [23] Dennis Grammenos. Urban Geography. Encyclopedia of Geography. 2010. SAGE Publications. 1 Oct. 2010. available online.

۞

Tagged: , , , , , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

What’s this?

You are currently reading Behavior at The "Putnam Program".

meta

%d bloggers like this: