July 10, 2012 § Leave a comment
What is the role of texts? How do we use them (as humans)?
How do we access them (as reading humans)? The answers to such questions seem to be pretty obvious. Almost everybody can read. Well, today. Noteworthy, reading itself, as a performance and regarding its use, changed dramatically at least two times in history: First, after the invention of the vocal alphabet in ancient Greece, and the second time after book printing became abundant during the 16th century. Maybe, the issue around reading isn’t so simple as it seems in everyday life.
Beyond such accounts of historical issues and basic experiences, we have a lot of more theoretical results concerning texts. Beginning with Friedrich Schleiermacher who was the first to identify hermeneutics as a subject around 1830 and formulated it in a way that has been considered as more complete and powerful than the version proposed by Gadamer in the 1950ies. Proceeding of course with Wittgenstein (language games, rule following), Austin (speech act theory) or Quine (criticizing empirism). Philosophers like John Searle, Hilary Putnam and Robert Brandom then explicating and extending the work of the former heroes. And those have been accompanied by many others. If you wonder about linguistics missing here, well, then because linguistics does not provide theories about language. Today, the domain is largely caught by positivism and the corresponding analytic approach.
Here in his little piece we pose these questions in the context of certain relations between machines and texts. There are a lot of such relations, and even quite sophisticated or surprising ones. For instance, texts can be considered as kind of machines. Yet, they bear a certain note of (virtual) agency as well, resulting in a considerable non-triviality of this machine aspect of texts. Here we will not deal with this perspective. Instead, we just will take a look on the possibilities and the respective practices to handle or to “treat” texts with machines. Or, if you prefer, the treating of texts by machines, as far as a certain autonomy of machines could be considered as necessary to deal with texts at all.
Today, we can find a fast growing community of computer programmers that are dealing with texts as kind of unstructured information. One of the buzz-words is the so-called “semantic web”, another one is “sentiment analysis”. We won’t comment in any detail about those movements, because they are deeply flawed. The first one is trying to formalize semantics and meaning apriori, trying to render the world into a trivial machine. We repeatedly criticized this and we agree herein with Douglas Hofstadter. (see this discussion of his “Fluid Analogy”). The second is trying to identify the sentiment of a text or a “tweet”, e.g. about a stock or an organization, on the basis of statistical measures about keywords and their utterly naive “n-grammed” versions, without actually paying any notice to the problem of “understanding”. Such nonsense would not be as widespread if programmers would read only a few fundamental philosophical texts about language. In fact, they don’t, and thus they are condemned to visit any of the underdeveloped positions that arose centuries ago.
If we neglect the social role of texts for a moment, we might identify a single major role of texts, albeit we have to describe it then in rather general terms. We may say that the role of a text, as a specimen of many other texts from a large population, is its functioning as a medium for the externalization of mental content in order to serve the ultimate purpose, which consists of the possibility for a (re)construction of resembling mental content on the side of the interpreting person.
This interpretation is a primacy. It is not possible to assign meaning to text like a sticky note, then putting the text including the yellow sticky note directly into the recipients brain. That may sound silly, but unfortunately it’s the “theory” followed by many people working in the computer sciences. Interpretation can’t be controlled completely, though, not even by the mind performing it, not even by the same mind who seconds before externalized the text through writing or speaking.
Now, the notion of mental content may seem both quite vague and hopelessly general as well. Yet, in the previous chapter we introduced a structure, the choreostemic space, which allows to speak pretty precise about mental content. Note that we don’t need to talk about semantics, meaning or references to “objects” here. Mental content is not a “state” either. Thinking “state” and the mental together is much on the same stage as to seriously considering the existence of sea monsters in the end of 18th century, when the list science of Linnaeus was not yet reshaped by the upcoming historical turn in the philosophy of nature. Nowadays we must consider it as silly-minded to think about a complex story like the brain and its mind by means of “state”. Doing so, one confounds the stability of the graphical representation of a word in a language with the complexity of a multi-layered dynamic process, spanned between deliberate randomness, self-organized rhythmicity and temporary thus preliminary meta-stability.
The notion of mental content does not refer to the representation of referenced “objects”. We do not have maps, lists or libraries in our heads. Everything which we experience as inner life builds up from an enormous randomness through deep stacks of complex emergent processes, where each emergent level is also shaped from top-down, implicitly and, except the last one usually called “consciousness,” also explicitly. The stability of memory and words, of feelings and faculties is deceptive, they are not so stable at all. Only their externalized symbolic representations are more or less stable, their stability as words etc. can be shattered easily. The point we would like to emphasize here is that everything that happens in the mind is constructed on the fly, while the construction is completed only with the ultimate step of externalization, that is, speaking or writing. The notion of “mental content” is thus a bit misleading.
The mental may be conceived most appropriately as a manifold of stacked and intertwined processes. This holds for the naturalist perspective as well as for the abstract perspective, as he have argued in the previous chapter. It is simply impossible to find a single stable point within the (abstract) dynamics between model, concept, mediality and virtuality, which could be thought of as spanning a space. We called it the choreostemic space.
For the following remarks about the relation between text and machines and the practitioners engaged in building machines to handle texts we have to keep in mind just those two things: (i) there is a primacy of interpretation, (ii) the mental is a non-representative dynamic process that can’t be formalized (in the sense of “being represented” by a formula).
In turn this means that we should avoid to refer to formulas when going to build a “text machine”. Text machines will be helpful only if their understanding of texts, even if it is a rudimentary understanding, follows the same abstract principles as our human understanding of texts does. Machines pretending to deal with texts, but actually only moving dead formal symbols back and forth, as it is the case in statistical text mining, n-gram based methods and similar, are not helpful at all. The only thing that happens is that these machines introduce a formalistic structure into our human life. We may say that these techniques render humans helpful to machines.
Nowadays we can find a whole techno-scientific community that is engaged in the field of machine learning, devised to “textual data”. The computers are programmed in such a way that they can be used to classify texts. The idea is to provide some keywords, or anti-words, or even a small set of sample texts, which then are taken by the software as a kind of template that is used to build a selection model. This model then is used to select resembling texts from a large set of texts. We have to be very clear about the purpose of these software programs: they classify texts.
The input data for doing so is taken from the texts themselves. More precisely, they are preprocessed according to specialized methods. Each of the texts gets described by a possibly large set of “features” that have been extracted by these methods. The obvious point is that the procedure is purely empirical in the strong sense. Only the available observations (the texts) are taken to infer the “similarity” between texts. Usually, not even linguistic properties are used to form the empirical observations, albeit there are exceptions. People use the so-called n-gram approach, which is only little more than counting letters. It is a zero-knowledge model about the series of symbols, which humans interpret as text. Additionally, the frequency or relative positions of keywords and anti-words are usually measured and expressed by mostly quite simple statistical methods.
Well, classifying texts is something that is quite different from understanding texts. Of course. Yet, said community tries to reproduce the “classification” achieved or produced by humans. Such, any of the engineers of the field of machine learning directed to texts implicitly claims kind of an understanding. They even organize competitions.
The problems with the statistical approach are quite obvious. Quine called it the dogma of empiricism and coined the Gavagai anecdote about it, which even provides much more information than the text alone. In order to understand a text we need references to many things outside the particular text(s) at hand. Two of those are especially salient: concepts and the social dimension. Straightly opposite to the believe of positivists, concepts can’t be defined in advance to a particular interpretation. Using catalogs of references does not help much, if these catalogs are used just as lists of references. The software does not understand “chair” by the “definition” stored in a database, or even by the set of such references. It simply does not care whether there are encoded ASCII codes that yield the symbol “chair” or the symbol “h&e%43”. Douglas Hofstadter has been stressing this point over and over again, and we fully agree to that.
From that necessity to a particular and rather wide “background” (notion by Searle) the second problem derives, which is much more serious, even devastating to the soundness of the whole empirico-statistical approach. The problem is simple: Even we humans have to read a text before being able to understand it. Only upon understanding we could classify it. Of course, the brain of many people is trained sufficiently as to work about the relations of the texts and any of its components while reading the text. The basic setup of the problem, however, remains the same.
Actually, what is happening is a constantly repeated re-reading of the text, taking into account all available insights regarding the text and the relations of it to the author and the reader, while this re-reading often takes place in the memory. To perform this demanding task in parallel, based on the “cache” available from memory, requires a lot of experience and training, though. Less experienced people indeed re-read the text physically.
The consequence of all of that is that we could not determine the best empirical discriminators for a particular text in-the-reading in order to select it as-if we would use a model. Actually, we can’t determine the set of discriminators before we have read it all, at least not before the first pass. Let us call this the completeness issue.
The very first insight is thus that a one-shot approach in text classification is based on a misconception. The software and the human would have to align to each other in some kind of conversation. Otherwise it can’t be specified in principle what the task is, that is, which texts should actually be selected. Any approach to text classification not following the “conversation scheme” is necessarily bare nonsense. Yet, that’s not really a surprise (except for some of the engineers).
There is a further consequence of the completeness issue. We can’t set up a table to learn from at all. This too is not a surprise, since setting up a table means to set up a particular symbolization. Any symbolization apriori to understanding must count as a hypothesis. Such simple. Whether it matches our purpose or not, we can’t know before we didn’t understand the text.
However, in order to make the software learning something we need assignates (traditionally called “properties”) and some criteria to distinguish better models from less performant models. In other words, we need a recurrent scheme on the technical level as well.
That’s why it is not perfectly correct to call texts “unstructured data”. (Besides the fact that data are not “out there”: we always need a measurement device, which in turn implies some kind of model AND some kind of theory.) In the case of texts, imposing a structure onto a text simply means to understand it. We even could say that a text as text is not structurable at all, since the interpretation of a text can’t never be regarded as finished.
All together, we may summarize the issue of complexity of texts as deriving from the following properties in the following way:
- – there are different levels of context, which additionally stretch across surrounds of very different sizes;
- – there are rich organizational constraints, e.g. grammars
- – there is a large corpus of words, while any of them bears meaning only upon interpretation;
- – there is a large number of relations that not only form a network, but which also change dynamically in the course of reading and of interpretation;
- – texts are symbolic: spatial neighborhood does not translate into reference, in neither way;
- – understanding of texts requires a wealth of external, and quite abstract-concepts, that appear as significant only upon interpretation, as well as a social embedding of mutual interpretation,.
This list should at least exclude any attempt to defend the empirico-statistical approach as a reasonable one. Except the fact that it conveys a better-than-nothing attitude. These brings us to the question of utility.
Engineers build machines that are supposedly useful, more exactly, they are intended to be fulfill a particular purpose. Mostly, however, machines, even any technology in general, is useful only upon processes of subjective appropriation. The most striking example for this is the car. Else, computers have evolved not for reasons of utility, but rather for gaming. Video did not become popular for artistic reasons or for commercial ones, but due to the possibilities the medium offered for the sex industry. The lesson here being that an intended purpose is difficult to achieve as of the actual usage of the technology. On the other hand, every technology may exert some gravitational forces to develop a then unintended symbolic purpose and regarding that even considerable value. So, could we agree that the classification of texts as it is performed by contemporary technology is useful?
Not quite. We can’t regard the classification of texts as it is possible with the empirico-statistical approach as a reasonable technology. For the classification of texts can’t be separated from their understanding. All we can accomplish by this approach is to filter out those texts that do not match our interests with a sufficiently high probability. Yet, for this task we do not need text classification.
Architectures like 3L-SOM could also be expected to play an important role in translation, as translation requires even deeper understanding of texts as it is needed for sorting texts according to a template.
Besides the necessity for this doubly recurrent scheme we haven’t said much so far here about how then actually to treat the text. Texts should not be mistaken as empiric data. That means that we have to take a modified stance regarding measurement itself. In several essays we already mentioned the conceptual advantages of the two-layered (TL) approach based on self-organizing maps (TL-SOM). We already described in detail how the TL-SOM works, including the the basic preparation of the random graph as it has been described by Kohonen.
The important thing about TL-SOM is that it is not a device for modeling the similarity of texts. It is just a representation, even as it is a very powerful one, because it is based on probabilistic contexts (random graphs). More precisely, it is just one of many possible representations, even as it is much more appropriate than n-gram and other jokes. We even should NOT consider the TL-SOM as so-called “unsupervised modeling”, as the distinction between unsupervised vs. supervised is just another myth (=nonsense if it comes to quantitative models). The TL-SOM is nothing else than an instance for associative storage.
The trick of using a random graph (see the link above) is that the surrounds of words are differentially represented as well. The Kohonen model is quite scarce in this respect, since it applies a completely neutral model. In fact, words in a text are represented as if they would be all the same: of the same kind, of the same weight, etc. That’s clearly not reasonable. Instead, we should represent a word in several, different manners into the same SOM.
Yet, the random graph approach should not be considered just as a “trick”. We repeatedly argued (for instance here) that we have to “dissolve” empirical observations into a probabilistic (re)presentation in order to evade and to avoid the pseudo-problem of “symbol grounding”. Note that even by the practice of setting up a table in order to organize “data” we are already crossing the rubicon into the realm of the symbolic!
The real trick of the TL-SOM, however, is something completely different. The first layer represents the random graph of all words, the actual pre-specific sorting of texts, however, is performed by the second layer on the output of the first layer. In other words, the text is “renormalized”, the SOM itself is used as a measurement device. This renormalization allows to organize data in a standardized manner while allowing to avoid the symbolic fallacy. To our knowledge, this possible usage of the renormalization principle has not been recognized so far. It is indeed a very important principle that puts many things in order. We will deal later in a separate contribution with this issue again.
Only based on the associative storage taken as an entirety appropriate modeling is possible for textual data. The tremendous advantage of that is that the structure for any subsequent consideration now remains constant. We may indeed set up a table. The content of this table, the data, however is not derived directly from the text. Instead we first apply renormalization (a technique known from quantum physics, cf. )
The input is some description of the text completely in terms of the TL-SOM. More explicit, we have to “observe” the text as it behaves in the TL-SOM. Here, we are indeed legitimized to treat the text as an empirical observation, albeit we can, of course, observe the text in many different ways. Yet, observing means to conceive the text as a moving target, as a series of multitudes.
One of the available tools is Markov modeling, either as Markov chains, or by means of Hidden Markov Models. But there are many others. Most significantly, probabilistic grammars, even probabilistic phrase structure grammars can be mapped onto Markov models. Yet, again we meet the problem of apriori classification. Both models, Markovian as well as grammarian, need an assignment of grammatical type to a phrase, which often first requires understanding.
Given the autonomy of text, their temporal structure and the impossibility to apply apriori schematism, our proposal is that we just have to conceive of the text like we do of (higher) animals. Like an animal in its habitat, we may think of the text as inhabiting the TL-SOM, our associative storage. We can observe paths, their length and form, preferred neighborhoods, velocities, size and form of habitat.
Similar texts will behave in a similar manner. Such similarity is far beyond (better: as if from another planet) the statistical approach. We also can see now that the statistical approach is being trapped by the representationalist fallacy. This similarity is of course a relative one. The important point here is that we can describe texts in a standardized manner strictly WITHOUT reducing their content to statistical measures. It is also quite simple to determine the similarity of texts, whether as a whole, or whether regarding any part of it. We need not determine the range of our source at all apriori to the results of modeling. That modeling introduces a third logical layer. We may apply standard modeling, using a flexible tool for transformation and a further instance of a SOM, as we provide it as SomFluid in the downloads. The important thing is that this last step of modeling has to run automatically.
The proposed structure keeps any kind of reference completely intact. It also draws on its collected experience, that is, all texts it have been digesting before. It is not necessary to determine stopwords and similar gimmicks. Of course, we could, but that’s part of the conversation. Just provide an example of any size, just as it is available. Everything from two words, to a sentence, to a paragraph, to the content of a directory will work.
Such a 3L-SOM is very close to what we reasonably could call “understanding texts”. But does it really “understand”?
As such, not really. First, images should be stored in the same manner (!!), that is, preprocessed as random graphs over local contexts of various size, into the same (networked population of) SOM(s). Second, a language production module would be needed. But once we have those parts working together, then there will be full understanding of texts.
(I take any reasonable offer to implement this within the next 12 months, seriously!)
Understanding is a faculty to move around in a world of symbols. That’s not meant as a trivial issue. First, the world consists of facts, where facts comprise an universe of dynamic relations. Symbols are just not like traffic signs or pictograms as these belong to the more simple kind of symbols. Symbolizing is a complex, social, mediatized diachronic process.
Classifying, understood as “performing modeling and applying models” consists basically of two parts. One of them could be automated completely, while the other one could not treated by a finite or apriori definable set of rules at all: setting the purpose. In the case of texts, classifying can’t be separated from understanding, because the purpose of the text emerges only upon interpretation, which in turn requires a manifold of modeling raids. Modeling a (quasi-)physical system is completely different from that, it is almost trivial. Yet, the structure of a 3L-SOM could well evolve into an arrangement that is capable to understand in a similar way as we humans do. More precisely, and a bit more abstract, we also could say, that a “system” based on a population of 3L-SOM once will be able to navigate in the choreostemic space.
-  B. Delamotte (2003). A hint of renormalization. Am.J.Phys. 72 (2004) 170-184, available online: arXiv:hep-th/0212049v3.
April 7, 2012 § 1 Comment
The big question of philosophy, and probably its sole question,
concerns the status of the human as a concept.1 Does language play a salient role in this concept, either as a major constituent, or as sort of a tool? Which other capabilities and which potential beyond language, if it is reasonable at all to take that perspective, could be regarded as similarly constitutive?
These questions may appear far off such topics like the technical challenges to program a population of self-organizing maps, the limits of Turing-machines, or the generalization of models and their conditions. Yet, in times where lots of people are summoning the so-called singularity, the question about the status of the human is definitely not exotic at all. Notably, “singularity” is often and largely defined as “overwhelming intelligence”, seemingly coming up inevitably due to ever increasing calculation power, and which we could not “understand” any more. From an evolutionary perspective it makes pretty little sense to talk about singularities. Natural evolution, and cultural evolution alike, is full of singularities and void of singularities at the same time. The idea of “singularity” is not a fruitful way to approach the question of qualitative changes.
As you already may have read in another chapter, we prefer the concept of machine-based episteme as our ariadnic guide. In popular terms, machine-based episteme concerns the possibility for an actualization of a particular “machine” that would understand the conditions of its own when claiming “I know.” (Such an entity could not be regarded as a machine anymore, I guess.) Of course, in following this thread we meet a lot of already much-debated issues. Yet, moving the question about the episteme into the sphere of the machinic provides particular perspectives onto these issues.
In earlier times it has been tried, and some people still are trying today, to determine that status of the “human” as sort of a recipe. Do this and do that, but not that and this, then a particular quality will be established in your body, as your person, visible for others as virtues, labeled and conceived henceforth as “quality of being human”. Accordingly, natural language with all its ambiguities need not be regarded as an essential pillar. Quite to the opposite, if the “human” could be defined as a recipe, then our everyday language has to be cleaned up, made more close to crisp logic in order to avoid misunderstandings as far as possible; you may recognize this as the program of contemporary analytical philosophy. In methodological terms it was thought that it would be possible to determine the status of the human in positively given terms, or short, in a positive definite manner.
Such positions are, quite fortunately so, now recognized more and more as highly problematic. The main reason is that it is not possible to justify any kind of determination in an absolute manner. Any justification requires assumptions, while unjustified assumptions are counter-pragmatic to the intended justification. The problematics of knowledge is linked in here, as it could not be regarded as “justified, true belief” any more2. It was first Charles S. Peirce who concluded that the application of logic (as the grammar of reason) and ethics (as the theory of morality) are not independent from each other. In political terms, any positive definite determination that would be imposed to communities of other people must be regarded as an instance of violence. Hence, philosophy is not any more concerned about the status of the human as a fact, but, quite differently, the central question is how to speak about the status of the human, thereby not neglecting that speaking, using language is not a private affair. This looking for the “how” has to obey, of course, itself to the rule not to determine rules in a positive definite manner. As a consequence, the only philosophical work we can do is exploring the conditions, where the concept of “condition” refers to an open, though not recursive, chain. Actually, already Aristotle dubbed this as “metaphysics” and as the core interest of philosophy. This “metaphysics” can’t be overtaken by any “natural” discipline, whether it is a kind of science or engineering. There is a clear downstream relation: science as well as engineering should be affected by it in emphasizing the conditions for their work more intensely.
Practicing, turning the conditions and conditionability into facts and constraints is the job of design, let it manifest this design as “design,” as architecture, as machine-creating technology, as politician, as education, as writer and artist, etc.etc. Philosophy can not only never explain, as Wittgenstein mentioned, it also can’t describe things “as such”. Descriptions and explanations are only possible within a socially negotiated system of normative choices. This holds true even for natural sciences. As a consequence, we should start with philosophical questions even in the natural sciences, and definitely always in engineering. And engaging in fields like machine learning, so-called artificial intelligence or robotics without constantly referring to philosophy will almost inevitably result in nonsense. The history of these fields a full of examples for that, just remember the infamous “General Problem Solver” of Simon and Newell.
Yet, the issue is not only one of ethics, morality and politics. It has been Foucault as the first one, in sort of a follow-up to Merleau-Ponty, who claimed a third region between the empiricism of affections and the tradition of reflecting on pure reason or consciousness.3 This third region, or even dimension (we would say “aspection”), being based on the compound consisting from perception and the body, comprises the historical evolution of systems of thinking. Foucault, together with Deleuze, once opened the possibility for a transcendental empiricism, the former mostly with regard to historical and structural issues of political power, the latter mostly with regard to the micronics of individual thought, where the “individual” is not bound to a single human person, of course. In our project as represented by this collection of essays we are following a similar path, starting with the transition from the material to the immaterial by means of association, and then investigating the dynamics of thinking in the aspectional space of transcendental conditions (forthcoming chapter), which build an abstract bridge between Deleuze and Foucault as it covers both the individual and the societal aspects of thinking.
This essay deals with the relation of words and a rather important aspect in thinking, representation. We will address some aspects of its problematics, before we approach the role of words in language. Since the representation is something symbolic in the widest sense and that representation has to be achieved autonomously by a mainly material arrangement, e.g. called “the machine”4, we also will deal (again) with the conditions for the transformation of (mainly) physical matter into (mainly) symbolic matter. Particularly, however, we will explore the role of words in language. The outline comprises the following sections:
- From Matter to Mind
- The Unresolved Challenge
- Names, proper: Performing the turn completely
- Representing Words
- Words, Classes, Models, Waves
- Role of Words
- Understanding (Images, Words, …)
From Matter to Mind
Given the conditioning mentioned above, the anthropological history of the genus of homo5 poses a puzzle. Our anatomical foundations6 have been stable since at least 60’000 years, but contemporary human beings at the age of, let me say, 20 or 30 years are surely much more “intelligent”7. Given the measurement scale established as I.Q. in the beginning of the 20th century, a significant increase can be observed for the supervised populations even throughout the last 60 years.
So, what makes the difference then, between the earliest ancient cultures and the contemporary ones? This question is highly relevant for our considerations here that focus on the possibility of a machine-based episteme, or in more standard, yet seriously misplaced terms, machine learning, machine intelligence or even artificial intelligence. In any of those fields, one could argue, researchers and engineers somehow start with mere matter, then imprinting some rules and symbols to that matter, only to expect then the matter becoming “intelligent” in the end. The structure of the problematics remains the same, whether we take the transition that started from paleo-cultures or that rooted in the field of advanced computer science. Both instances concern the role of culture in the transformation of physical matter into symbolic matter.
While philosophy has tackled that issue for at least two and a half millennia, resulting in a rich landscape of arguments, including the reflection of the many styles of developing those arguments, computer science is still almost completely blind against the whole topic. Since computer scientists and computer engineers inevitably get into contact with the realm of the symbolic, they usually and naively repeat past positions, committing naïve, i.e. non-reflective idealism or materialism that is not even on a pre-socratic level. David Blair  correctly identifies the picture of language on which contemporary information retrieval systems are based on as that of Augustine: He believed that every word has a meaning. Notably, Augustine lived in the late 4th till early 5th century A.C. This story simply demonstrates that in order to understand the work of a field one also has, as always, to understand its history. In case of computer sciences it is the history of reflective thought itself.
Precisely this is also the reason for the fact that philosophy is much more than just a possibly interesting source for computer scientists. More directly expressed, it is probably one of the major structural faults of computer science that it is regarded as just a kind of engineering. Countless projects and pieces of software failed for the reason of such applied methodological reductionism. Everything that gets into contact with computers developed from within such an attitude then also becomes infected by the limited perspective of engineering.
One of the missing aspects is the philosophy of techno-science, which not just by chance seriously started with Heidegger8 as its first major proponent. Merleau-Ponty, inspired by Heidegger, then emphasized that everything concerning the human is artificial and natural at the same time. It does not make sense to set up that distinction for humans or man-made artifacts as well, as if such a difference would itself be “natural”. Any such distinction refers more directly than not to Descartes as well as to Hegel, that is, it follows either simplistic materialism or overdone idealism, so to speak idealism in its machinic, Cartesian form. Indeed, many misunderstandings about the role of computers in contemporary science and engineering, but also in the philosophy of science and the philosophy of information can be deciphered as a massive Cartesio-Hegelian heir, with all its drawbacks. And there are many.
The most salient perhaps is the foundational element9 of Descartes’ as well as Hegel’s thoughts: independence. Of course, for both of them independence was a major incentive, goal and demand, for political reasons (absolutism in the European 17th century), but also for general reasons imposed by the level of techno-scientific insights, which remained quite low until the mid of the 20th century. People before the scientific age had been exposed to all sorts of threatening issues, concerning health, finances, religious or political freedom, collective or individual violence, all together often termed “fate”. Being independent meant a basic condition to live more or less safely at all, physically and/or mentally. Yet, Descartes and Hegel definitely exaggerated it.
Yet, the element of independence made its way into the cores of the scientific method itself. Here it blossomed as reductionism, positivism and physicalism, all of which can be subsumed under the label of naive realism. It took decades until people developed some confidence not to prejudge complexity as esotericism.
With regard to computer science there is an important consequence. We first and safely can drop the label of “artificial intelligence” or “machine learning” just along with the respective narrow and limited concepts. Concerning machine learning we can state that only very few of the approaches to machine learning that exist so far is at most a rudimentary learning in the sense of structural self-transformation. The vast majority of approaches that are dubbed as “machine learning” represent just some sort of advanced parameter estimation, where the parameters to be estimated are all defined (i) apriori, and (ii) by the programmer(s). And regarding intelligence we can recognize that we never can assign concepts like artificial or natural to it, since there is always a strong dependence on culture in it. Michel Serres once called written language the first artificial intelligence, pointing to the central issue of any technology: externalization of symbol-based systems of references.
This brings us back to our core issue here, the conditions for the transformation of (mainly) physical matter into (mainly) symbolic matter. In some important way we even can state that there is no matter without symbolic aspects. Two pieces of matter can interact only if they are not completely transparent to each other. If there is an effective transfer of energy between those, then the form of the energy becomes important, think of it for instance as wave length of some electromagnetic radiation, or the rhythmicity of it, which becomes distinctive in the case of a LASER [9,10]. Sure, in a LASER there are no symbols to be found; yet, the system as a whole establishes a well-defined and self-focusing classification, i.e. it performs the transition from a white-noised, real-valued randomness to a discrete intensional dynamics. The LASER has thus to be regarded as a particular kind of associative system, which is able to produce proto-symbols.
Of course, we may not restrict our considerations to such basic instances of pan-semiotics. When talking about machine-based episteme we talk about the ability of an entity to think about the conditions for its own informational dynamics (avoiding the term knowledge here…). Obviously, this requires some kind of language. The question for any attempt to make machines “intelligent” thus concerns in turn the question about how to think about the individual acquisition of language, and, of course, with regard to our interests here how to implement the conditions for it. Note that homo erectus who lived 1 million years ago must have had a clear picture not only about causality, and not only individually, but they also must have had the ability to talk about that, since they have been able to keep fire burning and to utilize it for cooking meal and bones. Logic has not been invented as a field at these times, but it seems absolutely mandatory that they have been using a language.10 Even animals like cats, pigs or parrots are able to develop and to perform plans, i.e. to handle causality, albeit probably not in a conscious manner. Yet, neither wild pigs nor cats are able for symbol based culture, that is a culture, which spreads on the basis of symbols that are independent from a particular body or biological individual. The research programs of machine learning, robotics or artificial intelligence thus appears utterly naive, since they all neglect the cultural dimension.
The central set of questions thus considers the conditions that must be met in order to become able to deal with language, to learn it and to practice it.
These conditions are not only “private”, that is, they can’t be reduced to individual brains, or a machines, that would “process” information. Leaving the simplistic perspective onto information as it is usually practiced in computer sciences aside for the moment, we have to accept that learning language is a deeply social activity, even if the label of the material description of the entity is “computer”. We also have to think about the mediality of symbolic matter, the transition from nature to culture, that is from contexts of low symbolic intensity to those of high symbolic intensity. Handling language is not an affair that could be thought to be performed privately, there is no such thing as a “private language”. Of course, we have brains, for which the matter could still be regarded as dominant, and the processes running there are running only there11.
Note that implementing the handling of words as apriori existing symbols is not what we are talking about here. As Hofstadter pointed out , calling the computing processes on apriori defined strings “language understanding” is nothing but silly. We are not allowed to call the shuffling of predefined encoded symbols forth and back “understanding”. But what could we call “understanding” then? Again, we have to postpone this question for the time being. Meanwhile we may reshape the question about learning language a bit:
The Unresolved Challenge
The big danger when addressing these issues is to start too late, provoked by an ontological stance that is applied to language. The most famous example probably being provided by Heidegger and his attempt of “fundamental ontology”, which failed glamorously. It is all too easy to get bewitched by language itself and to regard it as something natural, as something like stones: well-defined, stable, and potentially serving as a tool. Language itself makes us believe that words exist as such, independent from us.
Yet, language is a practice, as Wittgenstein said, and this practice is neither a single homogenous one nor does it remain constant throughout life, nor are the instances identical and exchangeable. The practice of language develops, unfolds, gains quasi-materiality, turns from an end to a means and back. Indeed, language may be characterized just by the capability to provide that variability in the domain of the symbolic. Take as a contrast for instance the symbolon, or take the use of signs in animals, in both cases there is exactly one single “game” you can play. Only in such trivial cases the meaning of a name could be said to be close to its referent. Yet, language games are not trivial.
I already mentioned the implicit popularity of Augustine among computer scientists and information systems engineers. Let me cite the passage that Wittgenstein chose in his opening remarks to the famous Philosophical Investigations (PI)12. Augustine writes:
When they (my elders) named some object, and accordingly moved towards something, I saw this and I grasped that the thing was called by the sound they uttered when they meant to point it out. Their intention was shewn by their bodily movements, as it were the natural language of all peoples: the expression of the face, the play of the eyes, the movement of other parts of the body, and the tone of voice which expresses our state of mind in seeking, having, rejecting, or avoiding something. Thus, as I heard words repeatedly used in their proper places in various sentences, I gradually learnt to understand what objects they signified; and after I had trained my mouth to form these signs, I used them to express my own desires.
Wittgenstein gave two replies, one directly in the PI, the other one in the collection entitled “Philosophical Grammar” (PG).
These words, it seems to me, give us a particular picture of the essence of human language. It is this: the individual words in language name objects—sentences are combinations of such names.—In this picture of language we find the roots of the following idea: Every word has a meaning. This meaning is correlated with the word. It is the object for which the word stands.
Augustine does not speak of there being any difference between kinds of word. If you describe the learning of language in this way you are, I believe, thinking primarily of nouns like “table,” “chair,” “bread,” and of people’s names, and only secondarily of the names of certain actions and properties; and of the remaining kind of words as something that will take care of itself. (PI §1)
And in the Philosophical Grammar:
When Augustine talks about the learning of language he talks about how we attach names to things or understand the names of things. Naming here appears as the foundation, the be all and end all of language. (PG 56)
Before we will take the step to drop and to drown the ontological stance once and for all we would like to provide two things. First, we will briefly cite a summarizing table from Blair 13. Blair’s book is indeed a quite nice work about the peculiarities of language as far as it concerns “information retrieval” and how Wittgenstein’s philosophy could be helpful in resolving the misunderstandings. Second, we will (also very briefly) make our perspective to names and naming explicit.
David Blair dedicates quite some efforts to render the issue of indeterminacy of language as clear as possible. In alignment to Wittgenstein he emphasizes that indeterminacy in language is not the result of sloppy or irrational usage. Language is neither a medium of logics nor a something like a projection screen of logics. There are good arguments, represented by the works of Ludwig Wittgenstein, late Hilary Putnam and Robert Brandom, to believe that language is not an inferior way to express a logical predicate (see the previous chapter about language). Language can’t be “cleared” or being made less ambiguous, its vagueness is a constitutive necessity for its use and utility in social intercourse. Many people in linguistics (e.g. Rooij ) and large parts of cognitive sciences (e.g. Alvin Goldman 14), but also philosophers like Saul Kripke  or Scott Soames  take the opposite position.
Of course, in some contexts it is reasonable to try to limit the vagueness of natural language, e.g. in law and contracts. Yet, it is also clear that positivism in jurisdiction is a rather bad thing, especially if it shows up as a pair with idealism.
Blair then contrasts two areas in so-called “information retrieval”15, distinguished by the type of data that is addressed: structured data that could be arranged in tables on the one hand, Blair calls it determinate data, and such “data” that can’t be structured apriori, like language. We already met this fundamental difference in other chapters (about analogies, language). The result of his investigation he summarized in the following table. It is more than obvious that the characteristics of the two fields are drastically different, which equally obvious has to be reflected in the methods going to be applied. For instance, the infamous n-gram method is definitely a no-go.
For the same reasons, semantic disambiguation is not possible by a set of rules that could be applied by an individual, whether this individual is a human or a machine. Quite likely it is even completely devoid of sense to try to remove ambiguity from language. One of the reasons is given by the fact that concepts are transcendental entities. We will return to the issue of “ambiguity” later.
In the quote from the PG shown above Wittgenstein rejects Augustine’s perspective that naming is central to language. Nevertheless, there is a renewed discussion in philosophy about names and so-called “natural kind terms”, brought up by Kripke’s “Naming and Necessity” . Recently, Scott Soames explicitly referred to Kripke’s. Yet, as so many others, Soames commits the drastic mistake introduced along the line formed by Frege, Russell and Carnap in ascribing language the property of predicativity (cf.  p.646).
These claims are developed within a broader theory which, details aside, identifies the meaning of a non-indexical sentence S with a proposition asserted by utterances of S in all normal contexts.
We won’t delve in any detail to the discussion of “proper names”16, because it is largely a misguided and unnecessary one. Let me just briefly mention three main (and popular) alternative approaches to address the meaning of names: the descriptivist theories, the referential theory originally arranged by John Stuart Mill, and the causal-historical theory. They are all not tenable because they implicitly violate the primacy of interpretation, though not in an obvious manner.
Why can’t we say that a name is a description? A description needs assignates17, or aspects, if you like, at least one scale. Assuming that there is the possibility for a description that is apriori justified and hence objective invokes divinity as a hidden parameter, or any other kind of Fregean hyper-idealism. Assignates are chosen according to and in dependence from the context. Of course, one could try to expel any variability of any expectable context, e.g. by literally programming society, or some kind of philosophical dictatorship. In any other case, descriptions are variant. The actual choice for any kind of description is the rather volatile result of negotiation processes in the embedding society. The rejection of names as description results from the contradictory pragmatic stances. First, names are taken as indivisible, atomic entities, but second descriptions are context-dependent subatomic properties, which by virtue of the implied pragmatics, corroborates the primary claim. Remember that the context-dependency results from the empirical underdetermination. In standard situations it is neither important that water consists as a compound of hydrogen and oxygen, nor is this what we want to say in everyday situations. We do not carry the full description of the named entity along into any instance of its use, despite there are some situations where we indeed are interested in the description, e.g. as a scientist, or as a supporter of the “hydrogen economy”. The important point is that we never can determine the status of the name before we have interpreted the whole sentence, while we also can’t interpret the sentence without determining the status of the named entity. Both entities co-emerge. Hence we also can’t give an explicit rule for such a decision other than just using the name or uttering the sentence. Wittgenstein thus denies the view that assumes a meaning behind the words that is different from their usage.
The claim that the meaning of a proper name is its referent meets similar problems, because it just introduces the ontological stance through the backdoor. Identifying the meaning of a label with its referent implies that the meaning is taken as something objective, as something that is independent from context, and even beyond that, as something that could be packaged and transferred *as such*. In other words, it deliberately denies the primacy of interpretation. We need not say anything further, except perhaps that Kripke (and Soames as well, in taking it seriously) commits a third mistake in using “truth-values” as factual qualities.18 We may propose that the whole theory of proper names follows a pseudo-problem, induced by overgeneralized idealism or materialism.
Names, proper: Performing the turn completely
Yet, what would be an appropriate perspective to deal with the problem of names? What I would like to propose is a consequent application of the concept of “language game”. The “game” perspective could not only be applied to the complete stream of exchanged utterances, but also to the parts of the sentences, e.g. names and single words. As a result, new questions become visible. Wittgenstein himself did not explore this possibility (he took Augustine as a point of departure), and it could not be found in contemporary discourse either”19. As so often, philosophers influenced by positivism simply forget about the fact that they are speaking. Our proposal is markedly different from and also much more powerful than the causal-historical or the descriptivist approach, and also avoids the difficulties of Kripke’s externalist version.
After all, naming, to give a name and to use names, is a “language game”. Names are close to observable things, and as a matter of fact, observable things are also demonstrable. Using a name refers to the possibility of a speaker to provide a description to his partner in discourse such that this listener would be able to agree on the individuality of the referenced thing. The use of the name “water” for this particular liquid thing does not refer to an apriori fixed catalog of properties. Speaker and listener even need not agree on the identity of the set of properties ascribed to the referred physical thing. The chemist may always associate the physico-chemical properties of the molecule even when he reads about the submersed sailors in Shakespeare’s *tempest*, but nevertheless he easily could talk about that liquid matter with a 9 year old boy that does neither know about Shakespeare nor about the molecule.
It is thus neither possible nor is it reasonable to try to achieve a match regarding the properties, since a rich body of methods would be necessarily invoked to determine that set. Establishing the identity of representations of physical, external things, or even of the physical things themselves, inevitably invokes a normative act (which is rather incommensurable to the empiricists claims).
For instance, saying just “London”, out of the blue, it is not necessary that we envisage the same aspects of the grand urban area. Since cities are inevitably heterotopic entities (in the sense of Foucault [19, 20], acc. to David Graham Shane ), this agreement is actually impossible. Even for the undeniably more simple minded cartographers the same problem exists: “Where” is that London, in terms of spheric coordinates? Despite these unavoidable difficulties both the speaker and the listener easily agree on the individuality of the imaginary entity “London”. The name of “London” does not point to a physical thing but just to an imaginative pole. In contrast to concepts, however, names take a different grammatical role as they not only allow for a negotiation of rather primitive assignates in order to take action, they even demonstrate the possibility of such negotiation. The actual negotiations could be quite hard, though.
We conclude that we are not allowed to take any of the words as something that would “exist” as a, or like a physical “thing”. Of course, we get used to certain words, the gain a quasi-materiality because a constancy appears that may be much stronger than the initial contingency. But this “getting used” is a different topic, it just refers how we speak about words. Naming remains a game, and as any other game this one also does not have an identifiable border.
Despite this manifold that is mediated through language, or as language, it is also clear that language remains rooted in activity or the possibility of it. I demonstrate the usage of a glass and accompany that by uttering “glass”. Of course, there is the Gavagai problematics20 as it has been devised by Quine . Yet, this problematics is not a real problem, since we usually interact repeatedly. On the one hand this provides us the possibility to improve our capability to differentiate single concepts in a certain manner, but on the other hand the extended experience introduces a secondary indeterminacy.
In some way, all words are names. All words may be taken as indicators that there is the potential to say more about them, yet in a different, orthogonal story. This holds even for the abstract concepts denoted by the word “transcendental” or for verbs.
The usage of names, i.e. their application in the stream of sentences, gets more and more rich, but also more and more indeterminate. All languages developed some kind of grammar, which is a more or less strict body of rules about how to arrange words for certain language games. Yet, the grammar is not a necessity for language at all, it is just a tool to render language-based communication more easy, more fast and more precise. Beyond the grammars, it is the experience which enables us to use metaphors in a dedicated way. Yet, language is not a thing that sometimes contains metaphors and sometimes not. In a very basic sense all the language is metaphorical all the time.
So, we first conclude that there is nothing enigmatic in learning a language. Secondly, we can say that extending the “gameness” down to words provides the perspective of the mechanism, notably without reducing language to names or propositions.
There is a drastic consequence of the completed gaming perspective. Words can’t be “represented” as symbols or as symbolic strings in the brain, and words can’t be appropriately represented as symbols in the computer either. Given any programming language, strings in a computer program are nothing else than particularly formatted series of values. Usually, this series is represented as an array of values, which is part of an object. In other words, the word is represented as a property of an object, where such objects are instances of their respective classes. Such, the representation of words in ANY computer program created so far for the purpose of handling texts, documents, or textual information in general is deeply inappropriate.
Instead, the representation of the word has to carry along its roots, its path of derivation, or in still other words, its traces of precipitation of the “showing”. This rooting includes, so we may say, a demonstrativum, an abstract image. This does not mean that we have to set up an object in the computer program that contains a string and an abstract image. This would be just the positivistic approach, leaving all problems untouched, the string and the image still being independent. the question of how to link them would be just delegated to the next analytic homunculus.
What we propose are non-representational abstract compounds that are irrevocably multi-modal since they are built from the assignates of abstract “things” (Gegenstände). These compounds are nothing else than combined sets of assignates. The “things” represented in this way are actually always more or less “abstract”. Through the sets of assignates we actually may combine even things which appear incommensurable on the level of their wholeness, at least at first sight. An action is an action, not a word, and vice versa, an image is neither a word nor an action, isn’t it? Well, it depends; we already mentioned that we should not take words as ontological instances. Any of those entities can be described using the same formal structure, the probabilistic context that is further translated into a set of assignates. The probabilistic context creates a space of expressibility, where the incommensurability disappears, notably without reducing the comprised parts (image, text,…) to the slightest extent.
The situation reminds a bit synesthetic experiences. Yet, I would like to avoid calling it synesthetic, since synesthecism is experienced on a highly symbolic level. Like other phenomenological concepts, it also does not provide any hint about the underlying mechanisms. In contrast, we are talking about a much lower level of integration. Probably we could call this multi-modal compound a “syn-presentational” compound, or short, a “synpresentation”.21
Words, images and actions are represented together as a quite particular compound, which is an inextricable multi-modal compound. We also may say that these compounds are derived qualia. The exciting point is that the described way of probabilistic multi-modal representation obviates the need for explicit references and relations between words and images. These relations even would have to be defined apriori (strongly: before programming, weakly: before usage). In our approach, and quite to the contrast to the model of external control, relations and references *can be* subject to context-dependent alignments, either to the discourse, or the task (of preparing a deliverable from memory).
The demonstrativum may not only refer to an “image”. First note that the image does not exist outside of its interpretation. We need to refer to that interpretation, not to an index in a data base or a file system. Interpretation thus means that we apply a lot of various processing and extraction methods to it, each of them providing a few assignates. The image is dissolved into probabilistic contexts as we do it for words (footnote: we have described it elsewhere). The dissolving of an image is of course not the endpoint of a communicable interpretation, it is just the starting point. Yet, this does not matter, since the demonstrativum may also refer to any derived intension and even to any derived concept.22
The probabilistic multi-modal representation exhibits three highly interesting properties, concerning abstractness, relations and the issue of foundations. First, the abstractness of represented items becomes scalable in an almost smooth manner. In our approach, “abstractness” is not a quality any more. Secondly, relations and references of both words and the “content” of images are transformed into their pre-specific versions. Both, relations and references need not be implemented apriori or observed as an apriori. Initially, they appear only as randolations23. Thirdly, some derived and already quite abstract entities on an intermediate level of “processing” are more basic than the so-called raw observations24.
Words, Classes, Models, Waves
It is somewhat tempting to arrange these four concepts to form a hierarchical series. Yet, things are not that simple. Actually, any of the concepts that appear more as a symbolistic entity also may re-turn into a quasi-materiality, into a wave-like phenomenon that itself serves as a basis for potential differences. This re-turn is a direct consequence of the inextricable mediality of the world, mediality understood here thus as a transcendental category. Needless to say that mediality is just another blind spot in contemporary computer sciences. Cybernetics as well as engineering straightaway exclude the possibility to recognize the mediatedness of worldly events.
In this section we will try to explicate the relations between the headlined concepts to some extent, at least as far as it concerns the mapping of those into an implementable system of (non-Turing) “computer programs”. The computational model that we presuppose here is the extended version of the 2-layered SOM, as we have it introduced previously.
Let us start with first things first. Given a physical signal, here in the literal sense, that is as a potentially perceivable difference in a stream of energy, we find embodied modeling, and nothing else. The embodiment of the initial modeling is actualized in sensory organs, or more generally, in any instance that is able to discretize the waves and differences at least “a bit more”. In more technical terms, the process of discretization is a process that increases the signal-noise ratio. In biological systems we often find a frequency encoding of the intensity of a difference. Though the embodiment of that modeling is indeed a filtering and encoding, hence already some kind of a modeling representation, it is not a modeling in the more narrow sense. It points out of the individual entity into the phylogenesis, the historical contingency of the production of that very individual entity. We also can’t say that the initial embodied processing by the sensory organs is a kind of encoding. There is no code consisting of well-identified symbols at the proximate end of the sensory cell. It is still a rather probabilistic affair.
This basic encoding is not yet symbolic, albeit we also can’t call it a wave any more. In biological entities this slightly discretized wave then is subject of an intense modeling sensu strictu. The processing of the signals is performed by associative mechanisms that are arranged in cascades. This “cascading” is highly interesting and probably one of the major mandatory ingredients that are neglected by computer science so far. The reason is quite clear: it is not an analytic process, hence it is excluded from computer science almost by definition.
Throughout that cascade signals turn more and more into information as an interpreted difference. It is clear that there is not a single or identifiable point in this cascade to which one could assign the turn from “data” to “information”. The process of interpretation is, quite in contrast to idealistic pictures of the process of thinking, not a single step. The discretized waves that flow into the processing cascade are subject to many instances and very different kinds of modeling, throughout of which discrete pieces get separated and related to other pieces. The processing cascade thus is repeating a modular principle consisting from association and distribution.
This level we still could not label as “thinking”, albeit it is clearly some kind of a mental process. Yet, we could still regard it as something “mechanical”, even as we also find already class-like representations, intensions and proto-concepts. Thinking in its meaningful dimension, however, appears only through assigning sharable symbols. Thinking of something implicitly means that one could tell about the respective thoughts. It does not matter much whether these symbols are shared between different regions in the brain or between different bodily entities does not matter much. Hence, thinking and mental processes need to be clearly distinguished. Yet, assigning symbols, that is assigning a word, a specific sound first, and later, as a further step of externalization, a specific grapheme that reflects the specific sound, which in turn represents an abstract symbol, this process of assigning symbols is only possible through cultural means. Cats may recognize situations very well and react accordingly, they may even have a feeling that they have encountered that situation before, but cats can’t share they symbols, they can’t communicate the relational structure of a situation. Yet, cats and dogs already may take part in “behavior games”, and such games clearly has been found in baboons by Fernando Colmenares . Colmenares adopted the concept of “games” precisely because the co-occurrence of obvious rules, high variability, and predictive values of actions and reactions of the individual animals. Such games unfold synchronic as well as diachronic, and across dynamically changing assignment of social roles. All of this is accompanied by specific sounds. Other instances of language-like externalization of symbols can presumably be found in grey parrots , green vervet monkey , bonobos, dolphins and Orcas.
But still… in animals those already rather specific symbols are not externalized by imprinting them into matter different from their own bodies. One of the most desirable capabilities for our endeavor here about machine-based episteme thus consists in just that externalization processes embedded in social contexts.
Now the important thing to understand is that this whole process from waves to words is not simply a one-way track. First, words do not exist as such, they just appear as discrete entities through usage. It is the usage of X that introduces irreversibility. In other words, the discreteness of words is a quality that is completely on the aposteriori side of thinking. Before their actual usage, their arrangement into sentences words “are” nothing else than probabilistic relations. It needs a purpose, a target oriented selection (call it “goal-directed modeling”) to let them appear as crisp entities.
The second issue is that a sentence is an empirical phenomenon, remarkably even to the authoring brain itself. The sentence needs interpretation, because it is never ever fully determinate. Interpretation, however, of such indeterminate instances like sentences renders the apparent crisp phenomenon of words back into waves. A further effect of interpretation of sentences as series of symbols is the construction of a virtual network. Texts, and in a very similar way, pieces of music, should not be conceived as series, as computer linguistics is treating them. Much more appropriately texts are conceived as networks, that even may exert there own (again virtual) associative power, which to some extent is independent from the hosting interpreter, as I have argued here .
Role of Words
All these characteristics of words, their purely aposteriori crispness, their indeterminacy as sub-sentential indicators of randolational networks, their quality as signs by which they only point to other signs, but never to “objects”, their double quality as constituent and result of the “naming game”, all these “properties” make it actually appear as highly unlikely and questionable whether language is about references at all. Additionally, we know that the concept of “direct” access to the mind or the brain is simply absurd. Everything we know about the world as individuals is due to modeling and interpretation. That of course concerns also the interpretation of cultural artifacts or culturally enabled externalization of symbols, for instance into the graphemes that we use to represent words.
It is of utmost importance to understand that the written or drawn grapheme is not the “word” itself. The concept of a “word-as-such” is highly inappropriate, if not bare nonsense.
So, if words, sentences and language at large are not about “direct” referencing of (quasi-) material objects, how then should we conceive of the process we call “language game”, or “naming game”? Note that we now can identify van Fraassen’s question about “how do words and concepts acquire their reference?” as a misunderstanding, deeply informed by positivism itself. It does not make sense to pose that question in this way at all. There is not first a word which then, in a secondary process gets some reference or meaning attached. Such a concept is almost absurd. Similarly, the distinction between syntax and semantics, once introduced by the positivist Morris in the late 1940ies, is to be regarded as much the same pseudo-problem, established just by the fundamental and elemental assumptions of positivism itself: linear additivity, metaphysical independence and lossless separability of parts of wholenesses. If you scatter everything into single pieces of empirical dust, you will never be able to make any proposition anymore about the relations you destroyed before. That’s the actual reason for the problem of positivistic science and its failure.
In contrast to that we tend to propose a radically different picture of language, one that of course has been existing in many preformed flavors. Since we can’t transfer anything directly into one’s other mind, the only thing we can do is to invite or trigger processes of interpretation. In the chapter about vagueness we called words “processual indicative” for slightly different reasons. Language is a highly structured, institutionalized and symbolized “demonstrating”, an invitation to interpret. Richard Brandom investigated in great detail  the processes and the roles of speakers and listeners in that process of mutual invitation for interpretation. The mutuality allows a synchronization, a resonance and a more or less strong resemblance between pairs of speaker-listeners and listener-speakers.
The “naming game” and its derivative, the “word game” is embedded into a context of “language games”. Actually, word games and language games are not as related as it might appear prima facie, at least beyond their common characteristics that we may label “game”. This becomes apparent if we ask what happens with the “physical” representative of a single word that we throw into our mechanisms. If there is no sentential context, or likewise no social context like a chat, then a lot of quite different variants of possible continuations are triggered. Calling out “London” our colleague in chatting may continue with “Jack London” (the writer), “Jack the Ripper”, Chelsea, London Tower, Buckingham, London Heathrow, London Soho, London Stock Exchange, etc. but also Paris, Vienna, Berlin, etc., choices being slightly dependent on our mood, the thoughts we had before etc. In other words, the word that we bring to the foreground as a crisp entity behaves like a seedling: it is the starting point of a potential garden or forest, it functions as the root of the unfolding of a potential story (as a co-weaving of a network of abstract relations). Just to bring in another metaphorical representation: Words are like the initial traces of firework rockets, or the traces of elementary particles in statu nascendi as they can be observed in a bubble chamber: they promise a rich texture of upcoming events.
Understanding (Images, Words, …)
We have seen that “words” gain shape only as a result of a particular game, the “naming game”, which is embedded into a “language game”. Before those games are played, “words” do not exist as a discrete, crisp entity, say as a symbol, or a string of letters. Would they, we could not think. Even more than the “language game” the “naming game” works mainly as an invitation or as an acknowledged trigger for more or less constrained interpretation.
Now there are those enlightened language games of “understanding” and “explaining”. Both of them work just as any other part of speech do: they promise something. The claim to understand something refers to the ability for a potential preparation of a series of triggers that one additionally claim to be able to arrange in such a way as to support the gaining of the respective insight in my chat partner. Slightly derived from that understanding also could mean to transfer the structure of the underlying or overarching problematics to other contexts. This ability for adaptive reframing of a problematic setting is thus always accompanied by a demonstrativum, that is, by some abstract image, either by actual pictorial information or its imagination, or by its activity. Such a demonstrativum could be located completely within language itself, of course, which however is probably quite rare.
It is clear that language does not work as a way to express logical predicates. Trying to do so needs careful preparations. Language can’t be “cured” and “cleaned” from ambiguities, trying to do so would establish a categorical misunderstanding. Any “disambiguation” happens as a resonating resemblance of at least two participants in language-word-gaming, mutually interpreting each other until both believe that their interest and their feelings match. An actual, so to speak objective match is neither necessary nor possible. In other words, language does not exist in two different forms, one without ambiguity and without metaphors, and the other form full of them. Language without metaphorical dynamics is not a language at all.
The interpretation of empirical phenomena, whether outside of language or concerning language itself, is never fully determinable. Quine called the idea of the possibility of such a complete determination a myth and as the “dogma of empiricism” . Thus, given this underdetermination, it does not make any sense to expect that language should be isomorphic to logical predicates or propositions. Language is basically an instance of impredicativity. Elsewhere we already met the self-referentiality of language (its strong singularity) as another reason for this. Instead, we should expect that this fundamental empirical underdetermination is reflected appropriately in the structure of language, namely as analogical thinking, or quite related to that, as metaphorical thinking.
Ambiguity is not a property of language or words, it is a result, or better, a property of the process of interpretation at some arbitrarily chosen point in time. And that process takes place synchronously within a single brain/mind as well as between two brains/minds. Language is just the mediating instance of that intercourse.
It is now possible to clarify the ominous concept of “intelligence”. We find the concept in the name of a whole discipline (“Artificial Intelligence”), and it is at work behind the scenes in areas dubbed as “machine learning”. Else, there is the hype about the so-called “collective intelligence”. These observations, and of course our own intentions make it necessary to deal briefly with it, albeit we think that it is a misleading and inappropriate idea.
First of all one has to understand that “intelligence” is an operationalization of a research question, allowing for a measurement, hence for a quantitative comparison. It is questionable whether the mental qualities can be made quantitatively measurable without reducing them seriously. For instance, the capacity for I/O operations related to a particular task surely can’t be equaled with “intelligence”, even if it could be a necessary condition.
It is just silly to search for “intelligence” in machines or beings, or to assign more or less intelligence to any kind of entity. Intelligence as such does not “exist” independently of a cultural setup, we can’t find it “out there”. Ontology is, as always, not only a bad trail, it directly leads into the abyss of nonsense. The research question, by the way, was induced by the intention to proof that black people and women are less intelligent than white males.
Yet, even if we take “intelligence” in an adapted and updated form as the capability for autonomous generalization, it is a bad concept, simply because it does not allow to pose further reasonable questions. This directly follows from its characteristics of being itself an operationalization. Investigating the operationalization hardly brings anything useful to light about the pretended subject of interest.
The concept of intelligence arose in a strongly positivistic climate, where the positivism has been practiced even in a completely unreflected manner. Hence, their inventors have not been aware of the effect of their operationalization. The concept of intelligence implies a strong functional embedding of the respective, measured entity. Yet, dealing with language undeniably has something to do with higher mental abilities, but language is a strictly non-functional phenomenon. It does not matter here that positivists still claim the opposite. And who would stand up claiming that a particular move, e.g. in planning a city, or dealing with the earth’s climate, is more smart than another? In other words, the other strong assumption of positivism, measurability and identifiability, also fails dramatically when it comes to human affairs. And everything on this earth is a human affair.
Intelligence is only determinable relative to a particular Lebensform. It is thus not possible to “compare the intelligence” across individuals living in different contexts. This renders the concept completely useless, finally.
The hypothesis I have been arguing for in this essay claims that the trinity of waves, words and images plays a significant role in the ability to deal with language and for the emergence of higher mental abilities. I proposed first that this trinity is irreducible and second that is responsible for this ability in the sense of a necessary and sufficient condition. In order to describe the practicing of that trinity, for instance with regard to possible implementations, I introduced the term of “synpresentation”. This concept draws the future track of how to deal with words and images as far as it concerns machine-based episteme.
In more direct terms, we conclude that without the capability to deal with “names”, “words” and language, the attempt to mapping higher mental capacities onto machines will not experience any progress. Once the machine will have arrived such a level, it will find itself exactly in the same position as we as humans do. This capability is definitely not sufficiently defined by “calculation power”; indeed, such an idea is ridiculous. Without embedding into appropriate social intercourse, without solving the question of representation (contemporary computer science and its technology do NOT solve it, of course), even a combined 1020000 flops will not cause the respective machine or network of machines25 “intelligent” in any way.
Words and proper names are re-formulated as a particular form of “games”, though not as “language games”, but on a more elementary level as “naming game”. I have tried to argue how the problematics of the reference could be thought of to disappear as a pseudo-problem on the basis of such a reformulation.
Finally, we found important relationships to earlier discussions of concepts like the making of analogies or vagueness. We basically agree on the stance that language can’t be clarified and that it is inappropriate (“free of sense”) to assign any kind of predicativity to language. Bluntly spoken, the application of logic is the mind, and nowhere else. Communicating about this application is not based on a language any more, and similarly, projecting logic onto language destroys language. The idea of a scientific language is empty as it is the idea of a generally applicable and understandable language. A language that is not inventive could not be called such.
1. If you read other articles in this blog you might think that there is a certain redundancy in the arguments and the targeted issues. This is not the case, of course. The perspectives are always a bit different; such I hope that by the repeated attempt “to draw the face” (Ludwig Wittgenstein, ) the problematics is rendered more accurately. “How can one learn the truth by thinking? As one learns to see a face better if one
draws it.” ( Zettel §255, )
2. In one of the shortest articles ever published in the field of philosophy, Edmund Gettier  demonstrated that it is deeply inappropriate to conceive of knowledge as “justified true belief”. Yet, in the field of machine learning so-called “belief revision” is precisely and still following this untenable position. See also our chapter about the role of logic.
4. we will see that the distinction or even separation of the “symbolic” and the “material” is neither that clear nor is it simple. Fomr the side of the machine, Felix Guattari argued in favor for a particular quality , the machinic, which is roughly something like a mechanism in human affairs. From the side of the symbolic there is clearly the work of Edwina Taborsky to cite, who extended and deepened the work of Charles S. Peirce in the field of semiotics,
8. Heidegger developed the figure of the “Gestell” (cf. ), which serves multiple purposes. It is providing a storage capacity, it is a tool for sort of well-ordered/organized hiding and unhiding (“entbergen”), it provides a scaffold for sorting things in and out, and thus it is working as a complex constraint on technological progress. See also Peter Sloterdijk on this topic .
9. elementarization regarding Descartes
10. Homo floresiensis, also called “hobbit man”, who lived on Flores, Indonesia, 600’000y till approx. 3’000y ago. Homo floresiensis derived from homo erectus. 600’000 years ago they obviously built a boat to transfer to the islands across a sea gate with strong currents. The interesting issue is that this endeavor requires a stable social structure, division of labor, and thus also language. Homo floresiensis had a particular fore brain anatomy which is believed to provide the “intelligence” while the overall brain was relatively small as compared to ours.
11. Concerning the “the enigma of brain-mind interaction” Eccles was an avowed dualist . Consequently he searched for the “interface” between the mind and the brain, in which he was deeply inspired by the 3-world concept of Karl Popper. The “dualist” position held that the mind exists at least partially independently from and somehow outside the brain. Irrespective his contributions to neuroscience on the cellular level, these ideas (of Eccles and Popper) are just wild nonsense.
12. The Philosophical Investigations are probably the most important contribution to philosophy in the 20th century. The are often mistaken as a foundational document for analytic philosophy of language. Nothing is more wrong as to take Wittgenstein as a founding father of analytic philosophy, however. Many of the positions that refer to Wittgenstein (e.g. Kripke) are just low-quality caricatures of his work.
14. Goldman  provides a paradigmatic examples how psychologists constantly miss the point of philosophy, up today. In an almost arrogant tone he claims: “First, let me clarify my treatment of justificational rules, logic, and psychology. The concept of justified or rational belief is a core item on the agenda of philosophical epistemology. It is often discussed in terms of “rules” or “principles” of justification, but these have normally been thought of as derivable from deductive and inductive logic, probability theory, or purely autonomous, armchair epistemology.”
Markie  demonstrated that everything in these claims is wrong or mistaken. Our point about it is that something like “justification” is not possible in principle, but particularly it is not possible from an empirical perspective. Goldman’s secretions to the foundations of his own work are utter nonsense (till today).
15. It is one of the rare (but important) flaws in Blair’s work that he assimilates the concept of “information retrieval” in an unreflected manner. Neither it is reasonable to assign an ontological quality to information (we can not say that information “exists”, as this would deny the primacy of interpretation) nor can we then say that information can be “retrieved”. See also our chapter about his issue. Despite his largely successful attempt to argue in favor of the importance of Wittgenstein’s philosophy for computer science, Blair fails to recognize that ontology is not tenable at large, but particularly for issues around “information”. It is a language game, after all.
16 see Stanford Encyclopedia for a discussion of various positions.
17. In our investigation of models and their generalized form, we stressed the point that there are no apriori fixed “properties” of a measured (perceived) thing; instead we have to assign the criteria for measurement actively, hence we call these criteria assignates instead of “properties”, “features”, or “attributes”.
18. See our essay about logic.
20. See the entry in the Stanford Encyclopedia of Philosophy about Quine. Quine in “Word and Object” gives the following example (abridged version here). Imagine, you discovered a formerly unknown tribe of friendly people. Nobody knows their language. You accompany one of them hunting. Suddenly a hare rushes along, crossing your way. The hunter immediately points to the hare, shouting “Gavagai!” What did he mean? Funny enough, this story happened in reality. British settlers in Australia wondered about those large animals hopping around. They asked the aborigines about the animal and its name. The answer was “cangaroo” – which means “I do not understand you” in their language.
21. This, of course, resembles to Bergson, who, in Matter and Memory , argued that any thinking and understanding takes place by means of primary image-like “representations”. As Leonard Lawlor (Henri Bergson@Stanford) summarizes, Bergson conceives of knowledge as “knowledge of things, in its pure state, takes place within the things it represents.” We would not describe out principle of associativity as it can be be realized by SOMs very differently…
22. the main difference between “intension” and “concept” is that the former still maintains a set of indices to raw observations of external entities, while the latter is completely devoid of such indices.
23. We conceived randolations as pre-specific relations; one may also think of them as probabilistic quasi-species that eventually may become discrete on behalf of some measurement. The intention for conceiving of randolations is given by the central drawback of relations: their double-binary nature presumes apriori measurability and identifiability, something that is not appropriate when dealing with language.
25. There are mainly two aspects about that: (1) large parts of the internet is organized as a hierarchical network, not as an associative network; nowadays everybody should know that telephone network did not, do not and will not develop “intelligence”; (2) so-called Grid-computing is always organized as a linear, additive division of labor; such, it allows to run processes faster, but no qualitative change is achieved, as it can be observed for instance in the purely size-related contrast between a mouse and an elephant. Thus, taken (1) and (2) together, we may safely conclude that doing wrong things (=counting Cantoric dust) with a high speed will not produce anything capable for developing a capacity to understand anything.
-  Ludwig Wittgenstein, Zettel. Oxford, Basil Blackwell, 1967. Edited by G.E.M. Anscombe and G.H. von Wright, translated by G.E.M. Anscombe.
-  Edmund Gettier (1963), Is Justified True Belief Knowledge? Analysis 23: 121-123.
-  Michel Foucault “Dits et Ecrits”, Vol I.
-  Bernhard Waldenfels, Idiome des Denkens. Suhrkamp, Frankfurt 2005.
-  Henning Schmidgen (ed.), Aesthetik und Maschinismus, Texte zu und von Felix Guattari. Merve, Berlin 1995.
-  David Blair, Wittgenstein, Language and Information – Back to the Rough Ground! Springer Series on Information Science and Knowledge Management, Vol.10, New York 2006.
-  Martin Heidegger, The Question Concerning Technology and Other Essays. Harper, New York 1977.
-  Peter Sloterdijk, Nicht-gerettet, Versuche nach Heidegger. Suhrkamp, Frankfurt 2001.
-  Hermann Haken, Synergetik. Springer, Berlin New York 1982.
-  R. Graham, A. Wunderlin (eds.): Lasers and Synergetics. Springer, Berlin New York 1987.
-  John Eccles, The Understanding of the Brain. 1973.
-  Douglas Hofstadter, Fluid Concepts And Creative Analogies: Computer Models Of The Fundamental Mechanisms Of Thought. Basic Books, New York 1996.
-  Robert van Rooij, Vagueness, Tolerance and Non-Transitive Entailment. p.205-221 in: Petr Cintula, Christian G. Fermüller, Lluis Godo, Petr Hajek (eds.) Understanding Vagueness. Logical, Philosophical and Linguistic Perspectives. Vol.36 of Studies in Logic, College Publications, London 2011. book is avail online.
-  Alvin I. Goldman (1988), On Epistemology and Cognition, a response to the review by S.W. Smoliar. Artificial Intelligence 34: 265-267.
-  Peter J. Markie (1996). Goldman’s New Reliabilism. Philosophy and Phenomenological Research Vol.56, No.4, pp. 799-817
-  Saul Kripke, Naming and Necessity. 1972.
-  Scott Soames, Beyond Rigidity: The Unfinished Semantic Agenda of Naming and Necessity. Oxford University Press, Oxford 2002.
-  Scott Soames (2006), Précis of Beyond Rigidity. Philosophical Studies 128: 645–654.
-  Michel Foucault, Les Hétérotopies – [Radio Feature 1966]. Youtube.
-  Michel Foucault, Die Heterotopien. Der utopische Körper. Aus dem Französischen von Michael Bischoff, Suhrkamp, Frankfurt 2005.
-  David Grahame Shane, Recombinant Urbanism – Conceptual Modeling in Architecture, Urban Design and City Theory. Wiley Academy Press, Chichester 2005.
-  Willard van Orman Quine, Word and Object. M.I.T. Press, Cambridge (Mass.) 1960.
-  Henri Louis Bergson, Matter and Memory. transl. Nancy M. Paul & W. Scott Palmer, Martino Fine Books, Eastford (CT) 2011 .
-  Fernando Colmenares, Helena Rivero (1986). A conceptual Model for Analysing Interactions in Baboons: A Preliminary Report. pp.63-80. in: Colgan PW, Zayan R (eds.), Quantitative models in ethology. Privat I.E, Toulouse.
-  Irene Pepperberg (1998). Talking with Alex: Logic and speech in parrots. Scientific American. avail online. see also the Wiki entry about Alex.
-  a. Robert Seyfarth, Dorothy Cheney, Peter Marler (1980). Monkey Responses to Three Different Alarm Calls: Evidence of Predator Classification and Semantic Communication. Science, Vol.210: 801-803.b. Dorothy L. Cheney, Robert M. Seyfarth (1982). How vervet monkeys perceive their grunts: Field playback experiments. Animal Behaviour 30(3): 739–751.
-  Robert Seyfarth, Dorothy Cheney (1990). The assessment by vervet monkeys of their own and another species’ alarm calls. Animal Behaviour 40(4): 754–764.
-  Klaus Wassermann (2010). Nodes, Streams and Symbionts: Working with the Associativity of Virtual Textures. The 6th European Meeting of the Society for Literature, Science, and the Arts, Riga, 15-19 June, 2010. available online.
-  Richard Brandom, Making it Explicit. Harvard University Press, Cambridge (Mass.) 1998.
-  Willard van Orman Quine (1951), Two Dogmas of Empiricism. Philosophical Review, 60: 20–43. available here
February 17, 2012 § Leave a comment
The status of self-referential things is a very particular one.
They can be described only by referring to the concept of the “self.”
Of course, self-referential things are not without conditions, just as any other thing, too. It is, however, not possible to describe self-referential things completely just by means of those conditions, or dependencies. Logically, there is an explanatory gap regarding their inward-directed dependencies. The second peculiarity with self-referential things is that there are some families of configurations for which they become generative.
For strongly singular terms no possible justification exists. Nevertheless, they are there, we even use them, which means that the strong singularity does not imply isolation at all. The question then is about how we can/do achieve such an embedding, and which are the consequences of that.
Despite the fact that there is no entry point which could by apriori be taken as a justified or even salient one we still have to make a choice which one actually to take. We suppose that there is indeed such a choice. It is a particular one though. We do not assume that the first choice is actually directed to an already identified entity as this would mean that there already would have been a lot of other choices in advance. We would have to select methods and atoms to fix, i.e. select and choose the subject of a concrete choice, and so on.
The choice we propose to take is neither directed to an actual entity, nor is it itself a actual entity. We are talking about a virtual choice. Practically, we start with the assumption of choosability.
Actually, Zermelo performed the same move when trying to provide a sound basis for set theory  after the idealistic foundation developed by Frege and others had failed so dramatically, leading into the foundational crisis of formal sciences . Zermelo’s move was to introduce choosability as an axiom, called the axiom of choice.
For Zermelo’s set theory the starting point, or if you prefer, the anchor point, lies completely outside the realm of the concept that is headed for. The same holds for our conceptualization of formalization. This outside is the structure of pragmatic act of choice itself. This choice is a choice qua factum, it is not important that we choose from a set from identified entities.
The choice itself proposes by its mere performance that it is possible to think of relations and transformations; it is the unitary element of any further formalization. In Wittgenstein’s terms, it is part of the abstract life form. In accordance to Wittgenstein’s critique of Moore’s problems1, we can also say that it is not reasonable, or more precise: it is without any sense, to doubt on the act of choosing something, even if we did not think about anything particular. The mere executive aspect of any type of activity is sufficient for any a posteriori reasoning that a choice has been performed.
Notably, the axiom of choice implies the underlying assumption of intensive relatedness between yet undetermined entities. In doing so, this position represents a fundamental opposite to the attitude of Frege, Russell and any modernist in general, who always start with the assumption of the isolated particle. For these reasons we regard the axiom of choice as one of the most interesting items in mathematics!
The choice thus is a Deleuzean double-articulation , closely related to his concept of the transcendental status of difference; we also could say that the choice has a transcendental dualistic characteristics. On the one hand there is nothing to justify. It is mere movement, or more abstract, a pure mapping or transformation, just as a matter of fact. On the other hand, it provides us with the possibility of just being enabled to conceive mere movement as such a mapping transformation; it enables us to think the unit before any identification. Transformation comes first; Deleuze’s philosophy similarly puts the difference into the salient transcendental position. To put it still different, it is the choice, or the selection, that is inevitably linked to actualization. Actualization and choice/selection are co-extensive.
Just another Game
So, let us summarize briefly the achievements. First, we may hold that similarly to language, there is no justification for formalization. Second, as soon as we use language, we also use symbols. Symbols on the other hand take, as we have seen, a double-articulated position between language and form. We characterized formalization as a way to give a complicated thing a symbolic form that lives within a system of other forms. We can’t conceive of forms without symbols. Language hence always implies, to some degree, formalization. It is only a matter of intensity, or likewise, a matter of formalizing the formalization, to proceed from language to mathematics. Third, both language and formalization belong to particular class of terms, that we characterized as strongly singular terms. These terms may be well put together with an abstract version of Kant’s organon.
From those three points follows that concepts that are denoted by strongly singular terms, such as formalization, creativity, or “I”, have to be conceived, as we do with language, as particular types of games.
In short, all these games are being embedded in the life form of or as a particular (sub-)culture. As such, they are not themselves language games in the original sense as proposed by Wittgenstein.
These games are different from the language game, of course, mainly because the underlying mechanisms as well as embedding landscape of purposes is different. These differences become clearly visible if we try to map those games into the choreostemic space. There, they will appear as different choreostemic styles. Despite the differences, we guess that the main properties of the language game apply also to the formalization game. This concerns the setup, the performance of such games, their role, their evaluation etc.etc., despite the effective mechanisms might be slightly different; for instance, Brandom’s principle of the “making it explicit” that serves well in the case of language is almost for sure differently parameterized for the formalizatin or the creativity game. Of course, this guess has to be subject of more detailed investigations.
As there are different natural languages that all share the same basement of enabling or hosting the possibility of language games, we could infer—based on the shared membership to the family of strongly singular terms— that there are different forms of formalization. Any of course, everybody knows at least two of such different forms of formalization: music and mathematics. Yet, once found the glasses that allow us to see the multitude of games, we easily find others. Take for instance the notations in contemporary choreography, that have been developed throughout the 20ieth century. Or the various formalizations that human cultures impose onto themselves as traditions.
Taken together it is quite obvious that language games are not a singularity. There are other contexts like formalization, modeling or the “I-reflexivity” that exist for the same reason and are similarly structured, although their dynamics may be strikingly different. In order to characterize any possible such game we could abstract from the individual species by proceeding to the -ability. Cultures then could be described precisely as the languagability of their members.
Based on the concept of strongly singular terms we first proof that we have to conceive of formalization (and symbol based creativity) in a similar way as we do for language. Both are embedded into a life form (in the Wittgensteinian sense). Thus it makes sense to propose to transfer the structure of the “game” from the domain of natural language to other areas that are arranged around strongly singular terms, such as formalization or creativity in the symbolic domain. As a nice side effect this brought us to the proper generalization of the Wittgensteinian language games.
Yet, there is still more about creativity that we have to clarify before we can relate it to other “games” like formalization and to proof the “beauty” of this particular combination. For instance, we have to become clear about the differences of systemic creativity, which can be observed in quasi-material arrangements (m-creativity), e.g. as self-organization, and the creativity that is at home in the realm of the symbolic (s-creativity).
The next step is thus to investigate the issue of expressibility.
1. In an objection to Wittgenstein, Moore raised the skeptic question about the status of certain doubts: Can I doubt that this hand belongs to me? Wittgenstein denied the reasonability of such kind of questions.
-  Zermelo, Set theory
-  Hahn, Grundlagenkrise
-  Deleuze & Guattari, Milles Plateaus
February 16, 2012 § Leave a comment
Formalization is based on the the use of symbols.
In the last chapter we characterized formalization as a way to give a complicated thing a symbolic form that lives within a system of other forms.
Here, we will first discuss a special property of the concepts of formalization and creativity, one that they share for instance with language. We call this property strong singularity. Then, we will sketch some consequences of this state.
What does “Strongly Singular” mean?
Before I am going to discuss (briefly) the adjacent concept of “singular terms” I would like to shed a note on the newly introduced term of “strong singularity”.
The ordinary Case
Let us take ordinary language, even as this may be a difficult thing to theorize about. At least, everybody is able to use it. We can do a lot of things with language, the common thing about these things is, however, that we use it in social situations, mostly in order to elicit two “effects”: First, we trigger some interpretation or even inference in our social companion, secondly, we indicate that we did just that. As a result, a common understanding emerges, formally taken, a homeomorphism, which in turn then may serve as the basis for the assignment of so-called “propositional content”. Only then we can “talk about” something, that is, only then we are able to assign a reference to something that is external to the exchanged speech.
As said, this is the usual working of language. For instance, by saying “Right now I am hearing my neighbor exercising piano.” I can refer to common experience, or at least to a construction you would call an imagination (it is anyway always a construction). This way I refer to an external subject and its relations, a fact. We can build sentences about it, about which we even could say whether they correspond to reality or not. But, of course, this already would be a further interpretation. There is no direct access to the “external world”.
In this way we can gain (fallaciously) the impression that we can refer to external objects by means of language. Yet, this is a fallacy, based on an illegitimate shortcut, as we have seen. Nevertheless, for most parts of our language(s) it is possible to refer to external or externalized objects by exchanging the mutual inferential / interpretational assignments as described above. I can say “music” and it is pretty clear what I mean by that, even if the status of the mere utterance of a single word is somewhat deficient: it is not determined whether I intended to refer to music in general, e.g. as the totality of all pieces or the cultural phenomenon, or to a particular piece, to a possibility of its instantiation or the factual instance right now. Notwithstanding this divergent variety, it is possible to trigger interpretations and to start a talk between people about music, while we neither have to play or to listen to music at that moment.
The same holds for structural terms that regulate interpretation predominantly by their “structural” value. It is not that important for us here, whether the externalization is directed to objects or to the speech itself. There is an external, even a physical justification for the starting to engage in the language game about such entities.
Now, this externalization is not possible for some terms. The most obvious is “language”. We neither can talk about language without language, nor can we even think “language” or have the “feeling” of language without practicing it. We also can’t investigate language without using or practicing it. Any “measurement” about language inevitably uses language itself as the means to measure, and this includes any interpretation of speech in language as well. This self-referentiality further leads to interesting phenomena, such as “n-isms” like the dualism in quantum physics, where we also find a conflation of scales. If we would fail to take this self-referentiality into consideration we inevitably will create faults or pseudo-paradoxa.
The important issue about that is that there is no justification of language which could be expressed outside of language, hence there is no (foundational) justification for it at all. We find a quite unique setting, which corrodes any attempt for a “closed” , i.e. formal analysis of language.
The extension of the concept “language” is at the same time an instance of it.
It is absolutely not surprising that the attempt for a fully mechanic, i.e. apriori determined or algorithmic analysis of language must fail. Wittgenstein thus arrived at the conclusion that language is ultimately embedded as a practice in the life form  (we would prefer the term “performance” instead). He demanded, that justifications (of language games as rule-following) have to come to an end1; for him it was fallacious to think that a complete justification—or ultimate foundation—would be possible.
Just to emphasize it again: The particular uniqueness of terms like language is that they can not be justified outside of themselves. Analytically, they start with a structural singularity. Thus the term “strong singularity” that differs significantly from the concept of the so-called “singular term” as it is widely known. We will discuss it below.
The term “strong singularity” indicates the absence of any possibility for an external justification.
In §329 of the Philosophical Investigations, Wittgenstein notes:
When I think in language, there aren’t ”meanings” going through my mind in addition to the verbal expressions: the language is itself the vehicle of thought.
It is quite interesting to see that symbols do not own this particular property of strong singularity. Despite that they are a structural part of language they do not share this property. Hence we may conceive it as a remarkable instance of a Deleuzean double articulation  in midst thinking itself. There would be lot to say about it, but it also would not fit here.
Language now shares the property of strong singularity with formalization . We can neither have the idea nor the feeling of formalization without formalization, and we even can not perform formalization without prior higher-order formalization. There is no justification of formalization which could be expressed outside of formalization, hence there is no (foundational) justification for it at all. The parallel is obvious: Would it then be necessary, for instance, to conclude that formalization is embedded in the life form much in the same way as it is the case for language? That mere performance precedes logics? Precisely this could be concluded from the whole of Wittgenstein’s philosophical theory, as Colin Johnston suggested .
Performative activity precedes any possibility of applying logics in the social world; formulated the other way round, we can say that transcendental logics is getting instantiated into an applicable quasi-logics. Before this background, the idea of truth functions determining a “pure” or ideal truth value is rendered into an importunate misunderstanding. Yet, formalization and language are not only similar with regard to this self-referentiality, they are also strictly different. Nevertheless, so the hypothesis we try to strengthen here, formalization resembles language in that we can not have the slightest thought or even any mental operation without formalization. It is even the other way round, in that any mental operation invokes a formalizing step.
Formalization and language are not the only entities, which exhibit self-referentiality and which can not defined by any kind of outside stance. Theory, model and metaphor belong to the family, too, not to forget finally about thinking, hence creativity, at large. A peculiar representative of these terms is the “I”. Close relatives, though not as critical as the former ones, are concepts like causality or information. All these terms are not only self-referential, they are also cross-referential. Discussing any of them automatically involves the others. Many instances of deep confusion derive from the attempt to treat them separately, across many domains from neurosciences, sociology, computer sciences and mathematics up to philosophy. Since digital technologies are based seriously on formalization and have been developing yet further into a significant deep structure of our contemporary life form, any area where software technology is pervasively used is endangered by the same misunderstandings. One of these areas is architecture and city-planning, or more general, any discipline where language or the social in involved as a target of the investigation.
There is last point to note about self-referentiality. Self-referentiality may likely lead to a situation that we have described as “complexity”. From this perspective, self-referentiality is a basic condition for the potential of novelty. It is thus interesting to see that this potential is directly and natively implanted into some concepts.
Now we will briefly discuss the concept of “singular term” as it is usually referred to. Yet, there is not a full agreement about this issue of singular terms, in my opinion mainly due to methodological issues. Many proponents of analytical philosophy simply “forget that there are speaking”, in the sense mentioned above.
The analytical perspective
Anyway, according to the received view, names are singular terms. It is said that the reference of singular terms are singular things or objects, even if they are immaterial, like the unicorn. Yet, the complete distinctive list of singular terms would look like this:
- – proper names (“Barack Obama”);
- – labeling designation (“U.S. President”);
- – indexical expressions (“here”, “this dog”).
Such singular terms are distinguished from so-called general terms. Following Tugendhat , who refers in turn to Strawson , the significance of a general term F consists from the conditions to be fulfilled, such that F matches one or several objects. In other words, the significance of a singular term is given by a rule for identification, while the significance of a general term is given by a rule for classification. As a consequence, singular terms require knowledge about general terms.
Such statements are typical for analytical philosophy.
There are serious problems with it. However, even the labeling is misleading. It is definitely NOT the term that is singular. Singular is at most a particular contextual event, which we decided to address by a name. Labelings and indexical expressions are not necessarily “singular,” and quite frequently the same holds for names. Think about “John Smith” first as a name, then as a person… This mistake is quite frequent in analytic philosophy. We can trace it even to the philosophy of mathematics , when it comes to certain claims of set theory about infinity.
The relevance for the possibility of machine-based episteme
There can be little doubt, as we already have been expressing it elsewhere, that human cognition can’t be separated from language. Even the use of most primitive tools, let alone be the production and distribution of them, requires the capability for at least a precursor of language, some first steps into languagability.
We know by experience that, in our mother tongue, we can understand sentences that we never heard before. Hence, understanding of language (quite likely as any understanding) is bottom-up, not top-down, at least in the beginning of the respective processes. Thus we have to ask about the sub-sentential components of a sentence.
Such components are singular terms. Imagine some perfectly working structure that comprises the capability for arbitrary classification as well as the capability for non-empirical analogical thinking, that is based on a dynamic symmetries. The machine wold not only be able to perform the transition from extensions to intensions, it would even be able to abstract the intension into a system of meta-algebraic symmetry relations. Such a system, or better, the programmer of it then would be faced with the problem of naming and labeling. Somehow the intensions have to be made addressable. A private index does not help, since such an index would be without any value for communication purposes.
The question is how to make the machine referring to the proper names? We will see elsewhere (forthcoming: “Waves, Words, and Images“), that this question will lead us to the necessity of multi-modality in processing linguistic input, e.g. language and images together into the same structure (which is just another reason why to rely on self-organizing maps and our methodology of modeling).
Refutation of the analytical view
The analytical position about singular term does not provide any help or insight into the the particular differential quality of terms as words that denote a concept.2 Analytical statements as cited above are inconsistent, if not self-contradictory. The reason is simple. Words as placeholders for concepts can not have a particular meaning attached to them by principle. The meaning, even that of subsentential components, is an issue of interpretation, and the meaning of a sentence is given not only by its own totality, it is also dependent on the embedding of the sentence itself into the story or the social context, where it is performed.
Since “analytic” singular terms require knowledge about general terms, and the general terms are only determined if the sentence is understood, it is impossible to identify or classify single terms, whether singular or general, before the propositional content of the sentence is clear to the participants. That propositional content of the sentence, however, is, as Robert Brandom in chapter 6 of his  convincingly argues, only accessible through their role in the inferential relations between the participants of the talk as well as the relations between sentences. Such we can easily see that the analytical concept of singular terms is empty, if not self-nullifying.
The required understanding of the sentence is missing in the analytical perspective, the object is dominant against the sentence, which is against any real-life experience. Hence, we’d also say that the primacy of interpretation is not fully respected. What we’d need instead is a kind of bootstrapping procedure that works within a social situation of exchanged speech.
Robert Brandom moves this bootstrapping into the social situation itself, which starts with a medial symmetry between language and socialness. There is, coarsely spoken, a rather fixed choreography to accomplish that. First, the participants have to be able to maintain what Brandom calls a de-ontic account. The sequence start with a claim, which includes the assignment of a particular role. This role must be accepted and returned, which is established by signalling that the inference / interpretation will be done. Both the role and the acceptance are dependent on the claim, on the de-ontic status of the participants and on the intended meaning. (now I have summarized about 500 pages of Brandom’s book…, but, as said, it is a very coarse summary!)
Brandom (chp.6) investigates the issue of singular terms. For him, the analytical perspective is not acceptable, since for him, as it the case for us, there is the primacy of interpretation.
Brandom refutes the claim of analytical philosophy that singular names designate single objects. Instead he strives to determine the necessity and the characteristics of singular terms by a scheme that distinguishes particular structural (“syntactical”) and semantic conditions. These conditions are further divergent between the two classes of possible subsentential structures, the singular terms (ST) and predicates (P). While syntactically, ST take the role of substitution-of/substitution-by and P take the structural role of providing a frame for such substitutions, in the semantic perspective ST are characterised exclusively by so called symmetric substitution-inferential commitments (SIC), where P also take asymmetric SIC. Those inferential commitments link the de-ontic, i.e. ultimately socialness of linguistic exchange, to the linguistic domain of the social exchange. We hence may also characterize the whole situation as it is described by Brandom as a cross-medial setting, where the socialness and linguistic domain provide each other mutually a medial embedding.
Interestingly, this simultaneous cross-mediality represents also a “region”, or a context, where materiality (of the participants) and immateriality (of information qua interpretation) overlaps. We find, so to speak, an event-like situation just before the symmetry-break that we ay identify as meaning. To some respect, Brandom’s scheme provides us the pragmatic details of of a Peircean sign situation.
The Peirce-Brandom Test
This has been a very coarse sketch of one aspect of Brandom’s approach. Yet, we have seen that language understanding can not be understood if we neglect the described cross-mediality. We therefore propose to replace the so-called Turing-test by a procedure that we propose to call the Peirce-Brandom Test. That test would proof the capability to take part in semiosis, and the choreography of the interaction scheme would guarantee that references and inferences are indeed performed. In contrast to the Turing-test, the Peirce-Brandom test can’t be “faked”, e.g. by a “Chinese Room.” (Searle ) Else, to find out whether the interaction partner is a “machine” or a human we should not ask them anything, since the question as a grammatical form of social interaction corroborates the complexity of the situation. We just should talk to it/her/him.The Searlean homunculus inside the Chinese room would not be able to look up anything anymore. He would have to be able to think in Chinese and as Chinese, q.e.d.
Strongly Singular Terms and the Issue of Virtuality
The result of Brandom’s analysis is that the label of singular terms is somewhat dispensable. These terms may be taken as if they point to a singular object, but there is no necessity for that, since their meaning is not attached to the reference to the object, but to their role in in performing the discourse.
Strongly singular terms are strikingly different from those (“weakly) singular terms. Since they are founding themselves while being practiced through their self-referential structure, it is not possible to find any “incoming” dependencies. They are seemingly isolated on their passive side, there are only outgoing dependencies towards other terms, i.e. other terms are dependent on them. Hence we could call them also “(purely) active terms”.
What we can experience here in a quite immediate manner is pure potentiality, or virtuality (in the Deleuzean sense). Language imports potentiality into material arrangements, which is something that programming languages or any other finite state automaton can’t accomplish. That’s the reason why we all the time heftily deny the reasonability to talk about states when it comes to the brain or the mind.
Now, at this point it is perfectly clear why language can be conceived as ongoing creativity. Without ongoing creativity, the continuous actualization of the virtual, there wouldn’t be anything that would take place, there would not “be” language. For this reason, the term creativity belongs to the small group of strongly singular terms.
In this series of essays about the relation between formalization and creativity we have achieved an important methodological milestone. We have found a consistent structural basis for the terms language, formalization and creativity. The common denominator for all of those is self-referentiality. On the one hand this becomes manifest in the phenomenon of strong singularity, on the other hand this implies an immanent virtuality for certain terms. These terms (language, formalization, model, theory) may well be taken as the “hot spots” not only of the creative power of language, but also of thinking at large.
The aspect of immanent virtuality implicates a highly significant methodological move concerning the starting point for any reasoning about strongly singular terms. Yet, this we will check out in the next chapter.
1. Wittgenstein repeatedly has been expressing this from different perspectives. In the Philosophical Investigations , PI §219, he states: “When I obey the rule, I do not choose. I obey the rule blindly.” In other words, there is usually no reason to give, although one always can think of some reasons. Yet, it is also true that (PI §10) “Rules cannot be made for every possible contingency, but then that isn’t their point anyway.” This leads us to §217: “If I have exhausted the justifications I have reached bedrock, and my spade is turned. Then I am inclined to say: ‘This is simply what I do’.” Rules are are never intended to remove all possible doubt, thus PI §485: “Justification by experience comes to an end. If it did not it would not be justification.” Later Quine proofed accordingly from a different perspective what today is known as the indeterminacy of empirical reason (“Word and Object”).
2. There are, of course, other interesting positions, e.g. that elaborated by Wilfrid Sellars , who distinguished different kinds of singular terms: abstract singular terms (“triangularity”), and distributive singular terms (“the red”), in addition to standard singular terms. Yet, the problem of which the analytical position is suffering also hits the position of Sellars.
-  Ludwig Wittgenstein, Philosophical Investigations.
-  Gilles Deleuze, Felix Guattari, Milles Plateaus.
-  Colin Johnston (2009). Tractarian objects and logical categories. Synthese 167: 145-161.
-  Ernst Tugendhat, Traditional and Analytical Philosophy. 1976
-  Strawson 1974
-  Rodych, Victor, “Wittgenstein’s Philosophy of Mathematics”, The Stanford Encyclopedia of Philosophy (Summer 2011 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu.
-  Robert Brandom, Making it Explicit. 1994
-  John Searle (1980). Minds, Brains and Programs. Behav Brain Sci 3 (3), 417–424.
-  Wilfrid Sellars, Science and Metaphysics. Variations on Kantian Themes, Ridgview Publishing Company, Atascadero, California  1992.
February 15, 2012 § Leave a comment
If there is such a category as the antipodic at all,
it certainly applies to the pair of the formal and the creative, at least as long as we consult the field of propositions1 that is labeled as the “Western Culture.” As a consequence, in many cultures, and even among mathematicians, these qualities tend to be conceived as completely separated.
We think that this rating is based on a serious myopia, one that is quite common throughout rationalism, especially if that comes as a flavor of idealism. In a small series of essays—it is too much material for a single one—we will investigate the relation between these qualities, or concepts, of the formal and the creative. Today, we just will briefly introduce some markers.
The Basic Context
The relevance of this endeavor is pretty obvious. On the one hand we have the part of creativity. If machine-based episteme implies the necessity to create new models, new hypothesis and new theories we should not only get clear about the necessary mechanisms and the sufficient conditions for its “appearance.” In other chapters we already mentioned complexity and evolutionary processes as the primary, if not the only candidates for such mechanisms. These domains are related to the transition from the material to the immaterial, and surely, as such they are indispensable for any complete theory about creativity. Yet, we also have to take into consideration the space of the symbolic, i.e. of the immaterial, of information and knowledge, which we can’t find in the domains of complexity and evolution, at least not without distorting them too much. There is a significant aspect of creativity that is situated completely in the realm of the symbolic (to which we propose to include diagrams as well). In other words, there is an aspect of creativity that is related to language, to story-telling, understood as weaving (combining) a web of symbols and concepts, that often is associative in its own respect, whether in literature, mathematics, reading and writing, or regarding the DNA.
On the other hand, we have the quality of the formal, or when labelled as a domain of activity, formalization. The domain of the formal is fully within the realm of the symbolic. And of course, the formal is frequently conceived as the cornerstone, if not essence, of mathematics. Before the beginning of the 20ieth century, or around its onset, the formal was almost a synonym to mathematics. At that time, the general movement to more and more abstract structures in mathematics, i.e. things like group theory, or number theory, lead to the enterprise to search for the foundations of mathematics, often epitomized as the Hilbert Program. As a consequence, kind of a “war” broke out, bearing two parties, the intuitionists and the formalists, and the famous foundational crisis started, which is lasting till today. Gödel then proofed that even in mathematics we can not know perfectly. Nevertheless, for most people mathematics is seen as the domain where reason and rationalism is most developed. Yet, despite mathematicians are indeed ingenious (as many other people), mathematics itself is conceived as safe, that is static and non-creative. Mathematics is about symbols under analytic closure. Ideally, there are no “white territories” in mathematics, at least for the members of the formalist party.
The mostly digital machines finally pose a particular problem. The question is whether a deterministic machine, i.e. a machine for which a complete analytic description can exist, is able to develop creativity.
This question has been devised many times in the history of philosophy and thinking, albeit in different forms. Leibniz imagined a mathesis universalis and characteristica universalis as well. In the 20ieth century, Carnap tried to proof the possibility of a formal language that could serve as the ideal language for science . Both failed, Carnap much more disastrously than Leibniz. Leibniz also thought about the transition from the realm of the mechanic to the realm of human thought, by means of his ars combinatoria, which he had imagined to create any possible thought. We definitely will return to Leibniz and his ideas later.
A (summarizing) Glimpse
How will we proceed, and what will we find?
First we will introduce and discuss some the methodological pillars for our reasoning about the (almost “dialectic”) relation between creativity and formalization; among those the most important ones are the following:
- – the status of “elements” for theorizing;
- – the concept of dimensions and space;
- – relations
- – the domain of comparison;
- – symmetries as a tool;
- – virtuality.
Secondly, we will ask about the structure of the terms “formal” and “creative” while they are in use, especially however, we are interested in their foundational status. We will find, that both, formalization and creativity belong to a very particular class of language games. Notably, these terms turn out to be singular terms, that are at the same time not names. They are singular because their foundation as well as the possibility to experience them are self-referential. (ceterum censeo: a result that is not possible if we’d stick to the ontological style by asking “What is creativity…”)
The experience of the concepts associated to them can’t be externalized. We can not talk about language without language, nor can we think “language” without practicing it. Thus, they also can’t be justified by external references, which is a quite remarkable property.
In the end we hopefully will have made clear that creativity in the symbolic space is not achievable without formalization. They are even co-generative.
Let us start with creativity. Creativity has always been considered as something fateful. Until the beginnings of psychology as a science by William James, smart people have been smart by the working of fate, or some gods. Famous, and for centuries unchallenged, the passage in Plato’s Theaitetos , where Sokrates explains his role in maieutics by mentioning that the creation of novel things is task of the gods. The genius as well as concept of intuition could be regarded just a rather close relatives of that. Only since the 1950ies, probably not by pure chance, people started to recognize creativity as a subject in its own right . Yet, somehow it is not really satisfying to explain creativity by calling it “divergent” or “lateral” thinking . Nothing is going to be explained by replacing one term by another. Nowadays, and mostly in the domain of design research, conditions for creativity are often understood in terms of collaborations. People even resort to infamous swarm intelligence, which is near a declaration of bankruptcy.
Any of these approaches are just replacing some terms with some other terms, trying to conjure some improvement in understanding. Most of the “explanations” indeed look rather like rain dancing than a valuable analysis. Recently a large philosophical congress in Berlin, with more than 1200 inscribed participants, and two books comprising around 2000 pages focused on the subject largely in the same vein and without much results . We are definitely neither interested in any kind of metaphysical base-jumping by referring directly or indirectly to intuition, and the accompanying angels in the background, nor in phenomenological, sociological or superficial psychological approaches, tying to get support by some funny anecdotes.
The question really is, what are we talking about, and how, when referring to the concept of creativity. Only because this question is neither posed nor answered, we are finding so much esoterics around this topic. Creativity surely exceeds problem solving, although sometimes it occurs righteous when solving a problem. It may be observed in calm story telling, in cataclysmic performances of artists, or in language.
Actually, our impression is that creativity is nothing that sometimes “is there”, and sometimes not. In language it is present all the time, much like it is the case for analogical thinking. The question is which of those phenomena we call “creative,” coarsely spoken, which degree of intensity regarding novelty and usefulness of that novelty we allow to get assigned a particular saliency. Somehow, constraints seem to play an important role, as well as the capability to release it, or apply it, at will. Then, however, creativity must be a technique, or at least based on tools which we could learn to use. It is, however, pretty clear that we have to distinguish between the assignment of the saliency (“this or that person has been creative”) and the phenomenon and its underlying mechanisms. The assignment of the term is introducing a discreteness that is not present on the level of the mechanism, hence we never will understand about what we are talking about if we take just the parlance as the source and the measure.
The phenomenon of language provides a nice bridge to the realm of the formal. Today, probably mainly due to the influence of computer sciences, natural languages are distinguished from artificial languages, which often are also called formal languages. It is widely accepted, that formalization either is based on formal languages or that the former creates an instance of the latter. The concept of formal language is important in mathematics, computer science and science at large. Instantiated as programming languages, formal languages are of an enormous relevance for human society; one could even say that these languages themselves establish some kind of a media.
Yet, the labeling of the discerned phenomena as “natural” and “formal” always strikes me. It is remarkable that human languages are so often also called “natural” languages. Somehow, human language appears so outstanding to humans that they call their language in a funny paradoxical move a “natural” thing, as if this language-thing would have been originated outside human culture. Today, as we know about many instances of cultural phenomena in animals, the strong dichotomy between culture and nature blurred considerably. A particular driver of this is provided by the spreading insight that we as humans are also animals: our bodies contain a brain. Thus, we and our culture also build upon this amazing morphological structure, continuously so. We as humans are just the embodiment of the dichotomy between nature and culture, and nothing could express the confusion about this issue more than the notion of “natural language.” A bit shamefaced we call the expressions of whales and dolphins “singing”, despite we know that they communicate rather complicated matters. We are just unable to understand anything of it. The main reason for that presumable being that we do not share anything regarding their Lebensform, and other references than the Lebensform are not relevant for languages.
Language, whether natural or formal, is supposed to be used to express things. Already here we now have been arriving in deep troubles as the previous sentence is anything than innocent. First, speaking about things is not a trivial affair. A thing is a difficult thing. Taking etymology into consideration, we see that things are the results of negotiations. As a “result,” in turn, “things” are reductions, albeit in the realm of the abstract. The next difficulty is invoked by the idea that we can “express” things in a formal language. There has been a large debate on the expressive capabilities of formal languages, mainly induced by Carnap , and carried further by Quine , Sneed , Stegmüller , Spohn , and Moulines , among others, up to today.
In our opinion, the claim of the expressibility of formal language, and hence the proposed usage of formal languages as a way to express scientific models and theories, is based on probably more than just one deep and drastic misunderstanding. We will elucidate this throughout this series; other arguments has been devised for instance by Putnam in his small treatise about the “meaning of meaning” , where he famously argued that “analyticity is an inexplicable noise” without any possibility for a meaningful usage. That’s also a first hint that analyticity is not about the the same thing(s) as formalization.
Robert Brandom puts the act of expressing within social contexts into the center of his philosophy , constructing a well-differentiated perspective upon the relation between principles in the usage of language and its structure. Following Brandom, we could say that formal language can not be expressive almost by its own definition: the mutual social act of requesting an interpretation is missing there, as well as any propositional content. If there is no propositional content, nothing could be expressed. Yet, propositional content comes into existence only by a series of events where the interactees in a social situation ascribe it mutually to each other and are also willing to accept that assignment.
Formal languages consist of exactly defined labeled sets, where each set and its label represents a rewriting rule. In other words, formal languages are finite state machines; they are always expressible as a compiler for a programming language. Programming languages organize the arrangement of rewriting rules, they are however not an entity capable for semantics. We could easily conclude that formal languages are not languages at all.
A last remark about formalization as a technique. Formalization is undeniably based on the use, or better, the assignment of symbols to particular, deliberately chosen contexts, actions, recipes or processes. Think of proofs of certain results in mathematics where the symbolized idea later refers to the idea and its proof. Such, they may act as kind of abbreviations, or they will denote abstractions. They also may support the visibility of the core of otherwise lengthy reasonings. Sometimes, as for instance in mathematics, formalization requires several components, e.g. the item or subject, the accompanying operators or transformations (take that as “usage”), and the reference to some axiomatics or a explicit description of the conditions and the affected items. The same style is applied in physics. Yet, this complete structure is not necessary for an action to count as a formalization. We propose to conceive of formalization as the selection of elements (will be introduced soon) that consecutively are symbolized. Actually, it is not necessary to write down a “formula” about something in order to formalize that something. It is also not necessary, so we are convinced, to apply a particular logic when establishing the formalization through abstraction. It is just the symbolic compression that allows to achieve further results which would remain inaccessible otherwise. Or briefly put, to give a complicated thing a symbolic form that lives within a system of other forms.
Finally, there is just one thing we always should keep in mind. Using, introducing or referring to a formalization irrevocably implies an instantiation when we are going to apply it, to bring it back to more “concrete” contexts. Thus, formalization is deeply linked to the Deleuzean figure of thought of the “Differential.” 
-  Rudolf Carnap, Logische Syntax der Sprache, Wien 1934 [2. Aufl. 1968].
-  Platon, Theaitetos.
-  Guilford, Creativity , 1950
-  DeBono about lateral thinking
-  Günther Abel (ed.), Kreativität. Kolloquiumsband XX. Kongress der Deutschen Philosophie. Meiner Verlag, Hamburg 2007.
-  Quine, Two Dogmas of Empiricism.
-  Sneed
-  Stegmüller
-  Spohn
-  Moulines
-  Hilary Putnam, The Meaning of Meaning
-  Robert Brandom, Making it Explicit.
-  Gilles Deleuze, Difference and Repetition.