Analogical Thinking, revisited. (II)

March 20, 2012 § Leave a comment

In this second part of the essay about a fresh perspective on

(II/II)

analogical thinking—more precise: on models about it—we will try to bring two concepts together that at first sight represent quite different approaches: Copycat and SOM.

Why engaging in such an endeavor? Firstly, we are quite convinced that FARG’s Copycat demonstrates an important and outstanding architecture. It provides a well-founded proposal about the way we humans apply ideas and abstract concepts to real situations. Secondly, however, it is also clear that Copycat suffers from a few serious flaws in its architecture, particularly the built-in idealism. This renders any adaptation to more realistic domains, or even to completely domain-independent conditions very, very difficult, if not impossible, since this drawback also prohibits structural learning. So far, Copycat is just able to adapt some predefined internal parameters. In other words, the Copycat mechanism just adapts a predefined structure, though a quite abstract one, to a given empiric situation.

Well, basically there seem to be two different, “opposite” strategies to merge these approaches. Either we integrate the SOM into Copycat, or we try to transfer the relevant yet to be identified parts from Copycat to a SOM-based environment. Yet, at the end of day we will see that and how the two alternatives converge.

In order to accomplish our goal of establishing a fruitful combination between SOM and Copycat we have to take mainly three steps. First, we briefly recapitulate the basic elements of Copycat and the proper instance of a SOM-based system. We also will describe the extended SOM system in some detail, albeit there will be a dedicated chapter on it. Finally, we have to transfer and presumably adapt those elements of the Copycat approach that are missing in the SOM paradigm.

Crossing over

The particular power of (natural) evolutionary processes derives from the fact that it is based on symbols. “Adaptation” or “optimization” are not processes that change just the numerical values of parameters of formulas. Quite to the opposite, adaptational processes that span across generations parts of the DNA-based story is being rewritten, with potential consequences for the whole of the story. This effect of recombination in the symbolic space is particularly present in the so-called “crossing over” during the production of gamete cells in the context of sexual reproduction in eukaryotes. Crossing over is a “technique” to dramatically speed up the exploration of the space of potential changes. (In some way, this space is also greatly enlarged by symbolic recombination.)

What we will try here in our attempt to merge the two concepts of Copycat and SOM is exactly this: a symbolic recombination. The difference to its natural template is that in our case we do not transfer DNA-snippets between homologous locations in chromosomes, we transfer whole “genes,” which are represented by elements.

Elementarizations I: C.o.p.y.c.a.t.

In part 1 we identified two top-level (non-atomic) elements of Copycat

Since the first element, covering evolutionary aspects such as randomness, population and a particular memory dynamics, is pretty clear and a whole range of possible ways to implement it are available, any attempt for improving the Copycat approach has to target the static, strongly idealistic characteristics of the the structure that is called “Slipnet” by the FARG’s. The Slipnet has to be enabled for structural changes and autonomous adaptation of its parameters. This could be accomplished in many ways, e.g. by representing the items in the Slipnet as primitive artificial genes. Yet, we will take a different road here, since the SOM paradigm already provides the means to achieve idealizations.

At that point we have to elementarize Copycat’s Slipnet in a way that renders it compatible with the SOM principles. Hofstadter emphasizes the following properties of the Slipnet and the items contained therein (pp.212).

  • (1) Conceptual depth allows for a dynamic and continuous scaling of “abstractness” and resistance against “slipping” to another concept;
  • (2) Nodes and links between nodes both represent active abstract properties;
  • (3) Nodes acquire, spread and lose activation, which knows an switch-on threshold < 1;
  • (4) The length of links represents conceptual proximity or degree of association between the nodes.

As a whole, and viewed from the network perspective, the Slipnet behaves much like a spring system, or a network built from rubber bands, where the springs or the rubber bands are regulated in their strength. Note that our concept of SomFluid also exhibits the feature of local regulation of the bonds between nodes, a property that is not present in the idealized standard SOM paradigm.

Yet, the most interesting properties in the list above are (1) and (2), while (3) and (4) are known in the classic SOM paradigm as well. The first item is great because it represents an elegant instance of creating the possibility for measurability that goes far beyond the nominal scale. As a consequence, “abstractness” ceases to be nominal none-or-all property, as it is present in hierarchies of abstraction. Such hierarchies now can be recognized as mere projections or selections, both introducing a severe limitation of expressibility. The conceptual depth opens a new space.

The second item is also very interesting since it blurs the distinction between items and their relations to some extent. That distinction is also a consequence of relying too readily on the nominal scale of description. It introduces a certain moment of self-reference, though this is not fully developed in the Slipnet. Nevertheless, a result of this move is that concepts can’t be thought without their embedding into other a neighborhood of other concepts. Hofstadter clearly introduces a non-positivistic and non-idealistic notion here, as it establishes a non-totalizing meta-concept of wholeness.

Yet, the blurring between “concepts” and “relations” could be and must be driven far beyond the level Hofstadter achieved, if the Slipnet should become extensible. Namely, all the parts and processes of the Slipnet need to follow the paradigm of probabilization, since this offers the only way to evade the demons of cybernetic idealism and control apriori. Hofstadter himself relies much on probabilization concerning the other two architectural parts of Copycat. Its beyond me why he didn’t apply it to the Slipnet too.

Taken together, we may derive (or: impose) the following important elements for an abstract description of the Slipnet.

  • (1) Smooth scaling of abstractness (“conceptual depth”);
  • (2) Items and links of a network of sub-conceptual abstract properties are instances of the same category of “abstract property”;
  • (3) Activation of abstract properties represents a non-linear flow of energy;
  • (4) The distance between abstract properties represents their conceptual proximity.

A note should be added regarding the last (fourth) point. In Copycat, this proximity is a static number. In Hofstadter’s framework, it does not express something like similarity, since the abstract properties are not conceived as compounds. That is, the abstract properties are themselves on the nominal level. And indeed, it might appear as rather difficult to conceive of concepts as “right of”, “left of”, or “group” as compounds. Yet, I think that it is well possible by referring to mathematical group theory, the theory of algebra and the framework of mathematical categories. All of those may be subsumed into the same operationalization: symmetry operations. Of course, there are different ways to conceive of symmetries and to implement the respective operationalizations. We will discuss this issue in a forthcoming essay that is part of the series “The Formal and the Creative“.

The next step is now to distill the elements of the SOM paradigm in a way that enables a common differential for the SOM and for Copycat..

Elementarizations II: S.O.M.

The self-organizing map is a structure that associates comparable items—usually records of values that represent observations—according to their similarity. Hence, it makes two strong and important assumptions.

  • (1) The basic assumption of the SOM paradigm is that items can be rendered comparable;
  • (2) The items are conceived as tokens that are created by repeated measurement;

The first assumption means, that the structure of the items can be described (i) apriori to their comparison and (ii) independent from the final result of the SOM process. Of course, this assumption is not unique to SOMs, any algorithmic approach to the treatment of data is committed to it. The particular status of SOM is given by the fact—and in stark contrast to almost any other method for the treatment of data—that this is the only strong assumption. All other parameters can be handled in a dynamic manner. In other words, there is no particular zone of the internal parametrization of a SOM that would be inaccessible apriori. Compare this with ANN or statistical methods, and you feel the difference…  Usually, methods are rather opaque with respect to their internal parameters. For instance, the similarity functional is usually not accessible, which renders all these nice looking, so-called analytic methods into some kind of subjective gambling. In PCA and its relatives, for instance, the similarity is buried in the covariance matrix, which in turn is only defined within the assumption of normality of correlations. If not a rank correlation is used, this assumption is extended even to the data itself. In both cases it is impossible to introduce a different notion of similarity. Else, and also as a consequence of that, it is impossible to investigate the particular dependency of the results proposed by the method from the structural properties and (opaque) assumptions. In contrast to such unfavorable epistemo-mythical practices, the particular transparency of the SOM paradigm allows for critical structural learning of the SOM instances. “Critical” here means that the influence of internal parameters of the method onto the results or conclusions can be investigated, changed, and accordingly adapted.

The second assumption is implied by its purpose to be a learning mechanism. It simply needs some observations as results of the same type of measurement. The number of observations (the number of repeats) has to  exceed a certain lower threshold, which, dependent on the data and the purpose, is at least 8, typically however (much) more than 100 observations of the same kind are needed. Any result will be within the space delimited by the assignates (properties), and thus any result is a possibility (if we take just the SOM itself).

The particular accomplishment of a SOM process is the transition from the extensional to the intensional description, i.e. the SOM may be used as a tool to perform the step from tokens to types.

From this we may derive the following elements of the SOM:1

  • (1) a multitude of items that can be described within a common structure, though not necessarily an identical one;
  • (2) a dense network where the links between nodes are probabilistic relations;
  • (3) a bottom-up mechanism which results in the transition from an extensional to an intensional level of description;

As a consequence of this structure the SOM process avoids the necessity to compare all items (N) to all other items (N-1). This property, together with the probabilistic neighborhoods establishes the main difference to other clustering procedures.

It is quite important to understand that the SOM mechanism as such is not a modeling procedure. Several extensions have to be added and properly integrated, such as

  • – operationalization of the target into a target variable;
  • – validation by separate samples;
  • – feature selection, preferably by an instance of  a generalized evolutionary process (though not by a genetic algorithm);
  • – detecting strong functional and/or non-linear coupling between variables;
  • – description of the dependency of the results from internal parameters by means of data experiments.

We already described the generalized architecture of modeling as well as the elements of the generalized model in previous chapters.

Yet, as we explained in part 1 of this essay, analogy making is conceptually incompatible to any kind of modeling, as long as the target of the model points to some external entity. Thus, we have to choose a non-modeling instance of a SOM as the starting point. However, clustering is also an instance of those processes that provide the transition from extensions to intensions, whether this clustering is embedded into full modeling or not. In other words, both the classic SOM as well as the modeling SOM are not suitable as candidates for a merger with Copycat.

SOM-based Abstraction

Fortunately, there is already a proposal, and even a well-known one, that indeed may be taken as such a candidate: the two-layer SOM (TL-SOM) as it has been demonstrated as essential part of the so-called WebSom [1,2].

Actually, the description as being “two layered” is a very minimalistic, if not inappropriate description what is going on in the WebSom. We already discussed many aspects of its architecture here and here.

Concerning our interests here, the multi-layered arrangement itself is not a significant feature. Any system doing complicated things needs a functional compartmentalization; we have met a multi-part, multi-compartment and multi-layered structure in the case of Copycat too. Else, the SOM mechanism itself remains perfectly identical across the layers.

The real interesting features of the approach realized in the TL-SOM are

  • – the preparation of the observations into probabilistic contexts;
  • – the utilization of the primary SOM as a measurement device (the actual trick).

The domain of application of the TL-SOM is the comparison and classification of texts. Texts belong to unstructured data and the comparison of texts is exposed to the same problematics as the making of analogies: there is no apriori structure that could serve as a basis for modeling. Also, as the analogies investigated by the FARG the text is a locational phenomenon, i.e. it takes place in a space.

Let us briefly recapitulate the dynamics in a TL-SOM. In order to create a TL-SOM the text is first dissolved into overlapping, probabilistic contexts. Note that the locational arrangement is captured by these random contexts. No explicit apriori rules are necessary to separate patterns. The resulting collection of  contexts then gets “somified”. Each node then contains similar random contexts that have been derived from various positions in different texts. Now the decisive step will be taken, which consists in turning the perspective by “90 degrees”: We can use the SOM as the basis for creating a histogram for each of the texts. The nodes are interpreted as properties of the texts, i.e. each node represents a bin of the histogram. The values of the individual bins measure the frequency of the text as it is represented by the respective random context. The secondary SOM then creates a clustering across these histograms, which represent the texts in an abstract manner.

This way the primary lattice of the TL-SOM is used to impose a structure on the unstructured entity “text.”

Figure 1: A schematic representation of a two-layered SOM with built-in self-referential abstraction. The input for the secondary SOM (foreground) is derived as a collection of histograms that are defined as a density across the nodes of the primary SOM (background). The input for the primary SOM are random contexts.

To put it clearly: the secondary SOM builds an intensional description of entities that results from the interaction of a SOM with a probabilistic description of the empirical observations. Quite obviously, intensions built this way about intensions are not only quite abstract, the mechanism could even be stacked. It could be described as “high-level perception” as justified as Hofstadter uses the term for Copycat. The TL-SOM turns representational intensions into abstract, structural ones.

The two aspects from above thus interact, they are elements of the TL-SOM. Despite the fact that there are still transitions from extensions to intensions, we also can see that the targeted units of the analysis, the texts get probabilistically distributed across an area, the lattice of the primary SOM. Since the SOM maps the high-dimensional input data onto its map in a way that preserves their topological properties, it is easy to recognize that the TL-SOM creates conceptual halos as an intermediate.

So let us summarize the possibilities provided by the SOM.

  • (1) SOMs are able to create non-empiric, or better: de-empirified idealizations of intensions that are based on “quasi-empiric” input data;
  • (2) TL-SOMs can be used to create conceptual halos.

In the next section we will focus on this spatial, better: primarily spatial effect.

The Extended SOM

Kohonen and co-workers [1,2] proposed to build histograms that reflect the probability density of a text across the SOM. Those histograms represent the original units (e.g. texts) in a quite static manner, using a kind of summary statistics.

Yet, texts are definitely not a static phenomenon. At first sight there is at least a series, while more appropriately texts are even described as dynamic networks of own associative power [3]. Returning to the SOM we see that additionally to the densities scattered across the nodes of the SOM we also can observe a sequence of invoked nodes, according to the sequence of random contexts in the text (or the serial observations)

The not so difficult question then is: How to deal with that sequence? Obviously, it is again and best conceived as a random process (though with a strong structure), and random processes are best described using Markov models, either as hidden (HMM) or as transitional models. Note that the Markov model is not a model about the raw observational data, it describes the sequence of activation events of SOM nodes.

The Markov model can be used as a further means to produce conceptual halos in the sequence domain. The differential properties of a particular sequence as compared to the Markov model then could be used as further properties to describe the observational sequence.

(The full version of the extended SOM comprises targeted modeling as a further level. Yet, this targeted modeling does not refer to raw data. Instead, its input is provided completely by the primary SOM, which is based on probabilistic contexts, while the target of such modeling is just internal consistency of a context-dependent degree.)

The Transfer

Just to avoid misunderstanding: it does not make sense to try representing Copycat completely by a SOM-based system. The particular dynamics and phenomenologically behavior depends a lot on Copycat’s tripartite morphology as represented by the Coderack (agents), the Workspace and the Slipnet. We are “just” in search for a possibility to remove the deep idealism from the Slipnet in order to enable it for structural learning.

Basically, there are two possible routes. Either we re-interpret the extended SOM in a way that allows us to represent the elements of the Slipnet as properties of the SOM, or we try to replace the all items in the Slipnet by SOM lattices.

So, let us take a look which structures we have (Copycat) or what we could have (SOM) on both sides.

Table 1: Comparing elements from Copycat’s Slipnet to the (possible) mechanisms in a SOM-based system.

Copycat extended SOM
 1. smoothly scaled abstraction Conceptual depth (dynamic parameter) distance of abstract intensions in an integrated lattice of a n-layered SOM
 2.  Links as concepts structure by implementation reflecting conceptual proximity as an assignate property for a higher-level
 3. Activation featuring non-linear switching behavior structure by implementation x
 4. Conceptual proximity link length (dynamic parameter) distance in map (dynamic parameter)
 5.  Kind of concepts locational, positional symmetries, any

From this comparison it is clear that the single most challenging part of this route is the possibility for the emergence of abstract intensions in the SOM based on empirical data. From the perspective of the SOM, relations between observational items such as “left-most,” “group” or “right of”, and even such as “sameness group” or “predecessor group”, are just probabilities of a pattern. Such patterns are identified by functions or dynamic combinations thereof. Combinations ot topological primitives remain mappable by analytic functions. Such concepts we could call “primitive concepts” and we can map these to the process of data transformation and the set of assignates as potential properties.2 It is then the job of the SOM to assign a relevancy to the assignates.

Yet, Copycat’s Slipnet comprises also rather abstract concepts such as “opposite”. Further more, the most abstract concepts often act as links between more primitive concepts, or, in Hofstadter terms, conceptual items of lower “conceptual depth”.

My feeling here is that it is a fundamental mistake to implement concepts like “opposite” directly. What is opposite of something else is a deeply semantic concept in itself, thus strongly dependent on the domain. I think that most of the interesting concepts, i.e. the most abstract ones are domain-specific. Concepts like “opposite” could be considered as something “simple” only in case of geometric or spatial domains.

Yet, that’s not a weakness. We should use this as a design feature. Take the following rather simple case as shown in the next figure as an example. Here we mapped simply triplets of uniformly distributed random values onto a SOM. The three values can be readily interpreted as parts of a RGB value, which renders the interpretation more intuitive. The special thing here is that the map has been a really large one: We defined approximately 700’000 nodes and fed approx. 6 million observations into it.

Figure 2: A SOM-based color map showing emergence of abstract features. Note that the topology of the map is a borderless toroid: Left and right borders touch each other (distance=0), and the same applies to the upper and lower borders.

We can observe several interesting things. The SOM didn’t come up with just any arbitrary sorting of the colors. Instead, a very particular one emerged.

First, the map is not perfectly homogeneous anymore. Very large maps tend to develop “anisotropies”, symmetry breaks if you like, simply due to the fact the the signal horizon becomes an important issue. This should not be regarded as a deficiency though. Symmetry breaks are essential for the possibility of the emergence of symbols. Second, we can see that two “color models” emerged, the RGB model around the dark spot in the lower left, and the YMC model around the bright spot in the upper right. Third, the distance between the bright, almost white spot and the dark, almost black one is maximized.

In other words, and not quite surprising, the conceptual distance is reflected as a geometrical distance in the SOM. As it is the case in the TL-SOM, we now could use the SOM as a measurement device that transforms an unknown structure into an internal property, simply by using the locational property in the SOM as an assignate for a secondary SOM. In this way we not only can represent “opposite”, but we even have a model procedure for “generalized oppositeness” at out disposal.

It is crucial to understand this step of “observing the SOM”, thereby conceiving the SOM as a filter, or more precisely as a measurement device. Of course, at this point it becomes clear that a large variety of such transposing and internal-virtual measurement devices may be thought of. Methodologically, this opens an orthogonal dimension to the representation of data, resembling strongly to the concept of orthoregulation.

The map shown above even allows to create completely different color models, for instance one around yellow and another one around magenta. Our color psychology is strongly determined by the sun’s radiated spectrum and hence it reflects a particular Lebenswelt; yet, there is no necessity about it. Some insects like bees are able to perceive ultraviolet radiation, i.e. their colors may have 4 components, yielding a completely different color psychology, while the capability to distinguish colors remains perfectly.3

“Oppositeness” is just a “simple” example for an abstract concept and its operationalization using a SOM. We already mentioned the “serial” coherence of texts (and thus of general arguments) that can be operationalized as sort of virtual movement across a SOM of a particular level of integration.

It is crucial to understand that there is no other model besides the SOM that combines the ability to learn from empirical data and the possibility for emergent abstraction.

There is yet another lesson that we can take home from the simple example above. Well, the example doesn’t not remain that simple. High-level abstraction, items of considerable conceptual depth, so to speak, requires rather short assignate vectors. In the process of learning qua abstraction it appears to be essential that the masses of possible assignates derived from or imposed by measurement of raw data will be reduced. On the one hand, empiric contexts from very different domains should be abstracted, i.e. quite literally “reduced”, into the same perspective. On the other hand, any given empiric context should be abstracted into (much) more than just one abstract perspective. The consequence of that is that we need a lot of SOMs, all separated “sufficiently” from each other. In other words, we need a dynamic population of Self-organizing maps in order to represent the capability of abstraction in real-life. “Dynamic population” here means that there are developmental mechanisms that result in a proliferation, almost a breeding of new SOM instances in a seamless manner. Of course, the SOM instances themselves have to be able to grow and to differentiate, as we have described it here and here.

In a population of SOM the conceptual depth of a concept may be represented by the efforts to arrive at a particular abstract “intension.” This not only comprises the ordinary SOM lattices, but also processes like Markov models, simulations, idealizations qua SOMs, targeted modeling, transition into symbolic space, synchronous or potential activations of other SOM compartments etc. This effort may be represented finally as a “number.”

Conclusions

The structure of multi-layered system of Self-organizing Maps as it has been proposed by Kohonen and co-workers is a powerful model to represent emerging abstraction in response to empiric impressions. The Copycat model demonstrates how abstraction could be brought back to the level of application in order to become able to make analogies and to deal with “first-time-exposures”.

Here we tried to outline a potential path to bring these models together. We regard this combination in the way we proposed it (or a quite similar one) as crucial for any advance in the field of machine-based episteme at large, but also for the rather confined area of machine learning. Attempts like that of Blank [4] appear to suffer seriously from categorical mis-attributions. Analogical thinking does not take place on the level of single neurons.

We didn’t discuss alternative models here (so far, a small extension is planned). The main reasons are that first it would be an almost endless job, and second that Hofstadter already did it and as a result of his investigation he dismissed all the alternative approaches (from authors like Gentner, Holyoak, Thagard). For an overview Runco [5] about recent models on creativity, analogical thinking, or problem solving provides a good starting point. Of course, many authors point to roughly the same direction as we did here, but mostly, the proposals are circular, not helpful because the problematic is just replaced by another one (e.g. the infamous and completely unusable “divergent thinking”), or can’t be implemented for other reasons. Thagard [6] for instance, claim that a “parallel satisfaction of the constraints of similarity, structure and purpose” is key in analogical thinking. Given our analysis, such statements are nothing but a great mess, mixing modeling, theory, vagueness and fluidity.

For instance, in cognitive psychology and in the field of artificial intelligence as well, the hypothesis of Structural Mapping (STM) finds a lot of supporters [7]. Hofstadter discusses similar approaches in his book. The STM hypothesis is highly implausible and obviously a left-over of the symbolic approach to Artificial Intelligence, just transposed into more structural regions. The STM hypothesis has not only to be implemented as a whole, it also has to be implemented for each domain specifically. There is no emergence of that capability.

The combination of the extended SOM—interpreted as a dynamic population of growing SOM instances—with the Copycat mechanism indeed appears as a self-sustaining approach into proliferating abstraction and—quite significant—back from it into application. It will be able to make analogies on any field already in its first encounter with it, even regarding itself, since both the extended SOM as well as the Copycat comprise several mechanisms that may count as precursors of high-level reflexivity.

After this proposal little remains to be said on the technical level. One of those issues which remain to be discussed is the conditions for the possibility of binding internal processes to external references. Here our favorite candidate principle is multi-modality, that is the joint and inextricable “processing” (in the sense of “getting affected”) of words, images and physical signals alike. In other words, I feel that we have come close to the fulfillment of the ariadnic question this blog:”Where is the Limit?” …even in its multi-faceted aspects.

A lot of implementation work has now to be performed, eventually commented by some philosophical musings about “cognition”, or more appropriate the “epistemic condition.” I just would like to invite you to stay tuned for the software publications to come (hopefully in the near future).

Notes

1. see also the other chapters about the SOM, SOM-based modeling, and generalized modeling.

2. It is somehow interesting that in the brain of many animals we can find very small groups of neurons, if not even single neurons, that respond to primitive features such as verticality of lines, or the direction of the movement of objects in the visual field.

3. Ludwig Wittgenstein insisted all the time that we can’t know anything about the “inner” representation of “concepts.” It is thus free of any sense and meaning to claim knowledge about the inner state of oneself as well as of that of others. Wilhelm Vossenkuhl introduces and explains the Wittgensteinian “grammatical” solipsism carefully and in a very nice way.[8]  The only thing we can know about inner states is that we use certain labels for it, and the only meaning of emotions is that we do report them in certain ways. In other terms, the only thing that is important is the ability to distinguish ones feelings. This, however, is easy to accomplish for SOM-based systems, as we have been demonstrating here and elsewhere in this collection of essays.

4. Don’t miss Timo Honkela’s webpage where one can find a lot of gems related to SOMs! The only puzzling issue about all the work done in Helsinki is that the people there constantly and pervasively misunderstand the SOM per se as a modeling tool. Despite their ingenuity they completely neglect the issues of data transformation, feature selection, validation and data experimentation, which all have to be integrated to achieve a model (see our discussion here), for a recent example see here, or the cited papers about the Websom project.

  • [1] Timo Honkela, Samuel Kaski, Krista Lagus, Teuvo Kohonen (1997). WEBSOM – Self-Organizing Maps of Document Collections. Neurocomputing, 21: 101-117.4
  • [2] Krista Lagus, Samuel Kaski, Teuvo Kohonen in Information Sciences (2004)
    Mining massive document collections by the WEBSOM method. Information Sciences, 163(1-3): 135-156. DOI: 10.1016/j.ins.2003.03.017
  • [3] Klaus Wassermann (2010). Nodes, Streams and Symbionts: Working with the Associativity of Virtual Textures. The 6th European Meeting of the Society for Literature, Science, and the Arts, Riga, 15-19 June, 2010. available online.
  • [4 ]Douglas S. Blank, Implicit Analogy-Making: A Connectionist Exploration.Indiana University Computer Science Department. available online.
  • [5] Mark A. Runco, Creativity-Research, Development, and Practice Elsevier 2007.
  • [6] Keith J. Holyoak and Paul Thagard, Mental Leaps: Analogy in Creative Thought.
    MIT Press, Cambridge 1995.
  • [7] John F. Sowa, Arun K. Majumdar (2003), Analogical Reasoning.  in: A. Aldo, W. Lex, & B. Ganter (eds.), “Conceptual Structures for Knowledge Creation and Communication,” Proc.Intl.Conf.Conceptual Structures, Dresden, Germany, July 2003.  LNAI 2746, Springer New York 2003. pp. 16-36. available online.
  • [8] Wilhelm Vossenkuhl. Solipsismus und Sprachkritik. Beiträge zu Wittgenstein. Parerga, Berlin 2009.

.

Advertisements

Analogical Thinking, revisited.

March 19, 2012 § Leave a comment

What is the New York of California?

(I/II)

Or even, what is the New York of New York? Almost everybody will come up with the same answer, despite the fact that not only the question is not only ill-defined. Both the question and its answer can be described only after the final appearance of the answer. In other words, it is not possible to provide any proposal about the relevance of those properties apriori to its completion, that aposteriori are easily tagged as relevant for the description of both the question as well as the answer. Both the question and the solution do not “exist” in the way that is pretended by their form before we have finished making sense of it. There is a wealth of philosophical issues around this phenomenon, which we all have to bypass here. Here we will focus just on the possibility for mechanisms that could be invoked in order to build a model that is capable to behave phenomeno-logically “as if“.

The credit to render such questions and the associated problematics salient in the area of computer models of thinking belongs to Douglas Hofstadter and his “Fluid Analogy Research group” (FARG). In his book “Fluid Concepts and Creative Analogies” that we already mentioned here he proposes a particular model of which he claims that it is a proper model for analogical thinking. In constructing this model, which took more than 10 years of research, we did not try to stick (to get stuck?) to the neuronal level. Accordingly, one can’t describe the performance of a tennis player at the molecular level, he says. Remarkably, he also keeps the so-called cognitive sciences and their laboratory wisdom at distance. Instead, his starting point is the everyday language, and presumably a good deal of introspection as well. He sees his model located at an intermediate level between the neurons and consciousness (quite a large field, though).

His overarching claim is as simple as it is distant from the main stream of AI and cognitive science. (Note that Hofstadter does not formulate “analogical reasoning.”)

Thinking is largely equivalent with making analogies.

Hofstadter is not interested to produce just another model for analogy making. There are indeed quite a lot of such models, which he discusses in great detail. And he refutes them all; he proofs that they are all ill-posed, since they all do not start with perception. Without exception they all assume that the “knowledge” is already in the computer and based on this assumption some computer program is established. Of course, such approaches are nonsense, euphemistically called “knowledge acquisition bottleneck” by people working in the field of AI / machine learning. Yet, knowledge is nothing that could be externalized and then acquired subsequently by some other party, it can’t be found “in” the world, and of course it can’t be separated as something that “exists” beside the processing mechanisms of the brain, making the whole thing “smart”. As already mentioned, such ideas are utter nonsense.

Hofstadter’s basic strategy is different. He proposes to create a software system that is able for “concept slipping” as an emergent phenomenon, deeply based on perceptional mechanisms. He even coined the term “high-level perception.”

That is, the […] project is not about simulating analogy-making per se, but about simulating the very crux of human cognition: fluid concepts. (p.208)

This essay will investigate his model. We will find that despite its appeal it is nevertheless seriously unrealistic, even according to Hofstadter’s own standards. Yet, despite its particular weaknesses it also demonstrates very interesting mechanisms. After extracting the cornerstones of his model we will try to map his insights to the world of self-organizing maps. We also will discuss how to transfer the interesting parts of Hofstadter’s model. Hofstadter himself clearly stated the deficiencies of “connectionist models” of “learning,” yet, my impression is that he was not aware about self-organizing maps at this time. By “connectionism” he obviously referred to artificial neural networks (ANN), and for those we completely agree to his critique.

Before we start I would like to provide some original sources, that is, copies of those parts that are most relevant for this essay. These parts are from chapter 5chapter 7 and chapter 8 of the aforementioned book. There you will find much more details and lucid examples about it in Hofstadter’s own words.

Is there an Alternative to Analogies?

In order to find an alternative we have to take a small bird’s view. Very coarsely spoken, thinking transforms some input into some output while being affected and transforming itself. In some sense, any transformation of input to output transforms the transforming instance, though in vastly different degrees. A trivial machine just wears off, a trivial computer—that is, any digital machine that fits into the scheme of the Turing-computing1—can be reset to meet exactly a previous state. As soon as historical contingency is involved, reproducibility vanishes and strictly non-technical entities appear: memory, value, and semantics (among others).

This transformation game applies to analogy making, and it also applies to traditional modeling.Is it possible to apply any kind of modeling to the problematics that is represented by the “transfer game”, for which those little questions posed in the beginning are just an example?

In his context, Hofstadter calls the modeling approach the brute-force approach (p.327, chp.8). The outline of the modeling approach could look like this (p.337).

  • Step 1: Run down the apriori list of city-characterization criteria and characterize the “source town” A according to each of them.
  • Step 2: Retrieve an apriori list of “target towns” inside target region Y from the data base.
  • Step 3: For each retrieved target town X, run down the a priori list of city-characterization criteria again, calculating X’s numerical degree of match with A for every criterion in the list.
  • Step 4: For each target town X, sum up the points generated in Step 3, possibly using apriori weights, thus allowing some criteria to be counted more heavily than others.
  • Step 5: Locate the target town with the highest overall rating as calculated in Step 4, and propose it as “the A of Y”.

Any plausible apriori list of city-characterization criteria would be long, very long indeed. Effectively, it can’t be limited in advance, since any imposed limit would represent a model that would claim to be better suited to decide about the criteria than the model being built. We are crashed by an infinite regress, not just in theory. What we experience here is Wittgenstein’s famous verdict that justifications have to come to an end. Rules are embedded in the form of life (“Lebensform”) and without knowing all about a particular Lebensform and to take into consideration anything comprised by such (impossible) knowledge we can’t start to model at all.

He identifies four characteristic difficulties for the modeling approach with regard to his little “transfer game” that plays around with cities.

  • – Difficulty 1: It is psychologically unrealistic to explicitly consider all the towns one knows in a given region in order to come up with a reasonable answer.
  • – Difficulty 2: Comparison of a target town and a source town according to a specific city-characterization criterion is not a hard-edged mechanical task, but rather, can itself constitute an analogy problem as complex as the original top-level puzzle.
  • – Difficulty 3: There will always be source towns A whose “essence”—that is, set of most salient characteristics—is not captured by a given fixed list of city-characterization criteria.
  • – Difficulty 4: What constitutes a “town in region Y” is not apriori evident.

Hofstadter underpins his point with the following question (p.347).

What possible set of apriori criteria would allow a computer to reply, perfectly self-confidently, that the country of Monaco is “the Atlantic City of France”?

Of course, the “computer” should come up with the answer in a way that is not pre-programmed explicitly.

Obviously, the problematics of making analogies can’t be solved algorithmically. There is not only no such thing as a single “solution”, even the criteria to describe the problem are missing. Thus we can conclude that modeling, even in its non-algorithmical form, is not a viable alternative to analogy making.

The FARG Model

In the following, we investigate the model as proposed by Hofstadter and his group, mainly Melanie Mitchell. This is separated into the parts

  • – precis of the model,
  • – its elements,
  • – its extension as proposed by Hofstadter,
  • – the main problems of the model, and finally,
  • – the main superior aspects of the model as compared to connectionist models (from Hofstadter’s perspective, of course).
Precis of the Model

Hofstadter’s conclusion from the problems with the model-based approach and thus also the starting point for his endeavor is that the making of an analogy must appear as an emergent phenomenon. Analogy itself can’t be “defined” in terms of criteria, beyond sort f rather opaque statements about “similarity.” The point is that this similarity could be measured only aposteriori, so this concept does not help. The capability for making analogies can’t be programmed explicitly. It would not be “making” of analogies anymore, it would just be a look-up of dead graphems (not even symbols!) in a database.

He proofs his ideas by means of a small software called “Copycat”. This name derives from the internal processes of the software, as making “almost identical copies” is an important ingredient of it. Yet, it also refers to the problem that appears if you say: “I am doing this, now do the same thing…”

Copycat has three major parts, which he labels as (i) the Slipnet, (ii) the Workspace, (iii) the Coderack.

The Coderack is a rack that serves as a launching site for a population of agents of various kinds. Agents decease and are being created in various ways. They may be spawned by other agents, by the Coderack, or by any of the items in the Slipnet—as a top-down specialist bred just to engage in situations represented by the Slipnet item. Any freshly created agent will be first put into the Coderack, regardless its originator or kind.

Any particular agent behaves as a specialist for recognizing a particular situation or to establish a particular relation between parts of the input “data, ” the initial observation. This recognition requires a model apriori, of course. Since these models are rather abstract as compared to the observational data, Hofstadter calls them “concepts.” After their set up, agents are put into the Coderack from where they start in random order, but also dependent on their “inner state,” which Hofstadter calls “pressure.”

The Slipnet is a loose “network” of deep and/or abstract concepts. In case of Copycat these concepts comprise

a, b, c, … , z, letter, successor, predecessor, alphabetic-first, alphabetic-last, alphabetic position, left, right, direction, leftmost, rightmost, middle, string position, group, sameness group, successor group, predecessor group, group length, 1, 2, 3, sameness, and opposite,

In total there are more than 60 of such concepts. These items are linked together, while the length of the link reflects the “distance” between concepts. This distance changes while Copycat is working on a particular task. The change is induced by the agents in response to their “success.” The Slipnet is not really a “network,” since it is neither a logistic network (it doesn’t transport anything) nor is it an associative network like a SOM. It is also not suitable to conceive it as a kind of filter in the sense of a spider’s web, or a fisherman’s net. It is thus more appropriate to consider it simply as a non-directed, dynamic graph, where discrete items are linked.

Finally, the third aspect is the Workspace. Hofstadter describes it as a “busy construction site” and likens it to the cytoplasm (p.216). In the Workspace, the agents establish bonds between the atomic items of the observation. As said, each agent knows nothing about the posed problem, it is just capable to perform on a mini-aspect of the task. The whole population of agents, however, build something larger. It looks much like the activity in ants or termites, building some morphological structure in the hive, or a macroscopic dynamic effect as hive population. The Workspace is the location of such intermediate structures of various degrees of stability, meaning that some agents also work to remove a particular structure.

So far we have described the morphology. The particular dynamics unfolding on this morphology is settled between competition and cooperation, with the result of a collective calming down of the activities. The decrease in activity is itself an emergent consequence of the many parallel processes inside Copycat.

A single run of Copycat yields one instance of the result. Yet, a single answer is not the result itself. Rather, as different runs of Copycat yield different singular answers, the result consists of a probability density for different singular answers. For the letter-domain in which Copycat is working the result look like this:

Figure 1: Probability densities as result of a Copycat run.

The Elements of the FARG Model

Before we proceed, I should emphasize that  here “element” is used as we have introduced the term here.

Returning to the FARG model, it is important to understand that a particularly constraint randomness plays a crucial role in its setup. The population of agents does not search through all possibilities all the time. Yet, any existing intermediate result, say structural hypothesis, serves as a constraint for the future search.

We also find different kinds of memories with different durations, we find dynamic historic constraints, which we also could call contingencies. We have a population of different kinds of agents that cooperate and compete. In some almost obvious way, Copycat’s mechanisms may be conceived as an instance of the generalized evolution that we proposed earlier. Hofstadter himself is not aware that he just proposed a mechanism for generalized evolutionary changes. He calls the process “parallel terraced scan”, thereby unnecessarily sticking to a functional perspective. Yet, we consider generalized evolution as one of the elements of Copycat. It could really be promising to develop Copycat as an alternative to so-called genetic algorithms.2

Despite a certain resemblance to natural evolution the mechanisms built into Copycat do not comprise an equivalent to what is known from biology as “gene doubling”. Gene doubling and the akin part of gene deletion are probably the most important mechanisms in natural evolution. Copycat produces different kinds of agents, but the informational setup of these agents does not change as it is given by the Slipnet. The equivalent to gene doubling would have to be implemented into the Slipnet. On the other hand, however, it is clear that the items in the Slipnet are too concrete, almost representational. In contrast, genes usually do not represent a particular function on the macro-level (which is one of the main structural faults of so-called genetic algorithms). So, we conclude that Copycat contains a restricted version of generalized evolution. Else, we see a structural resemblance to the theories of Edelman and his neuronal Darwinism, which actually is a nice insight.

Conceiving large parts of the mechanism of Copycat as (restricted) generalized evolution covers both the Coderack as well as the Workspace, but not the Slipnet.

The Slipnet acts as sort of a “Platonic Heaven” (Hofstadter’s term). It contains various kinds of abstract terms, where “abstract” simply means “not directly observable.” It is hence not comparable to those abstractions that can be used to build tree-like hierarchies. Think of the series “fluffy”-dog-mammal-animal-living entity. Significantly, the abstract terms in Copycat’s Slipnet also comprise concepts about relations, such as “right,” “direction,” “group,” or “leftmost.” Relations, however, are nothing else than even more abstract symmetries, that is transformational models, that may even build a mathematical group. Quite naturally, we could consider the items in Slipnet as a mathematical category (of categories). Again, Hofstadter and Mitchell do not refer in any way to such structures, quite unfortunately so.

The Slipnet’s items may well be conceived as instances of symmetry relations. Hofstadter treats them as idealizations of positional relations. Any of these items act as a structural property. This is a huge advance as compared to other models of analogy.

To summarize, we find two main elements in Copycat.

  • (1) restricted generalized evolution, and
  • (2) concrete instances of positional idealization.

Actually, these elements are top-level elements that must be conceived as compounds. In part 2 we will check out the elements of the Slipnet in detail, while the evolutionary aspects we already discussed in a previous chapter. Yet, this level of abstraction is necessary to render Copycat’s principles conceptually more mobile. In some way, we have to apply the principles of Copycat to the attempt to understand it.

The Copycat, released to the wild

Any generalization of Copycat has to withdraw the implicit constraints of its elements. In more detail, this would include the following changes:

  • (1) The representation of the items in the Slipnet could be changed into compounds, and these compounds should be expressed as “gene-like” entities.
  • (2) Introducing a mechanism to extend the Slipnet. This could be achieved through gene doubling in response to external pressures; yet, these pressures are not to be conceived as “external” to the whole system, just external to the Copycat. The pressures could be issued by a SOM. Alternatively, a SOM environment might also deliver the idealizations themselves. In either case, the resulting behavior of the Copycat has to be shaped by selection, either through internal mechanisms, or through environmentally induced forces (changes in the fitness landscape).
  • (3) The focus to positional idealization would have to be removed by introducing the more abstract notion of “symmetries”, i.e. mathematical groups or categories. This would render positional idealization just into a possible instance of potential idealization.

The resulting improvement of these changes would be dramatic. It would be not only much more easy to establish a Slipnet for any kind of domain, it also would allow the system (a CopyTiger?) to evolve new traits and capabilities, and to parametrize them autonomously. But these changes also require a change in the architectural (and mental) setup.

From Copycat to Metacat

Hofstadter himself tried to describe possible improvements of Copycat. A significant part of these suggestions for improvement is represented by the capability for self-monitoring and proliferating abstraction, hence he calls it “Metacat”.

The list of improvements comprises mainly the following five points (pp.315, chp.7).

  • (1) Self-monitoring of pressures, actions, and crucial changes as an explicit registering into parts of the Workspace.
  • (2) Disassembling of a given solution into the path of required actions.
  • (3) Hofstadter writes that “Metacat should store a trace of its solution of a problem in an episodic memory.
  • (4) A clear “meta-analogical” sense as an ability to see analogies between analogies, that is a multi-leveled type of self-reflectiveness.
  • (5) The ability to create and to enjoy the creation of new puzzles. In this context he writes “Indeed, I feel that responsiveness to beauty and its close cousin, simplicity, plays a central role in high-level cognition.

I am not really convinced of these suggestions, at least not if it would be implemented in the way that is suggested by Hofstadter “between the lines”. They look much more like a dream than a reasonable list of improvements, perhaps except the first one. The topic of self-monitoring has been explored by James  Marshall in his dissertation [1], but still his version of “Metacat” was not able to learn. This self-monitoring should not be conceived as a kind of Cartesian theater [2], perhaps even populated with homunculi on both sides of the stage.

The second point is completely incompatible with the architecture of Copycat, and notably Hofstadter does not provide even the tiniest comment on it. The third point violates the concept of “memory” as a re-constructive device. Hofstadter himself says elsewhere, while discussing alternative models of analogy, that the brain is not a database, which is quite correct. “Memory” is not a storage device. Yet, the consequence is that analogy making can’t be separated from memory itself (and vice versa).

The fourth suggestion, then, would require further platonic heavens, in case of Copycat/Metacat created by a programmer. This is highly implausible, and since it is a consequence of the architecture, the architecture of Copycat as such is not suitable to address real-world entities.

Finally, the fifth suggestion displays a certain naivity regarding either evolutionary contexts, to philosophical aspects of reasoning that are known since Immanuel Kant, or to the particular setup of human cognition, where emotions and propositional reasoning appear as deeply entangled issues.

The main Problem(s) of the FARG model

We already mentioned Copycat’s main problems, which are (i) the “Platonic heaven”, and (ii) the lack of the capability to learn as a kind of structural self-transformation.

Both problems are closely related. Actually, somehow there is only one single problem, and that’s the issue that Hofstadter got trapped by idealism. A Platonic heaven that is filled by the designer with an x-cat (or a Copy-x) is hard to comprehend. Even for the really small letter domain there are more than 60 of such idealistic, top-down and externally imposed concepts. These concepts have to be linked and balanced in just the right way, otherwise the capicut will not behave interesting in any way. Further more, the Slipnet is a structurally static entity. There are some parameters that change during its activity, but Copycat does not add new items to its Slipnet.

For these reasons it remains completely opaque, how Mitchell and Hofstadter arrived at that particular instance of the Slipnet for the letter domain, and thus it also remains completely unclear how the “computer” itself could build or achieve something like a Slipnet. Albeit Linhares [3] was able to implement an analogous FARG model for the domain of chess3, his model too suffers from the static Slipnet in the same way: it is extremely tedious to set up a Slipnet. Further more, the validation is even more laborious, if not impossible, due to the very nature of making analogies and the idealismic Slipnet.

The result is, well, a model that can not serve as a template for any kind of application that is designed to be able to adapt and to learn, at least if we take it without abstracting from it.

From an architectural point of view the Slipnet is simply not compatible to the rest of Copycat, which is strongly based on randomness and probabilistic processes in populations. The architecture of the Slipnet and the way it is used does not offer something like a probabilistic pathway into it. But why should the “Slipnet” not be a probabilistic process either?

Superior Aspects of the FARG model

Hofstadter clearly and correctly separates his project from connectionism (p.308):

Connectionist (neural-net) models are doing very interesting things these days, but they are not addressing questions at nearly as high a level of cognition as Copycat is, and it is my belief that ultimately, people will recognize that the neural level of description is a bit too low to capture the mechanisms of creative, fluid thinking. Trying to use connectionist language to describe creative thought strikes me as a bit like trying to describe the skill of a great tennis player in terms of molecular biology, which would be absurd.

A cornerstone in Hofstadter’s arguments and concepts around Copycat is conceptual slippage. This occurs in Slipnet and is represented as a sudden change in the weights of the items such that the most active (or influential) “neigh-borhood” also changes. To describe these neighborhoods, he invokes the concept of a halo. The “halo” is a more or less circular region around one of the abstract items in the Slipnet, yet without a clear boundary. Items in the Slipnet change their relative position all the time, thus their co-excitation also changes dynamically.

Hofstadter lists (p.215) the following missing issues in connectionist network (CN) models with regard to cognition, particularly with regard to concept slippage and fluid analogies.

  • – CN don’t develop a halo around the representatives of concepts in case of localist networks, i.e. node oriented networks and thus no slippability emerges;
  • – CN don’t develop a core region for a halo in case of networks where a “concept” is distributed throughout the network, and thus no slippability emerges;
  • – CN have no notion of normality due to learning that is instantiated in any encounter with data.

This critique appears both to be a bit overdone and misdirected. As we have seen above, Copycat can be interpreted as to comprise a slightly restricted case of generalized evolution. Standard neuronal techniques do not know of evolutionary techniques, there are no “coopetitioning” agents, and there is no separation into different memories of different durations. The abstraction achieved by artificial neuronal networks (ANN) or even by standard SOMs is always exhausted by the transition from extensional (observed items) to intensional description (classes, types). The abstract items in the Slipnet are not just intensional descriptions and could not be found/constructed by an ANN or a SOM that would work just on the observation, especially, if there is just a single observation at all!

Copycat  is definitely working in a different space as compared to network-based models.1 While the latter can provide the mechanisms to proceed from extensions to intensions in a “bottom-up” movement, the former is applying those intensions in a “top-down” manner. Saying this, we may invoke the reference to the higher forms of comparison and the Deleuzean differential. As many other things mentioned here, this would deserve a closer look from a philosophical perspective, which however we can’t provide here and now.

Nevertheless, Hofstadter’s critique of connectionist models seems to be closely related to the abandonment of modeling as a model for analogy making. Any of the three points above can be mitigated if we take a particular collection of SOM as a counterpart for Copycat. In the next section (which will be found in part II of this essay) we will see how the two approaches can inform each other.

Notes

1. We would like to point you to our discussion of non-Turing computation and else make you aware of the this conference: 11th International Conference on Unconventional Computation & Natural Computation 2012, University of Orléans, conference website.

2. Interestingly, Hofstadter’s PhD-student, co-worker and co-author Melanie Mitchell started to publish in the field of genetic algorithms (GA), yet, she never realized the kinship between GA and Copycat, at least she never said anything like this publicly.

3. He calls his model implementation “Capyblanca”; it is available through Google Code.

4. The example provided by Blank [4] where he tried to implement analogy making in a simply ANN is seriously deficient in many respects.

  • [1] James B. Marshall, Metacat: A Self-Watching Cognitive Architecture for Analogy-Making and High-Level Perception. PhD Thesis, Indiana University 1999. available online (last access 18/3/2012)
  • [2] Daniel Dennett, Consciousness Explained. 1992. p.107.
  • [3] Alexandre Linhares (2008). The emergence of choice: Decision-making and strategic thinking through analogies. available online.
  • [4] Douglas S. Blank, Implicit Analogy-Making: A Connectionist Exploration.
    Indiana University Computer Science Department. available online.

۞

Elementarization and Expressibility

March 12, 2012 § Leave a comment

Since the beginnings of the intellectual adventure

that we know as philosophy, elements take a particular and prominent role. For us, as we live as “post-particularists,” the concept of element seems to be not only a familiar one, but also a simple, almost a primitive one. One may take this as the aftermath of the ontological dogma of the four (or five) elements and its early dismissal by Aristotle.

In fact, I think that the concept element is seriously undervalued and hence it is left disregarded much too often, especially as far as one concerns it as a structural tool in the task to organize thinking. The purpose of this chapter is thus to reconstruct the concept of “element” in an adequate manner (at least, to provide some first steps of such a reconstruction). To achieve that we have to take tree steps.

First, we will try to shed some light on its relevance as a more complete concept. In order to achieve this we will briefly visit the “origins” of the concept in (pre-)classic Greek philosophy. After browsing quickly through some prominent examples, the second part then will deal with the concept of element as a thinking technique. For that purpose we strip the ontological part of it (what else?), and turn it into an activity, a technique, and ultimately into a “game of languagability,” called straightforwardly “elementarization.”

This will forward us then to the third part, which will deal with problematics of expression and expressibility, or more precisely, to the problematics of how to talk about expression and expressibility. Undeniably, creativity is breaking (into) new grounds, and this aspect of breaking pre-existing borders also implies new ways of expressing things. To get clear about creativity thus requires to get clear about expressibility in advance.

The remainder of this essay revolves is arranged by the following sections (active links):

The Roots1

As many other concepts too, the concept of “element” first appeared in classic Greek culture. As a concept, the element, Greek “stoicheion”, in greek letters ΣΤΟΙΧΕΙΟΝ, is quite unique because it is a synthetic concept, without predecessors in common language. The context of its appearance is the popularization of the sundial by Anaximander around 590 B.C. Sundials have been known before, but it was quite laborious to create them since they required a so-called skaphe, a hollow sphere as the projection site of the gnomon’s shadow.

Figure 1a,b.  Left (a): A sundial in its ancient (primary) form based on a skaphe, which allowed for equidistant segmentation , Right (b): the planar projection involves hyperbolas and complicated segmentation.

The planar projection promised a much more easier implementation, yet, it involves the handling of hyperbolas, which even change relative to the earth’s seasonal inclination. Else, the hours can’t be indicated by an equidistant segments any more. Such, the mathematical complexity has been beyond the capabilities of that time. The idea (presumably of Anaximander) then was to determine the points for the hours empirically, using “local” time (measured by water clocks) as a reference.

Anaximander also got aware of the particular status of a single point in such a non-trivial “series”. It can’t be thought without reference to the whole series, and additionally, there was no simple rule which would have been allowing for its easy reconstruction. This particular status he called an “element”, a stoicheia (pronunciation). Anaximander’s element is best understood as a constitutive component, a building block for the purpose to build a series; note the instrumental twist in his conceptualization.

From this starting point, the concept has been generalized in its further career, soon denoting something like “basics,” or “basic principles”. While Empedokles conceived the four elements, earth, wind, water and fire almost as divine entities, it was Platon (Timaios 201, Theaitet 48B) who developed the more abstract perspective into “elements as basic principles.”

Yet, the road of abstraction does not know a well-defined destiny. Platon himself introduced the notion of “element of recognition and proofing” for stoicheia. Isokrates, then, a famous rhetorician and coeval of Platon extended the reach of stoicheia from “basic component / principle” into “basic condition.” This turn is quite significant since as a consequence it inverts the structure of argumentation from idealistic, positive definite claims to the constraints of such claims; it even opens the perspective to the “condition of possibility”, a concept that is one of the cornerstones of Kantian philosophy, more than 2000 years later. No wonder, Isokrates is said to have opposed Platon’s  arguments.

Nevertheless, all these philosophical uses of stoicheia, the elements, have been used as ontological principles in the context of the enigma of the absolute origin of all things and the search for it. This is all the more particularly remarkable as the concept itself has been constructed some 150 years before in a purely instrumental manner.

Aristotle dramatically changed the ontological perspective. He dismissed the “analysis based on elements” completely and established what is now known as “analysis of moments”, to which the concepts of “form” and “substance” are central. Since Aristotle, elemental analysis regarded as a perspective heading towards “particularization”, while the analysis of moments is believed to be directed to generalization. Elemental analysis and ontology is considered as being somewhat “primitive,” probably due to its (historic) neighborhood to the dogma of the four elements.

True, the dualism made from form and substance is more abstract and more general. Yet, as concept it looses contact not only to the empiric world as it completely devoid of processual aspects. It is also quite difficult, if not impossible, to think “substance” in a non-ontological manner. It seems as if that dualism abolishes even the possibility to think in a different manner than as ontology, hence implying a whole range of severe blind spots: the primacy of interpretation, the deeply processual, event-like character of the “world” (the primacy of “process” against “being”), the communal aspects of human lifeforms and its creational power, the issue of localized transcendence are just the most salient issues that are rendered invisible in the perspective of ontology.

Much more could be said of course about the history of those concepts. Of course, Aristotle’s introduction of the concept of substance is definitely not without its own problems, paving the way for the (overly) pronounced materialism of our days. And there is, of course, the “Elements of Geometry” by Euclid, the most abundant mathematical textbook ever. Yet, I am neither a historian nor a philologus, thus let us now proceed with some examples. I just would like to emphasize that the “element” can be conceived as a structural topos of thinking starting from the earliest witnesses of historical time.

2. Examples

Think about the chemical elements as they have been invented in the 19th century. Chemical compounds, so the parlance of chemists goes, are made from chemical elements, which have been typicized by Mendeleev according to the valence electrons and then arranged into the famous “periodic table.” Mendeleev not only constructed a quality according to which various elements could be distinguished. His “basic principle” allowed him to make qualitative and quantitative predictions of an astonishing accuracy. He predicted the existence of chemical elements, “nature’s substance”, actually unknown so far, along with their physico-chemical qualities. Since it was in the context of natural science, he also could validate that. Without the concept of those (chemical) elements the (chemical) compounds can’t be properly understood. Today a similar development can be observed within the standard theory of particle physics, where basic types of particles are conceived as elements analogous to chemical elements, just that in particle physics the descriptive level is a different one.

Here we have to draw a quite important distinction. The element in Mendeleev’s thinking is not equal to the element as the chemical elements. Mendeleev’s elements are (i) the discrete number (an integer between 1..7, and 0/8 for the noble gases like Argon etc.) that describes the free electron as a representative of electrostatic forces, and (ii) the concept of “completeness” of the set of electrons in the so-called outer shell (or “orbitals”): the number of the valence electrons of two different chemical elements tend to sum up to eight. Actually, chemical elements can be sorted into groups (gases, different kinds of metals, carbon and silicon) according to the mechanism how they achieve this magic number (or how they don’t). As a result, there is a certain kind of combinatorianism, the chemical universe is almost a Lullian-Leibnizian one. Anyway, the important point here is that the chemical elements are only a consequence of a completely different figure of thought.

Still within in chemistry, there is another famous, albeit less well-known example for abstract “basic principles”: Kekulé’s de-localized valence electrons in carbon compounds (in today’s notion: delocalized 6-π-electrons). Actually, Kekulé added the “element” of the indeterminateness to the element of the valence electron. He dropped the idea of a stable state that could be expressed by a numerical value, or even by an integer. His 6-π-orbital is a cloud that could not be measured directly as such. Today, it is easy to see that the whole area of organic chemistry is based on, or even defined by, these conceptual elements.

Another example is provided by “The Elements of Geometry” by Euclid. He called it “elements” probably for mainly two reasons. First, it was supposed that it was complete, secondly, because you could not remove any of the axioms, procedures, proofs or lines of arguments, i.e. any of its elements, without corroborating the compound concept “geometry.”

A further example from the classic is the conceptual (re-)construction of causality by Aristotle. He obviously understood that it is not appropriate to take causality as an impartible entity. Aristotle designed his idea of causality as an irreducible combination of four distinct elements, causa materialis, causa formalis, causa efficiens and causa finalis. To render this a bit more palpable, think about inflaming a wooden stick and then being asked: What is the cause for the stick being burning?

Even if I would put (causa efficiens) a wooden (causa materialis) stick (causa formalis) above an open flame (part of causa efficiens), it will not necessarily be inflamed until I decide that it should (causa finalis). This is a quite interesting structure, since it could be conceived as a precursor of the Wittgensteinian perspective of a language game.

For Aristotle it made no sense to assume that any of the elements of his causality as he conceived it would be independent from any of the others. For him it would have been nonsense to conceive of causality as any subset of his four elements. Nevertheless, exactly this was what physics did since Newton. In our culture, causality is almost always debated as if it would be identical to causa efficiens. In Newton’s words: Actioni contrariam semper et aequalem esse reactionem. [2] Later, this postulate of actio = reactio has been backed by further foundational work through larger physical theories postulating the homogeneity of physical space. Despite the success of physics, the reduction of causality to physical forces remains just that: a reduction. Applying this principle then again to any event in the world generates specific deficits, which are well visible in large parts of contemporary philosophy of science when it comes to the debate about the relation of natural science and causality (see cf. [3]).

Aristotle himself did not call the components of causality as “elements.” Yet, the technique he applied is just that: an elementarization. This technique was quite popular and well known from another discourse, involving earth, water, air, and fire. Finally, this model had to be abolished, but it is quite likely that the idea of the “element” has been inherited down to Mendeleev.

Characterizing the Concept of “Element”

As we have announced it before, we would like to strip any ontological flavor from the concept of the element. This marks the difference between conceiving them as part of the world or, alternatively, as a part of a tool-set used in the process of constructing a world. This means to take it purely instrumental, or in other words, as a language game. Such, it is also one out of the row of many examples for the necessity to remove any content from philosophy (Ontology is always claiming some kind of such content, which is highly problematical).

A major structural component of the language game “element” is that the entities denoted by it are used as anchors for a particular non-primitive compound quality, i.e. a quality that can’t be perceived by just the natural five (or six, or so) senses.

One the other hand, they are also strictly different from axioms. An axiom is a primitive proposition that serves as a starting point in a formal framework, such as mathematics. The intention behind the construction of axioms is to utilize common sense as a basis for more complicated reasoning. Axioms are considered as facts that could not seriously disputed as such. Thus, they indeed the main element in the attempt to secure mathematics as a unbroken chain of logic-based reasoning. Of course, the selection of a particular axiom for a particular purpose could always be discussed. But itself, it is a “primitive”, either a simple more or less empiric fact, or a simple mathematical definition.

The difference to elements is profound. One always can remove a single axiom from an axiomatic system without corroborating the sense of the latter. Take for instance the axiom of associativity in group theory, which leads to Lie-groups and Lie-algebras. Klein groups are just a special case of Lie Groups. Or, removing the “axiom” of parallel lines from the Euclidean axioms brings us to more general notions of geometry.

In contrast to that pattern, removing an element from an elemental system destroys the sense of the system. Elemental systems are primarily thought as a whole, as a non-decomposable thing, and any of the used elements is synthetically effective. Their actual meaning is only given by being a part of a composition with other elements. Axioms, in contrast, are parts of decomposable systems, where they act as constraints. Removing them leads usually to improved generality. The axioms that build an “axiomatic system” are not tied to each other, they are independent as such. Of course, their interaction always will create a particular conditionability, but that is a secondary effect.

The synthetic activity of elements simply mirrors the assumption that there is (i) a particular irreducible whole, and (ii) that the parts of that whole have a particular relationship to the embedding whole. In contrast to the prejudice that elemental analysis results in an unsuitable particularization of the subject matter, I think that elements are highly integrated, yet itself non-decomposable idealizations of compound structures. This is true for the quaternium of earth, wind, water and fire, but also for the valence electrons in chemistry or the elements of complexity, as we have introduced them here. Elements are made from concepts, while axioms are made from definitions.

In some way, elements can be conceived as the operationalization of beliefs. Take a belief, symbolize it and you get an element. From this perspective it again becomes obvious (on a second route) that elements could not be as something natural or even ontological; they can not be discovered as such in a pure or stable form. They can’t be used to proof propositions in a formal system, but they are indispensable to explain or establish the possibility of thinking a whole.

Mechanism and organism are just different terms that can be used to talk about the same issue, albeit in a less abstract manner. Yet, it is clear that integrated phenomena like “complexity,” or “culture,” or even “text” can’t be appropriately handled without the structural topos of the element, regardless which specific elements are actually chosen. In any of these cases it is a particular relation between the parts and the whole that is essential for the respective phenomenon as such.

If we accept the perspective that conceives of  elements as stabilized beliefs we may recognize that they may be used as building blocks for the construction of a consistent world. Indeed, we well may say that it is due to their properties as described above, their positioning between belief and axiom, that we can use them as an initial scaffold (Gestell), which in turn provides the possibility for targeted observation, and thus for consistency, understood both as substance and as logical quality.

Finally, we should shed some words on the relation between elements and ideas. Elsewhere, we distinguished ideas from concepts. Ideas can’t be equated with elements either. Just the other way round, elements may contain ideas, but also concepts, relations and systems thereof, empirical hypotheses or formal definitions. Elements are, however, always immaterial, even in the case of chemistry. For us, elements are immaterial synthetic compounds used as interdependent building blocks of other immaterial things like concepts, rules, or hypotheses.

Many, if not all concepts, are built from elements in a similar way. The important issue is that elements are synthetic compounds which are used to establish further compounds in a particular manner. In the beginning there need not to be any kind of apriori justification for a particular choice or design. The only requirement is that the compound built from them allows for some kind of beneficial usage in creating higher integrated compounds which would not be achievable without them.

4. Expressibility

Elements may well be conceived as epistemological stepping stones, capsules of belief that we use to build up beliefs. Such, the status of elements is somewhere between models and concepts, not as formal and restricted as models and not as transcendental as concepts, yet still with much stronger ties towards empiric conditions than ideas.

It is quite obvious that such a status reflects a prominent role for perception as well as for understanding. The element may well be conceived as an active zone of differentiation, a zone from which different kind of branches emerge: ideas, models, concepts, words, beliefs. We also could say that elements are close to the effects and the emergence of immanence. The ΣΤΟΙΧΕΙΟΝ itself, its origins and transformations, may count as an epitome of this zone, where thinking creates its objects. It is “here” that expressibility finds its conditions.

At that point we should recall – and keep in mind – that elements should not be conceived as an ontological category. Elements unfold as (rather than “are”) a figure of thought, an idiom of thinking, as a figure for thought. Of course, we can deliberately visit this area, we may develop certain styles to navigate in this (sometimes) misty areas. In other words, we may develop a culture of elementarization. Sadly enough, positivism, which emerged from the materialism of the 19th century on the line from Auguste Comte down to Frege, Husserl, Schlick, Carnap and van Fraassen (among others), that positivism indeed destroyed much of that style. In my opinion, much of the inventiveness of the 19th century could be attributed a certain, yet largely unconscious, attitude towards the topos of the “element.”

No question, elevating the topos of the element into consciousness, as a deliberate means of thinking, is quite promising. Hence, it is also of some importance to our question of machine-based episteme. We may just add a further twist to this overarching topic by asking about the mechanisms and conditions that are needed for the possibility of “elementarization”. Still in other words we could say that elements are the main element of creativity. And we may add that the issue of expression and expressibility is not about words and texts, albeit texts and words potentiated the dynamics and the density of expressibility.

Before we can step on to harvest the power of elementarization we have to spend some efforts on the issue of the structure of expression. The first question is: What exactly happens if we invent and impose an element in and to our thoughts? The second salient question is about the process forming the element itself. Is the “element” just a phenomenological descriptional parlance, or is it possible to give some mechanisms for it?

Spaces and Dimensions

As it is already demonstrated by Anaximander’s ΣΤΟΙΧΕΙΟΝ, elements put marks into the void. The “element game” introduces discernability, and it is central to the topos of the element that it implies a whole, an irreducible set, of which it is a constitutive part. This way, elements don’t act just sign posts that would indicate a direction in an already existing landscape. It is more appropriate to conceive of them as a generators of landscape. Even before words, whether spoken or written, elements are the basic instance of externalization, abstract writing, so to speak.

It is the abstract topos of elements that introduce the complexities around territorialization and deterritorialization into thought, a dynamics that never can come to an end. Yet, let us focus here on the generative capacities of elements.

Elements transform existing spaces or create completely new ones, they represent the condition for the possibility of expressing anything. The implications are rather strong. Looking back from that conditioning to the topos itself we may recognize that wherever there is some kind of expression, there is also a germination zone of ideas, concepts and models, and above all, belief.

The space implied by elements is particular one yet, due to the fact that it inherits the aprioris of the wholeness and non-decomposability. Non-decomposability means that the elemental space looses essential qualities if one of the constituting elements would be removed.

This may be contrasted to the Cartesian space, the generalized Euclidean space, which is the prevailing concept of space today. A Cartesian space is spanned by dimensions that are set orthogonal to each other. This orthogonality of the dimensional setup allows to change the position in just one dimension, but to keep the position in all the other dimensions unchanged, constant. The dimensions are independent from each other. Additionally, the quality of the space itself does not change if we remove one of the dimensions of a n-dimensional Cartesian space (n>1). Thus, the Cartesian space is decomposable.

Spaces are inevitably implied as soon as entities are conceived as carriers of properties, in fact, even if at least one (“1”!) property will be assigned to them. These assigned properties, or short: assignates, then could be mapped to different dimensions. A particular entity thus becomes visible as a particular arrangement in the implied space. In case of Cartesian spaces, this arrangement consists of a sheaf of vectors, which is as specific for the mapped entity as it could be desired.

Dimensions may refer to sensory modalities, to philosophical qualias, or to constructed properties of development in time, that is, concepts like frequency, density, or any kind of pattern. Dimensions may be even purely abstract, as in case of random vectors or random graphs, which we discussed here, where the assignate refers to some arbitrary probability or structural, method specific parameter.

Many phenomena remain completely mysterious if we do not succeed to setup the (approximately) right number of dimensions or aspects. This has been famously demonstrated by Abbott and his flatland [4], or by Ian Stewart and his flatter land [5]. Other examples are the so-called embedding dimension in the complex systems analysis, or the analysis of (mathematical) cusp catastrophes by Ian Stewart [6]. Dimensionality also plays an important role in the philosophy of science, where Ronald Giere uses it to develop a “scientific perspectivism.” [7]

Suppose the example of a cloud of points in the 3‑dimensional space, which forms a spiral-like shape, with the main axis of the shape parallel to the z-axis. For points in the upper half of the cloudy spiral there shall be a high probability that they are blue; those in the lower half shall be mostly red. In other words, there is a clear pattern. If we now project the points to the x-y-plane, i.e. if we reduce dimensionality we loose the possibility to recognize the pattern. Yet, the conclusion that there “is” no pattern is utterly wrong. The selection of a particular number of dimensions is a rather critical operation. Hence, taking action without reflecting on the dimensionality of the space of expressibility quite likely leads to severe misinterpretations. The cover of Douglas Hofstadter’s first book “Gödel, Escher, Bach” featured a demonstration of the effect of projection from higher to lower dimensionality (see the image below), another presentation can be found here on YouTube, featuring Carl Sagan on the topic of dimensionality.

In mathematics, the relation between two spaces of different dimensionality, the so-called manifold, may itself form an abstract space. This exercise of checking out the consequences of removing or adding a dimension/aspect from the space of expressibility is a rewarding game even in everyday life. In the case of fractals in time series developments, Mandelbrot conceptualizes even a changing dimensionality of the space which is used to embed the observations over time.

Undeniably, this decomposability contributed much to the rise and the success of what we call modern science. Any of the spaces of mathematics or statistics is a Cartesian space. Riemann space, Hilbert space, Banach space, topological spaces etc. are all Cartesian insofar as the dimensions are arranged orthogonal to each other, thus introducing independence of elements before any other definition. Though, the real revolutionary contribution of Descartes has not been the setup of independent dimensions, it is the “Copernican” move to move the “origin” around, and with that, to mobilize the reference system of a particular measurement.

But again: By performing this mapping, the wholeness of the entity will be lost. Any interpretation of the entities requires a point outside of the Cartesian dimensional system. And precisely this externalized position is not possible for an entity that itself “performs cognitive processes.”2 It would be quite interesting to investigate the epistemic role of externalization of mental affairs through cultural techniques like words, symbols, or computers, yet that task would be huge.

Despite the success of the Cartesian space as a methodological approach it obviously also remains true that there is no free lunch in the realm of methods and mappings. In case of the Cartesian space this cost is as huge as its benefit, as both are linked to its decomposability. In Cartesian space it is not possible to speak about a whole, whole entities are simply nonexistent. This is indeed as dramatic as it sounds.Yet, it is a direct consequence of the independence of the dimensions. There is nothing in the structure of the Cartesian space that could be utilized as a kind of media to establish coherence. We already emphasized that the structure of the Cartesian space implies the necessity of an external observer. This, however, is not quite surprising for a construction devised by Descartes in the age of absolutistic monarchies symbiontically tied to catholicism, where the idea of the machine had been applied pervasively to anything and everything.

There are still  further assumptions underlying the Cartesian conception of space. Probably the two most salient ones are concerning density and homogeneity. At first it might sound somewhat crazy to conceive of a space of inhomogeneous dimensionality. Such a space would have “holes” about which one could neither talk from within that space not would they be recognizable. Yet, from theoretical physics we know about the concept of wormholes, which precisely represent such inhomogeneity. Nevertheless, the “accessible” parts of such a space would remain Cartesian, so we could call the whole entity “weakly Cartesian”. A famous example is provided by Benoît Mandelbrot’s warping of dimensionality in the time domain of observations [8,9]

From an epistemological perspective, the Cartesian space is just a particular instance for the standardization or even institutionalization of the inevitable implication of spaces. Yet, the epistemic spaces are not just 3-dimensional as Kant assumed in his investigation, epistemic spaces may comprise a large and even variable number of dimensions. Nevertheless, Kant was right about the transcendental character of space, though the space we refer to here is not just the 3d- or (n)d-physical space.

Despite the success of Cartesian space, which builds on the elements of separability, decomposability and externalizable position of the interpreter, it is perfectly clear that it is nothing else than just a particular way of dealing with spaces. There are many empirical, cognitive or mental contexts for which the assumptions underlying the Cartesian space are severely violated. Such contexts usually involve the wholeness of the investigated entity as a necessary apriori. Think of complexity, language, the concept of life forms with its representatives like urban cultures, for any of these domains the status of any part of it can’t be qualified in any reasonable manner without referring always to the embedding wholeness.

The Aspectional Space

What we need is a more general concept of space, which does not start with any assumption about decomposability (or its refutation). Since it is always possible to proof and to drop the assumption of dependence (non-decomposability), but never for the assumption of independence (decomposability) we should start with a concept of space which keeps the wholeness intact.

Actually, it is not too difficult to start with a construction of such a space. The starting point is provided by a method to visualize data, the so-called ternary diagram. Particularly in metallurgy and geology ternary diagrams are abundantly in use for the purpose of expressing mixing proportions. The following figure 2a shows a general diagram for three components A,B,C, and Figure 2b shows a concrete diagram for a three component steel alloy at 900°C.

Figure 2a,b: Ternary diagrams in metallurgy and geology are pre-cursors of aspectional spaces.

Such ternary diagrams are used to express the relation between different phases where the influential components all influence each other. Note that the area of the triangle in such a ternary diagram comprises the whole universe as it is implied by the components. However, in principle it is still possible (though not overly elegant) to map the ternary diagram as it is used in geology into Cartesian space, because there is a strongly standardized way about how to map values. Any triple of values (a,b,c) is mapped to the axes A,B,C such that these axes are served counter-clockwise beginning with A. Without that rule a unique mapping of single points from the ternary space to the Cartesian space would not be possible any more. Thus we can see that the ternary diagram does not introduce a fundamental difference as compared to the Cartesian space defined by orthogonal axes.

Now let us drop this standard of the arrangement of axes. None of the axes should be primary against any other. Obviously, the resulting space is completely different from the spaces shown in Fig.2. We can keep only one of n dimensions constant while changing position in this space (by moving along an arc around one of the corners). Compare this to the Cartesian space, where it is possible to change just one and keep the other constant. For this reason we should call the boundaries of such a space not “axes” or “dimensions” and more. By convention, we may call the scaling entities “aspection“, derived from “aspect,” a concept that, similarly to the concept of element, indicates the non-decomposability of the embedding context.

As said, our space that we are going to construct for a mapping of elements can’t be transformed into a Cartesian space any more. It is an “aspectional space”, not a dimensional space. Of course, the aspectional space, together with the introduction of “aspections” as a companion concept for “dimension” is not just a Glass Bead Game. We urgently need it if we want to talk transparently and probably even quantitatively about the relation between parts and wholes in a way that keeps the dependency relations alive.

The requirement of keeping the dependency relations exerts an interesting consequence. It renders the corner points into singular points, or more precisely, into poles, as the underlying apriori assumption is just the irreducibility of the space. In contrast to the ternary diagram (which is thus still Cartesian) the aspectional space is neither defined at the corner points nor along the borders (“edges”). In  other words, the aspectional space has no border, despite the fact that its volume appears to be limited. Since it would be somehow artificial to exclude the edges and corners by dedicated rules we prefer to achieve the same effect (of exclusion) by choosing a particular structure of the space itself. For that purpose, it is quite straightforward to provide the aspectional space with a hyperbolic structure.

The artist M.C. Escher produced a small variety of confined hyperbolic disks that perfectly represent the structure of our aspectional space. Note that there are no “aspects,” it is a zero-aspectional space. Remember that the 0-dimensional mathematical point represents a number in Cartesian space. This way we even invented a new class of numbers!3 A value in this class of number would (probably) represent the structure of the space, in other words the curvature of the hyperbola underlying the scaling of the space. Yet, the whole mathematics around this space and these numbers is undiscovered!

Figure 3: M.C. Eschers hyperbolic disk, capturing infinity on the table.

Above we said that this space appears to be limited. This impression of a limitation would hold only for external observers. Yet, our interest in aspectional spaces is precisely given by the apriori assumption of non-decomposability and the impossibility of such an external position for cognitive activities. Aspectional spaces are suitable just for those cases where such an external position is not available. From within such a hyperbolic space, the limitation would not be experiencable, a at least not by simple means: the propagation of waves would be different as compared to the Cartesian space.

Aspections, Dimensions

So, what is the status of the aspectional space, especially as compared to the dimensional Cartesian space? A first step of such a characterization would investigate the possibility of transforming those spaces into each other. A second part would not address the space itself, but its capability to do some things uniquely.

So, let us start with the first issue, the possibility for a transition between the two types of species. Think of a three-aspectional space. The space is given by the triangularized relation, where the corners represent the intensity or relevance of a certain aspect. Moving around on this plane changes the distance to at least two (n-1) of the corners, but most moves change the distance to all three of the corners. Now, if we reduce the conceptual difference and/or the possible difference of intensity between all of the three corners we experience a sudden change of the quality of the aspectional space when we perform the limes transition into a state where all differential relevance has been expelled; the aspects would behave perfectly collinear.

Of course, we then would drop the possibility for dependence, claiming independence as a universal property, resulting in a jump into Cartesian space. Notably, there is no way back from the dimensional Cartesian space into aspectional spaces. Interestingly, there is a transformation of the aspectional space which produces a Cartesian space, while the opposite is not possible.

This formal exercise sheds an interesting light to the life form of the 17th century Descartes. Indeed, even in assuming the possibility of dependence one would grant parts of the world autonomy, something that has been categorically ruled out at those times. The idea of God as it was abundant then implied the mechanical character of the world.

Anyway, we can conclude that aspectional spaces are more general than Cartesian spaces as there is a transition only in one direction. Aspectional spaces are indeed formal spaces as Cartesian spaces are. It is possible to define negative numbers, and it is possible to provide them with different metrices or topologies.

Figure 4: From aspectional space to dimensional space in 5 steps. Descartes’ “origin” turns out to be nothing else than the abolishment or conflation of elements, which again could be interpreted as a strongly metaphysically influenced choice.

Now to the second aspect about the kinship between aspections and dimensions. One may wonder, whether the kind of dependency that could be mapped to aspectional spaces could not be modeled in dimensional spaces as well, for instance, by some functional rule acting on the relation between two dimensions. A simple example would be the regression, but also any analytic function y=f(x).

At first sight it seems that this could result in similar effects. We could, for instance, replace two independent dimensions by a new dimension, which has been synthesized in a rule-based manner, e.g. by applying a classic analytical closed-form function. The dependency would disappear and all dimensions again being orthogonal, i.e. independent to each other. Such an operation, however, would require that the dimensions are already abstract enough such that they can be combined by closed analytical functions. This then reveals that we put the claim of independence already into the considerations before anything else. Claiming the perfect equality of functional mapping of dependency into independence thus is a petitio principii. No wonder we find it possible to do so in a later phase of the analysis. It is thus obvious, that the epistemological state of a dependence secondary to the independence of dimensions is a completely different from the primary dependence.

A brief Example

A telling example4 for such an aspectional space is provided by the city theory of David Grahame Shane [10]. The space created by Shane in order to fit in his interests in a non-reductionist coverage of the complexity of cities represents a powerful city theory, from which various models can be derived. The space is established through the three elements of armature, enclave and (Foucaultian) heterotopia. Armature is, of course a rather general concept–designed to cover more or less straight zones of transmission or the guidance for such–, which however expresses nicely the double role of “things” in a city. It points to things as part of the equipment of a city as well as their role as anchor (points). Armatures, in Shane’s terminology, are things like gates, arcades, malls, boulevards, railways, highway, skyscraper or particular forms of public media, that is, particular forms of passages. Heterotopias, on the other hand, are rather compli­cated “things,” at least it invokes the whole philo­sophi­cal stance of the late Foucault, to whom Shane explicitly refers. For any of these elements, Shane then provides extensions and phenomenological instances, as values if you like, from which he builds a metric for each of the three basic aspects. Through­out his book he demonstrates the usefulness of his approach, which is based on these three elements. This usefulness becomes tangible because Shane’s city theory is an aspectional space of expressibility which allows to compare and to relate an extreme variety of phenomena regarding the city and the urban organization. Of course, we must expect other such spaces in principle; this would not only be interesting, but also a large amount of work to complete. Quite likely, however, it will be a just an extension of Shane’s concept.

5. Conclusion

Freeing the concept of “element” from its ontological burden turns it into a structural topos of thinking. The “element game” is a mandatory condition for the creation of spaces that we need in order to express anything. Hence, the “element game,” or briefly, the operation of “elementarization,” may be regarded as the prime instance of externalization and as such also as the hot spot of the germination of ideas, concepts and words, both abstract and factual. For our concerns here about machine-based episteme it is important that the notion of the element provides an additional (new?) possibility to ask about the mechanism in the formation of thinking.

Elementarization also represents the conditions for “developing” ideas and to “settle” them. Yet, our strictly non-ontological approach helps to avoid premature and final territorialization in thought. Quite to the contrary, if understood as a technique, elementarization helps to open new perspectives.

Elementarization appears as a technique to create spaces of expressibility, even before words and texts. It is thus worthwhile to consider words as representatives of a certain dynamics around processes of elementarization, both as an active as well as a passive structure.

We have been arguing that the notion of space does not automatically determine the space to be a Cartesian space. Elements to not create Cartesian spaces. Their particular reference to the apriori acceptance of an embedding wholeness renders both the elements as well as the space implied by them incompatible with Cartesian space. We introduced the notion of “aspects” in order to reflect to the particular quality of elements. Aspects are the result of a more or less volitional selection and construction.

Aspectional spaces are spaces of mutual dependency between aspects, while Cartesian spaces claim that dimensions are independent from each other. Concerning the handling and usage of spaces, parameters have to be sharply distinguished both from aspects as well as from dimensions. In Mathematics or in natural sciences, parameters are distinguished from variables. Variables are to be understood as containers for all allowed instances of values of a certain dimension. Parameters are modifying just the operation of placing such a value into the coordinate system. In other words, they do not change the general structure of the space used for or established by performing a mapping, and they even do not change the dimensionality of the space itself. For designers as well as scientists, and more general for any person acting with or upon things in the world, it is thus more than naive to play around with parameters without explicating or challenging the underlying space of expressibility, whether this is a Cartesian or an aspectional space. From that it also follows that the estimation of parameters can’t be regarded as an instance of learning.

Here we didn’t mention the mechanisms that could lead to the formation of elements.Yet, it is quite important to understand that we didn’t just shift the problematics of creativity to another descriptional layer, without getting a better grip to it. The topos of the element allows us to develop and to apply a completely different perspective to the “creative act.”

The mechanisms that could be put into charge for generating elements will be the issue of the next chapter. There we will deal with relations and its precursors. We also will briefly return to the topos of comparison.

Part 3: A Pragmatic Start for a Beautiful Pair

Part 5: Relations and Symmetries (forthcoming)

Notes

1. Most of the classic items presented here I have taken from Wilhelm Schwabe’s superb work about the ΣΤΟΙΧΕΙΟΝ [1], in latin letters “stoicheion.”

2. The external viewpoint has been recognized as an unavailable desire already by Archimedes long ago.

3. Just consider the imaginary numbers that are basically 2-dimensional entities, where the unit 1 expresses a turn of -90 degrees in the plane.

4. Elsewhere [11] I dealt in more detail with Shane’s approach, a must read for anyone dealing with or interested in cities or urban culture.

  • [1] Wilhelm Schwabe. ‘Mischung’ und ‘Element’ im Griechischen bis Platon. Wort- u. begriffsgeschichtliche Untersuchungen, insbes. zur Bedeutungsentwicklung von ΣΤΟΙΧΕΙΟΝ. Bouvier, Bonn 1980.
  • [2] Isaac Newton: Philosophiae naturalis principia mathematica. Bd. 1 Tomus Primus. London 1726, S. 14 (http://gdz.sub.uni-goettingen.de/no_cache/dms/load/img/?IDDOC=294021)
  • [3] Wesley C. Salmon. Explanation and Causality. 2003.
  • [4] Abbott. Flatland.
  • [5] Ian Stewart Flatter Land.
  • [6] Ian Stewart & nn, Catastrophe Theory
  • [7] Ronald N. Giere, Scientific Perspectivism.
  • [8] Benoit B. Mandelbrot, Fractals: Form, Chance and Dimension.Freeman, New York 1977.
  • [9] Benoit B. Mandelbrot, Fractals and Scaling in Finance. Springer, New York 1997.
  • [10] David Grahame Shane, Recombinant Urbanism, Wiley, New York 2005.
  • [11] Klaus Wassermann (2011). Sema Città-Deriving Elements for an applicable City Theory. in: T. Zupančič-Strojan, M. Juvančič, S. Verovšek, A. Jutraž (eds.), Respecting fragile places, 29th Conference on Education in Computer Aided Architectural Design in Europe
    eCAADe. available online.

۞

Ideas and Machinic Platonism

March 1, 2012 § Leave a comment

Once the cat had the idea to go on a journey…
You don’t believe me? Did not your cat have the same idea? Or is your doubt about my believe that cats can have ideas?

So, look at this individual here, who is climbing along the facade, outside the window…

(sorry for the spoken comment being available only in German language in the clip, but I am quite sure you got the point anyway…)

Cats definitely know about the height of their own position, and this one is climbing from flat to flat … outside, on the facade of the building, and in the 6th floor. Crazy, or cool, respectively, in its full meaning, this cat here, since it looks like she has been having a plan… (of course, anyone ever lived together with a cat knows very well that they can have plans… proudness like this one, and also remorse…)

Yet, how would your doubts look like, if I would say “Once the machine got the idea…” ? Probably you would stop talking or listening to me, turning away from this strange guy. Anyway, just that is the claim here, and hence I hope you keep reading.

We already discussed elsewhere1 that it is quite easy to derive a bunch of hypotheses about empirical data. Yet, deriving regularities or rules from empirical data does not make up an idea, or a concept. At most they could serve as kind of qualified precursors for the latter. Once the subject of interest has been identified, deriving hypotheses about it is almost something mechanical. Ideas and concepts as well are much more related to the invention of a problematics, as Deleuze has been working out again and again, without being that invention or problematics. To overlook (or to negate?) that difference between the problematic and the question is one of the main failures of logical empiricism, and probably even of today’s science.

The Topic

But what is it then, that would make up an idea, or a concept? Douglas Hofstadter once wrote [1] that we are lacking a concept of concept. Since then, a discipline emerged that calls itself “formal concept analysis”. So, actually some people indeed do think that concepts could be analyzed formally. We will see that the issues about the relation between concepts and form are quite important. We already met some aspects of that relationship in the chapters about formalization and creativity. And we definitely think that formalization expels anything interesting from that what probably had been a concept before that formalization. Of course, formalization is an important part in thinking, yet it is importance is restricted before it there are concepts or after we have reduced them into a fixed set of finite rules.

Ideas

Ideas are almost annoying, I mean, as a philosophical concept, and they have been so since the first clear expressions of philosophy. From the very beginning there was a quarrel not only about “where they come from,” but also about their role with respect to knowledge, today expressed as . Very early on in philosophy two seemingly juxtaposed positions emerged, represented by the philosophical approaches of Platon and Aristotle. The former claimed that ideas are before perception, while for the latter ideas clearly have been assigned the status of something derived, secondary. Yet, recent research emphasized the possibility that the contrast between them is not as strong as it has been proposed for more than 2000 years. There is an eminent empiric pillar in Platon’s philosophical building [2].

We certainly will not delve into this discussion here, it simply would take too much space and efforts, and not to the least there are enough sources in the web displaying the traditional positions in great detail. Throughout history since Aristotle, many and rather divergent flavors of idealism emerged. Whatever the exact distinctive claim of any of those positions is, they all share the belief in the dominance into some top-down principle as essential part of the conditions for the possibility of knowledge, or more general the episteme. Some philosophers like Hegel or Frege, just as others nowadays being perceived as members of German Idealism took rather radical positions. Frege’s hyper-platonism, probably the most extreme idealistic position (but not exceeding Hegel’s “great spirit” that far) indeed claimed that something like a triangle exists, and quite literally so, albeit in a non-substantial manner, completely independent from any, e.g. human, thought.

Let us fix this main property of the claim of a top-down principle as characteristic for any flavor of idealism. The decisive question then is how could we think the becoming of ideas.It is clearly one of the weaknesses of idealistic positions that they induce a salient vulnerability regarding the issue of justification. As a philosophical structure, idealism mixes content with value in the structural domain, consequently and quite directly leading to a certain kind of blind spot: political power is justified by the right idea. The factual consequences have been disastrous throughout history.

So, there are several alternatives to think about this becoming. But even before we consider any alternative, it should be clear that something like “becoming” and “idealism” is barely compatible. Maybe, a very soft idealism, one that already turned into pragmatism, much in the vein of Charles S. Peirce, could allow to think process and ideas together. Hegel’s position, or as well Schelling’s, Fichte’s, Marx’s or Frege’s definitely exclude any such rapprochement or convergence.

The becoming of ideas could not thought as something that is flowing down from even greater transcendental heights. Of course, anybody may choose to invoke some kind of divinity here, but obviously that does not help much. A solution according to Hegel’s great spirit, history itself, is not helpful either, even as this concept implied that there is something in and about the community that is indispensable when it comes to thinking. Much later, Wittgenstein took a related route and thereby initiated the momentum towards the linguistic turn. Yet, Hegel’s history is not useful to get clear about the becoming of ideas regarding the involved mechanism. And without such mechanisms anything like machine-based episteme, or cats having ideas, is accepted as being impossible apriori.

One such mechanism is interpretation. For us the principle of the primacy of interpretation is definitely indisputable. This does not mean that we disregard the concept of the idea, yet, we clearly take an Aristotelian position. More á jour, we could say that we are quite fond of Deleuze’s position on relating empiric impressions, affects, and thought. There are, of course many supporters in the period of time that span between Aristotle and Deleuze who are quite influential for our position.2
Yet, somehow it culminated all in the approach that has been labelled French philosophy, and which for us comprises mainly Michel Serres, Gilles Deleuze and Michel Foucault, with some predecessors like Georges Simondon. They converged towards a position that allow to think the embedding of ideas in the world as a process, or as an ongoing event [3,4], and this embedding is based on empiric affects.

So far, so good. Yet, we only declared the kind of raft we will build to sail with. We didn’t mention anything about how to build this raft or how to sail it. Before we can start to constructively discuss the relation between machines and ideas we first have to visit the concept, both as an issue and as a concept.

Concepts

“Concept” is very special concept. First, it is not externalizable, which is why we call it a strongly singular term. Whenever one thinks “concept,” there is already something like concept. For most of the other terms in our languages, such as idea, that does not hold. Such, and regarding the structural dynamics of its usage,”concept” behave similar to “language” or “formalization.”

Additionally, however, “concept” is not self-containing term like language. One needs not only symbols, one even needs a combination of categories and structured expression, there are also Peircean signs involved, and last but not least concepts relate to models, even as models are also quite apart from it. Ideas do not relate in the same way to models as concepts do.

Let us, for instance take the concept of time. There is this abundantly cited quote by  Augustine [5], a passage where he tries to explain the status of God as the creator of time, hence the fundamental incomprehensibility of God, and even of his creations (such as time) [my emphasis]:

For what is time? Who can easily and briefly explain it? Who even in thought can comprehend it, even to the pronouncing of a word concerning it? But what in speaking do we refer to more familiarly and knowingly than time? And certainly we understand when we speak of it; we understand also when we hear it spoken of by another. What, then, is time? If no one ask of me, I know; if I wish to explain to him who asks, I know not. Yet I say with confidence, that I know that if nothing passed away, there would not be past time; and if nothing were coming, there would not be future time; and if nothing were, there would not be present time.

I certainly don’t want to speculate about “time” (or God) here, instead I would like to focus this peculiarity Augustine is talking about. Many, and probably even Augustine himself, confine this peculiarity to time (and space). I think, however, this peculiarity applies to any concept.

By means of this example we can quite clearly experience the difference between ideas and concepts. Ideas are some kind of models—we will return that in the next section—, while concepts are the both the condition for models and being conditioned by models. The concept of time provides the condition for calendars, which in turn can be conceived as a possible condition for the operationalization of expectability.

“Concepts” as well as “models” do not exist as “pure” forms. We elicit a strange and eminently counter-intuitive force when trying to “think” pure concept or models. The stronger we try, the more we imply their “opposite”, which in case of concepts presumably is the embedding potentiality of mechanisms, and in case of models we could say it is simply belief. We will discuss the issue of these relation in much more detail in the chapter about the choreosteme (forthcoming). Actually, we think that it is appropriate to conceive of terms like “concept” and “model” as choreostemic singular terms, or short choreostemic singularities.

Even from an ontological perspective we could not claim that there “is” such a thing like a “concept”. Well, you may already know that we refute any ontological approach anyway. Yet, in case of choreostemic singular terms like “concept” we can’t simply resort to our beloved language game. With respect to language, the choreosteme takes the role of an apriori, something like the the sum of all conditions.

Since we would need a full discussion of the concept of the choreosteme we can’t fully discuss the concept of “concept” here.  Yet, as kind of a summary we may propose that the important point about concepts is that it is nothing that could exist. It does not exist as matter, as information, as substance nor as form.

The language game of “concept” simply points into the direction of that non-existence. Concepts are not a “thing” that we could analyze, and also nothing that we could relate to by means of an identifiable relation (as e.g. in a graph). Concepts are best taken as gradient field in a choreostemic space, yet, one exhibiting a quite unusual structure and topology. So far, we identified two (of a total of four) singularities that together spawn the choreostemic space. We also could say that the language game of “concept” is used to indicate a certain form of a drift in the choreostemic space. (Later we also will discuss the topology of that space, among many other issues.)

For our concerns here in this chapter, the machine-based episteme, we can conclude that it would be a misguided approach to try to implement concepts (or their formal analysis). The issue of the conditions for the ability to move around in the choreostemic space we have to postpone. In other words, we have confined our task, or at least, we found a suitable entry  point for our task, the investigation of the relation between machines and ideas.

Machines and Ideas

When talking about machines and ideas we are, here and for the time being, not interested in the usage of machines to support “having” ideas. We are not interested in such tooling for now. The question is about the mechanism inside the machine that would lead to the emergence of ideas.

Think about the idea of a triangle. Certainly, triangles as we imagine them do not belong to the material world. Any possible factual representation is imperfect, as compared with the idea. Yet, without the idea (of the triangle) we wouldn’t be able to proceed, as, for instance, towards land survey. As already said, ideas serve as models, they do not involve formalization, they often live as formalization (though not always a mathematical one) in the sense of an idealized model, in other words they serve as ladder spokes for actions. Concepts, if we in contrast them to ideas, that is, if we try to distinguish them, never could be formalized, they remain inaccessible as condition. Nothing else could be expected  from a transcendental singularity.

Back to our triangle. Despite we can’t represent them perfectly, seeing a lot of imperfect triangles gives rise to the idea of the triangle. Rephrased in this way, we may recognize that the first half of the task is to look for a process that would provide an idealization (of a model), starting from empirical impressions. The second half of the task is to get the idea working as a kind of template, yet not as a template. Such an abstract pattern is detached from any direct empirical relation, despite the fact that once we started with with empiric data.

Table 1: The two tasks in realizing “machinic idealism”

Task 1: process of idealization that starts with an intensional description
Task 2: applying the idealization for first-of-a-kind-encounters

Here we should note that culture is almost defined by the fact that it provides such ideas before any individual person’s possibility to collect enough experience for deriving them on her own.

In order to approach these tasks, we need first model systems that exhibit the desired behavior, but which also are simple enough to comprehend. Let us first deal with the first half of the task.

Task 1: The Process of Idealization

We already mentioned that we need to start from empirical impressions. These can be provided by the Self-organizing Map (SOM), as it is able to abstract from the list of observations (the extensions), thereby building an intensional representation of the data. In other words, the SOM is able to create “representative” classes. Of course, these representations are dependent on some parameters, but that’s not the important point here.

Once we have those intensions available, we may ask how to proceed in order to arrive at something that we could call an idea. Our proposal for an appropriate model system consists from the following parts:

  • (1) A small set (n=4) of profiles, which consist of 3 properties; the form of the profiles is set apriori such that they overlap partially;
  • (2) a small SOM, here with 12×12=144 nodes; the SOM needs to be trainable and also should provide classification service, i.e. acting as a model
  • (3) a simple Monte-Carlo-simulation device, that is able to create randomly varied profiles that deviate from the original ones without departing too much;
  • (4) A measurement process that is recording the (simulated) data flow

The profiles are defined as shown in the following table (V denotes variables, C denotes categories, or classes):

V1 V2 V3
C1 0.1 0.4 0.6
C2 0.8 0.4 0.6
C3 0.3 0.1 0.4
C4 0.2 0.2 0.8

From these parts we then build a cyclic process, which comprises the following steps.

  • (0) Organize some empirical measurement for training the SOM; in our model system, however, we use the original profiles and create an artificial body of “original” data, in order to be able to detect the relevant phenomenon (we have perfect knowledge about the measurement);
  • (1) Train the SOM;
  • (2) Check the intensional descriptions for their implied risk (should be minimal, i.e. beyond some threshold) and extract them as profiles;
  • (3) Use these profiles to create a bunch of simulated (artificial) data;
  • (4) Take the profile definitions and simulate enough records to train the SOM,

Thus, we have two counteracting forces, (1) a dispersion due to the randomizing simulation, and (2) the focusing of the SOM due to the filtering along the separability, in our case operationalized as risk (1/ppv=positive predictive value) per node. Note that the SOM process is not a directly re-entrant process as for instance Elman networks [6,7,8].3

This process leads not only to a focusing contrast-enhancement but also to (a limited version) of inventing new intensional descriptions that never have been present in the empiric measurement, at least not salient enough to show up as an intension.

The following figure 1a-1i shows 9 snapshots from the evolution of such a system, it starts top-left of the portfolio, then proceeds row-wise from left to right down to the bottom-right item. Each of the 9 items displays a SOM, where the RGB-color corresponds to the three variables V1, V2, V3. A particular color thus represents a particular profile on the level of the intension. Remember, that the intensions are built from the field-wise average across all the extensions collected by a particular node.

Well, let us now contemplate a bit about the sequence of these panels, which represents the evolution of the system. The first point is that there is no particular locational stability. Of course, not, I am tempted to say, since a SOM is not an image that represents as image. A SOM contains intensions and abstractions, the only issue that counts is its predictive power.

Now, comparing the colors between the first and the second, we see that the green (top-right in 1a, middle-left in 1b) and the brownish (top-left in 1a, middle-right in 1b) appear much more clear in 1b as compared to 1a. In 1a, the green obviously was “contaminated” by blue, and actually by all other values as well, leading to its brightness. This tendency prevails. In 1c and 1d yellowish colors are separated, etc.

Figure 1a thru 1i: A simple SOM in a re-entrant Markov process develops idealization. Time index proceeds from top-left to bottom-right.

The point now is that the intensions contained in the last SOM (1i, bottom-right of the portfolio) have not been recognizable in the beginning, in some important respect they have not been present. Our SOM steadily drifted away from its empirical roots. That’s not a big surprise, indeed, for we used a randomization process. The nice thing is something different: the intensions get “purified”, changing thereby their status from “intensions” to “ideas”.

Now imagine that the variables V1..Vn represent properties of geometric primitives. Our sensory apparatus is able to perceive and to encode them: horizontal lines, vertical lines, crossings, etc. In empiric data our visual apparatus may find any combination of those properties, especially in case of a (platonic) school (say: academia) where the pupils and the teachers draw triangles over triangles into the wax tablets, or into the sand of the pathways in the garden…

By now, the message should be quite clear: there is nothing special about ideas. In abstract terms, what is needed is

  • (1) a SOM-like structure;
  • (2) a self-directed simulation process;
  • (3) re-entrant modeling

Notice that we need not to specify a target variable. The associative process itself is just sufficient.

Given this model it should not surprise anymore why the first philosophers came up with idealism. It is almost built into the nature of the brain. We may summarize our achievements in the following characterization;

Ideas can be conceived as idealizations of intensional descriptions.

It is of course important to be aware of the status of such a “definition”. First, we tried to separate concepts and ideas. Most of the literature about ideas conflate them.Yet, as long as they are conflated, everything and any reasoning about mental affairs, cognition, thinking and knowledge necessarily remains inappropriate. For instance, the infamous discourse about universals and qualia seriously suffered from that conflation, or more precisely, they only arose due to that mess.

Second, our lemma is just an operationalization, despite the fact that we are quite convinced about its reasonability. Yet, there might be different ones.

Our proposal has important benefits though, as it matches a lot of the aspects commonly associated the the term “idea.” In my opinion, what is especially striking about the proposed model is the observation that idealization implicitly also led to the “invention” of “intensions” that were not present in the empiric data. Who would have been expecting that idealization is implicitly inventive?

Finally, two small notes should be added concerning the type of data and the relationship between the “idea” as a continuously intermediate result of the re-entrant SOM process. One should be aware that the “normal” input to natural associative systems are time series. Our brain is dealing with a manifold of series of events, which is mapped onto the internal processes, that is, onto another time-based structure. Prima facie Our brain is not dealing with tables. Yet, (virtual) tabular structures are implied by the process of propertization, which is an inevitable component of any kind of modeling. It is well-known that is is time-series data and their modeling that give rise to the impression of causality. In the light of ideas qua re-entrant associativity, we now can easily understand the transition from networks of potential causal influences to the claim of “causality” as some kind of a pure concept. Despite the idea of causality (in the Newtonian sense) played an important role in the history of science, it is just that: a naive idealization.

The other note concerns the source of the data.  If we consider re-entrant informational structures that are arranged across large “distances”, possibly with several intermediate transformative complexes (for which there are hints from neurobiology) we may understand that for a particular SOM (or SOM-like structure) the type of the source is completely opaque. To put it short, it does not matter for our proposed mechanism whether the data are sourced as empiric data from the external world,or as some kind of simulated, surrogated re-entrant data from within the system itself. In such wide-area, informationally re-entrant probabilistic networks we may expect kind of a runaway idealization. The question then is about the minimal size necessary for eliciting that effect. A nice corollary of this result is the insight that logistic networks, such like the internet or the telephone wiring cable NEVER will start to think on itself, as some still expect. Yet, since there a lot of brains as intermediate transforming entities embedded in this deterministic cablework, we indeed may expect that the whole assembly is much more than could be achieved by a small group of humans living, say around 1983. But that is not really a surprise.

Task 2: Ideas, applied

Ideas are an extremely important structural phenomenon, because they allow to recognize things and to deal with tasks that we never have seen before. We may act adaptively before having encountered a situation that would directly resemble—as equivalence class—any intensional description available so far.

Actually, it is not just one idea, it is a “system” of ideas that is needed for that. Some years ago, Douglas Hofstadter and his group3 devised a model system suitable for demonstrating exactly this: the application of ideas. They called the project (and the model system) Copycat.

We won’t discuss Copycat and analogy-making rules by top-down ideas here (we already introduced it elsewhere). We just want to note that the central “platonic” concept in Copycat is a dynamic relational system of symmetry relations. Such symmetry relations are for instance “before”, “after”, or “builds a group”, “is a triple”, etc. Such kind of relations represent different levels of abstractions, but that’s not important. Much more important is the fact that the relations between these symmetry relations are dynamic and will adapt according to the situation at hand.

I think that these symmetry relations as conceived by the Fargonauts are on the same level as our ideas. The transition from ideas to symmetries is just a grammatological move.

The case of Biological Neural Systems

Re-entrance seems to be an important property of natural neural networks. Very early on in the liaison of neurobiology and computer science, starting with Hebb and Hopfield in the beginning of the 1940ies, recurrent networks have been attractive for researchers. If we take a look to drawings like the following, created (!) by Ramon y Cajal [10] in the beginning of the 20th century.

Figure 2a-2c: Drawings by Ramon y Cajal, the Spain neurobiologist. See also:  History of Neuroscience. a: from a Sparrow’s brain, b: motor brain in human brain, c: Hypothalamus in human brain

Yet, Hebb, Hopfield and Elman got trapped by the (necessary) idealization of Cajal’s drawings. Cajal’s interest was to establish and to proof the “neuron hypothesis”, i.e. that brains work on the basis of neurons. From Cajal’s drawings to the claim that biological neuronal structures could be represented by cybernetic systems or finite state machines is, honestly, a breakneck, or, likewise, ideology.

Figure 3: Structure of an Elman Network; obviously, Elman was seriously affected by idealization (click for higher resolution).

Thus, we propose to distinguish between re-entrant and recurrent networks. While the latter are directly wired onto themselves in a deterministic manner, that is the self-reference is modeled on the morphological level, the former are modeled on the  informational level. Since it is simply impossible for cybernetic structure to reflect neuromorphological plasticity and change, the informational approach is much more appropriate for modeling large assemblies of individual “neuronal” items (cf. [11]).

Nevertheless, the principle of re-entrance remains a very important one. It is a structure that is known to lead to contrast enhancement and to second-order memory effects. It is also a cornerstone in the theory (theories) proposed by Gerald Edelman, who probably is much less affected by cybernetics (e.g. [12]) than the authors cited above. Edelman always conceived the brain-mind as something like an abstract informational population; he even was the first adopting evolutionary selection processes (Darwinian and others) to describe the dynamics in the brain-mind.

Conclusion: Machines and Choreostemic Drift

Out point of departure was to distinguish between ideas and concepts. Their difference becomes visible if we compare them, for instance, with regard to their relation to (abstract) models. It turns out that ideas can be conceived as a more or less stable immaterial entity (though not  “state”) of self-referential processes involving self-organizing maps and the simulated surrogates of intensional descriptions. Concepts on the other hand are described as a transcendental vector in choreostemic processes. Consequently, we may propose only for ideas that we can implement their conditions and mechanisms, while concepts can’t be implemented. It is beyond the expressibility of any technique to talk about the conditions for their actualization. Hence, the issue of “concept” has been postponed to a forthcoming chapter.

Ideas can be conceived as the effect of putting a SOM into a reentrant context, through which the SOM develops a system of categories beyond simple intensions. These categories are not justified by empirical references any more, at least not in the strong sense. Hence, ideas can be also characterized as being clearly distinct from models or schemata. Both, models and schemata involve classification, which—due to the dissolved bonds to empiric data—can not be regarded a sufficient component for ideas. We would like to suggest the intended mechanism as the candidate principle for the development ideas. We think that the simulated data in the re-entrant SOM process should be distinguished from data in contexts that are characterized by measurement of “external” objects, albeit their digestion by the SOM mechanism itself remains the same.

From what has been said it is also clear that the capability of deriving ideas alone is still quite close to the material arrangements of a body, whether thought as biological wetware or as software. Therefore, we still didn’t reach a state where we can talk about epistemic affairs. What we need is the possibility of expressing the abstract conditions of the episteme.

Of course, what we have compiled here exceeds by far any other approach, and additionally we think that it could serve as as a natural complement to the work of Douglas Hofstadter. In his work, Hofstadter had to implement the platonic heavens of his machine manually, and even for the small domain he’d chosen it has been a tedious work. Here we proposed the possibility for a seamless transition from the world of associative mechanisms like the SOM to the world of platonic Copy-Cats, and “seamless” here refers to “implementable”.

Yet, what is really interesting is the form of choreostemic movement or drift, resulting from a particular configuration of the dynamics in systems of ideas. But this is another story, perhaps related to Felix Guattari’s principle of the “machinic”, and it definitely can’t be implemented any more.

.
Notes

1. we did so in the recent chapter about data and their transformation, but also see the section “Overall Organization” in Technical Aspects of Modeling.

2. You really should be aware that this trace we try to put forward here does not come close to even a coarse outline of all of the relevant issues.

3. they called themselves the “Fargonauts”, from FARG being the acronym for “Fluid Analogy Research Group”.

4. Elman networks are an attempt to simulate neuronal networks on the level of neurons. Such approaches we rate as fundamentally misguided, deeply inspired by cybernetics [9], because they consider noise as disturbance. Actually, they are equivalent to finite state machines. It is somewhat ridiculous to consider a finite state machine as model for learning “networks”. SOM, in contrast, especially if used in architectures like ours, are fundamentally probabilistic structures that could be regarded as “feeding on noise.” Elman networks, and their predecessor, the Hopfield network are not quite useful, due to problems in scalability and, more important, also in stability.

  • [1] Douglas Hofstadter, Douglas R. Hofstadter, Fluid Concepts And Creative Analogies: Computer Models Of The Fundamental Mechanisms Of Thought. Basic Books, New York 1996.  p.365
  • [2] Gernot Böhme, “Platon der Empiriker.” in: Gernot Böhme, Dieter Mersch, Gregor Schiemann (eds.), Platon im nachmetaphysischen Zeitalter. Wissenschaftliche Buchgesellschaft, Darmstadt 2006.
  • [3] Marc Rölli (ed.), Ereignis auf Französisch: Von Bergson bis Deleuze. Fin, Frankfurt 2004.
  • [4] Gilles Deleuze, Difference and Repetition. 1967
  • [5] Augustine, Confessions, Book 11 CHAP. XIV.
  • [6] Mandic, D. & Chambers, J. (2001). Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability. Wiley.
  • [7] J.L. Elman, (1990). Finding Structure in Time. Cognitive Science 14 (2): 179–211.
  • [8] Raul Rojas, Neural Networks: A Systematic Introduction. Springer, Berlin 1996. (@google books)
  • [9] Holk Cruse, Neural Networks As Cybernetic Systems: Science Briefings, 3rd edition. Thieme, Stuttgart 2007.
  • [10] Santiago R.y Cajal, Texture of the Nervous System of Man and the Vertebrates: Volume I: 1, Springer, Wien 1999, edited and translated by Pedro Pasik & Tauba Pasik. see google books
  • [11] Florence Levy, Peter R. Krebs (2006), Cortical-Subcortical Re-Entrant Circuits and Recurrent Behaviour. Aust N Z J Psychiatry September 2006 vol. 40 no. 9 752-758.
    doi: 10.1080/j.1440-1614.2006.01879
  • [12] Gerald Edelman: “From Brain Dynamics to Consciousness: A Prelude to the Future of Brain-Based Devices“, Video, IBM Lecture on Cognitive Computing, June 2006.

۞

Where Am I?

You are currently viewing the archives for March, 2012 at The "Putnam Program".