Analogical Thinking, revisited.
March 19, 2012 § Leave a comment
What is the New York of California?
Or even, what is the New York of New York? Almost everybody will come up with the same answer, despite the fact that not only the question is not only ill-defined. Both the question and its answer can be described only after the final appearance of the answer. In other words, it is not possible to provide any proposal about the relevance of those properties apriori to its completion, that aposteriori are easily tagged as relevant for the description of both the question as well as the answer. Both the question and the solution do not “exist” in the way that is pretended by their form before we have finished making sense of it. There is a wealth of philosophical issues around this phenomenon, which we all have to bypass here. Here we will focus just on the possibility for mechanisms that could be invoked in order to build a model that is capable to behave phenomeno-logically “as if“.
The credit to render such questions and the associated problematics salient in the area of computer models of thinking belongs to Douglas Hofstadter and his “Fluid Analogy Research group” (FARG). In his book “Fluid Concepts and Creative Analogies” that we already mentioned here he proposes a particular model of which he claims that it is a proper model for analogical thinking. In constructing this model, which took more than 10 years of research, we did not try to stick (to get stuck?) to the neuronal level. Accordingly, one can’t describe the performance of a tennis player at the molecular level, he says. Remarkably, he also keeps the so-called cognitive sciences and their laboratory wisdom at distance. Instead, his starting point is the everyday language, and presumably a good deal of introspection as well. He sees his model located at an intermediate level between the neurons and consciousness (quite a large field, though).
His overarching claim is as simple as it is distant from the main stream of AI and cognitive science. (Note that Hofstadter does not formulate “analogical reasoning.”)
Thinking is largely equivalent with making analogies.
Hofstadter is not interested to produce just another model for analogy making. There are indeed quite a lot of such models, which he discusses in great detail. And he refutes them all; he proofs that they are all ill-posed, since they all do not start with perception. Without exception they all assume that the “knowledge” is already in the computer and based on this assumption some computer program is established. Of course, such approaches are nonsense, euphemistically called “knowledge acquisition bottleneck” by people working in the field of AI / machine learning. Yet, knowledge is nothing that could be externalized and then acquired subsequently by some other party, it can’t be found “in” the world, and of course it can’t be separated as something that “exists” beside the processing mechanisms of the brain, making the whole thing “smart”. As already mentioned, such ideas are utter nonsense.
Hofstadter’s basic strategy is different. He proposes to create a software system that is able for “concept slipping” as an emergent phenomenon, deeply based on perceptional mechanisms. He even coined the term “high-level perception.”
That is, the […] project is not about simulating analogy-making per se, but about simulating the very crux of human cognition: fluid concepts. (p.208)
This essay will investigate his model. We will find that despite its appeal it is nevertheless seriously unrealistic, even according to Hofstadter’s own standards. Yet, despite its particular weaknesses it also demonstrates very interesting mechanisms. After extracting the cornerstones of his model we will try to map his insights to the world of self-organizing maps. We also will discuss how to transfer the interesting parts of Hofstadter’s model. Hofstadter himself clearly stated the deficiencies of “connectionist models” of “learning,” yet, my impression is that he was not aware about self-organizing maps at this time. By “connectionism” he obviously referred to artificial neural networks (ANN), and for those we completely agree to his critique.
Before we start I would like to provide some original sources, that is, copies of those parts that are most relevant for this essay. These parts are from chapter 5, chapter 7 and chapter 8 of the aforementioned book. There you will find much more details and lucid examples about it in Hofstadter’s own words.
Is there an Alternative to Analogies?
In order to find an alternative we have to take a small bird’s view. Very coarsely spoken, thinking transforms some input into some output while being affected and transforming itself. In some sense, any transformation of input to output transforms the transforming instance, though in vastly different degrees. A trivial machine just wears off, a trivial computer—that is, any digital machine that fits into the scheme of the Turing-computing1—can be reset to meet exactly a previous state. As soon as historical contingency is involved, reproducibility vanishes and strictly non-technical entities appear: memory, value, and semantics (among others).
This transformation game applies to analogy making, and it also applies to traditional modeling.Is it possible to apply any kind of modeling to the problematics that is represented by the “transfer game”, for which those little questions posed in the beginning are just an example?
In his context, Hofstadter calls the modeling approach the brute-force approach (p.327, chp.8). The outline of the modeling approach could look like this (p.337).
- Step 1: Run down the apriori list of city-characterization criteria and characterize the “source town” A according to each of them.
- Step 2: Retrieve an apriori list of “target towns” inside target region Y from the data base.
- Step 3: For each retrieved target town X, run down the a priori list of city-characterization criteria again, calculating X’s numerical degree of match with A for every criterion in the list.
- Step 4: For each target town X, sum up the points generated in Step 3, possibly using apriori weights, thus allowing some criteria to be counted more heavily than others.
- Step 5: Locate the target town with the highest overall rating as calculated in Step 4, and propose it as “the A of Y”.
Any plausible apriori list of city-characterization criteria would be long, very long indeed. Effectively, it can’t be limited in advance, since any imposed limit would represent a model that would claim to be better suited to decide about the criteria than the model being built. We are crashed by an infinite regress, not just in theory. What we experience here is Wittgenstein’s famous verdict that justifications have to come to an end. Rules are embedded in the form of life (“Lebensform”) and without knowing all about a particular Lebensform and to take into consideration anything comprised by such (impossible) knowledge we can’t start to model at all.
He identifies four characteristic difficulties for the modeling approach with regard to his little “transfer game” that plays around with cities.
- – Difficulty 1: It is psychologically unrealistic to explicitly consider all the towns one knows in a given region in order to come up with a reasonable answer.
- – Difficulty 2: Comparison of a target town and a source town according to a specific city-characterization criterion is not a hard-edged mechanical task, but rather, can itself constitute an analogy problem as complex as the original top-level puzzle.
- – Difficulty 3: There will always be source towns A whose “essence”—that is, set of most salient characteristics—is not captured by a given fixed list of city-characterization criteria.
- – Difficulty 4: What constitutes a “town in region Y” is not apriori evident.
Hofstadter underpins his point with the following question (p.347).
What possible set of apriori criteria would allow a computer to reply, perfectly self-confidently, that the country of Monaco is “the Atlantic City of France”?
Of course, the “computer” should come up with the answer in a way that is not pre-programmed explicitly.
Obviously, the problematics of making analogies can’t be solved algorithmically. There is not only no such thing as a single “solution”, even the criteria to describe the problem are missing. Thus we can conclude that modeling, even in its non-algorithmical form, is not a viable alternative to analogy making.
The FARG Model
In the following, we investigate the model as proposed by Hofstadter and his group, mainly Melanie Mitchell. This is separated into the parts
- – precis of the model,
- – its elements,
- – its extension as proposed by Hofstadter,
- – the main problems of the model, and finally,
- – the main superior aspects of the model as compared to connectionist models (from Hofstadter’s perspective, of course).
Precis of the Model
Hofstadter’s conclusion from the problems with the model-based approach and thus also the starting point for his endeavor is that the making of an analogy must appear as an emergent phenomenon. Analogy itself can’t be “defined” in terms of criteria, beyond sort f rather opaque statements about “similarity.” The point is that this similarity could be measured only aposteriori, so this concept does not help. The capability for making analogies can’t be programmed explicitly. It would not be “making” of analogies anymore, it would just be a look-up of dead graphems (not even symbols!) in a database.
He proofs his ideas by means of a small software called “Copycat”. This name derives from the internal processes of the software, as making “almost identical copies” is an important ingredient of it. Yet, it also refers to the problem that appears if you say: “I am doing this, now do the same thing…”
Copycat has three major parts, which he labels as (i) the Slipnet, (ii) the Workspace, (iii) the Coderack.
The Coderack is a rack that serves as a launching site for a population of agents of various kinds. Agents decease and are being created in various ways. They may be spawned by other agents, by the Coderack, or by any of the items in the Slipnet—as a top-down specialist bred just to engage in situations represented by the Slipnet item. Any freshly created agent will be first put into the Coderack, regardless its originator or kind.
Any particular agent behaves as a specialist for recognizing a particular situation or to establish a particular relation between parts of the input “data, ” the initial observation. This recognition requires a model apriori, of course. Since these models are rather abstract as compared to the observational data, Hofstadter calls them “concepts.” After their set up, agents are put into the Coderack from where they start in random order, but also dependent on their “inner state,” which Hofstadter calls “pressure.”
The Slipnet is a loose “network” of deep and/or abstract concepts. In case of Copycat these concepts comprise
a, b, c, … , z, letter, successor, predecessor, alphabetic-first, alphabetic-last, alphabetic position, left, right, direction, leftmost, rightmost, middle, string position, group, sameness group, successor group, predecessor group, group length, 1, 2, 3, sameness, and opposite,
In total there are more than 60 of such concepts. These items are linked together, while the length of the link reflects the “distance” between concepts. This distance changes while Copycat is working on a particular task. The change is induced by the agents in response to their “success.” The Slipnet is not really a “network,” since it is neither a logistic network (it doesn’t transport anything) nor is it an associative network like a SOM. It is also not suitable to conceive it as a kind of filter in the sense of a spider’s web, or a fisherman’s net. It is thus more appropriate to consider it simply as a non-directed, dynamic graph, where discrete items are linked.
Finally, the third aspect is the Workspace. Hofstadter describes it as a “busy construction site” and likens it to the cytoplasm (p.216). In the Workspace, the agents establish bonds between the atomic items of the observation. As said, each agent knows nothing about the posed problem, it is just capable to perform on a mini-aspect of the task. The whole population of agents, however, build something larger. It looks much like the activity in ants or termites, building some morphological structure in the hive, or a macroscopic dynamic effect as hive population. The Workspace is the location of such intermediate structures of various degrees of stability, meaning that some agents also work to remove a particular structure.
So far we have described the morphology. The particular dynamics unfolding on this morphology is settled between competition and cooperation, with the result of a collective calming down of the activities. The decrease in activity is itself an emergent consequence of the many parallel processes inside Copycat.
A single run of Copycat yields one instance of the result. Yet, a single answer is not the result itself. Rather, as different runs of Copycat yield different singular answers, the result consists of a probability density for different singular answers. For the letter-domain in which Copycat is working the result look like this:
Figure 1: Probability densities as result of a Copycat run.
The Elements of the FARG Model
Before we proceed, I should emphasize that here “element” is used as we have introduced the term here.
Returning to the FARG model, it is important to understand that a particularly constraint randomness plays a crucial role in its setup. The population of agents does not search through all possibilities all the time. Yet, any existing intermediate result, say structural hypothesis, serves as a constraint for the future search.
We also find different kinds of memories with different durations, we find dynamic historic constraints, which we also could call contingencies. We have a population of different kinds of agents that cooperate and compete. In some almost obvious way, Copycat’s mechanisms may be conceived as an instance of the generalized evolution that we proposed earlier. Hofstadter himself is not aware that he just proposed a mechanism for generalized evolutionary changes. He calls the process “parallel terraced scan”, thereby unnecessarily sticking to a functional perspective. Yet, we consider generalized evolution as one of the elements of Copycat. It could really be promising to develop Copycat as an alternative to so-called genetic algorithms.2
Despite a certain resemblance to natural evolution the mechanisms built into Copycat do not comprise an equivalent to what is known from biology as “gene doubling”. Gene doubling and the akin part of gene deletion are probably the most important mechanisms in natural evolution. Copycat produces different kinds of agents, but the informational setup of these agents does not change as it is given by the Slipnet. The equivalent to gene doubling would have to be implemented into the Slipnet. On the other hand, however, it is clear that the items in the Slipnet are too concrete, almost representational. In contrast, genes usually do not represent a particular function on the macro-level (which is one of the main structural faults of so-called genetic algorithms). So, we conclude that Copycat contains a restricted version of generalized evolution. Else, we see a structural resemblance to the theories of Edelman and his neuronal Darwinism, which actually is a nice insight.
Conceiving large parts of the mechanism of Copycat as (restricted) generalized evolution covers both the Coderack as well as the Workspace, but not the Slipnet.
The Slipnet acts as sort of a “Platonic Heaven” (Hofstadter’s term). It contains various kinds of abstract terms, where “abstract” simply means “not directly observable.” It is hence not comparable to those abstractions that can be used to build tree-like hierarchies. Think of the series “fluffy”-dog-mammal-animal-living entity. Significantly, the abstract terms in Copycat’s Slipnet also comprise concepts about relations, such as “right,” “direction,” “group,” or “leftmost.” Relations, however, are nothing else than even more abstract symmetries, that is transformational models, that may even build a mathematical group. Quite naturally, we could consider the items in Slipnet as a mathematical category (of categories). Again, Hofstadter and Mitchell do not refer in any way to such structures, quite unfortunately so.
The Slipnet’s items may well be conceived as instances of symmetry relations. Hofstadter treats them as idealizations of positional relations. Any of these items act as a structural property. This is a huge advance as compared to other models of analogy.
To summarize, we find two main elements in Copycat.
- (1) restricted generalized evolution, and
- (2) concrete instances of positional idealization.
Actually, these elements are top-level elements that must be conceived as compounds. In part 2 we will check out the elements of the Slipnet in detail, while the evolutionary aspects we already discussed in a previous chapter. Yet, this level of abstraction is necessary to render Copycat’s principles conceptually more mobile. In some way, we have to apply the principles of Copycat to the attempt to understand it.
The Copycat, released to the wild
Any generalization of Copycat has to withdraw the implicit constraints of its elements. In more detail, this would include the following changes:
- (1) The representation of the items in the Slipnet could be changed into compounds, and these compounds should be expressed as “gene-like” entities.
- (2) Introducing a mechanism to extend the Slipnet. This could be achieved through gene doubling in response to external pressures; yet, these pressures are not to be conceived as “external” to the whole system, just external to the Copycat. The pressures could be issued by a SOM. Alternatively, a SOM environment might also deliver the idealizations themselves. In either case, the resulting behavior of the Copycat has to be shaped by selection, either through internal mechanisms, or through environmentally induced forces (changes in the fitness landscape).
- (3) The focus to positional idealization would have to be removed by introducing the more abstract notion of “symmetries”, i.e. mathematical groups or categories. This would render positional idealization just into a possible instance of potential idealization.
The resulting improvement of these changes would be dramatic. It would be not only much more easy to establish a Slipnet for any kind of domain, it also would allow the system (a CopyTiger?) to evolve new traits and capabilities, and to parametrize them autonomously. But these changes also require a change in the architectural (and mental) setup.
From Copycat to Metacat
Hofstadter himself tried to describe possible improvements of Copycat. A significant part of these suggestions for improvement is represented by the capability for self-monitoring and proliferating abstraction, hence he calls it “Metacat”.
The list of improvements comprises mainly the following five points (pp.315, chp.7).
- (1) Self-monitoring of pressures, actions, and crucial changes as an explicit registering into parts of the Workspace.
- (2) Disassembling of a given solution into the path of required actions.
- (3) Hofstadter writes that “Metacat should store a trace of its solution of a problem in an episodic memory.“
- (4) A clear “meta-analogical” sense as an ability to see analogies between analogies, that is a multi-leveled type of self-reflectiveness.
- (5) The ability to create and to enjoy the creation of new puzzles. In this context he writes “Indeed, I feel that responsiveness to beauty and its close cousin, simplicity, plays a central role in high-level cognition.“
I am not really convinced of these suggestions, at least not if it would be implemented in the way that is suggested by Hofstadter “between the lines”. They look much more like a dream than a reasonable list of improvements, perhaps except the first one. The topic of self-monitoring has been explored by James Marshall in his dissertation , but still his version of “Metacat” was not able to learn. This self-monitoring should not be conceived as a kind of Cartesian theater , perhaps even populated with homunculi on both sides of the stage.
The second point is completely incompatible with the architecture of Copycat, and notably Hofstadter does not provide even the tiniest comment on it. The third point violates the concept of “memory” as a re-constructive device. Hofstadter himself says elsewhere, while discussing alternative models of analogy, that the brain is not a database, which is quite correct. “Memory” is not a storage device. Yet, the consequence is that analogy making can’t be separated from memory itself (and vice versa).
The fourth suggestion, then, would require further platonic heavens, in case of Copycat/Metacat created by a programmer. This is highly implausible, and since it is a consequence of the architecture, the architecture of Copycat as such is not suitable to address real-world entities.
Finally, the fifth suggestion displays a certain naivity regarding either evolutionary contexts, to philosophical aspects of reasoning that are known since Immanuel Kant, or to the particular setup of human cognition, where emotions and propositional reasoning appear as deeply entangled issues.
The main Problem(s) of the FARG model
We already mentioned Copycat’s main problems, which are (i) the “Platonic heaven”, and (ii) the lack of the capability to learn as a kind of structural self-transformation.
Both problems are closely related. Actually, somehow there is only one single problem, and that’s the issue that Hofstadter got trapped by idealism. A Platonic heaven that is filled by the designer with an x-cat (or a Copy-x) is hard to comprehend. Even for the really small letter domain there are more than 60 of such idealistic, top-down and externally imposed concepts. These concepts have to be linked and balanced in just the right way, otherwise the capicut will not behave interesting in any way. Further more, the Slipnet is a structurally static entity. There are some parameters that change during its activity, but Copycat does not add new items to its Slipnet.
For these reasons it remains completely opaque, how Mitchell and Hofstadter arrived at that particular instance of the Slipnet for the letter domain, and thus it also remains completely unclear how the “computer” itself could build or achieve something like a Slipnet. Albeit Linhares  was able to implement an analogous FARG model for the domain of chess3, his model too suffers from the static Slipnet in the same way: it is extremely tedious to set up a Slipnet. Further more, the validation is even more laborious, if not impossible, due to the very nature of making analogies and the idealismic Slipnet.
The result is, well, a model that can not serve as a template for any kind of application that is designed to be able to adapt and to learn, at least if we take it without abstracting from it.
From an architectural point of view the Slipnet is simply not compatible to the rest of Copycat, which is strongly based on randomness and probabilistic processes in populations. The architecture of the Slipnet and the way it is used does not offer something like a probabilistic pathway into it. But why should the “Slipnet” not be a probabilistic process either?
Superior Aspects of the FARG model
Hofstadter clearly and correctly separates his project from connectionism (p.308):
Connectionist (neural-net) models are doing very interesting things these days, but they are not addressing questions at nearly as high a level of cognition as Copycat is, and it is my belief that ultimately, people will recognize that the neural level of description is a bit too low to capture the mechanisms of creative, fluid thinking. Trying to use connectionist language to describe creative thought strikes me as a bit like trying to describe the skill of a great tennis player in terms of molecular biology, which would be absurd.
A cornerstone in Hofstadter’s arguments and concepts around Copycat is conceptual slippage. This occurs in Slipnet and is represented as a sudden change in the weights of the items such that the most active (or influential) “neigh-borhood” also changes. To describe these neighborhoods, he invokes the concept of a halo. The “halo” is a more or less circular region around one of the abstract items in the Slipnet, yet without a clear boundary. Items in the Slipnet change their relative position all the time, thus their co-excitation also changes dynamically.
Hofstadter lists (p.215) the following missing issues in connectionist network (CN) models with regard to cognition, particularly with regard to concept slippage and fluid analogies.
- – CN don’t develop a halo around the representatives of concepts in case of localist networks, i.e. node oriented networks and thus no slippability emerges;
- – CN don’t develop a core region for a halo in case of networks where a “concept” is distributed throughout the network, and thus no slippability emerges;
- – CN have no notion of normality due to learning that is instantiated in any encounter with data.
This critique appears both to be a bit overdone and misdirected. As we have seen above, Copycat can be interpreted as to comprise a slightly restricted case of generalized evolution. Standard neuronal techniques do not know of evolutionary techniques, there are no “coopetitioning” agents, and there is no separation into different memories of different durations. The abstraction achieved by artificial neuronal networks (ANN) or even by standard SOMs is always exhausted by the transition from extensional (observed items) to intensional description (classes, types). The abstract items in the Slipnet are not just intensional descriptions and could not be found/constructed by an ANN or a SOM that would work just on the observation, especially, if there is just a single observation at all!
Copycat is definitely working in a different space as compared to network-based models.1 While the latter can provide the mechanisms to proceed from extensions to intensions in a “bottom-up” movement, the former is applying those intensions in a “top-down” manner. Saying this, we may invoke the reference to the higher forms of comparison and the Deleuzean differential. As many other things mentioned here, this would deserve a closer look from a philosophical perspective, which however we can’t provide here and now.
Nevertheless, Hofstadter’s critique of connectionist models seems to be closely related to the abandonment of modeling as a model for analogy making. Any of the three points above can be mitigated if we take a particular collection of SOM as a counterpart for Copycat. In the next section (which will be found in part II of this essay) we will see how the two approaches can inform each other.
1. We would like to point you to our discussion of non-Turing computation and else make you aware of the this conference: 11th International Conference on Unconventional Computation & Natural Computation 2012, University of Orléans, conference website.
2. Interestingly, Hofstadter’s PhD-student, co-worker and co-author Melanie Mitchell started to publish in the field of genetic algorithms (GA), yet, she never realized the kinship between GA and Copycat, at least she never said anything like this publicly.
3. He calls his model implementation “Capyblanca”; it is available through Google Code.
-  James B. Marshall, Metacat: A Self-Watching Cognitive Architecture for Analogy-Making and High-Level Perception. PhD Thesis, Indiana University 1999. available online (last access 18/3/2012)
-  Daniel Dennett, Consciousness Explained. 1992. p.107.
-  Alexandre Linhares (2008). The emergence of choice: Decision-making and strategic thinking through analogies. available online.
-  Douglas S. Blank, Implicit Analogy-Making: A Connectionist Exploration.
Indiana University Computer Science Department. available online.