A Model for Analogical Thinking

February 13, 2012 § Leave a comment

Analogy is an ill-defined term, and quite naturally so.

If something is said to be analog to something else, one supposes not only just a distant resemblance. Such a similarity could be described indeed by a similarity mapping, based on a well identified function. Yet, analogy is more than that, much more than just similarity.

Analogical thinking is significantly different from determining similarity, or by “selecting” a similar thought. Actually, it is not a selection at all, and it is also not based on modeling. Despite it is based on experience, it is not based on identifiable models. We may even conclude that analogical thinking is (1) non-empiric, and (2) based on a constructive use of theories. In other words, and derived from our theory of theory, an analogy is itself a freshly derived model! Yet, that model is of a  particular kind, since it does not contain any data. In other words, it looks like the basic definition to be used as the primer for surrogating simulated data, which in turn can be used to create SOM-based expectancy. We may also say it in simple terms: analogical thinking is about producing an ongoing stream of ideas. Folding, inventing.

In 2006, on the occasion of the yearly Presidential Lecture at Stanford, Douglas Hofstadter gave some remarkable statements in an interview, which I’d like to quote here, since they express some issues we have argued for throughout our writings.

I knew some people wouldn’t like what I was going to say, since it was diametrically opposed to the standard cognitive-science party line that analogy is simply a part of “reasoning” in the service of some kind of “problem solving” (makes me think of doing problem sets in a physics course). For some bizarre reason, don’t ask me why, people in cognitive science only think of analogies in connection with what they call “analogical reasoning” — they have to elevate it, to connect it with fancy things like logic and reasoning and truth. They don’t seem to see that analogy is dirt-cheap, that it has nothing to do with logical thinking, that it simply pervades every tiny nook and cranny of cognition, it shapes our every thinking moment. Not seeing that is like fish not perceiving water, I guess.

[…]

The point is that thinking amounts to putting one’s finger on the essence of situations one is faced with, which amounts to categorization in a very deep and general sense, and the point is that in order to categorize, one has to compare situations “out there” with things already in one’s head, and such comparisons are analogies. Thus at the root of all acts of thought, every last one of them, is the making of analogies. Cognitive scientists talk a lot about categories, but unfortunately the categories that they study are far too limited. For them, categories are essentially always nouns, and not only that, they are objects that we can see, like “tree” or “car.” You get the impression that they think that categorization is very simple, no more complicated than looking at an object and identifying the standard “features”

The most salient issues here are, in preliminary terms for the time being

  • (1) Whenever thinking happens, this thinking is performed as “making analogy”; there are no “precise = perfectly defined” items in the brain or the mind.
  • (2) Thinking can not be equated with problem-solving and logic;
  • (3) Comparison and categorization are the most basic operations, while both of these take place in a fluid, open manner.

It is due to a fallacy to think that there is analogical reasoning and some other “kinds” (but the journals and libraries are filled with this kind of shortcomings, e.g. [1,2,3]), or vice versa, that there is logical reasoning and something other, such as analogical reasoning. Thinking so would mean to put logic into the world (of neurons, in this case). Yet, we know that logic that is free from interpretive parts can’t be part of the real world. We always and inevitably can deal just and only with quasi-logic, an semantically contaminated instance of transcendental logic. The issue is not just one about wording here. It is the self-referentiality that always is present in multiple respects when dealing with cognitive capacities that enforces us to be not to lazy regarding the wording. Dropping the claim posited by the term “analogical reasoning” we quickly arrive at the insight that all thinking is analogical, or even, that “making analogies” is the label for the visible parts of the phenomenon that we call thinking.

The next issue mentioned by Hofstadter is about categories. Categories in the mind are NOT about the objects, they are even not about any external reference. It is a major and widely spread misunderstanding, according to Hofstadter, among cognitive scientists. It is clear that such referentialism, call it materialism, or naive realism, is conceptually primitive and inappropriate. We also could refer to Charles Peirce, the great American philosopher, who repeatedly stated (and also as the first one) that signs always refer only to signs. Yet, signs are not objects, of course. Similarly, in §198 of the Philosophical Investigations Wittgenstein notes

[…] any interpretation still hangs in the air along with what it interprets, and cannot give it any support. The interpretations alone do not determine the meaning.

The materiality of road sign should not be taken as its meaning, the semiotic sign associated with it is only partially to be found in the matter standing there near the road. The only thing that ties “us” (i.e. our thinking) to the world is modeling, whether this world is taken as the external or the internal one, it is the creation of tools (the models) for anticipation given the expectation of weak repeatability. That means, we have to belief and to trust in the first instance, which yet can not really be taken as “ties”.

The kind of modeling we have in mind is neither covered by model theory, nor by the constructivist framework, nor by the empiricist account of it. Actually, despite we can at least go on believing into an ontology of models ( “a model is…”) as long as we can play the blind man’s buff, that modeling can nevertheless not be separated from entities like concepts, code, mediality or virtuality. We firmly believe in the impossibility of reductionism when it comes to human affairs. And, I guess, we could agree upon the proposal that thinking is such a human affair, even if we are going to “implement” it into a “machine.” In the end, we anyway just strive for understanding ourselves.

It is thus extremely important to investigate the issue of analogy in an appropriate model system. The only feasible model that is known up to these days is that which has been published by Douglas Hofstadter and Melanie Mitchell.

They describe their results on a non-technical level in the wonderful book “Fluid Concepts and Creative Analogies“, while Mitchell focuses on more technical aspects and the computer experiments in “Analogy-Making as Perception: A Computer Model

Though we do not completely agree on their theoretical foundation, particularly the role they ascribe the concept of perception, their approach is nevertheless brilliant.  Hofstadter discusses in a very detailed manner why other approaches are either fake of failure (see also [4]), while Mitchell provides a detailed account about the model system, which they call “CopyCat”.

CopyCat

CopyCat is the last and most advanced variant of a series of similar programs. It deals with a funny example from the letter domain. The task that the program should solve is the following:

Given the transformation of a short sequence of letters, say “abc”, into a similar one, say “abd”, what is the result of applying the very “same” transformation to a different sequence, say “ijk”?

Most people will answer “ijl”. The rule seems to be “obvious”. Yet, there are several solutions, though there is a strong propensity to “ijl”. One of the other solutions would be “ijd”. However, this solution is “intellectually” not appealing, for humans notably…

Astonishingly, CopyCat reproduces the probabilistic distribution of solution provided by a population of humans.

The following extracts show further examples.

Now the important question: How does CopyCat work?

First, the solutions are indeed produced, there is no stored collection of solutions. Thus, CopyCat derives its proposals not from experience that could be expressed by empiric models.

Second, the solutions are created by a multi-layered, multi-component random process. It reminds in some respect to the behavior of a developed ant state, where different functional roles are performed by different sub-populations. In other words, it is more than just swarming behavior, there is division of labor among different populations of agents.

The most important third component is a structure that represents a network of symmetry relations, i.e. a set of symmetry relations are arranged as a graph, where the weights of the relations are dynamically adapted to the given task.

Based on these architectural principals, Copycat produces—and it is indeed a creative production— its answers.

Conclusion

Of course, Copycat is a model system. The greatest challenge is to establish the “Platonic” sphere (Hofstadter’s term) that comprises the dynamical relational system of symmetry relations. In a real system, this sphere of relations has to be fed by other parts of system, most likely the modeling sub-system. This sub-system has to be able to extract abstract relations from data, which then could be assembled into the “Platonic” device. All of those functional parts can be covered or served by self-organizing maps, and all of them are planned. The theory of these parts you may find scattered throughout this blog.

Copycat has been created as a piece of software. Years ago, it had been publicly available from the ftp site at the Illinois University of Urbana-Champaign. Then it vanished. Fortunately I was able to grab it before.

Since I rate this piece as one of the most important contributions to machine-based episteme, I created a mirror for downloading it from Google code. But be aware, so far, you need to be a programmer to run it, it needs a development environment for the Java programming language. You can check it out from the source repository behind the link given above.  In the near future I will provide a version that runs more easily as a standalone program.

  • [1] David E Rumelhart, Adele A Abrahamson (1972), “A model for analogical reasoning,”
    Cognitive Psychology  5(1) 1-28. doi
  • [2] John F. Sowa and Arun K. Majumdar (2003). “Analogical Reasoning,” in: A. Aldo, W. Lex, & B. Ganter, eds. (2003) Conceptual Structures for Knowledge Creation and Communication, LNAI 2746, Springer-Verlag, pp. 16-36. Proc Intl Conf Conceptual Structures, Dresden, July 2003.
  • [3] Morrison RG, Krawczyk DC, Holyoak KJ, Hummel JE, Chow TW, Miller BL, Knowlton BJ. (2004). A neurocomputational model of analogical reasoning and its breakdown in frontotemporal lobar degeneration. J Cogn Neurosci. 16(2), 260-71.
  • [4] Chalmers, D. J., R. M. French, & D. R. Hofstadter (1992). High-level perception, representation, and analogy: A critique of artificial intelligence methodology. J Exp & Theor Artificial Intelligence 4, 185-211.

۞

FluidSOM (Software)

January 25, 2012 § 7 Comments

The FluidSOM is a modular component of a SOM population

that is suitable to follow the “Growth & Differentiate” paradigm.

Self-Organizing Maps (SOM) are usually established on fixed grids, using a 4n or 6n topology. Implementations as swarms or gas are quite rare and also are burdened with particular problems. After all, we don’t have “swarms” or “gases” in our heads (at least most of us for most of the time…). This remains true even if we would consider only the informational part of the brain.

The fixed grid prohibits a “natural” growth or differentiation of the SOM-layer. Actually, this impossibility to differentiate also renders structural learning impossible. If we consider “learning” as something that is different from mere adjustment of already available internal parameters, then we could say that the inability to differentiate morphologically also means that that there is no true learning at all.

These limitations, among others, are overcome by our FluidSOM. Instead of fixed grid, we use a quasi-crystalline fluid of particles. This makes it very easy to add or to remove, to merge or to split “nodes”. The quasi-grid will always take a state of minimized tensions (at least after shaking it a bit … )

Instead of fixed grid, we use a quasi-crystalline fluid of particles. This makes it very easy to add or to remove, to merge or to split “nodes”. The quasi-grid will always take a state of minimized tensions (at least after shaking it a bit … )

As said, the particles of the collection may move around “freely”, there is no grid to which they are bound apriori. Yet, the population will arrange in an almost hexagonal arrangement… if certain conditions hold:

  • – The number of particles fits the dimensions of the available surface area.
  • – The particles are fully symmetric across the population regarding their properties.
  • – The parameters for mobility and repellent forces are suitably chosen

Deviations from a perfect hexagonal arrangement are thus quite frequent. Sometimes hexagons enclose an empty position, or pentagons establish instead of hexagons, frequently so near the border or immediately after a change of collection (adding/removing a particle). This, however, is not a drawback at all, especially not in in case of SOM layers that are relatively large (starting with N>~500). In really large layers comprising >100’000 nodes, the effect is neglectable. The advantage of such symmetry breaks on the geometrical level, i.e. on the quasi-material level, is that it provides a starting point for natural pathway of differentiation.

There is yet another advantage: The fluid layer contains particles that not necessarily are identical to the nodes of the SOM, and also the relations between nodes are not bound to the hosting grid.

The RepulsionField class allows for a confined space or for a borderless topology (a torus), the second of which is often more suitable to run a SOM.

Given all the advantages, there is the question why are fixed grids so dramatically preferred against fluid layouts? The answer is simple: it is not simple at all to implement them in a way that allows for a fast and constant query time for neighborhoods. If it takes 100ms to determine the neighborhood for a particular location in a large SOM layer, it would not be possible to run such a construct as a SOM at all: the waiting time would be prohibitive. Our Repulsion Field addresses this problem with buffering, such it is almost as fast as the neighborhood query in fixed grids.

So far, only the RepulsionField class is available, but the completed FluidSOM should follow soon.

The Repulsion Field of the FluidSOM is available through the Google project hosting in noolabfluidsom.

The following four screenshot images show four different selection regimes for the dynamic hexagonal grid:

  • – single node selection, here as an arbitrary group
  • – minimal spanning tree on this disjoint set of nodes
  • – convex hull on the same set
  • – conventional patch selection as it occurs in the learning phase of a SOM

As I already said, those particles may move around such that the total energy of the field gets minimized. Splitting a node as a metaphor for natural growth leads to a different layout, yet in a very smooth manner.

Fig 1a-d: The Repulsion Field used in FluidSOM.
Four different modes of selection are demonstrated.

To summarize, the change to the fluidic architecture comprises

  • – possibility for a separation of physical particles and logical node components
  • – possibility for dynamic seamless growth or differentiation of the SOM lattice, including the mobility of the “particles” that act as node containers;

Besides that FluidSOM offers a second major advance as compared to the common SOM concept. It concerns the concept of the nodes. In FluidSOM, nodes are active entities, stuffed with a partial autonomy. Nodes are not just passive data structures, they won’t “get updated2 by a central mechanism. In a salient contrast they maintain certain states comprised by activity and connectivity as well as their particular selection of a similarity function. Only in the beginning all nodes are equal with respect to those structural parameters. As a consequence of these properties, nodes in FluidSOM are able to outgrow (pullulate) new additional instances of FluidSOM as kind of offspring.

These two advances removes many limitations of the common concept of SOM (for more details see here).

There is last small improvement to introduce. In the snapshots shown above you may detect some “defects,” often as either holes within a perfect hexagon, or sometimes also as a pentagon. But generally it looks quite regular. Yet, this regularity is again more similar to crystals than to living tissue. We should not take the irregularity of living tissue as a deficiency. In nature there are indeed highly regular morphological structures, e.g. in the retina of the eyes in vertebrates, or the faceted eyes of insects. In some parts (motor brains) of some brains (especially birds) we can find quite regular structures. There is no reason to assume that evolutionary processes could not lead to regular cellular structures. Yet, we never will find “crystals” in any kind of brain, not even in insects.

Taking this as an advice, we should introduce a random factor into the basic settings of the particles, such that the emerging pattern will not be regular anymore. The repulsion principle still will lead to a locally stable configuration, though. Yet, strong re-arrangement flows are not excluded either. The following figure show the resulting layout for a random variation (within certain limits) of the repellent force.

Figure 2: The Repulsion Field of FluidSOM, in which the particle are individually parameterized with regard to the repellent force. This leads to significant deviations  from the hexagonal symmetry.

This broken symmetry is based on a local individuality with regard to repellent force attached to it. Albeit this individuality is only local and of a rather weak character, together with the fact of the symmetry break it helps to induce it is nevertheless important as a seed for differentiation. It is easy to imagine that the repellent forces are some (random) function of the content-related role of the nodes that are transported by the particles. For instance, large particles, could decrease or increase this repellent force, leading to a particular morphological correlates to the semantic activity of the nodes in a FluidSOM.

A further important property for the determining the neighborhood of a particle is directionality. The RepulsionField supports this selection mode, too. It is, however, completely controlled on the level of the nodes. Hence we will discuss it there.

Here you may directly download a zip archive containing a runnable file demonstrating the repulsion field (sorry for the size (6 Mb), it is not optimized for the web). Please note that you have to install java first (on Windows). Else, I recommend to read the file “readme.txt” which explains the available commands.

How to Glue

November 9, 2011 § Leave a comment

Even software engineers know about their holy grail.

As descendants of the peak of modernism, I mean the historical phase between the “Wiener Kreis” and its founders of the positivism and the formalization of cybernetics by Norbert Wiener, this holy grail is given, of course, by the trinity of immediacy, transparency and independence.

It sounds somewhat strange that programmers are chasing this holy trinity. All the day long they are linking and fixing and defining, using finite state automata (paradoxically called “language”). Yet, programming is also about abstraction, which means nothing less as to decouple “things” from “ideas,” or more profane, the material  from the immaterial. Data are decoupled from programs, data are also decoupled from their representation (format), machines are decoupled, nowadays forming clouds (what a romantic label for distributed control), even inside a program, structures are decoupled from their actualization (into a programming language), objects are decoupled as much as possible and so on.

These are probably some of the reasons upon which Donald Knuth coined the issue of “The Art of Programming.” Again, it has of course nothing to do with the proclaimed subject, art in this case, even not metaphorically. Software engineers suffer from the illness of romanticism since the inception of software. Software always has a well-identified purpose, simply put, by definition. By definition, software is not art. Yet, there is of course something concerning software engineering, which can’t be defined formally (even as it is not art); maybe, that’s why Knuth felt to be inclined towards this misleading comparison (it not only hides the “essentials” of art, but also of formalization).

Due to the mere size of the issue of abstraction or even abstraction in programming we have to refrain from the discussion of this trinitarian grail filled with modernist phantasms about evanescence. We will perhaps do so elsewhere. Yet, what we will introduce here is a new approach to link different parts of a program, or different instances of some programs, where “program” means something like “executable” on a standard physical computer in 2011.

This approach will follow the constraint that the whole arrangement should be able to grow and to differentiate. In that we have to generalize and to transcend the principles as listed above. In a first, still coarse step we could say that we relate programs such that

  • – the linkage is an active medium, it is ought to be meta-transparent and almost immediate;
  • – independence is indeed complete, thus causing the necessity of extrinsic higher-order symbolism (non-behavioristic behavior)
  • – the linking instance is able to actualize very different, but first of all randomized communicological models

We created a small piece of software that is able (not yet completely, of course) to  represent these principles. Soon you will find it in the download area.

There are, of course, a lot of highly sophisticated softwares, both commercial and open sourced, dealing with the problem of linking programs, or users to programs. We discuss a sample of the more important ones here, along with the reasons why we did not take them as-is for our purpose of a (physically and abstractly) growing system. Of course, it was inspiring to check their specifications.

This brings us to the specification of or piece, which we call NooLabGlue:

  • – its main purpose is to link instances or parts of programs, irrespective of physical boundaries, such as networks or transport protocols (see OSI-stack);
  • – above all, its usage of the client-library API for the programmer has to be as simple as possible; actually, it has to be so simple, that its instantiation can be formalized.
  • – its functionality of “glueing” should not be dependent on a particular programming language;
  • – in its implementation (on the level of the computation procedure [1]) it has to follow the paradigm of natural neural systems: randomization and associativity; this means that it also should be able to learn;
  • – any kind of “data” should be transferable, even programs, or modules thereof;

In order to fulfill this requirements some basic practical decisions have to be made about the “architecture” and the tools to be employed:

  • – the communicative roles of a “glued” system are participants that are linking to a MessageBoard; participants have properties, e.g. regarding their type of activity (being source, receptor, or both), the type of data they are releasing or accepting, or as specified by more elaborate filters (concerning content), issued by the participants and hosted by the MessageBoard; there may be even “personal” relationships between participants, or between groups (1+:1+) of them;
  • – the exchange of messages is realized as a fully transactional system: participants as well as the MessageBoard (the message “server”) may shut down / restart at any time, and still “nothing” will be lost (nothing: none of the completely transferred messages);
  • – xml and only xml is transferred, no direct object linking like in ORB, RMI, or Akka; objects and binary stuff (like pdf, etc.) are taken as data elements inside the xml and they are transferred in encoded form (basically as a string of base64);
  • – contracting is NOT on the level of fields and data types, but instead on the level of document types and the respective behavioral level;
  • – MessageBoards are able to cascade is necessary, using different transport protocols or OSI-layers at the same time, e.g. for relaying messages between local and remote resources;
  • – the “style” to run remote MessageBoards is actualized following restful approach (we use the Java Restlet framework); that means, that there are no difficulties to connect applications across firewalls; note that the restful resource approach is taken as a style on top of HTTP, as a replacement of home-grown HTTP client; yet, the “semantics” will NOT be encoded into the URL as this would need cookies and similar stuff: the “semantics” remains fully within the xml (which even could be encrypted); in local networks, one may use transport through UDP (small messages) or TCP (any message size), neither the programmer nor the “vegetative” system need to be aware of the actual format/type of transport protocol.

Now, imagine that there are several hundred instances of growing and pullulating Self-Organizing Maps around, linked by this kind of “infrastructure”… the SOM may even be part of the MessageBoard (at least concerning the adaptive routing of “messages”).

Such an infrastructure seems to be perfect for a system that is able to develop the capability for Non-Turing-Computation.

(download of a basic version will be available soon)

  • [1]

Where Am I?

You are currently browsing entries tagged with download at The "Putnam Program".