>Date: Sun, 4 Jul 1993 00:41:16 -0700 (PDT)
>From: Melvin Rader <radermel at u.washington.edu>
>Subject: Re: Whence cybernetics
> By cybernetics, I take you to mean the study of neural networks
>and connectionist models of artificial intelligence. By no means is it
>dead, or even all that obscure. As an undergraduate at the Evergreen
>State College in Olympia, WA this year I took four credits of
>'Connectionism' and another four of programming of neural networks. I
> Minsky and Papert's book did effectively kill further research
>into neural networks for about two decades. The thrust of the book
>was that with the learning algorithms that had been developed then, neural
>networks could only learn linearly separable problems, which are always
>simple (this was proved mathematically). Networks existed which could
>solve more complicated problems, but they had to be "hard wired" - the
>person setting up the network had to set it up in such a way that the network
> etc.
You'd better give those credits back. The book explained (1) some
theory of which geometric problems were linearly separable (and the
results were not notably simple), (2) derived lower bounds on how the
size of networks and coefficients grow with the size of certain
problems, and (3) these results have nothing whatever to do with the
learning algorithms involved, because they only discuss the existence
of suitable networks.
There was not so much research in neural networks between 1969, when
the book was published, and around 1980 or so. This may have been
partly because we showed that feedforward nets are impractical for
various kinds of invariant recognitions on large retinas, but they are
useful for many other kinds of problems. The problem was that too
many people propagated absurdly wrong summaries of what the book said
-- as in the above account. There were some gloomy remarks near the
end of the book about the unavailability of convergence guarantees for
multilayer nets (as compared with the simple perceptron procedure,
which always converges for separable patterns), and this might have
discouraged some theorists. There still are no such guarantees for
learning algorithms of practical size -- but for many practical
purposes, no one cares much about that.