Week15 Reactions

Reading:
Alex Robinson

In the Large and Kolen paper they mention that recurrent networks haven't been able to successfully learn music sequences. I wonder why that is so. It may the the complexity is to great, but it seems that music would be an ideal temporal sequence to train a recurrent network on. Perhaps the results Elman got with the network are primarily because predicting words in a sentence relies on relatively little temporal information.

One inherent problem to the recurrent network is the inability to selectively save certain memories based on their importance to the current situation. It may learn to strongly react to certain things in the context layer, but all the information is uniformally painted over by the next set of output layer activations. I imagine the generation of recurrent networks mike have a selective short term memory that hold vital memories selected by the network. I can't, however, say now how it might be implemented. Government secrets and all.

After attending Hofstadter's lecture and reading these articles about music I see a tendency to use a fair amount of strong AI strategies to solve the problem. They are thinking about its structure instead of letting the computer's do most of the work by crunching numbers. I wonder if music will be treated in cognitive science like language was. First people will try very hard to model it by more symbolic rule-based systems, but then slowly realize the merits of using simple computer frameworks to crunch numbers until they find more complex structures on their own.

At first the the Port and Gelder article was a little a painful. I can only take so many irrefutable arguments for connectionism and it seems like wherever I go someone is trying to convice me of its validity (I think its the best option right now). The idea of time I think plays an important and sometimes unnoticed role and their argument is an important aspect of the bigger picture. Time has a significant role, because even if a nueron is highly activated it final effect will be entirely dependent on what point it plays a role in a process, because the activation will slowly decay.


Carl Wellington - Dynamical Systems

I really liked the general approach that this article proposed. I found that many of the author's complaints about other approaches in AI matched very closely with my own thoughts. It was really cool to see an alternative way of looking at the problem. The material in chapter two is exactly what Roger and I have been studying in Math 30 for the second half of this semester. It was cool seeing it from a slightly different approach. Jon picked up the book which this material came from so I got to at least glance at a few of the applications that he refers to in his introduction (ch. 1). Elman has an article about his recurrent nets. In general the book looked pretty interesting.

I have a few questions/complaints which came up during the reading. Early on, he says that he thinks it is "unlikely that it will be possible to reproduce the kind of intelligent capacities that are exhibited by natural cognitive systems without also reproducing their basic noncomputational architecture". I completely agree with this statement but I'm not completely clear what is truly noncomputational about these dynamical systems. The author makes the point that although dynamical systems are analyzed on computers, these are only simulations of the real systems. I wonder how a computer simulation of a dynamical system is truly different than the underlying equation. Dynamical systems use continuous time instead of discrete time. The author makes this seem like a really big difference between dynamical analysis and computationalist approaches. However, with a small enough time step, I would think that continuous time and discrete time are equivalent in terms of the end result. It is true that, given any time step size, it is possible to create a dynamical system which has important phenomena which will be missed because of the large step size. However, I believe that it is also true that, given any dynamical system, it is possible to create a step size small enough that all the important phenomena will be captured. For instance, if a time step was chosen that was, say, ten times as fast as the firing of a neuron, any dynamical activity which neurons exhibited would be captured to a good enough approximation by a discrete system. Also, since computers use discrete time, we really can't use continuous time anyway for a dynamical system of any complexity. The author also talks about how any temporal dependence built in to a computationalist model is completely arbitrary, but the temporal behavior of a dynamical system is also arbitrary. The designer of a dynamical system has complete control over when things happen. With either approach, it is basically just finding the correct constants. Also, I was curious how dynamic systems model discrete events (like an ecological example with only whole sheep and foxes). In all the problems that we've done in Math 30 which have required integer results, we've had to simply round down our answer. This doesn't seem very elegant.

While I thought that the author occasionally thought that there was a bigger difference between the dynamical approach and the computationalist approach, I liked the basic philosophy underlying the proposed dynamical systems approach a lot. I really like the idea of many interdependent, temporally-based relationships with different time scales all interacting in parallel to self-organize into a continually changing system which we call consciousness. It seems like this is basically what is going on in the brain. I would like to know how much has been done with this approach in terms of applications. I haven't had time to look at the rest of the book that Jon checked out to see the applications which the authors included, but I'm very curious how complex they are. I know from Math 30 that the analysis of a very simple system with just one or two nonlinear differential equations and a couple variable parameters can be very difficult to analyze. A system which has the amount of complexity needed to perform any meaningful task would have to be horribly hard to analyze and understand. At first I thought that the amount of complexity in these systems was a real problem but now I think it is a plus. The idea of billions of interrelated nonlinear relationships all continually evolving over time makes me think of another incredibly complex and hard to analyze system: the brain.
Aaron Hoffman

Ed Large and temporal structure.

So basically we have a bunch of units that are each responsible for detecting simple oscillations in a specific frequency range.

Lo and behold, they learn to predict polyrhythms - each unit predicting one of the simple rhythms in the polyrhythm.

I applaud the authors for postponing analysis of a network of conencted units until the units are better understood individually.

It is surprising how complex the units get. One would think that the receptive fields are narrow enough that they'd phase lock to the mean signal within their window. One would also think that this mean signal would be a fair predicter of the actual beat, because "noise", i.e., off-beat signals, or signals from other simple rhythms within the polyrhythm, would be distributed with mean equal to the signal that the unit is trying to detect.

It will be really interesting to see how networks of such oscillators interact - and if they can be made to behave similarly to models of networks of biophysical neurons

Gelder and Port

As brevity is the soul of wit and tediousness its limbs and outward flourishes let me be brief

this article is bad bad call i it because to define true badness...

my goodness. The author clearly spent his academic career filling page limits instead of writing concisely and coherently.

At any rate I suppose I'd better substantiate my complaints.

Firstly, the overarching dynamic paradigm is a little grandiose - yeah time happens, its kinda weird, a lot of people think that the real numbers model time pretty well. BUT there is nothing wrong with chopping time up into little pieces as long as the pieces are sufficiently small. If we are worried about events that occur in the 1 second range and we chop time up into milliseconds, then for our purposes time is continuous even though we are technically modelling it discretely.

Secondly, I completely missed whatever subtelty distinguishes Ellman nets from processes which model time as discrete. The Ellman net is all about time steps, and I still characterize it as dynamic, as do the authors.

Chapter 2

Diff eqs whoo hoo


Tom Kornack

Dynamic Systems

Look, I just saw the word ³cognition² on the same page as pictures with masses on springs and was ready to tear it up on the spot. But I didnıt. In fact, using classical mechanics is probably the worst and most laughable idea Iıve recently heard. The Dynamical Approach has a number of good uses, but is by no means a decent language for describing cognition. Notice that I called it a language; I say this because it does not offer us any new tools that we werenıt already working with (function spaces are equally extensive). It just puts the phenomena in a completely unwieldy form that makes it hard to understand anything. That is, the form of differential equations. Why should we want to do this: well, dynamics has the time variable and nothing else does in current cognitive models does. Thatıs a good point. But their real reason? ³Their usefulness in offering the best scientific explanations of phenomena throughout the natural world has been proved again and again.² Yeah, and again and again is a vanishingly small number. In fact, classical mechanics utterly fails to explain most phenomena in physics. In a small regime, it can be considered a good approximation, but in ALL cases, there is absolutely nothing fundamental about its description! Also, there is nothing continuous about the natural world. Indeed, there is nothing continuous about pulse trains that comprise our cognition. It would not surprise me if we are hopelessly lost here in the world of some poor old french physicist who knows nothing about neurobiology or modern physics. And who would happily apply his lifeıs work on approximation methods to partial differential equations to something like cognitive phenomena. ³Granted, we are a long way off...² Couldnıt have said it better.

The power of the methods developed by physicists do not yield much that is useful. The reading showed us how to see the most simple attractors. Now consider this: what are we supposed to make of the next simplest attractor map? Well, that would be the mandelbrot set given by the iterative equation Zı=Z^2+c. That equation is pretty simple to yield such infinite complexity as the Mandelbrot set. Now, letıs just try to imagine the attractor map of cognition... . What other methods did we learn? A Poincaré diagram (the circle map) is also quite pretty. Unfortunately, all these visualization tools for chaotic behavior are lost on even the next simplest systems. And then, pray tell, what will we do? Physicists have never had the knack for complex systems. Physicists are taught over and over again to model a complex system by the most simple system that you do know the solution to, then modify the simple system until the results match the complex system. If we were to do this with consciousness - and letıs just pretend weıd find some first order approximations to behavior - would we understand? Would we say that cognition is really just a series of coupled masses on springs attached to double jointed pendulums? No. The language of complex systems is statistical mechanics and that field has been around for a little less than 30 years, so most physicists still donıt fully understand it or know how to use it.

So, Iıve been ranting about the way dynamics was introduced in this article. But now what gets me the most is that they turn around and say that Elman nets are definitely dynamical. This surprised me since I would have said that theyıre almost completely on the connectionist side of things with only a little extension of the architecture built in to let it repeat over longer sequences. They in fact do not have a sense of time, or a time variable. But I like Elman networks, so it is only part of their study, the pure dynamical systems, that I have contention with.

The musical beat thing: You know, weıre all taking it for granted that the beat of music is something fundamental and easy to figure out for a human. I would argue that the beat is a very abstract and learned thing. There are a lot of humans who canıt find the beat. Much of our music today is for the rhythm impaired because it uses high amplitude, ultra-low frequency pulses. Anyone can find the beat when it is marked so clearly, even a neural network with the cognitive power of a leech. (Sometimes the tone generators used around campus move the air mass significantly and thus your body for you.) I think the beat of music is learned because we are told what the beat is and exactly where it falls for music that we become familiar with. I think weıd be hard pressed to find the beat in the music of a completely unknown style. More importantly, in music that weıre familiar with, the identifiable beat is almost always accompanied by an emphasis on the notes. If you play the notes that they printed without any emphasis, your music teacher would tell you that you had no sense of the beat. And its true, unless you are expressive with tonal and amplitude variations, I donıt think a beat is uniquely determined. Iıd like to see if the network could learn if the MIDI sequences had variations in amplitude. More importantly, Iıd like to see an experiment to show that humans can do it. An ideal example would be this: Take one of Beethovenıs symphonies and ask the class (excluding the music majors!) to find the beat. Itıd be interesting, and I predict that weıd be hard pressed to find the beat, let alone the down beat. (No offense to anyone.)

And a note on abstractions: Iıve been thinking for a while now that the biggest problem with Œnets is simply the inability to abstract an indefinite number of times. This is made clear by the shape recognition problem: We want to recognize a square whether it is rotated or scaled. This is tricky because the CCD inputs are in absolute coordinates. So, we could take the spatial fourier transform and abstract over one dimension so that it doesnıt matter where it is translated. Then you take another spatial fourier transform so it doesnıt matter how big it is. Then you take an angular fourier transform so it doesnıt matter how itıs rotated, and so on. Now, if we were to input the final result into a network, it would always recognize the square any which way. But I could imagine yet another level on which another fourier transform could be taken. My point is that abstract relationships need abstract perception and inputs if there is no abstraction in the network itself. So Iıve been racking by brains to come up with some abstraction in a network (a ³fourier network² where the nodes are little integrators and comprise the coefficients of a fourier series...). What if we made it a practice to input all levels of abstraction into a network on any input. I bet we could solve a lot more problems. In one of the backprop networks used to find the beat, they could train it for four various tempos, but it would never generalize. This is because it isnıt abstracting at all; and thereıs no mathematical reason sums of sigmoids should be able to do this. Thus, we need to provide that service. If you took a time-fourier transform over segments of that piece, you would recognize it at any tempo!!!!


Jonathon Shlens

MIND AS MOTION
- "natural cognitive systems, such as people, aren't computers" - leading up this part, they destroy the symbolic paradigm of AI.
- "cognitive processes and their context unfold continuously and simultaneously in real time" - the authors here seem to agree with Hofstadter's idea of "fluid" thought and concepts.
- "further they (other AI approaches) say nothing about ... deliberation time ... or how a choice can appear more attractive at one time, less attractive at another" - I think that the authors make a good point. Neither connectionism nor symbolism can explain these facts in human cognition. These approaches do not account for the fact that the same input at a different time can elicit a whole range or distribution (Gaussian?) of responses and deliberation time; the other approaches spew out only one answer. The attraction of backprop networks are that they begin to stray away from this single deliberate answer but rather can output a mixed answer. However, even backprop networks do not go far enough for their model does not explain very well variation in deliberation time and variation in answer over time.
- "the cognitive system does not interact with other aspects of the world by passing messages or commands; rather, it continuously coevolves with them" - I think that they hit the spot. By limiting out models to discrete "messages" and commands" we constrain our models to discrete 'cognition' and eliminate any possibility of variation or fluidity.
-"representations of stored items are point attractors in the phase space of the system. Recalling or recognizing an item is a matter of settling into its attractor, a process that is governed by purely numerical dynamical rules." - YES! I think that they have hit the mark with this statement! This approach and belief works well because it explains the variability of decisions and the self-organization of thought. Since dynamical systems naturally self-organize around attractors, there is no need for a teacher telling you when you are right or wrong. Also, this reminds of the Lorenz phase plot. Basically, the chaotic system remains continuously bounded by two attractors over time yet never remain on the same path. Hence, analogously, a thought or cognition might be represented by an attractor (which self-organizes); however, because of the fact that it never takes the exact same path twice, the thought continues to exhibit both variability and fluidity!!! Wow!
- a problem: how can one attach a motive or a goal to a dynamical system? For instance, in a backprop network the goal is created by presenting the desired output for the system. How would likewise train a dynamical system? It seems that one would need to somehow manipulate or change the attractor of the system? or possibly change the meaning of the attractor inherent in the system (i.e. change the translation of the attractor from apples=good to work='not fun',etc).
- "the claim is that models must be capable of describing change from one state to another arbitrarily close to it, as well as sudden change from one state to another discretely distinct from it" - This again is an extremely good point (the idea of fluidity of thought) and reason why symbolic systems (and possibly connectionist systems) are flawed in their approach.
- "dynamical systems are just the simultaneous, mutual influencing activity of multiple parts or aspects" - WOW! Again, their approach seems to agree with observation- that is dynamical system naturally require parallel processing!
WOWOWOWOW! I thought that this reading was great! The ideas which are presented in this chapter strongly agree with the views I had of the brain coming into the class. I think that the brain is definitely intertwined and composed of billions of nonlinear feedbacks all of which create self-organization and fluidity of thought. This approach has definite merit as it really agrees with my perception of the brain. Also, if true, this approach also demonstrates the vast complexity of the brain. Indeed, one linear partial differential equation of two variables is hard to solve - an hour or two by paper generating an utter mess. Imagine a NON linear partial differential equation of several billion variables (by the way, a simple 2 variable nonlinear equations can be literally impossible to solve).

RESONANCE AND THE PERCEPTION OF MUSICAL METER
I liked this article as it applied alot of the theory in the previous reading to this problem of musical meter. The whole development of the theory behind creating a system capable of this feat is very interesting. Specifically, this article describes that in an implementation of this model the meter's "oscillatory behavior of interest, and synchronization arises given simple couplings." Keeping this in mind, in the previous article they mentioned that "dynamical systems are known to be able to create structure both in space and in time." Both articles point toward the fact that large dynamical systems naturally converge on some structure (musical meter is just one example of such). These quotes reminded me of a math lecture last month on chaos by Steven Strogatz. In his lecture, he talked about fire flies in Borneo(?) which at night naturally begin to 'orchestrate' their light. In roughly an hour after night fall, the fire flies naturally synchronize their behavior and light-up their bodies at the exact same frequency. In any case, Strogatz proved (partially) the mathematical reason for this. What it boiled down to was that nonlinear systems naturally organize. Wow! It is nice to see this math relate to theories of how the brain recognizes musical meter (or anything else)!

Michael Morton
Response to:
It's About Time:
An Overview of the Dynamical Approach to Cognition

This article had some interesting points that the authors repeated over and over again (or so it seemed to me). The idea that we should represent cognition as a dynamical system and use dynamical approaches was a new idea to me and one that has a lot of merit. I especially liked the idea that idea that perhaps our whole framework (and mindset) are wrong, and although we have found many good approximations for cognition, we are lacking the appropriate model. The analogy to astrology really hits home. Although, a dynamical approach may not be the correct one either - who knows.

The embedded idea is something that really makes sense to me and fits much better into a dynamical framework. I believe that things go on at many levels in our head and they work together in some sense to come up with responses to stimuli. In a discrete model, it's very impossible to cope with such things in parallel; by the mere fact that we are operating in time steps. Well, not exactly, but the idea that all these things happen at the same time and somehow blend together I do not think can be captured without including time in some very real sense.

"Everything is simultaneously affecting everything else." This is definetly how the world works, and modelling it as such sure makes a lot of sense. That's why we use dynamics to model the world. So why use sequential machines to model our brains? Our brains also interact with the world through our bodies. Although I believe that some of our intellectual activity is certainly just planning, much or most of what we do is interacting with the world and responding to stimuli. Why then should we not use a dynamical approach as we do in physics?

I liked the discussion on the relation of Neural Nets to this "new" approach. "Such networks are little more than sophisticated devices for mapping static inputes into static outputs." This is exactly what we have been saying (pretty much). NNets are just a mapping from one space to another. It is hard for them to capture any essence of time, although we apptempt to via different, more sophisticated architectures. But then are these architectures just a patch in a model that just isn't correct?

Finally, although I enjoyed the article and it definitely got me thinking on a whole new level, it lacked concreteness in some sense. Just what is this system that they are referring to? Just what are some of the parameters or "equations"? I may be ahead of the chapter, but though the authors raved (and repeated) about this model, they fail to mention anything concrete.


Dave Lewis


Elaine Huang's Briefs

It's About Time: An Overview of Blah, blah, blah
Timothy van Gelder, Robert H. Port

Okay. This article started out really promisingly. I thought- wow, this is really what learning modelling needs, a way to amke things continuous. I wonder how they're going to do that? That's cool.

So I was sort of disappointed. This article knocked down computationalism, and promoted dynamicism and didn't really provide any examples of how dynamicism was a feasible approach to take, and whether any successful attempts to use dynamicism had been done. There seemed to be no empirical evidence that the dynamic approach was superior to the computational approach, despite a convincing argument that cognition is dynamic as opposed to computational. And, unless I've seriously misunderstood, the authors went ahead and described a supposedly dynamic model of cognition as being discrete. Now this strikes me as a severe case of pot arguing with kettle.

I'm also not sure that the statement that human beings aren't computers and therefore cognition should not be modelled computationally is quite as simple a statement as the authors seem to think. Are we really sure that there is some essence behind the mechanics?

I also am really confused about the statement that "the most powerful known medium of representation is symbolic, and hence cognitive processes must manipulate symbols, i.e., must be computational in nature." First, are we sure that we use the most powerful medium of represenation, and secondly, didn't we just say that cognitive processes were dynamic in nature?

Also, it seems that some cognitive processes might be computational- such as the classic example of chess. True, external factors are dynamic, but the cognitive processes behind playing the game would work discretely, no?

I also have issues with their assumption that natural language is well-understood, but it seems to have been a passing comment.

In short, I think the authors made a lot of claims, and knocked down a lot of useful stuff, and then went on to either not support their own claims sufficiently, or contradict themselves. Though I may just have missed something.

Resonance and the Perception of Meter
Edward Large and John Kolen

I found this article interesting, but difficult because of my very limited math and physics background. I'm interested to know the connectionist approaches were unsuccessful in recognizing melodies played at different tempos. It would seem that a melody when played at different speeds would still be quite patterned and therefore recognizable by a network. Why were the attempts to train the networks unsuccessful? Would the use of context units help? What does this say about using networks on patterns in which time is a factor?

Also, as a technical question, was is the difference between phase-locked and frequency-locked?

How does the model respond to fluctuations in tempo that are characteristic to music as played by a human. A tempo can easily decrease from 88 beats per minute to 40, while the listener still feels the rhythm. Can the model handle such significant changes? The article describes the reaction to a ritardando, involving a "slight phase shift." Does this work with great changes in tempo?


Roger Bock

It's About Time: An Overview of the Dynamical Approach to Cognition
pg 2 - What is a research paradigm in Kuhn's classic sense?

pg 6 - I find it hard to believe that our minds can be modelled by a set of state variable equations, however numerous they may be. I can't believe that a person in one state will always do the same thing next. I believe that for every me which makes one decision, there is another hypothetical me which could just as easily have made the other choice. I think some of our behavior is governed by chance. Although, maybe it's just that our minds are so chaotic that it appears that chance is involved. In other words, a very minute difference in my state may result in me choosing a different course of action. However, because the difference in the states which results in a different choice is so infinitesimal, it may as well be chance.

pg 24 - I think the author's are spending a little too much time redundantly refuting the virtues of the computationalist approach. It's too bad there is such rivalry inherent in some disciplines of science.

pg 27 - What is the CNS?

pg 28 - I would love to hear the computationalist rebuttals to some of the points made in this article, as well as a non-biased discussion of the deficiencies of the dynamic approach.

pg 35 - I don't think it's necessarily true that using lower dimensional mathematical models corresponds to studying a system at a higher level. After all, what if all those dimensions are needed to show some sort of emergent property?

Dynamics - An Introduction
pg 45 - This chapter was kind of boring, because I am taking Math 30 this semester. Still, it was interesting to hear about dynamics from another person's viewpoint.

pg 60 - Are asymptotically stable fixed points a subset of Lyapunov-stable fixed points?

pg 62 - What are separatrices?

pg 64 - What does it mean for a system to have a dense orbit?

Resonance and the Perception of Musical Meter
pg 178 - I'm surprised Cope's work isn't mentioned in this paper, I wouldn't have thought the field of music production would be that well populated.

pg 179 - What input would the network trained to recognize a melody at a specific tempo receieve?

pg 182 - How would one have conflict between pitch structure and temporal structure?

pg 183 - What exactly is Weber's law?

pg 187 - The discussion of coupled oscillators reinforcing each other reminded me of the talk on fireflies given by the guy who wrote my Math 30 textbook whose name I can't remember. I wonder if someone who understood that lecture a little better would be able to offer anything insightful about how the ideas he discussed relate to the issue of musical meter and resonance. Also, what is the binding problem?

pg 188 - Why is the ratio of the periods referred to as the bare winding number?

pg 189 - I don't understand what the different regions mean. Do the Arnol'd tongues overlap, and if so, what is the significance of that?

pg 191 - How do the light lines show the effect of coupling?

pg 198 - What does the graph of the combined output show, and how is it related to the graphs for oscillators one and four?

pg 201 - How do the graphs show that the oscillators are responding correctly? To me, it looks as if there output is wandering all over the place.

pg 202 - How would these high level units' behavior emerge from the behavior of individual neurons?

pg 203 - What is the quantization problem? What is the rhythmogram model?