Main questions in psych: what is developing, how does it occur? -ed, verb usage -conservation of volume -things that can potentially be explained with modeling -Modeling as a simulation of the real world, allows you to explain your theory in terms of input/output -AI models are simulating, not thinking -Different types of modeling (Includes AI and neural networks) -Comp modeling is necessary in psych because of complexity of behavior, can't be done with pen/paper
Research involves the combination of exploratory and experimental techniques. First, the exploratory studies ask “What will happen if…?” This portion involves finding the relevant questions and discovering predictive/influential factors. Then, manipulation or observation experiments ask if the models and hypotheses developed from the exploratory stages are accurate.
There are three components to AI manipulations: agents (robot), tasks, and the environment.
Activation functions are used to process the inputs to a neuron and determine its activation level. The inputs are summed and then the function is applied to the total input. There are many different types of activation functions, with the simplest being linear functions. Linear functions are both limited in their expression and do not model what's actually happening in a neuron because there's no floor or ceiling as there should be. Common nonlinear activation functions include sigmoid, asigmoid, and hyperbolic-tangent, though any nonlinear function could suffice (Karl Sims mentions many more in his paper “Evolving 3D Morphology and Behavior by Competition”). These nonlinear activation functions are capable of squashing values, providing us with the floor and ceiling that were missing with linear activation functions.
Linear separability is a property of functions that describes a function “whose outputs are some linear combination of its inputs”. In the 2-dimensional case, this looks like a line where everything on one side of the line is some output and everything on the other is a different output. In the multidimensional case, we have hyperplanes that separate the values in the same manner. Generally, its easier for nnets to learn linearly separable problems. Learning nonlinearly separable problems requires hidden units with nonlinear activation functions. The XOR function is a simple example of a function that is not linearly separable.
Biological neural networks (i.e. the brain) are made up of large numbers of nodes (neurons) with complex interconnections. Each neuron has many inputs (dendrites) and an output (axon), which may be connected to multiple other neurons. Additionally, there is some threshold of input that is required for a neuron to "fire", or become activated, at which point it propagates the activation to other neurons which have dendrites connected to its axon. Artificial neural networks (ANNs) are a simplified model of the biological version. ANNs are made up of nodes with multiple inputs and outputs, and the output is determined by a nonlinear function of a weighted sum of the inputs. Usually the nodes are organized into layers with the connections feeding forward from one layer to the next, but some types of ANNs have more complex topologies. ANNs learn by presenting a wide variety of inputs and adjusting the connection weights to move closer to the desired outputs. More on this later (see: back propagation). Strengths gained from the biological model Distributed representation: since there is no explicit storage, the network's behavior becomes encoded in the weights between the neurons. This means there is no way to pinpoint a particular behavior, and usually the removal of any one neuron will not noticeably degrade the network's overall performance. Generalization to new material: during the learning process the network is not memorizing specific examples -- it should be actually "learning" the relationships between the inputs and the outputs. This means that networks usually have a reasonable response to novel inputs. This is a big advantage of ANNs over rule-based systems: they never say "that does not compute.
Concept that error is backpropagated throughout the network to adjust weights.
Occurs in two general phases
Feed forward activations generate errors Topology: Multi-layer, feed forward Activations are summed (weight * activations) across all inputs and go through activation function to produce output Activation function must be differentiable for algorithm to solve problems that are not linearly seperable (most common sigmoid)
Send error backwards Error is calculated for each output node (Target - Output = error) Error is propagated backwards to determine whether the derivative of error with respect to weight is positive or negative at each node Adjusts weight at each node according to direction (and slope) of derivative Variables of learning rate and momentum allow programmer to adjust amount learned from each teaching example.
3. Advantages of Back-prop
a. Works pretty well
b. local computation is biologically plausible 4. Problems with Back-prop
a. Fixed topology i. No good way to know how many hidden units to use. Too many can overfit. b. Step-size problem
i. How much should weights be adjusts so it learns quickly, but does not overstep optima
ii. tends to learn slowly because of this
c. Moving target problem
i. Hidden layer solving a certain aspect of a problem but then not able to solve a different aspect of the same problem
d. Catastrophic Interference
iWhen learning is distributed across the whole network, learning knew things can be at expense of old
e. Feed-forward only is biologically implausible
Ways to change genotype
There are essentially three methods of reproduction: sexually through crossovers and grafts, or asexually through mutation. The following descriptions make use of the graph representation of individuals.
Crossovers For sexual reproduction, two parent programs are required. In a genetic crossover, the parent nodes are lined up next to each other, and then the source of genetic information alternation between parents. Crossover reproduction can potentially result in a situation in which the offspring inherits neural connections that go nowhere, or nodes from a parent that are unconnected as well. In either case, the extra nodes/neurons are discarded. However, this type of reproduction is useful in that it ensures that the offspring has some of the best parts of the parents, putting the offspring in a position for success while simultaneously allowing for mutation and variation.
Grafts In grafts, after the parent nodes are lined up and copied, a random node in the first parent’s copy is connected to a random node in the second copy. The unconnected nodes are removed.
The concept behind sexual reproduction is that only the best pieces of various programs survive over multiple generations.
Mutations Each newly created program is mutated, or randomly adjusted, through various means. For example, a new node or connector can be randomly appended to the graph or connector and node values can be altered or multiplied by -1.
Mutations, crossovers, and grafts can lead to the loss of useful connections and often produce useless offspring. Their existence is justified, however, because diversity in populations is essential - programs need to change in order for us to be able to select for the best ones. Useless offspring will die out due to their low fitness levels, and measures can be taken to prevent the loss of useful connections. For example, the evolutionary structure can retain the best of each generation.