User Tools

Site Tools


start

CS81 Section 1

Week 1 Summary

*NON-AI GROUP*

Modeling in psychological development
  • The main question in psychological development is: What is developing, and how does it occur?
  • Modeling as a simulation of the real world allows theories to be explained in terms of input and output.
  • AI simulations simulate thinking; they do not actually think.
  • Psychology is too complex to be modeled through verbal theories; computational modeling allows for precision and complexity. It also allows for a physical result that that can be tested and corrected, if wrong.

Main questions in psych: what is developing, how does it occur? -ed, verb usage -conservation of volume -things that can potentially be explained with modeling -Modeling as a simulation of the real world, allows you to explain your theory in terms of input/output -AI models are simulating, not thinking -Different types of modeling (Includes AI and neural networks) -Comp modeling is necessary in psych because of complexity of behavior, can't be done with pen/paper

Empirical Research

Research involves the combination of exploratory and experimental techniques. First, the exploratory studies ask “What will happen if…?” This portion involves finding the relevant questions and discovering predictive/influential factors. Then, manipulation or observation experiments ask if the models and hypotheses developed from the exploratory stages are accurate.

There are three components to AI manipulations: agents (robot), tasks, and the environment.

*AI GROUP*

Structure of a neural network
  • basic components are nodes and the connections between them, modeling neurons and synapses
  • information is held in two ways:
    • temporarily: activation level of a node
      • a float between 0 and 1 (inclusive)
    • more permanently: one-way connection weights between nodes—how much the first node’s firing affects the second’s activation, and in what direction
      • a positive or negative float of any size
  • often, but not always, organized in layers—see Network topologies, below
  • types of nodes:
    • input nodes activate based on environmental input (e.g. sensor information encoded numerically)
      • output nodes activate to determine the functioning of the algorithm the network is trying to optimize
  • activation of a node based on inputs from nodes feeding into it:
    • for one unit, netInput = ∑(ai * wi) where i is a node connecting to the unit
    • activation = f(netInput)
  • learning generally only adjusts network weights
    • generative networks can also adjust the shape of the network (extra or pruned nodes, connections)
Network Topologies
  • Arbitrary
    • No pattern to linkages between units. Can come up with pretty complicated and interesting topologies.
  • Feed Forward
    • Has one input layer and one output layer
    • Often has some number of hidden layers
      • Difficult to solve complex problems without hidden layers
    • Each unit of a layer is connected to every unit of the previous and next layer
  • Recurrent
    • Outputs of units from one layer act as inputs to units from previous layer
  • Cascade Correlation
    • The network originally begins with no hidden nodes. The network then attempts to solve the given task. If not successful, the network adds a hidden node and attempts to learn the problem again.
Linear Separability and Activation Functions

Activation functions are used to process the inputs to a neuron and determine its activation level. The inputs are summed and then the function is applied to the total input. There are many different types of activation functions, with the simplest being linear functions. Linear functions are both limited in their expression and do not model what's actually happening in a neuron because there's no floor or ceiling as there should be. Common nonlinear activation functions include sigmoid, asigmoid, and hyperbolic-tangent, though any nonlinear function could suffice (Karl Sims mentions many more in his paper “Evolving 3D Morphology and Behavior by Competition”). These nonlinear activation functions are capable of squashing values, providing us with the floor and ceiling that were missing with linear activation functions.

Linear separability is a property of functions that describes a function “whose outputs are some linear combination of its inputs”. In the 2-dimensional case, this looks like a line where everything on one side of the line is some output and everything on the other is a different output. In the multidimensional case, we have hyperplanes that separate the values in the same manner. Generally, its easier for nnets to learn linearly separable problems. Learning nonlinearly separable problems requires hidden units with nonlinear activation functions. The XOR function is a simple example of a function that is not linearly separable.

Biological Basis of Artificial Neural Networks
  Biological neural networks (i.e. the brain) are made up of large numbers of nodes (neurons) with complex interconnections. Each neuron has many inputs (dendrites) and an output (axon), which may be connected to multiple other neurons. Additionally, there is some threshold of input that is required for a neuron to "fire", or become activated, at which point it propagates the activation to other neurons which have dendrites connected to its axon.
  Artificial neural networks (ANNs) are a simplified model of the biological version. ANNs are made up of nodes with multiple inputs and outputs, and the output is determined by a nonlinear function of a weighted sum of the inputs. Usually the nodes are organized into layers with the connections feeding forward from one layer to the next, but some types of ANNs have more complex topologies.
  ANNs learn by presenting a wide variety of inputs and adjusting the connection weights to move closer to the desired outputs. More on this later (see: back propagation).
  Strengths gained from the biological model
      Distributed representation: since there is no explicit storage, the network's behavior becomes encoded in the weights between the neurons. This means there is no way to pinpoint a particular behavior, and usually the removal of any one neuron will not noticeably degrade the network's overall performance.
      Generalization to new material: during the learning process the network is not memorizing specific examples -- it should be actually "learning" the relationships between the inputs and the outputs. This means that networks usually have a reasonable response to novel inputs. This is a big advantage of ANNs over rule-based systems: they never say "that does not compute.
      
Back Propagation

Concept that error is backpropagated throughout the network to adjust weights.

Occurs in two general phases

  Feed forward activations generate errors
      Topology: Multi-layer, feed forward
      Activations are summed (weight * activations) across all inputs and go through activation function to produce output
      Activation function must be differentiable for algorithm to solve problems that are not linearly seperable (most common sigmoid)
  Send error backwards
      Error is calculated for each output node (Target - Output = error)
      Error is propagated backwards to determine  whether the derivative of error with respect to weight is positive or negative at each node
      Adjusts weight at each node according to direction (and slope) of derivative
      Variables of learning rate and momentum allow programmer to adjust amount learned from each teaching example.

3. Advantages of Back-prop

a. Works pretty well

b. local computation is biologically plausible 4. Problems with Back-prop

  a. Fixed topology
      i. No good way to know how many hidden units to use. Too many can overfit.
  b. Step-size problem

i. How much should weights be adjusts so it learns quickly, but does not overstep optima

      ii. tends to learn slowly because of this

c. Moving target problem

i. Hidden layer solving a certain aspect of a problem but then not able to solve a different aspect of the same problem

  d. Catastrophic Interference
  iWhen learning is distributed across the whole network, learning knew things can be at expense of old

e. Feed-forward only is biologically implausible

Successfully solved problems/examples

Ways to change genotype

There are essentially three methods of reproduction: sexually through crossovers and grafts, or asexually through mutation. The following descriptions make use of the graph representation of individuals.

Crossovers For sexual reproduction, two parent programs are required. In a genetic crossover, the parent nodes are lined up next to each other, and then the source of genetic information alternation between parents. Crossover reproduction can potentially result in a situation in which the offspring inherits neural connections that go nowhere, or nodes from a parent that are unconnected as well. In either case, the extra nodes/neurons are discarded. However, this type of reproduction is useful in that it ensures that the offspring has some of the best parts of the parents, putting the offspring in a position for success while simultaneously allowing for mutation and variation.

Grafts In grafts, after the parent nodes are lined up and copied, a random node in the first parent’s copy is connected to a random node in the second copy. The unconnected nodes are removed.

The concept behind sexual reproduction is that only the best pieces of various programs survive over multiple generations.

Mutations Each newly created program is mutated, or randomly adjusted, through various means. For example, a new node or connector can be randomly appended to the graph or connector and node values can be altered or multiplied by -1.

Mutations, crossovers, and grafts can lead to the loss of useful connections and often produce useless offspring. Their existence is justified, however, because diversity in populations is essential - programs need to change in order for us to be able to select for the best ones. Useless offspring will die out due to their low fitness levels, and measures can be taken to prevent the loss of useful connections. For example, the evolutionary structure can retain the best of each generation.

Week 2 Summary

Week 3 Summary

Week 4 Summary

Week 5 Summary

Week 6 Summary

Week 7 Summary

start.txt · Last modified: 2012/02/02 14:23 by ddannun1