In Elman's 1990 paper Finding Structure in Time he argues that time underlies many interesting human behaviors including language. He describes several experiments using a simple recurrent network to discover the structure present within sequences of data. The networks are trained using back-propagation to try to predict the next item in the sequence. The first experiment relates to phonology. We will reproduce this experiment and then modify it and test whether the simple recurrent network can discover the structure present within more complicated sequences.
In Elman's first experiment, the sequence of data was made up of three consonants (b, d, b) and three vowels (a, i, u). He randomly generated a sequence of length 1000 using only the consonants. Then he made the following replacements:
b -> ba d -> dii g -> guuuOnly some aspects of this sequence are predictable. The vowels are always predictable, but the consonants are not. In this experiment, the network's task is shown below. Given d, it should predict i. Given the first i, it should predict the second i. Given the second i, it should predict a consonant, but it can't know which one. Given the next consonant, it should predict the appropriate vowels.
Input: diiguuudiibabaguuudii... Output: iiguuudiibabaguuudii...
Elman encoded each letter using 6 bits of information shown in the table below. Notice that each letter has a unique pattern, but that consonants have a distinctive pattern from vowels, and vice versa.
After training the network, Elman observed the overall error fluctuations and found that the overall error consistently drops at the points in the sequence that are predictable and rises at the points in the sequence that are not. This demonstrates that the network has picked up on the structure of the letter sequence. In addition, Elman observed the error on a unit by unit basis and found that units that were consistently predictable (such as the units representing Consonant, Vowel, Interrupted, and Voiced) showed constant low error throughout the testing.
python ElmanExp1.py &This will train the simple recurrent network for 20 epochs on the letter sequence task. The network is trying to predict the next letter in the sequence. Notice that two types of performance measures are reported, percent correct by output unit and percent correct on the entire output pattern. Remember that even though the network cannot accurately predict which consonant is coming next, it does know when a consonant is coming. Therefore some of the individual output units are predictable, although the entire pattern is not.
python TestExp1.pyThis will run quickly and print the test string used.
xgraph -P err*The test string used was: "badiiguuubadii". Therefore the first point on the graph equates with the network seeing a "b" as input and trying to predict the next letter, which is "a". Print out this graph and label each point with the letter that the network is trying to predict. You should see that the total error rises at the points when the network must predict a consonant and drops at the points when it is predicting vowels. You should also see that the Consonant bit is consistently quite low, while the High bit fluctuates in step with the total error. You have now reproduced Elman's first experiment and proven that the simple recurrent network was able to recognize the structure in the letter sequence.
emacs ElmanExp1.pyNotice that there are three lines at the top of the file listing the possible substrings within the sequence. Change these in some interesting way, however you must restrict the letters you use to be the same as Elman's letters (b, d, g, a, i, or u). Here are some possible modifications:
emacs TestExp1.pyNotice that there is a line at the top containing a test letter sequence. Change this sequence to test your changes. Save the file and exit the editor. Re-run the test, and then view the resulting graph. Print this graph.