$ python3 textinfo.py This program will identify the words that most distinguish one text corpus from another. Here are the files in the text folder: [ 0] ALL.txt [ 1] AUSTEN.txt [ 2] BRUNTON.txt [ 3] BURNEY.txt [ 4] CANON.txt [ 5] CHAWTON.txt [ 6] DEFOE.txt [ 7] FIELDING.txt [ 8] PAMELAPARODIES.txt [ 9] RICHARDSON.txt [10] SCOTT.txt [11] SHELLEY.txt [12] SINGERMENDENHALL.txt [13] anonymous_beer.txt [14] anonymous_kid.txt [15] austen_emma.txt [16] austen_lady_susan.txt [17] austen_mansfield_park.txt [18] austen_northanger_abbey.txt [19] austen_persuasion.txt [20] austen_pride_and_prejudice.txt [21] austen_sense_and_sensibility.txt [22] brunton_discipline.txt [23] brunton_self-control.txt [24] burney_cecilia_1.txt [25] burney_cecilia_2.txt [26] burney_cecilia_3.txt [27] burney_evelina.txt [28] burney_the_wanderer_1.txt [29] burney_the_wanderer_2.txt [30] burney_the_wanderer_3.txt [31] burney_the_wanderer_4.txt [32] burney_the_wanderer_5.txt [33] scott_bride_of_lammermoor.txt [34] scott_ivanhoe.txt [35] scott_rob_roy.txt [36] scott_the_lady_of_the_lake.txt [37] scott_waverly.txt [38] shelley_frankenstein.txt [39] shelley_mathilda.txt [40] shelley_the_last_man.txt [41] taylor_twinkle.txt To get started, you may want to try austen_emma.txt and austen_northanger_abbey.txt Enter choice for the 1st corpus: 38 Enter choice for the 2nd corpus: 39 The 10 most frequent words in /data/cs21/novels/shelley_frankenstein.txt are: Frequency | Word ----------|------------------------- 4194 | the 2976 | and 2850 | i 2642 | of 2094 | to 1776 | my 1391 | a 1129 | in 1021 | was 1018 | that The 10 most frequent words in /data/cs21/novels/shelley_mathilda.txt are: Frequency | Word ----------|------------------------- 1591 | the 1488 | i 1282 | and 1092 | to 1065 | of 813 | my 666 | a 656 | that 571 | was 557 | me The 10 most prevalent words in /data/cs21/novels/shelley_frankenstein.txt relative to /data/cs21/novels/shelley_mathilda.txt Score | Word -------|------------------------- 18.0 | forever 17.5 | discovered 12.9 | cousin 12.3 | brother 11.8 | m 11.8 | home 11.8 | cottagers 11.3 | stranger 11.1 | ice 10.3 | creator The 10 most prevalent words in /data/cs21/novels/shelley_mathilda.txt relative to /data/cs21/novels/shelley_frankenstein.txt Score | Word -------|------------------------- 29.2 | heath 19.4 | brow 15.6 | flower 15.6 | laid 15.6 | pour 15.6 | sixteen 13.6 | contemplation 13.6 | dazzling 13.6 | existed 13.6 | gardens Choose a word to see it in context: dazzling Here are the occurences of dazzling in /data/cs21/novels/shelley_frankenstein.txt: ------------------------------------ so soon as the dazzling light vanished the oak ------------------------------------ Here are the occurences of dazzling in /data/cs21/novels/shelley_mathilda.txt: ------------------------------------ from the slant and dazzling beams of the descending their sight after the dazzling light the oak no his surpassing beauty the dazzling fire of his eyes would have been too dazzling for me when we plumage of a bird dazzling as lightning and like whose light was too dazzling gay to be reflected was blue but not dazzling like that of rome Choose a word to see it in context: flower Here are the occurences of flower in /data/cs21/novels/shelley_frankenstein.txt: ------------------------------------ the first little white flower that peeped out from ------------------------------------ Here are the occurences of flower in /data/cs21/novels/shelley_mathilda.txt: ------------------------------------ these lovely solitudes gathering flower after flower ond era solitudes gathering flower after flower ond era pinta tutta here like a decaying flower still withering under his blighting influence as no flower so sweet ever did or plant one fair flower let that be motive down to gather a flower for my wreath on bleak plain where no flower grew when i awoke pluck from thence a flower and lay it to Choose a word to see it in context:
Once the user enters nothing at the prompt, the program ends.