day1/exercise/word_frequencies.py
author Prabhu Ramachandran <prabhu@aero.iitb.ac.in>
Sat, 19 Jun 2010 01:27:20 -0400
branchscipy2010
changeset 409 4442da6bf693
parent 380 669b72283b55
permissions -rw-r--r--
ENH: Minor cleanup. Also added slide to introduce IPython's %timeit and %time.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
380
669b72283b55 Updated after Day 2 at GRDCS
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 64
diff changeset
     1
f = open('/home/vattam/Desktop/circulate/word-freq/holmes.txt')
64
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
     2
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
     3
freq = {}
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
     4
for line in f:
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
     5
    words = line.split()
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
     6
    for word in words:
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
     7
        key = word.strip(',.!;?\'" ')
380
669b72283b55 Updated after Day 2 at GRDCS
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 64
diff changeset
     8
        if key in freq:
669b72283b55 Updated after Day 2 at GRDCS
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 64
diff changeset
     9
            freq[key] += 1
669b72283b55 Updated after Day 2 at GRDCS
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 64
diff changeset
    10
        else:
669b72283b55 Updated after Day 2 at GRDCS
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 64
diff changeset
    11
            freq[key] = 1
64
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
    12
333092b68926 Added quiz tex file and all exercise problems Madhu worked out.
Madhusudan.C.S <madhusudancs@gmail.com>
parents:
diff changeset
    13
print freq