Edits to statistics.txt.
--- a/statistics.txt Tue Apr 13 12:24:44 2010 +0530
+++ b/statistics.txt Tue Apr 13 14:26:52 2010 +0530
@@ -1,16 +1,16 @@
-Hello and welcome to the tutorial on handling large data files and processing them to get desired results.
+Hello and welcome to the tutorial on handling large data files and processing them.
Till now we have covered:
* How to create plots.
-* How to read data from file and process it.
+* How to read data from files and process it.
-In this session, we will use them and some new concepts to solve a problem/exercise.
+In this session, we will use these concepts and some new ones, to solve a problem/exercise.
We have a file named sslc.txt.
It contains record of students and their performance in one of the State Secondary Board Examination. It has 180, 000 lines of record. We are going to read it and process this data.
We can see the content of file by opening with any text editor.
Please don't edit the data.
-This file has a particular structure. Each line in the file is a set of 11 fields:
+This file has a particular structure. Each line in the file is a set of 11 fields separated by semi-colons
A;015163;JOSEPH RAJ S;083;042;47;AA;72;244;;;
The following are the fields in any given line.
* Region Code which is 'A'
@@ -43,9 +43,9 @@
We earlier used lists briefly. Back then we just created lists and appended items into them.
x = [1, 4, 2, 7, 6]
In order to access any element in a list, we use its index number. Index starts from 0.
-For eg. x[0] will give 1 and x[3] will 7.
+For eg. x[0] will give 1 and x[3] will give 7.
-There are times when we can't access things through integer indexes. For example consider a telephone directory, we give it a name and it should return back corresponding number. List is not the best kind of data structure for such problems, and hence Python provides support for dictionaries. Dictionaries are key value pairs. Lists are indexed by integers while dictionaries are indexed by strings. For example:
+But, using integer indexes isn't always convenient. For example, consider a telephone directory. We give it a name and it should return a corresponding number. A list is not well suited for such problems. Python's dictionaries are better, for such problems. Dictionaries are just key-value pairs. For example:
d = {'png' : 'image',
'txt' : 'text',
@@ -55,7 +55,7 @@
d is a dictionary. The first element in the pair is called the `key' and the second is called the `value'. The key always has to be a string while the value can be of any type.
-Dictionaries are indexed using their keys as shown
+Lists are indexed by integers while dictionaries are indexed by strings. They are indexed using their keys as shown
In []: d['txt']
Out[]: 'text'
@@ -69,10 +69,10 @@
'jpg' in d
False
-Please note the values cannot be searched in a dictionaries.
+Please note that keys, and not values, are searched.
'In a telephone directory one can search for a number based on a name, but not for a name based on a number'
-to obtain the list of all keys in a dictionary type
+to obtain the list of all keys in a dictionary, type
d.keys()
['py', 'txt', 'png']