statistics.txt
author amit@thunder
Fri, 02 Apr 2010 19:19:14 +0530
changeset 11 eafc653206d8
parent 7 9794cc414498
child 46 34df59770550
permissions -rw-r--r--
Made changes to incorporate suggestions for adding info about ipython and pointers from spoken tutorials workshop

Hello welcome to the tutorial on statistics and dictionaries in Python.

In the previous tutorial we saw the `for' loop and lists. Here we shall look into
calculating mean for the same pendulum experiment and then move on to calculate
the mean, median and standard deviation for a very large data set.

Let's start with calculating the mean acceleration due to gravity based on the data from pendulum.txt.

We first create an empty list `g_list' to which we shall append the values of `g'.
In []: g_list = []

For each pair of `L' and `t' values in the file `pendulum.txt' we calculate the 
value of `g' and append it to the list `g_list'
In []: for line in open('pendulum.txt'):
  ....     point = line.split()
  ....     L = float(point[0])
  ....     t = float(point[1])
  ....     g = 4 * pi * pi * L / (t * t)
  ....     g_list.append(g)

We proceed to calculate the mean of the value of `g' from the list `g_list'. 
Here we shall show three ways of calculating the mean. 
Firstly, we calculate the sum `total' of the values in `g_list'.
In []: total = 0
In []: for g in g_list:
 ....:     total += g
 ....:

Once we have the total we calculate by dividing the `total' by the length of `g_list'

In []: g_mean = total / len(g_list)
In []: print 'Mean: ', g_mean

The second method is slightly simpler. Python provides a built-in function called "sum()" that computes the sum of all the elements in a list. 
In []: g_mean = sum(g_list) / len(g_list)
In []: print 'Mean: ', g_mean

The third method is the simplest. Python provides a built-in function `mean' that
calculates the mean of all the elements in a list.
In []: g_mean = mean(g_list)
In []: print 'Mean: ', g_mean

Python provides support for dictionaries. Dictionaries are key value pairs. Lists are indexed by integers while dictionaries are indexed by strings. For example:
In []: d = {'png' : 'image',
      'txt' : 'text', 
      'py' : 'python'} 
is a dictionary. The first element in the pair is called the `key' and the second 
is called the `value'. The key always has to be a string while the value can be 
of any type.

Dictionaries are indexed using their keys as shown
In []: d['txt']
Out[]: 'text'

In []: d['png']
Out[]: 'image'

The dictionaries can be searched for the presence of a certain key by typing
In []: 'py' in d
Out[]: True

In []: 'jpg' in d
Out[]: False
Please note the values cannot be searched in a dictionaries.

In []: d.keys()
Out[]: ['py', 'txt', 'png']
is used to obtain the list of all keys in a dictionary

In []: d.values()
Out[]: ['python', 'text', 'image']
is used to obtain the list of all values in a dictionary

In []: d
Out[]: {'png': 'image', 'py': 'python', 'txt': 'text'}
Please observe that dictionaries do not preserve the order in which the items
were entered. The order of the elements in a dictionary should not be relied upon.