|
1 Hello friends and welcome to the third tutorial in the series of tutorials on "Python for scientific computing." |
|
2 |
|
3 In the previous tutorial we learnt how to read data from a file and plot the data |
|
4 We used 'for' loops and lists to get data in desired format. |
|
5 IPython -Pylab also provides with a function 'loadtxt' which can get us data without much hustle. |
|
6 |
|
7 We know that, pendulum.txt contains two columns, with length being first and time period is second column, so to get both columns in two separate variables we type |
|
8 |
|
9 l, t = loadtxt('pendulum.txt', unpack=True) |
|
10 |
|
11 (unpack = True)? will give us all of first column(length) in l and second column(time) in t |
|
12 |
|
13 to get more help type |
|
14 |
|
15 loadtxt? |
|
16 This is really powerful tool to load data directly from files which are well structured and formatted. It supports many features like getting particular columns. |
|
17 now to get squared values of t we can simply do |
|
18 |
|
19 tsq = t*t |
|
20 |
|
21 and we dont have to use for loop anymore. This is benefit of arrays. If we try to something similar to lists we cant escape a 'for' loop. |
|
22 |
|
23 Now to plot l vs tsq is same as we did in previous session |
|
24 |
|
25 plot(l, tsq, 'o') |
|
26 |
|
27 |
|
28 In this tutorial we shall learn how to compute statistics using python. |
|
29 We also shall learn how to represent data in the form of pie charts. |
|
30 |
|
31 Let us start with the most basic need in statistics, the mean. |
|
32 |
|
33 We shall calculate the mean acceleration due to gravity using the same 'pendulum.txt' that we used in the previous session. |
|
34 |
|
35 As we know, 'pendulum.txt' contains two values in each line. The first being length of pendulum and second the time period. |
|
36 To calculate acceleration due to gravity from these values, we shall use the expression T = 2*pi*sqrt(L/g) |
|
37 So re-arranging this equation, we get g = 4*pi**2*L/T**2 . |
|
38 |
|
39 We shall calculate the value of g for each pair of L and t and then calculate mean of all those g values. |
|
40 |
|
41 ## if we do loadtxt and numpy arrays then this part will change |
|
42 First we need something to store each value of g that we are going to compute. |
|
43 So we start with initialising an empty list called `g_list'. |
|
44 |
|
45 Now we read each line from the file 'pendulum.txt' and calculate g value for that pair of L and t and then append the computed g to our `g_list'. |
|
46 |
|
47 In []: for line in open('pendulum.txt'): |
|
48 .... point = line.split() |
|
49 .... L = float(point[0]) |
|
50 .... t = float(point[1]) |
|
51 .... g = 4 * pi * pi * L / (t * t) |
|
52 .... g_list.append(g) |
|
53 |
|
54 The first four lines of this code must be trivial. We read the file and store the values. |
|
55 The fifth line where we do g equals to 4 star pi star and so on is the line which calculates g for each pair of L and t values from teh file. The last line simply stores the computed g value. In technical terms appends the computed value to g_list. |
|
56 |
|
57 Let us type this code in and see what g_list contains. |
|
58 ############################### |
|
59 |
|
60 Each value in g_list is the g value computed from a pair of L and t values. |
|
61 |
|
62 Now we have all the values for g. We must find the mean of these values. That is the sum of all these values divided by the total no.of values. |
|
63 |
|
64 The no.of values can be found using len(g_list) |
|
65 |
|
66 So we are left with the problem of finding the sum. |
|
67 We shall create a variable and loop over the list and add each g value to that variable. |
|
68 lets call it total. |
|
69 |
|
70 In []: total = 0 |
|
71 In []: for g in g_list: |
|
72 ....: total += g |
|
73 ....: |
|
74 |
|
75 So at of this piece of code we will have the sum of all the g values in the variable total. |
|
76 |
|
77 Now calculating mean of g is as simple as doing total divided by len(g_list) |
|
78 |
|
79 In []: g_mean = total / len(g_list) |
|
80 In []: print 'Mean: ', g_mean |
|
81 |
|
82 If we observe, we have to write a loop to do very simple thing such as finding sum of a list of values. |
|
83 Python has a built-in function called sum to ease things. |
|
84 |
|
85 sum takes a list of values and returns the sum of those values. |
|
86 now calculating mean is much simpler. |
|
87 we don't have to write any for loop. |
|
88 we can directly use mean = sum(g_list) / len(g_list) |
|
89 |
|
90 Still calculating mean needs writing an expression. |
|
91 What if we had a built-in for calculating mean directly. |
|
92 We do have and it is available through the pylab library. |
|
93 |
|
94 Now the job of calculating mean is just a function away. |
|
95 Call mean(g_list) directly and it gives you the mean of values in g_list. |
|
96 |
|
97 Isn't that sweet. Ya and that is why I use python. |
|
98 |
|
99 |