author | Shantanu <shantanu@fossee.in> |
Fri, 16 Apr 2010 12:01:01 +0530 | |
changeset 71 | bc3f351aeec9 |
parent 46 | 34df59770550 |
permissions | -rw-r--r-- |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
1 |
Hello friends and welcome to the third tutorial in the series of tutorials on "Python for scientific computing." |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
2 |
|
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
3 |
This session is a continuation of the tutorial on Plotting Experimental data. |
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
4 |
|
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
5 |
We shall look at plotting experimental data using slightly advanced methods here. And then look into some statistical operations. |
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
6 |
|
41 | 7 |
In the previous tutorial we learnt how to read data from a file and plot it. |
8 |
We used 'for' loops and lists to get data in the desired format. |
|
9 |
IPython -Pylab also provides a function called 'loadtxt' that can get us the same data in the desired format without much hustle. |
|
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
10 |
|
45 | 11 |
We shall use the same pendulum.txt file that we used in the previous session. |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
12 |
We know that, pendulum.txt contains two columns, with length being first and time period is second column, so to get both columns in two separate variables we type |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
13 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
14 |
l, t = loadtxt('pendulum.txt', unpack=True) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
15 |
|
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
16 |
(unpack = True) will give us all the data in the first column which is the length in l and all the data in the second column which is the time period in t. Here both l and t are arrays. We shall look into what arrays are in subsequent tutorials. |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
17 |
|
41 | 18 |
to know more about loadtxt type |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
19 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
20 |
loadtxt? |
41 | 21 |
This is a really powerful tool to load data directly from files which are well structured and formatted. It supports many features like getting selected columns only, or skipping rows. |
22 |
||
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
23 |
Let's back to the problem, hit q to exit. Now to get squared values of t we can simply do |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
24 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
25 |
tsq = t*t |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
26 |
|
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
27 |
Note that we don't have to use the 'for' loop anymore. This is the benefit of arrays. If we try to do the something similar using lists we won't be able to escape the use of the 'for' loop. |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
28 |
|
41 | 29 |
Let's now plot l vs tsq just as we did in the previous session |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
30 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
31 |
plot(l, tsq, 'o') |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
32 |
|
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
33 |
Let's continue with the pendulum expt to obtain the value of the acceleration due to gravity. The basic equation for finding Time period of simple pendulum is: |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
34 |
|
38
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
35 |
T = 2*pi*sqrt(L/g) |
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
36 |
|
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
37 |
rearranging this equation we obtain the value of as |
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
38 |
g = 4 pi squared into l by t squared. |
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
39 |
|
41 | 40 |
In this case we have the values of t and l already, so to find g value for each element we can simply use: |
38
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
41 |
|
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
42 |
g = 4*pi^2*L/T^2 |
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
43 |
|
46
34df59770550
Added script for sslc.txt file and presentation.
Shantanu <shantanu@fossee.in>
parents:
45
diff
changeset
|
44 |
g here is array, we can take the average of all these values to get the acceleration due to gravity('g') by |
38
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
45 |
|
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
46 |
print mean(g) |
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
47 |
|
41 | 48 |
Mean again is provided by pylab module which calculates the average of the given set of values. |
38
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
49 |
There are other handy statistical functions available, such as median, mode, std(for standard deviation) etc. |
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
50 |
|
41 | 51 |
In this small session we have covered 'better' way of loading data from text files. |
52 |
Why arrays are a better choice than lists in some cases, and how they are more helpful with mathematical operations. |
|
38
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
53 |
|
41 | 54 |
Hope it was useful to you. Thank you! |
38
f248e91b1510
Added changes to 3.1 session script.
Shantanu <shantanu@fossee.in>
parents:
37
diff
changeset
|
55 |
----------------------------------------------------------------------------------------------------------- |
37
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
56 |
In this tutorial we shall learn how to compute statistics using python. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
57 |
We also shall learn how to represent data in the form of pie charts. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
58 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
59 |
Let us start with the most basic need in statistics, the mean. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
60 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
61 |
We shall calculate the mean acceleration due to gravity using the same 'pendulum.txt' that we used in the previous session. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
62 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
63 |
As we know, 'pendulum.txt' contains two values in each line. The first being length of pendulum and second the time period. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
64 |
To calculate acceleration due to gravity from these values, we shall use the expression T = 2*pi*sqrt(L/g) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
65 |
So re-arranging this equation, we get g = 4*pi**2*L/T**2 . |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
66 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
67 |
We shall calculate the value of g for each pair of L and t and then calculate mean of all those g values. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
68 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
69 |
## if we do loadtxt and numpy arrays then this part will change |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
70 |
First we need something to store each value of g that we are going to compute. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
71 |
So we start with initialising an empty list called `g_list'. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
72 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
73 |
Now we read each line from the file 'pendulum.txt' and calculate g value for that pair of L and t and then append the computed g to our `g_list'. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
74 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
75 |
In []: for line in open('pendulum.txt'): |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
76 |
.... point = line.split() |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
77 |
.... L = float(point[0]) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
78 |
.... t = float(point[1]) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
79 |
.... g = 4 * pi * pi * L / (t * t) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
80 |
.... g_list.append(g) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
81 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
82 |
The first four lines of this code must be trivial. We read the file and store the values. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
83 |
The fifth line where we do g equals to 4 star pi star and so on is the line which calculates g for each pair of L and t values from teh file. The last line simply stores the computed g value. In technical terms appends the computed value to g_list. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
84 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
85 |
Let us type this code in and see what g_list contains. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
86 |
############################### |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
87 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
88 |
Each value in g_list is the g value computed from a pair of L and t values. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
89 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
90 |
Now we have all the values for g. We must find the mean of these values. That is the sum of all these values divided by the total no.of values. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
91 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
92 |
The no.of values can be found using len(g_list) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
93 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
94 |
So we are left with the problem of finding the sum. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
95 |
We shall create a variable and loop over the list and add each g value to that variable. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
96 |
lets call it total. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
97 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
98 |
In []: total = 0 |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
99 |
In []: for g in g_list: |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
100 |
....: total += g |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
101 |
....: |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
102 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
103 |
So at of this piece of code we will have the sum of all the g values in the variable total. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
104 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
105 |
Now calculating mean of g is as simple as doing total divided by len(g_list) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
106 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
107 |
In []: g_mean = total / len(g_list) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
108 |
In []: print 'Mean: ', g_mean |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
109 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
110 |
If we observe, we have to write a loop to do very simple thing such as finding sum of a list of values. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
111 |
Python has a built-in function called sum to ease things. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
112 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
113 |
sum takes a list of values and returns the sum of those values. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
114 |
now calculating mean is much simpler. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
115 |
we don't have to write any for loop. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
116 |
we can directly use mean = sum(g_list) / len(g_list) |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
117 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
118 |
Still calculating mean needs writing an expression. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
119 |
What if we had a built-in for calculating mean directly. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
120 |
We do have and it is available through the pylab library. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
121 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
122 |
Now the job of calculating mean is just a function away. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
123 |
Call mean(g_list) directly and it gives you the mean of values in g_list. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
124 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
125 |
Isn't that sweet. Ya and that is why I use python. |
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
126 |
|
c2634d874e33
Added script for third session first part.
Shantanu <shantanu@fossee.in>
parents:
diff
changeset
|
127 |