Added script for third session first part.
authorShantanu <shantanu@fossee.in>
Sun, 11 Apr 2010 01:30:44 +0530
changeset 37 c2634d874e33
parent 36 57ed95acb13f
child 38 f248e91b1510
Added script for third session first part.
plotting-script.txt
statistics-script
--- a/plotting-script.txt	Sat Apr 10 15:28:52 2010 +0530
+++ b/plotting-script.txt	Sun Apr 11 01:30:44 2010 +0530
@@ -12,27 +12,27 @@
 
 First we shall look into using lists to input the data and then we shall plot it. 
 Type
-x = 0, 1, 2.1, 3.1, 4.2, 5.2 within square brackets.
+x = open square bracket 0, 1, 2.1, 3.1, 4.2, 5.2 close square bracket.
 here x is a list. In python, list is a container that holds a number of objects in the given order. 
 We shall look into other functions related to lists a little later. 
 
 Now for the corresponding Y values type
-y = 0, 0.8, 0.9, 0, -0.9, -0.8 within square brackets.
+y = open square bracket 0, 0.8, 0.9, 0, -0.9, -0.8 close square bracket.
  
 Now that we have x and y in two separate lists and we plot x vs. y using
-plot (x, y, 'o') The o within quotes plots with filled circles. We saw the various style options in the previous tutorial.
+plot (x, y, 'o') The o within quotes plots with filled circles. And lo! We have our plot! 
 
-And lo! We have our plot! 
+
 [We close the plot window. ] 
 
 Now, that we know how to plot data from lists, we will look at plotting data from a text file. Essentially, if we read the data from the file and fit them into lists, we can easily plot the data, just as we did previously. 
 
-Here we shall use the data collected from a simple pendulum experiment as an example. 
+Here we shall use the data collected from a simple pendulum experiment as an example. The aim of the experiment is to plot the length versus square of the time period.
 Let us check out what pendulum.txt contains. Type cat pendulum.txt
 
 Windows users just double click on the file to open it. Please be careful not to edit the file.
 
-The first column is the length of the pendulum and the second column is the time. We read the file line-by-line, collect the data into lists and plot them.
+The first column is the length of the pendulum and the second column is the time period. We read the file line-by-line, collect the data into lists and plot them.
 
 Let's begin with initializing three empty lists for length, time-period and square of the time-period.
 l = []
@@ -42,16 +42,16 @@
 Initializing an empty list is done as shown above using just a pair of square brackets without any content in them.
 
 Now we open the file and read it line by line. 
-for line in open('pendulum.txt'): 
+for line in open (within quotes the file name. )('pendulum.txt'): 
 
 The ':' at the end of the 'for' statement marks the beginning of the for block.
 'open' returns an iterable object which we traverse using the 'for' loop. In  python, 'for' iterates over items of a sequence.
 For more details regarding the for loop refer to our tutorial on loops and data structures.
-'line' here is a string variable that contains one line of the file at a time as the 'for' loop iterates through the file.
+Whatever we read from a file is in the form of strings. Thus 'line' here is a string variable that contains one line of the file at a time as the 'for' loop iterates through the file.
 
 We split each line at the space using
      point = line.split() 
-the split function returns a list of elements from the 'line' variable split over spaces. In this case it will have two elements, first is length and second is time. 
+the split function returns a list of elements from the 'line' variable split over spaces. In this case it will have two elements, first is length and second is time. point here contains 2 elements, the first one is the length and the second one is the time period
 
 Note the indentation here. Everything inside the 'for' loop has to be indented by 4 spaces.
 Then we append the length and time values to the appropriate lists. Since we cannot perform mathematical operations on strings, we need to convert the strings to floats, before appending to the lists. 
@@ -59,6 +59,8 @@
 append is a function used to append a single element to a list.
     t.append(float(point[1]))
 
+That's it, now we need to exit the loop. Hit the enter key twice.
+
 Now we have the time and length values in two lists. Now to get the square of the time values, we shall write one more 'for' loop which will iterate through list 't'
 
 for time in t:
@@ -70,5 +72,9 @@
 Now we have verified that all three have the same dimensions. lists l and tsq have the required data. Let's now plot them, as we did earlier. 
 plot(l, tsq, 'o')
 
-So here is the required plot. 
-In this way, you can plot data from files.  Hope this information was helpful. Thank you.
+So here is the required plot. We may proceed to label the axes, title the plot and save it. 
+
+In this tutorial we have learnt how to create lists and append items to them. We have learnt how to process data using lists, how to open and read files and the 'for' loop.
+
+That brings us to the end of this session.
+Hope this information was helpful. Thank you.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/statistics-script	Sun Apr 11 01:30:44 2010 +0530
@@ -0,0 +1,99 @@
+Hello friends and welcome to the third tutorial in the series of tutorials on "Python for scientific computing."
+
+In the previous tutorial we learnt how to read data from a file and plot the data
+We used 'for' loops and lists to get data in desired format.
+IPython -Pylab also provides with a function 'loadtxt' which can get us data without much hustle.
+
+We know that, pendulum.txt contains two columns, with length being first and time period is second column, so to get both columns in two separate variables we type
+
+l, t = loadtxt('pendulum.txt', unpack=True)
+
+(unpack = True)? will give us all of first column(length) in l and second column(time) in t
+
+to get more help type 
+
+loadtxt?
+This is really powerful tool to load data directly from files which are well structured and formatted. It supports many features like getting particular columns. 
+now to get squared values of t we can simply do
+
+tsq = t*t
+
+and we dont have to use for loop anymore. This is benefit of arrays. If we try to something similar to lists we cant escape a 'for' loop.
+
+Now to plot l vs tsq is same as we did in previous session
+
+plot(l, tsq, 'o')
+
+
+In this tutorial we shall learn how to compute statistics using python.
+We also shall learn how to represent data in the form of pie charts.
+
+Let us start with the most basic need in statistics, the mean.
+
+We shall calculate the mean acceleration due to gravity using the same 'pendulum.txt' that we used in the previous session.
+
+As we know, 'pendulum.txt' contains two values in each line. The first being length of pendulum and second the time period.
+To calculate acceleration due to gravity from these values, we shall use the expression T = 2*pi*sqrt(L/g)
+So re-arranging this equation, we get g = 4*pi**2*L/T**2 .
+
+We shall calculate the value of g for each pair of L and t and then calculate mean of all those g values.
+
+## if we do loadtxt and numpy arrays then this part will change
+	First we need something to store each value of g that we are going to compute.
+	So we start with initialising an empty list called `g_list'.
+
+	Now we read each line from the file 'pendulum.txt' and calculate g value for that pair of L and t and then append the computed g to our `g_list'.
+
+	In []: for line in open('pendulum.txt'):
+	  ....     point = line.split()
+	  ....     L = float(point[0])
+	  ....     t = float(point[1])
+	  ....     g = 4 * pi * pi * L / (t * t)
+	  ....     g_list.append(g)
+
+	The first four lines of this code must be trivial. We read the file and store the values. 
+	The fifth line where we do g equals to 4 star pi star and so on is the line which calculates g for each pair of L and t values from teh file. The last line simply stores the computed g value. In technical terms appends the computed value to g_list.
+
+	Let us type this code in and see what g_list contains.
+###############################
+
+Each value in g_list is the g value computed from a pair of L and t values.
+
+Now we have all the values for g. We must find the mean of these values. That is the sum of all these values divided by the total no.of values.
+
+The no.of values can be found using len(g_list)
+
+So we are left with the problem of finding the sum.
+We shall create a variable and loop over the list and add each g value to that variable.
+lets call it total.
+
+In []: total = 0 
+In []: for g in g_list:
+ ....:     total += g
+ ....:
+
+So at of this piece of code we will have the sum of all the g values in the variable total.
+
+Now calculating mean of g is as simple as doing total divided by len(g_list)
+
+In []: g_mean = total / len(g_list)
+In []: print 'Mean: ', g_mean
+
+If we observe, we have to write a loop to do very simple thing such as finding sum of a list of values.
+Python has a built-in function called sum to ease things.
+
+sum takes a list of values and returns the sum of those values.
+now calculating mean is much simpler.
+we don't have to write any for loop.
+we can directly use mean = sum(g_list) / len(g_list)
+
+Still calculating mean needs writing an expression.
+What if we had a built-in for calculating mean directly.
+We do have and it is available through the pylab library.
+
+Now the job of calculating mean is just a function away.
+Call mean(g_list) directly and it gives you the mean of values in g_list.
+
+Isn't that sweet. Ya and that is why I use python.
+
+