statistics/script.rst
author Puneeth Chaganti <punchagan@fossee.in>
Sat, 06 Nov 2010 19:18:54 +0530
changeset 383 4a6d548d4369
parent 382 aa8ea9119476
child 406 a534e9e79599
permissions -rw-r--r--
Minor comments on Statistics.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
362
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     1
.. Objectives
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     2
.. ----------
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     3
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     4
.. By the end of this tutorial you will --
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     5
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     6
.. 1. Get to know simple statistics functions like mean,std etc .. (Remembering)
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     7
.. #. Apply them on a real world example. (Applying)
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     8
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     9
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    10
.. Prerequisites
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    11
.. -------------
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    12
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    13
.. Getting started with IPython
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    14
.. Loading Data from files
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    15
.. Getting started with Lists
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    16
     
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    17
.. Author              : Puneeth 
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    18
   Internal Reviewer   : Anoop Jacob Thomas<anoop@fossee.in>
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    19
   External Reviewer   :
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    20
   Checklist OK?       : <put date stamp here, if OK> [2010-10-05]
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    21
383
4a6d548d4369 Minor comments on Statistics.
Puneeth Chaganti <punchagan@fossee.in>
parents: 382
diff changeset
    22
.. #[punch; add slides, exercises!]
4a6d548d4369 Minor comments on Statistics.
Puneeth Chaganti <punchagan@fossee.in>
parents: 382
diff changeset
    23
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    24
Hello friends and welcome to the tutorial on Statistics using Python
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    25
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    26
{{{ Show the slide containing title }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    27
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    28
{{{ Show the slide containing the outline slide }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    29
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    30
In this tutorial, we shall learn
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    31
 * Doing simple statistical operations in Python  
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    32
 * Applying these to real world problems 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    33
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    34
.. #[punch: the prerequisites part may be skipped in the tutorial. It
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    35
.. will be provided separately.]
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    36
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    37
You will need Ipython with pylab running on your computer to use this
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    38
tutorial.
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    39
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    40
Also you will need to know about loading data using loadtxt to be able
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    41
to follow the real world application.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    42
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    43
.. #[punch: since loadtxt is anyway a pre-req, I would recommend you
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    44
.. to use a data file and load data from that. that is good, since you
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    45
.. would get to deal with arrays, instead of lists. 
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    46
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    47
.. Talking of rows and columns of 2-D lists etc is confusing. Also,
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    48
.. converting to float can be avoided. The tutorial will feel more
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    49
.. natural, is what I think. 
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    50
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    51
.. The idea of separating the main problem and giving toy examples
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    52
.. doesn't sound good. Use the same problem to explain stuff. Or use a
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    53
.. smaller data-set or something. Using lists doesn't seem natural.]
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    54
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    55
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    56
We will first start with the most necessary statistical operation i.e
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    57
finding mean.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    58
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    59
We have a list of ages of a random group of people ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    60
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    61
   age_list = [4,45,23,34,34,38,65,42,32,7]
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    62
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    63
One way of getting the mean could be getting sum of all the ages and
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    64
dividing by the number of people in the group. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    65
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    66
    sum_age_list = sum(age_list)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    67
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    68
sum function gives us the sum of the elements. Note that the
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    69
``sum_age_list`` variable is an integer and the number of people or
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    70
length of the list is also an integer. We will need to convert one of
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    71
them to a float before carrying out the division. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    72
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    73
    mean_using_sum = float(sum_age_list)/len(age_list)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    74
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    75
This obviously gives the mean age but there is a simpler way to do
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    76
this in Python - using the mean function::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    77
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    78
       mean(age_list)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    79
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    80
Mean can be used in more ways in case of 2 dimensional lists.  Take a
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    81
two dimensional list ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    82
     
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    83
     two_dimension=[[1,5,6,8],[1,3,4,5]]
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    84
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    85
The mean function by default gives the mean of the flattened sequence.
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    86
A Flattened sequence means a list obtained by concatenating all the
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    87
smaller lists into a large long list. In this case, the list obtained
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    88
by writing the two lists one after the other. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    89
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    90
    mean(two_dimension)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    91
    flattened_seq=[1,5,6,8,1,3,4,5]
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    92
    mean(flattened_seq)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    93
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    94
As you can see both the results are same. ``mean`` function can also
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    95
give us the mean of each column, or the mean of corresponding elements
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    96
in the smaller lists. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    97
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    98
   mean(two_dimension, 0)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    99
   array([ 1. ,  4. ,  5. ,  6.5])
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   100
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   101
we pass an extra argument 0 in that case.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   102
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   103
If we use an argument 1, we obtain the mean along the rows. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   104
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   105
   mean(two_dimension, 1)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   106
   array([ 5.  ,  3.25])
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   107
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   108
We can see more option of mean using ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   109
   
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   110
   mean?
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   111
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   112
Similarly we can calculate median and stanard deviation of a list
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   113
using the functions median and std::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   114
      
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   115
      median(age_list)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   116
      std(age_list)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   117
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   118
Median and std can also be calculated for two dimensional arrays along
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   119
columns and rows just like mean.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   120
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   121
For example ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   122
       
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   123
       median(two_dimension, 0)
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   124
       std(two_dimension, 1)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   125
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   126
This gives us the median along the colums and standard devition along
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   127
the rows.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   128
       
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   129
Now lets apply this to a real world example 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   130
    
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   131
We will a data file that is at the a path ``/home/fossee/sslc2.txt``.
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   132
It contains record of students and their performance in one of the
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   133
State Secondary Board Examination. It has 180, 000 lines of record. We
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   134
are going to read it and process this data.  We can see the content of
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   135
file by double clicking on it. It might take some time to open since
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   136
it is quite a large file.  Please don't edit the data.  This file has
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   137
a particular structure.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   138
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   139
We can do ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   140
   
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   141
   cat /home/fossee/sslc2.txt
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   142
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   143
to check the contents of the file.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   144
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   145
Each line in the file is a set of 11 fields separated 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   146
by semi-colons Consider a sample line from this file.  
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   147
A;015163;JOSEPH RAJ S;083;042;47;00;72;244;;; 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   148
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   149
The following are the fields in any given line.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   150
* Region Code which is 'A'
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   151
* Roll Number 015163
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   152
* Name JOSEPH RAJ S
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   153
* Marks of 5 subjects: ** English 083 ** Hindi 042 ** Maths 47 **
349
9ced58c5c3b6 Added long answer type problems in all scripts
amit
parents: 321
diff changeset
   154
Science 35 ** Social 72
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   155
* Total marks 244
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   156
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   157
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   158
Now lets try and find the mean of English marks of all students.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   159
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   160
For this we do. ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   161
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   162
     L=loadtxt('/home/fossee/sslc2.txt',usecols=(3,),delimiter=';')
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   163
     L
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   164
     mean(L)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   165
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   166
loadtxt function loads data from an external file.Delimiter specifies
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   167
the kind of character are the fields of data seperated by. 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   168
usecols specifies  the columns to be used so (3,). The 'comma' is added
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   169
because usecols is a sequence.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   170
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   171
To get the median marks. ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   172
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   173
    median(L)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   174
   
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   175
Standard deviation. ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   176
	
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   177
    std(L)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   178
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   179
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   180
Now lets try and and get the mean for all the subjects ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   181
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   182
     L=loadtxt('/home/fossee/sslc2.txt',usecols=(3,4,5,6,7),delimiter=';')
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   183
     mean(L,0)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   184
     array([ 73.55452504,  53.79828941,  62.83342759,  50.69806158,  63.17056881])
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   185
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   186
As we can see from the result mean(L,0). The resultant sequence  
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   187
is the mean marks of all students that gave the exam for the five subjects.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   188
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   189
and ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   190
    
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   191
    mean(L,1)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   192
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   193
    
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   194
is the average accumalative marks of individual students. Clearly, mean(L,0)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   195
was a row wise calcultaion while mean(L,1) was a column wise calculation.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   196
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   197
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   198
{{{ Show summary slide }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   199
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   200
This brings us to the end of the tutorial.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   201
we have learnt
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   202
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   203
 * How to do the standard statistical operations sum , mean
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   204
   median and standard deviation in Python.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   205
 * Combine text loading and the statistical operation to solve
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   206
   real world problems.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   207
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   208
{{{ Show the "sponsored by FOSSEE" slide }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   209
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   210
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   211
This tutorial was created as a part of FOSSEE project, NME ICT, MHRD India
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   212
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   213
Hope you have enjoyed and found it useful.
349
9ced58c5c3b6 Added long answer type problems in all scripts
amit
parents: 321
diff changeset
   214
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   215
Thank you!
349
9ced58c5c3b6 Added long answer type problems in all scripts
amit
parents: 321
diff changeset
   216