statistics/script.rst
author Puneeth Chaganti <punchagan@fossee.in>
Sat, 06 Nov 2010 19:17:21 +0530
changeset 382 aa8ea9119476
parent 362 a77a27916f81
child 383 4a6d548d4369
permissions -rw-r--r--
Reviewed statistics script.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
362
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     1
.. Objectives
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     2
.. ----------
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     3
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     4
.. By the end of this tutorial you will --
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     5
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     6
.. 1. Get to know simple statistics functions like mean,std etc .. (Remembering)
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     7
.. #. Apply them on a real world example. (Applying)
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     8
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
     9
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    10
.. Prerequisites
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    11
.. -------------
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    12
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    13
.. Getting started with IPython
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    14
.. Loading Data from files
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    15
.. Getting started with Lists
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    16
     
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    17
.. Author              : Puneeth 
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    18
   Internal Reviewer   : Anoop Jacob Thomas<anoop@fossee.in>
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    19
   External Reviewer   :
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    20
   Checklist OK?       : <put date stamp here, if OK> [2010-10-05]
a77a27916f81 Added Objectives and other metadata
amit
parents: 349
diff changeset
    21
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    22
Hello friends and welcome to the tutorial on Statistics using Python
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    23
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    24
{{{ Show the slide containing title }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    25
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    26
{{{ Show the slide containing the outline slide }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    27
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    28
In this tutorial, we shall learn
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    29
 * Doing simple statistical operations in Python  
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    30
 * Applying these to real world problems 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    31
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    32
.. #[punch: the prerequisites part may be skipped in the tutorial. It
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    33
.. will be provided separately.]
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    34
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    35
You will need Ipython with pylab running on your computer to use this
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    36
tutorial.
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    37
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    38
Also you will need to know about loading data using loadtxt to be able
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    39
to follow the real world application.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    40
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    41
.. #[punch: since loadtxt is anyway a pre-req, I would recommend you
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    42
.. to use a data file and load data from that. that is good, since you
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    43
.. would get to deal with arrays, instead of lists. 
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    44
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    45
.. Talking of rows and columns of 2-D lists etc is confusing. Also,
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    46
.. converting to float can be avoided. The tutorial will feel more
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    47
.. natural, is what I think. 
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    48
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    49
.. The idea of separating the main problem and giving toy examples
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    50
.. doesn't sound good. Use the same problem to explain stuff. Or use a
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    51
.. smaller data-set or something. Using lists doesn't seem natural.]
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    52
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    53
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    54
We will first start with the most necessary statistical operation i.e
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    55
finding mean.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    56
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    57
We have a list of ages of a random group of people ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    58
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    59
   age_list = [4,45,23,34,34,38,65,42,32,7]
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    60
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    61
One way of getting the mean could be getting sum of all the ages and
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    62
dividing by the number of people in the group. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    63
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    64
    sum_age_list = sum(age_list)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    65
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    66
sum function gives us the sum of the elements. Note that the
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    67
``sum_age_list`` variable is an integer and the number of people or
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    68
length of the list is also an integer. We will need to convert one of
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    69
them to a float before carrying out the division. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    70
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    71
    mean_using_sum = float(sum_age_list)/len(age_list)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    72
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    73
This obviously gives the mean age but there is a simpler way to do
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    74
this in Python - using the mean function::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    75
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    76
       mean(age_list)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    77
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    78
Mean can be used in more ways in case of 2 dimensional lists.  Take a
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    79
two dimensional list ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    80
     
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    81
     two_dimension=[[1,5,6,8],[1,3,4,5]]
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    82
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    83
The mean function by default gives the mean of the flattened sequence.
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    84
A Flattened sequence means a list obtained by concatenating all the
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    85
smaller lists into a large long list. In this case, the list obtained
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    86
by writing the two lists one after the other. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    87
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    88
    mean(two_dimension)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    89
    flattened_seq=[1,5,6,8,1,3,4,5]
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    90
    mean(flattened_seq)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    91
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    92
As you can see both the results are same. ``mean`` function can also
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    93
give us the mean of each column, or the mean of corresponding elements
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    94
in the smaller lists. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    95
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
    96
   mean(two_dimension, 0)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    97
   array([ 1. ,  4. ,  5. ,  6.5])
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    98
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
    99
we pass an extra argument 0 in that case.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   100
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   101
If we use an argument 1, we obtain the mean along the rows. ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   102
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   103
   mean(two_dimension, 1)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   104
   array([ 5.  ,  3.25])
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   105
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   106
We can see more option of mean using ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   107
   
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   108
   mean?
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   109
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   110
Similarly we can calculate median and stanard deviation of a list
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   111
using the functions median and std::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   112
      
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   113
      median(age_list)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   114
      std(age_list)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   115
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   116
Median and std can also be calculated for two dimensional arrays along
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   117
columns and rows just like mean.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   118
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   119
For example ::
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   120
       
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   121
       median(two_dimension, 0)
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   122
       std(two_dimension, 1)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   123
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   124
This gives us the median along the colums and standard devition along
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   125
the rows.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   126
       
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   127
Now lets apply this to a real world example 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   128
    
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   129
We will a data file that is at the a path ``/home/fossee/sslc2.txt``.
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   130
It contains record of students and their performance in one of the
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   131
State Secondary Board Examination. It has 180, 000 lines of record. We
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   132
are going to read it and process this data.  We can see the content of
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   133
file by double clicking on it. It might take some time to open since
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   134
it is quite a large file.  Please don't edit the data.  This file has
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   135
a particular structure.
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   136
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   137
We can do ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   138
   
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   139
   cat /home/fossee/sslc2.txt
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   140
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   141
to check the contents of the file.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   142
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   143
Each line in the file is a set of 11 fields separated 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   144
by semi-colons Consider a sample line from this file.  
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   145
A;015163;JOSEPH RAJ S;083;042;47;00;72;244;;; 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   146
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   147
The following are the fields in any given line.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   148
* Region Code which is 'A'
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   149
* Roll Number 015163
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   150
* Name JOSEPH RAJ S
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   151
* Marks of 5 subjects: ** English 083 ** Hindi 042 ** Maths 47 **
349
9ced58c5c3b6 Added long answer type problems in all scripts
amit
parents: 321
diff changeset
   152
Science 35 ** Social 72
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   153
* Total marks 244
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   154
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   155
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   156
Now lets try and find the mean of English marks of all students.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   157
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   158
For this we do. ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   159
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   160
     L=loadtxt('/home/fossee/sslc2.txt',usecols=(3,),delimiter=';')
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   161
     L
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   162
     mean(L)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   163
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   164
loadtxt function loads data from an external file.Delimiter specifies
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   165
the kind of character are the fields of data seperated by. 
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   166
usecols specifies  the columns to be used so (3,). The 'comma' is added
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   167
because usecols is a sequence.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   168
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   169
To get the median marks. ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   170
   
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   171
    median(L)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   172
   
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   173
Standard deviation. ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   174
	
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   175
    std(L)
321
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   176
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   177
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   178
Now lets try and and get the mean for all the subjects ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   179
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   180
     L=loadtxt('/home/fossee/sslc2.txt',usecols=(3,4,5,6,7),delimiter=';')
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   181
     mean(L,0)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   182
     array([ 73.55452504,  53.79828941,  62.83342759,  50.69806158,  63.17056881])
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   183
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   184
As we can see from the result mean(L,0). The resultant sequence  
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   185
is the mean marks of all students that gave the exam for the five subjects.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   186
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   187
and ::
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   188
    
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   189
    mean(L,1)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   190
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   191
    
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   192
is the average accumalative marks of individual students. Clearly, mean(L,0)
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   193
was a row wise calcultaion while mean(L,1) was a column wise calculation.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   194
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   195
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   196
{{{ Show summary slide }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   197
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   198
This brings us to the end of the tutorial.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   199
we have learnt
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   200
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   201
 * How to do the standard statistical operations sum , mean
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   202
   median and standard deviation in Python.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   203
 * Combine text loading and the statistical operation to solve
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   204
   real world problems.
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   205
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   206
{{{ Show the "sponsored by FOSSEE" slide }}}
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   207
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   208
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   209
This tutorial was created as a part of FOSSEE project, NME ICT, MHRD India
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   210
2e49b1b72996 adding questions for all other LO needs to be cleaned
amit
parents:
diff changeset
   211
Hope you have enjoyed and found it useful.
349
9ced58c5c3b6 Added long answer type problems in all scripts
amit
parents: 321
diff changeset
   212
382
aa8ea9119476 Reviewed statistics script.
Puneeth Chaganti <punchagan@fossee.in>
parents: 362
diff changeset
   213
Thank you!
349
9ced58c5c3b6 Added long answer type problems in all scripts
amit
parents: 321
diff changeset
   214