least-squares.org
author Santosh G. Vattam <vattam.santosh@gmail.com>
Sat, 17 Apr 2010 15:51:43 +0530
changeset 81 2eff0ebac2dc
parent 76 6dfdb6fc9d80
permissions -rw-r--r--
Minor edits.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     1
* Least Squares Fit
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     2
*** Outline
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     3
***** Introduction
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     4
******* What do we want to do? Why?
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     5
********* What's a least square fit?
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     6
********* Why is it useful?
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     7
******* How are we doing it?
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     8
******* Arsenal Required
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     9
********* working knowledge of arrays
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    10
********* plotting
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    11
********* file reading
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    12
***** Procedure
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    13
******* The equation (for a single point)
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    14
******* It's matrix form
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    15
******* Getting the required matrices
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    16
******* getting the solution
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    17
******* plotting
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    18
*** Script
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    19
    Welcome. 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    20
    
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    21
    In this tutorial we shall look at obtaining the least squares fit
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    22
    of a given data-set. For this purpose, we shall use the same
75
3a94917224e9 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 2
diff changeset
    23
    pendulum data that we used in the tutorial on plotting from files.
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    24
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    25
    To be able to follow this tutorial comfortably, you should have a
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    26
    working knowledge of arrays, plotting and file reading. 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    27
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    28
    A least squares fit curve is the curve for which the sum of the
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    29
    squares of it's distance from the given set of points is
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    30
    minimum. 
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    31
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    32
    Previously, when we plotted the data from pendulum.txt we got a 
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    33
    scatter plot of points as shown. 
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    34
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    35
    In our example, we know that the length of the pendulum is
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    36
    proportional to the square of the time-period. But when we plot
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    37
    the data using lines we get a distorted line as shown. What
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    38
    we expect ideally, is something like the redline in this graph. 
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    39
    From the problem we know that L is directly proportional to T^2.
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    40
    But experimental data invariably contains errors and hence does
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    41
    not produce an ideal plot. The best fit curve for this data has 
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    42
    to be a linear curve and this can be obtained by performing least
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    43
    square fit on the data set. We shall use the lstsq function to
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    44
    obtain the least squares fit curve. 
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    45
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    46
    The equation of the line is of the form T^2 = mL+c. We have a set
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    47
    of values for L and the corresponding T^2 values. Using this, we
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    48
    wish to obtain the equation of the straight line. 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    49
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    50
    In matrix form the equation is represented as shown, 
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    51
    Tsq = A.p where Tsq is an NX1 matrix, and A is an NX2 matrix as shown.
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    52
    And p is a 2X1 matrix of the slope and Y-intercept. In order to 
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    53
    obtain the least square fit curve we need to find the matrix p
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    54
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    55
    Let's get started. As you can see, the file pendulum.txt
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    56
    is on our Desktop and hence we navigate to the Desktop by typing 
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    57
    cd Desktop. Let's now fire up IPython: ipython -pylab
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    58
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    59
    We have already seen (in a previous tutorial), how to read a file
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    60
    and obtain the data set using loadtxt(). Let's quickly get the required data
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    61
    from our file. 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    62
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    63
    l, t = loadtxt('pendulum.txt', unpack=True)
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    64
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    65
    loadtxt() directly stores the values in the pendulum.txt into arrays l and t
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    66
    Let's now calculate the values of square of the time-period. 
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    67
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    68
    tsq = t*t
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    69
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    70
    Now we shall obtain A, in the desired form using some simple array
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    71
    manipulation 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    72
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    73
    A = array([l, ones_like(l)])
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    74
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    75
    As we have seen in a previous tutorial, ones_like() gives an array similar
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    76
    in shape to the given array, in this case l, with all the elements as 1. 
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    77
    Please note, this is how we create an array from an existing array.
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    78
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    79
    Let's now look at the shape of A. 
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    80
    A.shape
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    81
    This is an 2X90 matrix. But we need a 90X2 matrix, so we shall transpose it.
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    82
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
    83
    A = A.T
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    84
    
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    85
    Type A, to confirm that we have obtained the desired array. 
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    86
    A
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    87
    Also note the shape of A. 
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    88
    A.shape
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    89
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    90
    We shall now use the lstsq function, to obtain the coefficients m
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    91
    and c. lstsq returns a lot of things along with these
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    92
    coefficients. We may look at the documentation of lstsq, for more
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    93
    information by typing lstsq? 
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    94
    result = lstsq(A,tsq)
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    95
75
3a94917224e9 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 2
diff changeset
    96
    We extract the required coefficients, which is the first element
3a94917224e9 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 2
diff changeset
    97
    in the list of things that lstsq returns, and store them into the variable coef. 
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
    98
    coef = result[0]
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    99
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   100
    To obtain the plot of the line, we simply use the equation of the
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   101
    line, we have noted before. T^2 = mL + c. 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   102
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
   103
    Tline = coef[0]*l + coef[1]
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
   104
    plot(l, Tline, 'r')
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   105
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   106
    Also, it would be nice to have a plot of the points. So, 
76
6dfdb6fc9d80 Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 75
diff changeset
   107
    plot(l, tsq, 'o')
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   108
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   109
    This brings us to the end of this tutorial. In this tutorial,
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   110
    you've learnt how to obtain a least squares fit curve for a given
81
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
   111
    set of points using lstsq. There are other curve fitting functions
2eff0ebac2dc Minor edits.
Santosh G. Vattam <vattam.santosh@gmail.com>
parents: 76
diff changeset
   112
    available in Pylab such as polyfit.
2
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   113
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   114
    Hope you enjoyed it. Thanks. 
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   115
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   116
*** Notes
008c0edc6eac Added scripts for session-4 and session-6 of day-1.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   117