loading-data-from-files.rst
changeset 217 b595f90016c5
parent 216 7206fe0c03c5
child 218 620a644c0581
equal deleted inserted replaced
216:7206fe0c03c5 217:b595f90016c5
     1 ========
       
     2  Script
       
     3 ========
       
     4 
       
     5 Welcome to this tutorial on loading data from files. 
       
     6 
       
     7 {{{ Screen shows welcome slide }}}
       
     8 
       
     9 We often require to plot points obtained from experimental
       
    10 observations. In this tutorial we shall learn to read data from files
       
    11 and save it into sequences that can later be used to plot.
       
    12 
       
    13 {{{ Show the outline for this tutorial }}} 
       
    14 
       
    15 We shall use the ``loadtxt`` command to load data from files. We will
       
    16 be looking at how to read a file with multiple columns of data and
       
    17 load each column of data into a sequence. 
       
    18 
       
    19 {{{ switch back to the terminal }}}
       
    20 
       
    21 As usual, let us start IPython, using 
       
    22 ::
       
    23 
       
    24   ipython -pylab 
       
    25 
       
    26 Now, Let us begin with reading the file primes.txt, which contains
       
    27 just a list of primes listed in a column, using the loadtxt command.
       
    28 The file, in our case, is present in ``/home/fossee/primes.txt``. 
       
    29 
       
    30 {{{ Navigate to the path in the OS, open the file and show it }}}
       
    31 
       
    32 .. #[punch: do we need a slide for showing the path?]
       
    33 
       
    34 .. We use the ``cat`` command to see the contents of this file. 
       
    35 
       
    36 .. #[punch: should we show the cat command here? seems like a good place
       
    37    to do it] ::
       
    38 
       
    39      cat /home/fossee/primes.txt
       
    40 
       
    41 .. #[Nishanth]: A problem for windows users.
       
    42                 Should we simply open the file and show them the data
       
    43                 so that we can be fine with GNU/Linux ;) and windows?
       
    44 
       
    45 Now let us read this list into the variable ``primes``.
       
    46 ::
       
    47 
       
    48   primes = loadtxt('/home/fossee/primes.txt')
       
    49 
       
    50 ``primes`` is now a sequence of primes, that was listed in the file,
       
    51 ``primes.txt``.
       
    52 
       
    53 We now type, ``print primes`` to see the sequence printed.
       
    54 
       
    55 We observe that all of the numbers end with a period. This is so,
       
    56 because these numbers are actually read as ``floats``. We shall learn
       
    57 about them, later.
       
    58 
       
    59 Now, let us use the ``loadtxt`` command to read a file that contains
       
    60 two columns of data, ``pendulum.txt``. This file contains the length
       
    61 of the pendulum in the first column and the corresponding time period
       
    62 in the second.
       
    63 
       
    64 %%1%% Pause the video here, and use the ``cat`` command to view the
       
    65 contents of this file and then resume the video.
       
    66 
       
    67 This is how we look at the contents of the file, ``pendulum.txt``
       
    68 ::
       
    69 
       
    70   cat /home/fossee/pendulum.txt
       
    71 
       
    72 .. #[Nishanth]: The first column is L values and second is T values
       
    73                 from a simle pelculum experiment.
       
    74                 Since you are using the variable names later in the
       
    75                 script.
       
    76                 Not necessary but can be included also.
       
    77 
       
    78 Let us, now, read the data into the variable ``pend``. Again, it is
       
    79 assumed that the file is in ``/home/fossee/``
       
    80 ::
       
    81 
       
    82   pend = loadtxt('/home/fossee/pendulum.txt')
       
    83 
       
    84 Let us now print the variable ``pend`` and see what's in it. 
       
    85 ::
       
    86 
       
    87   print pend
       
    88 
       
    89 Notice that ``pend`` is not a simple sequence like ``primes``. It has
       
    90 two sequences, containing both the columns of the data file. Let us
       
    91 use an additional argument of the ``loadtxt`` command, to read it into
       
    92 two separate, simple sequences.
       
    93 ::
       
    94 
       
    95   L, T = loadtxt('/home/fossee/pendulum.txt', unpack=True)
       
    96 
       
    97 .. #[Nishanth]: It has a sequence of items in which each item contains
       
    98                 two values. first is l and second is t
       
    99 
       
   100 Let us now, print the variables L and T, to see what they contain.
       
   101 ::
       
   102 
       
   103   print L
       
   104   print T
       
   105 
       
   106 .. #[Nishanth]: Stress on ``unpack=True`` ??
       
   107 
       
   108 Notice, that L and T now contain the first and second columns of data
       
   109 from the data file, ``pendulum.txt``, and they are both simple
       
   110 sequences. ``unpack=True`` has given us the two columns in to two
       
   111 separate sequences instead of one complex sequence. 
       
   112 
       
   113 {{{ show the slide with loadtxt --- other features }}}
       
   114 
       
   115 In this tutorial, we have learnt the basic use of the ``loadtxt``
       
   116 command, which is capable of doing a lot more than we have used it for
       
   117 until now, for example
       
   118 
       
   119 %%2%% Pause the video here, and read the file
       
   120 ``pendulum_semicolon.txt`` which contains the same data as
       
   121 ``pendulum.txt``, but the columns are separated by semi-colons instead
       
   122 of spaces. Use the IPython help to see how to do this. Once you have
       
   123 finished, resume the video to look at the solution.
       
   124 
       
   125 {{{ switch back to the terminal }}}
       
   126 ::
       
   127 
       
   128   L, T = loadtxt('/home/fossee/pendulum_semicolon.txt', unpack=True, delimiter=';')
       
   129 
       
   130   print L
       
   131 
       
   132   print T
       
   133 
       
   134 This brings us to the end of this tutorial. 
       
   135 
       
   136 {{{ show the summary slide }}}
       
   137 
       
   138 You should now be able to do the following, comfortably. 
       
   139 
       
   140   + Read data from files, containing a single column of data using the
       
   141     ``loadtxt`` command.
       
   142   + Read multiple columns of data, separated by spaces or other
       
   143     delimiters.
       
   144 
       
   145 Thank you!   
       
   146