getting-started-files.rst
author nishanth
Fri, 17 Sep 2010 14:45:24 +0530
changeset 153 22521a1d6841
parent 144 476ea1730aee
permissions -rw-r--r--
fixed a syntax error
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
144
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     1
========
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     2
 Script
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     3
========
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     4
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     5
Welcome to the tutorial on getting started with files. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     6
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     7
{{{ Screen shows welcome slide }}}
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     8
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
     9
{{{ Show the outline for this tutorial }}} 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    10
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    11
In this tutorial we shall learn to read files, and do some basic
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    12
actions on the file, like opening and reading a file, closing a
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    13
file, iterating through the file line-by-line, and appending the
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    14
lines of a file to a list. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    15
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    16
{{{ switch back to the terminal }}}
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    17
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    18
As usual, we start IPython, using 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    19
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    20
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    21
  ipython -pylab 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    22
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    23
Let us first open the file, ``pendulum.txt`` present in
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    24
``/home/fossee/``. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    25
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    26
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    27
  f = open('/home/fossee/pendulum.txt')
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    28
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    29
``f`` is called a file object. Let us type ``f`` on the terminal to
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    30
see what it is. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    31
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    32
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    33
  f
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    34
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    35
The file object shows, the file which is open and the mode (read
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    36
or write) in which it is open. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    37
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    38
We shall first learn to read the whole file into a single
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    39
variable. Later, we shall look at reading it line-by-line. We use
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    40
the ``read`` method of ``f`` to read, all the contents of the file
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    41
into the variable ``pend``. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    42
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    43
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    44
  pend = f.read()
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    45
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    46
Now, let us see what is in ``pend``, by typing 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    47
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    48
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    49
  print pend
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    50
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    51
We can see that ``pend`` has all the data of file. Type just ``pend``
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    52
to see more explicitly, what it contains. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    53
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    54
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    55
  pend
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    56
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    57
%%1%% Pause the video here and split the variable into a list,
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    58
``pend_list``, of the lines in the file and then resume the
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    59
video. Hint, use the tab command to see what methods the string
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    60
variable has. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    61
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    62
#[punch: should this even be put? add dependency to strings LO,
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    63
where we mention that strings have methods for manipulation. hint:
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    64
use splitlines()]
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    65
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    66
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    67
  pend_list = pend.splitlines()
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    68
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    69
  pend_list
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    70
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    71
Now, let us learn to read the file line-by-line. But, before that
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    72
we will have to close the file, since the file has already been
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    73
read till the end. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    74
#[punch: should we mention file-pointer?]
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    75
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    76
Let us close the file opened into f.
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    77
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    78
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    79
  f.close()
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    80
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    81
Let us again type ``f`` on the prompt to see what it shows. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    82
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    83
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    84
  f
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    85
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    86
Notice, that it now says the file has been closed. It is a good
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    87
programming practice to close any file objects that we have
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    88
opened, after their job is done.
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    89
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    90
Let us, now move on to reading files line-by-line. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    91
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    92
%%1%% Pause the video here and re-open the file ``pendulum.txt``
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    93
with ``f`` as the file object, and then resume the video.
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    94
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    95
We just use the up arrow until we reach the open command and issue
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    96
it again. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    97
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    98
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
    99
  f = open('/home/fossee/pendulum.txt')
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   100
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   101
Now, to read the file line-by-line, we iterate over the file
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   102
object line-by-line, using the ``for`` command. Let us iterate over
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   103
the file line-wise and print each of the lines. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   104
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   105
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   106
  for line in f:
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   107
      print line
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   108
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   109
As we already know, ``line`` is just a dummy variable, and not a
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   110
keyword. We could have used any other variable name, but ``line``
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   111
seems meaningful enough.
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   112
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   113
Instead of just printing the lines, let us append them to a list,
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   114
``line_list``. We first initialize an empty list, ``line_list``. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   115
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   116
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   117
  line_list = [ ]
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   118
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   119
Let us then read the file line-by-line and then append each of the
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   120
lines, to the list. We could, as usual close the file using
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   121
``f.close`` and re-open it. But, this time, let's leave alone the
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   122
file object ``f`` and directly open the file within the for
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   123
statement. This will save us the trouble of closing the file, each
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   124
time we open it. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   125
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   126
for line in open('/home/fossee/pendulum.txt'):
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   127
line_list.append(line)
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   128
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   129
Let us see what ``line_list`` contains. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   130
::
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   131
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   132
  line_list
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   133
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   134
Notice that ``line_list`` is a list of the lines in the file, along
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   135
with the newline characters. If you noticed, ``pend_list`` did not
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   136
contain the newline characters, because the string ``pend`` was
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   137
split on the newline characters. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   138
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   139
{{{ show the summary slide }}}
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   140
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   141
That brings us to the end of this tutorial. In this tutorial we
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   142
have learnt to open and close files, read the data in the files as
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   143
a whole, using the read command or reading it line by line by
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   144
iterating over the file object. 
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   145
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   146
Thank you!   
476ea1730aee Added rst files for scripts.
Puneeth Chaganti <punchagan@gmail.com>
parents:
diff changeset
   147