Mercurial: The Definitive Guide by Bryan O'Sullivan

14.1. Improve performance with the inotify extension

5.1. List Comprehensions

+ Functions allow us to enclose a set of statements and call the function again and again instead of repeating the group of statements everytime. Functions also allow us to isolate a piece of code from all the other code and provides the convenience of not polluting the global variables. +

+ Function in python is defined with the keyword + def followed by the name of the function, in turn followed by a pair of parenthesis which encloses the list of parameters to the function. The definition line ends with a ':'. The definition line is followed by the body of the function intended by one block. The + Function must return a value: +

def factorial(n):
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

The code snippet above defines a function with the name factorial, takes the number for which the factorial must be computed, computes the factorial and returns the value.

A + Function once defined can be used or called anywhere else in the program. We call a fucntion with its name followed by a pair of parenthesis which encloses the arguments to the function. +

The value that function returns can be assigned to a variable. Let's call the above function and store the factorial in a variable:

fact5 = factorial(5)
+
+

The value of fact5 will now be 120, which is the factorial of 5. Note that we passed 5 as the argument to the function.

It may be necessary to document what the function does, for each of the function to help the person who reads our code to understand it better. In order to do this Python allows the first line of the function body to be a string. This string is called as + Documentation String or + docstring. + docstrings prove to be very handy since there are number of tools which can pull out all the docstrings from Python functions and generate the documentation automatically from it. + docstrings for functions can be written as follows: +

def factorial(n):
+  'Returns the factorial for the number n.'
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

An important point to note at this point is that, a function can return any Python value or a Python object, which also includes a + Tuple. A + Tuple is just a collection of values and those values themselves can be of any other valid Python datatypes, including + Lists, + Tuples, + Dictionaries among other things. So effectively, if a function can return a tuple, it can return any number of values through a tuple +

Let us write a small function to swap two values:

def swap(a, b):
+  return b, a
+
+c, d = swap(a, b)
+
+

Function scope --------------- The variables used inside the function are confined to the function's scope and doesn't pollute the variables of the same name outside the scope of the function. Also the arguments passed to the function are passed by-value if it is of basic Python data type:

def cant_change(n):
+  n = 10
+
+n = 5
+cant_change(n)
+
+

Upon running this code, what do you think would have happened to value of n which was assigned 5 before the function call? If you have already tried out that snippet on the interpreter you already know that the value of n is not changed. This is true of any immutable types of Python like + Numbers, + Strings and + Tuples. But when you pass mutable objects like + Lists and + Dictionaries the values are manipulated even outside the function: +

>>> def can_change(n):
+...   n[1] = James
+...
+
+>>> name = ['Mr.', 'Steve', 'Gosling']
+>>> can_change(name)
+>>> name
+['Mr.', 'James', 'Gosling']
+
+

If nothing is returned by the function explicitly, Python takes care to return None when the funnction is called.

+1. Default Arguments

There may be situations where we need to allow the functions to take the arguments optionally. Python allows us to define function this way by providing a facility called + Default Arguments. For example, we need to write a function that returns a list of fibonacci numbers. Since our function cannot generate an infinite list of fibonacci numbers, we need to specify the number of elements that the fibonacci sequence must contain. Suppose, additionally, we want to the function to return 10 numbers in the sequence if no option is specified we can define the function as follows: +

def fib(n=10):
+  fib_list = [0, 1]
+  for i in range(n - 2):
+    next = fib_list[-2] + fib_list[-1]
+    fib_list.append(next)
+  return fib_list
+
+

When we call this function, we can optionally specify the value for the parameter n, during the call as an argument. Calling with no argument and argument with n=5 returns the following fibonacci sequences:

fib()
+[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
+fib(5)
+[0, 1, 1, 2, 3]
+
+

+2. Keyword Arguments

When a function takes a large number of arguments, it may be difficult to remember the order of the parameters in the function definition or it may be necessary to pass values to only certain parameters since others take the default value. In either of these cases, Python provides the facility of passing arguments by specifying the name of the parameter as defined in the function definition. This is known as + Keyword Arguments. +

In a function call, + Keyword arguments can be used for each argument, in the following fashion: +

argument_name=argument_value
+Also denoted as: keyword=argument
+
+def wish(name='World', greetings='Hello'):
+  print "%s, %s!" % (greetings, name)
+
+

This function can be called in one of the following ways. It is important to note that no restriction is imposed in the order in which + Keyword arguments can be specified. Also note, that we have combined + Keyword arguments with + Default arguments in this example, however it is not necessary: +

wish(name='Guido', greetings='Hey')
+wish(greetings='Hey', name='Guido')
+
+

Calling functions by specifying arguments in the order of parameters specified in the function definition is called as + Positional arguments, as opposed to + Keyword arguments. It is possible to use both + Positional arguments and + Keyword arguments in a single function call. But Python doesn't allow us to bungle up both of them. The arguments to the function, in the call, must always start with + Positional arguments which is in turn followed by + Keyword arguments: +

def my_func(x, y, z, u, v, w):
+  # initialize variables.
+  ...
+  # do some stuff 
+  ...
+  # return the value
+
+

It is valid to call the above functions in the following ways:

my_func(10, 20, 30, u=1.0, v=2.0, w=3.0)
+my_func(10, 20, 30, 1.0, 2.0, w=3.0)
+my_func(10, 20, z=30, u=1.0, v=2.0, w=3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, w=3.0)
+
+

Following lists some of the invalid calls:

my_func(10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, 3.0)
+
+

+3. Parameter Packing and Unpacking

The positional arguments passed to a function can be collected in a tuple parameter and keyword arguments can be collected in a dictionary. Since keyword arguments must always be the last set of arguments passed to a function, the keyword dictionary parameter must be the last parameter. The function definition must include a list explicit parameters, followed by tuple paramter collecting parameter, whose name is preceded by a *****, for collecting positional parameters, in turn followed by the dictionary collecting parameter, whose name is preceded by a ****** :

def print_report(title, *args, **name):
+  """Structure of *args*
+  (age, email-id)
+  Structure of *name*
+  {
+      'first': First Name
+      'middle': Middle Name
+      'last': Last Name
+  }
+  """
+
+  print "Title: %s" % (title)
+  print "Full name: %(first)s %(middle)s %(last)s" % name
+  print "Age: %d\nEmail-ID: %s" % args
+
+

The above function can be called as. Note, the order of keyword parameters can be interchanged:

>>> print_report('Employee Report', 29, 'johny@example.com', first='Johny',
+                 last='Charles', middle='Douglas')
+Title: Employee Report
+Full name: Johny Douglas Charles
+Age: 29
+Email-ID: johny@example.com
+
+

The reverse of this can also be achieved by using a very identical syntax while calling the function. A tuple or a dictionary can be passed as arguments in place of a list of *Positional arguments* or *Keyword arguments* respectively using ***** or ****** :

def print_report(title, age, email, first, middle, last):
+  print "Title: %s" % (title)
+  print "Full name: %s %s %s" % (first, middle, last)
+  print "Age: %d\nEmail-ID: %s" % (age, email)
+
+>>> args = (29, 'johny@example.com')
+>>> name = {
+        'first': 'Johny',
+        'middle': 'Charles',
+        'last': 'Douglas'
+        }
+>>> print_report('Employee Report', *args, **name)
+Title: Employee Report
+Full name: Johny Charles Douglas
+Age: 29
+Email-ID: johny@example.com
+
+

+4. Nested Functions and Scopes

Python allows nesting one function inside another. This style of programming turns out to be extremely flexible and powerful features when we use + Python decorators. We will not talk about decorators is beyond the scope of this course. If you are interested in knowing more about + decorator programming in Python you are suggested to read: +

+      http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
+      http://personalpages.tds.net/~kent37/kk/00001.html
+

However, the following is an example for nested functions in Python:

def outer():
+  print "Outer..."
+  def inner():
+    print "Inner..."
+  print "Outer..."
+  inner()
+
+>>> outer()
+
+

+5. map, reduce and filter functions

Python provides several built-in functions for convenience. The + map(), + reduce() and + filter() functions prove to be very useful with sequences like + Lists. +

The + map ( + function, + sequence) function takes two arguments: + function and a + sequence argument. The + function argument must be the name of the function which in turn takes a single argument, the individual element of the + sequence. The + map function calls + function(item), for each item in the sequence and returns a list of values, where each value is the value returned by each call to + function(item). + map() function allows to pass more than one sequence. In this case, the first argument, + function must take as many arguments as the number of sequences passed. This function is called with each corresponding element in the each of the sequences, or + None if one of the sequence is exhausted: +

def square(x):
+  return x*x
+
+>>> map(square, [1, 2, 3, 4])
+[1, 4, 9, 16]
+
+def mul(x, y):
+  return x*y
+
+>>> map(mul, [1, 2, 3, 4], [6, 7, 8, 9])
+
+

The + filter ( + function, + sequence) function takes two arguments, similar to the + map() function. The + filter function calls + function(item), for each item in the sequence and returns all the elements in the sequence for which + function(item) returned True: +

def even(x):
+  if x % 2:
+    return True
+  else:
+    return False
+
+>>> filter(even, range(1, 10))
+[1, 3, 5, 7, 9]
+
+

The + reduce ( + function, + sequence) function takes two arguments, similar to + map function, however multiple sequences are not allowed. The + reduce function calls + function with first two consecutive elements in the sequence, obtains the result, calls + function with the result and the subsequent element in the sequence and so on until the end of the list and returns the final result: +

def mul(x, y):
+  return x*y
+
+>>> reduce(mul, [1, 2, 3, 4])
+24
+
+

+5.1. List Comprehensions

List Comprehension is a convenvience utility provided by Python. It is a syntatic sugar to create + Lists. Using + List Comprehensions one can create + Lists from other type of sequential data structures or other + Lists itself. The syntax of + List Comprehensions consists of a square brackets to indicate the result is a + List within which we include at least one + for clause and multiple + if clauses. It will be more clear with an example: +

>>> num = [1, 2, 3]
+>>> sq = [x*x for x in num]
+>>> sq
+[1, 4, 9]
+>>> all_num = [1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> even = [x for x in all_num if x%2 == 0]
+
+

The syntax used here is very clear from the way it is written. It can be translated into english as, "for each element x in the list all_num, if remainder of x divided by 2 is 0, add x to the list."

+ + + + + + + + + + + +

+Prev
Chapter 1.

+ + diff -r 000000000000 -r 8083d21c0020 web/html/backup/abc.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/abc.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,530 @@ + +Chapter 9. Finding and fixing mistakes + + + + + + + + + + +

+ + + + + + + +

Chapter 14. Adding functionality with extensions
+Prev		Next +

+Chapter 14. Adding functionality with extensions

Table of Contents

14.2. Flexible diff support with the extdiff extension

14.2.1. Defining command aliases

14.3. Cherrypicking changes with the transplant extension

14.4. Send changes via email with the patchbomb extension

14.4.1. Changing the behavior of patchbombs

+ + + +

Section 3.3, “Simplifying the pull-merge-commit sequence” + covers the fetch extension; + this combines pulling new changes and merging them with local + changes into a single command, fetch.
In Chapter 10, Handling repository events with hooks, we covered + several extensions that are useful for hook-related + functionality: acl adds + access control lists; bugzilla adds integration with the + Bugzilla bug tracking system; and notify sends notification emails on + new changes.
The Mercurial Queues patch management extension is + so invaluable that it merits two chapters and an appendix all + to itself. Chapter 12, Managing change with Mercurial Queues covers the + basics; Chapter 13, Advanced uses of Mercurial Queues discusses advanced topics; + and Appendix B, Mercurial Queues reference goes into detail on + each + command.

In Section 14.1, “Improve performance with the inotify extension”, + we'll discuss the possibility of huge + performance improvements using the inotify extension.

+14.1. Improve performance with the `inotify` extension

Are you interested in having some of the most common + Mercurial operations run as much as a hundred times faster? + Read on!

Before we continue, please pay attention to some + caveats.

The inotify + extension is Linux-specific. Because it interfaces directly + to the Linux kernel's inotify subsystem, + it does not work on other operating systems.
It should work on any Linux distribution that + was released after early 2005. Older distributions are + likely to have a kernel that lacks + inotify, or a version of + glibc that does not have the necessary + interfacing support.
Not all filesystems are suitable for use with + the inotify extension. + Network filesystems such as NFS are a non-starter, for + example, particularly if you're running Mercurial on several + systems, all mounting the same network filesystem. The + kernel's inotify system has no way of + knowing about changes made on another system. Most local + filesystems (e.g. ext3, XFS, ReiserFS) should work + fine.

The inotify extension is + not yet shipped with Mercurial as of May 2007, so it's a little + more involved to set up than other extensions. But the + performance improvement is worth it!

The extension currently comes in two parts: a set of patches + to the Mercurial source code, and a library of Python bindings + to the inotify subsystem.

+ + + + + +

	Note
	There are two Python + `inotify` binding libraries. One of them is + called `pyinotify`, and is packaged by some + Linux distributions as `python-inotify`. + This is not the one you'll need, as it is + too buggy and inefficient to be practical.

To get going, it's best to already have a functioning copy + of Mercurial installed.

+ + + + + +

	Note
	If you follow the instructions below, you'll be + replacing and overwriting any existing + installation of Mercurial that you might already have, using + the latest “bleeding edge” Mercurial code. Don't + say you weren't warned!

Clone the Python inotify + binding repository. Build and install it.

hg clone http://hg.kublai.com/python/inotify
+cd inotify
+python setup.py build --force
+sudo python setup.py install --skip-build

+
Clone the crew Mercurial repository. + Clone the inotify patch + repository so that Mercurial Queues will be able to apply + patches to your cope of the crew repository.
+
```
hg clone http://hg.intevation.org/mercurial/crew
+hg clone crew inotify
+hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches
```
+
Make sure that you have the Mercurial Queues + extension, mq, enabled. If + you've never used MQ, read Section 12.5, “Getting started with Mercurial Queues” to get started + quickly.
+
Go into the inotify repo, and apply all + of the inotify patches + using the hg + -a option to the qpush command.
+
```
cd inotify
+hg qpush -a
```
+
If you get an error message from qpush, you should not continue. + Instead, ask for help.

Build and install the patched version of + Mercurial.

python setup.py build --force
+sudo python setup.py install --skip-build

Once you've build a suitably patched version of Mercurial, + all you need to do to enable the inotify extension is add an entry to + your ~/.hgrc.

[extensions] inotify =

+14.2. Flexible diff support with the `extdiff` extension

Mercurial's built-in hg + diff command outputs plaintext unified diffs.

$ hg diff
+diff -r 80997726a0ea myfile
+--- a/myfile	Wed Jan 06 06:50:18 2010 +0000
++++ b/myfile	Wed Jan 06 06:50:18 2010 +0000
+@@ -1,1 +1,2 @@
+ The first line.
++The second line.
+

If you would like to use an external tool to display + modifications, you'll want to use the extdiff extension. This will let you + use, for example, a graphical diff tool.

The extdiff extension is + bundled with Mercurial, so it's easy to set up. In the extensions section of your + ~/.hgrc, simply add a + one-line entry to enable the extension.

[extensions]
+extdiff =

This introduces a command named extdiff, which by default uses + your system's diff command to generate a + unified diff in the same form as the built-in hg diff command.

$ hg extdiff
+--- a.80997726a0ea/myfile	2010-01-06 06:50:18.613674526 +0000
++++ /tmp/extdiffNErQlu/a/myfile	2010-01-06 06:50:18.437687076 +0000
+@@ -1 +1,2 @@
+ The first line.
++The second line.
+

The result won't be exactly the same as with the built-in + hg diff variations, because the + output of diff varies from one system to + another, even when passed the same options.

$ hg extdiff -o -NprcC5
+*** a.80997726a0ea/myfile	Wed Jan  6 06:50:18 2010
+--- /tmp/extdiffNErQlu/a/myfile	Wed Jan  6 06:50:18 2010
+***************
+*** 1 ****
+--- 1,2 ----
+  The first line.
++ The second line.
+

Launching a visual diff tool is just as easy. Here's how to + launch the kdiff3 viewer.

hg extdiff -p kdiff3 -o

+14.2.1. Defining command aliases

[extdiff]
+cmd.kdiff3 =

[extdiff]
+ cmd.wibble = kdiff3

[extdiff]
+ cmd.vimdiff = vim
+opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)'

+14.3. Cherrypicking changes with the `transplant` extension

Need to have a long chat with Brendan about this.

+14.4. Send changes via email with the `patchbomb` extension

As usual, the basic configuration of the patchbomb extension takes just one or + two lines in your + /.hgrc.

[extensions]
+patchbomb =

Once you've enabled the extension, you will have a new + command available, named email.

The email command + accepts the same kind of revision syntax as every other + Mercurial command. For example, this command will send every + revision between 7 and tip, inclusive.

hg email -n 7:tip

When you are sending just one revision, the email command will by + default use the first line of the changeset description as the + subject of the single email message it sends.

+14.4.1. Changing the behavior of patchbombs

Not every project has exactly the same conventions for + sending changes in email; the patchbomb extension tries to + accommodate a number of variations through command line + options.

You can write a subject for the introductory + message on the command line using the hg -s + option. This takes one argument, the text of the subject + to use.
To change the email address from which the + messages originate, use the hg -f + option. This takes one argument, the email address to + use.
The default behavior is to send unified diffs + (see Section 12.4, “Understanding patches” for a + description of the + format), one per message. You can send a binary bundle + instead with the hg -b + option.
Unified diffs are normally prefaced with a + metadata header. You can omit this, and send unadorned + diffs, with the hg + --plain option.
Diffs are normally sent “inline”, + in the same body part as the description of a patch. This + makes it easiest for the largest number of readers to + quote and respond to parts of a diff, as some mail clients + will only quote the first MIME body part in a message. If + you'd prefer to send the description and the diff in + separate body parts, use the hg -a + option.
Instead of sending mail messages, you can + write them to an mbox-format mail + folder using the hg -m + option. That option takes one argument, the name of the + file to write to.
If you would like to add a + diffstat-format summary to each patch, + and one to the introductory message, use the hg -d + option. The diffstat command displays + a table containing the name of each file patched, the + number of lines affected, and a histogram showing how much + each file is modified. This gives readers a qualitative + glance at how complex a patch is.

+ + + + + + + + + + + +

+Prev		Next +
Chapter 13. Advanced uses of Mercurial Queues	Home	Appendix A. Migrating to Mercurial

+ + diff -r 000000000000 -r 8083d21c0020 web/html/backup/abc.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/abc.py Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,2 @@ +pid_list=['x_4fe', 'x_4ff', 'x_546', 'x_503', 'x_505', 'x_506', 'x_507', 'x_508', 'x_509', 'x_50a', 'x_50b', 'x_50c', 'x_510', 'x_511', 'x_513', 'x_515', 'x_516', 'x_518', 'x_51a', 'x_51b', 'x_51c', 'x_51d', 'x_51e', 'x_51f', 'x_520', 'x_521', 'x_522', 'x_523', 'x_524', 'x_525', 'x_526', 'x_527', 'x_528', 'x_529', 'x_52a', 'x_52b', 'x_52c', 'x_52d', 'x_52e', 'x_52f', 'x_530', 'x_531', 'x_532', 'x_533', 'x_534', 'x_535', 'x_536', 'x_537', 'x_538', 'x_539', 'x_53a', 'x_53b'] + diff -r 000000000000 -r 8083d21c0020 web/html/backup/abcd.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/abcd.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,929 @@ +Chapter 9. Finding and fixing mistakes

Mercurial: The Definitive Guideby Bryan O'Sullivan

Chapter 9. Finding and fixing mistakes
Prev		Next

Chapter 9. Finding and fixing mistakes

To err might be human, but to really handle the consequences + well takes a top-notch revision control system. In this chapter, + we'll discuss some of the techniques you can use when you find + that a problem has crept into your project. Mercurial has some + highly capable features that will help you to isolate the sources + of problems, and to handle them appropriately.

Erasing local history

The accidental commit

I have the occasional but persistent problem of typing + rather more quickly than I can think, which sometimes results + in me committing a changeset that is either incomplete or + plain wrong. In my case, the usual kind of incomplete + changeset is one in which I've created a new source file, but + forgotten to hg add it. A + “plain wrong” changeset is not as common, but no + less annoying.

Rolling back a transaction

In the section called “Safe operation”, I + mentioned that Mercurial treats each modification of a + repository as a transaction. Every time + you commit a changeset or pull changes from another + repository, Mercurial remembers what you did. You can undo, + or roll back, exactly one of these + actions using the hg rollback + + command. (See the section called “Rolling back is useless once you've pushed” + for an important caveat about the use of this command.)

Here's a mistake that I often find myself making: + committing a change in which I've created a new file, but + forgotten to hg add + it.

$ hg status
+M a
+$ echo b > b
+
+$ hg commit -m 'Add file b'
+

Looking at the output of hg + status after the commit immediately confirms the + error.

$ hg status
+? b
+$ hg tip
+
+changeset:   1:f2db1de2ba4f
+tag:         tip
+user:        Bryan O'Sullivan <bos@serpentine.com>
+date:        Tue May 05 06:55:44 2009 +0000
+summary:     Add file b
+
+

The commit captured the changes to the file + a, but not the new file + b. If I were to push this changeset to a + repository that I shared with a colleague, the chances are + high that something in a would refer to + b, which would not be present in their + repository when they pulled my changes. I would thus become + the object of some indignation.

However, luck is with me—I've caught my error + before I pushed the changeset. I use the hg rollback command, and Mercurial + makes that last changeset vanish.

$ hg rollback
+
+rolling back last transaction
+$ hg tip
+changeset:   0:cde70bc943e1
+tag:         tip
+user:        Bryan O'Sullivan <bos@serpentine.com>
+date:        Tue May 05 06:55:44 2009 +0000
+summary:     First commit
+
+$ hg status
+M a
+? b
+

Notice that the changeset is no longer present in the + repository's history, and the working directory once again + thinks that the file a is modified. The + commit and rollback have left the working directory exactly as + it was prior to the commit; the changeset has been completely + erased. I can now safely hg + add the file b, and rerun my + commit.

$ hg add b
+
+$ hg commit -m 'Add file b, this time for real'
+

The erroneous pull

It's common practice with Mercurial to maintain separate + development branches of a project in different repositories. + Your development team might have one shared repository for + your project's “0.9” release, and another, + containing different changes, for the “1.0” + release.

Given this, you can imagine that the consequences could be + messy if you had a local “0.9” repository, and + accidentally pulled changes from the shared “1.0” + repository into it. At worst, you could be paying + insufficient attention, and push those changes into the shared + “0.9” tree, confusing your entire team (but don't + worry, we'll return to this horror scenario later). However, + it's more likely that you'll notice immediately, because + Mercurial will display the URL it's pulling from, or you will + see it pull a suspiciously large number of changes into the + repository.

The hg rollback command + will work nicely to expunge all of the changesets that you + just pulled. Mercurial groups all changes from one hg pull into a single transaction, + so one hg rollback is all you + need to undo this mistake.

Rolling back is useless once you've pushed

The value of the hg + rollback command drops to zero once you've pushed + your changes to another repository. Rolling back a change + makes it disappear entirely, but only in + the repository in which you perform the hg rollback. Because a rollback + eliminates history, there's no way for the disappearance of a + change to propagate between repositories.

If you've pushed a change to another + repository—particularly if it's a shared + repository—it has essentially “escaped into the + wild,” and you'll have to recover from your mistake + in a different way. If you push a changeset somewhere, then + roll it back, then pull from the repository you pushed to, the + changeset you thought you'd gotten rid of will simply reappear + in your repository.

(If you absolutely know for sure that the change + you want to roll back is the most recent change in the + repository that you pushed to, and you + know that nobody else could have pulled it from that + repository, you can roll back the changeset there, too, but + you really should not expect this to work reliably. Sooner or + later a change really will make it into a repository that you + don't directly control (or have forgotten about), and come + back to bite you.)

You can only roll back once

Mercurial stores exactly one transaction in its + transaction log; that transaction is the most recent one that + occurred in the repository. This means that you can only roll + back one transaction. If you expect to be able to roll back + one transaction, then its predecessor, this is not the + behavior you will get.

$ hg rollback
+
+rolling back last transaction
+$ hg rollback
+no rollback information available
+

Once you've rolled back one transaction in a repository, + you can't roll back again in that repository until you perform + another commit or pull.

Reverting the mistaken change

If you make a modification to a file, and decide that you + really didn't want to change the file at all, and you haven't + yet committed your changes, the hg + revert command is the one you'll need. It looks at + the changeset that's the parent of the working directory, and + restores the contents of the file to their state as of that + changeset. (That's a long-winded way of saying that, in the + normal case, it undoes your modifications.)

Let's illustrate how the hg + revert command works with yet another small example. + We'll begin by modifying a file that Mercurial is already + tracking.

$ cat file
+
+original content
+$ echo unwanted change >> file
+$ hg diff file
+diff -r b52afd4afc59 file
+--- a/file	Tue May 05 06:55:32 2009 +0000
++++ b/file	Tue May 05 06:55:32 2009 +0000
+@@ -1,1 +1,2 @@
+ original content
++unwanted change
+

If we don't + want that change, we can simply hg + revert the file.

$ hg status
+
+M file
+$ hg revert file
+$ cat file
+original content
+

The hg revert command + provides us with an extra degree of safety by saving our + modified file with a .orig + extension.

$ hg status
+
+? file.orig
+$ cat file.orig
+original content
+unwanted change
+

Be careful with .orig files

	Be careful with .orig files
It's extremely unlikely that you are either using + Mercurial to manage files with `.orig` + extensions or that you even care about the contents of such + files. Just in case, though, it's useful to remember that + hg revert will + unconditionally overwrite an existing file with a + `.orig` extension. For instance, if you + already have a file named `foo.orig` when + you revert `foo`, the contents of + `foo.orig` will be clobbered.

It's extremely unlikely that you are either using + Mercurial to manage files with .orig + extensions or that you even care about the contents of such + files. Just in case, though, it's useful to remember that + hg revert will + unconditionally overwrite an existing file with a + .orig extension. For instance, if you + already have a file named foo.orig when + you revert foo, the contents of + foo.orig will be clobbered.

Here is a summary of the cases that the hg revert command can deal with. We + will describe each of these in more detail in the section that + follows.

If you modify a file, it will restore the file + to its unmodified state.
If you hg add a + file, it will undo the “added” state of the + file, but leave the file itself untouched.
If you delete a file without telling Mercurial, + it will restore the file to its unmodified contents.
If you use the hg + remove command to remove a file, it will undo + the “removed” state of the file, and restore + the file to its unmodified contents.

File management errors

The hg revert command is + useful for more than just modified files. It lets you reverse + the results of all of Mercurial's file management + commands—hg add, + hg remove, and so on.

If you hg add a file, + then decide that in fact you don't want Mercurial to track it, + use hg revert to undo the + add. Don't worry; Mercurial will not modify the file in any + way. It will just “unmark” the file.

$ echo oops > oops
+
+$ hg add oops
+$ hg status oops
+A oops
+$ hg revert oops
+$ hg status
+
+? oops
+

Similarly, if you ask Mercurial to hg remove a file, you can use + hg revert to restore it to + the contents it had as of the parent of the working directory. + +

$ hg remove file
+$ hg status
+R file
+
+$ hg revert file
+$ hg status
+$ ls file
+file
+

+ + This works just as + well for a file that you deleted by hand, without telling + Mercurial (recall that in Mercurial terminology, this kind of + file is called “missing”).

$ rm file
+
+$ hg status
+! file
+$ hg revert file
+$ ls file
+file
+

If you revert a hg copy, + the copied-to file remains in your working directory + afterwards, untracked. Since a copy doesn't affect the + copied-from file in any way, Mercurial doesn't do anything + with the copied-from file.

$ hg copy file new-file
+
+$ hg revert new-file
+$ hg status
+? new-file
+

Dealing with committed changes

Consider a case where you have committed a change + a, and another change + b on top of it; you then realise that + change a was incorrect. Mercurial lets you + “back out” an entire changeset automatically, and + building blocks that let you reverse part of a changeset by + hand.

Before you read this section, here's something to + keep in mind: the hg backout + + command undoes the effect of a change by + adding to your repository's history, not by + modifying or erasing it. It's the right tool to use if you're + fixing bugs, but not if you're trying to undo some change that + has catastrophic consequences. To deal with those, see + the section called “Changes that should never have been”.

Backing out a changeset

The hg backout command + lets you “undo” the effects of an entire + changeset in an automated fashion. Because Mercurial's + history is immutable, this command does + not get rid of the changeset you want to undo. + Instead, it creates a new changeset that + reverses the effect of the to-be-undone + changeset.

The operation of the hg + backout command is a little intricate, so let's + illustrate it with some examples. First, we'll create a + repository with some simple changes.

$ hg init myrepo
+
+$ cd myrepo
+$ echo first change >> myfile
+$ hg add myfile
+$ hg commit -m 'first change'
+
+$ echo second change >> myfile
+$ hg commit -m 'second change'
+

The hg backout command + takes a single changeset ID as its argument; this is the + changeset to back out. Normally, hg + backout will drop you into a text editor to write + a commit message, so you can record why you're backing the + change out. In this example, we provide a commit message on + the command line using the -m option.

Backing out the tip changeset

We're going to start by backing out the last changeset we + committed.

$ hg backout -m 'back out second change' tip
+
+reverting myfile
+changeset 2:01adc4672142 backs out changeset 1:7e341ee3be7a
+$ cat myfile
+first change
+

You can see that the second line from + myfile is no longer present. Taking a + look at the output of hg log + gives us an idea of what the hg + backout command has done. + + +

$ hg log --style compact
+2[tip]   01adc4672142   2009-05-05 06:55 +0000   bos
+  back out second change
+
+1   7e341ee3be7a   2009-05-05 06:55 +0000   bos
+  second change
+
+0   56b97fc928f2   2009-05-05 06:55 +0000   bos
+  first change
+
+

+ + Notice that the new changeset + that hg backout has created + is a child of the changeset we backed out. It's easier to see + this in Figure 9.1, “Backing out a change using the hg backout command”, which presents a + graphical view of the change history. As you can see, the + history is nice and linear.

Figure 9.1. Backing out a change using the hg backout command

Backing out a non-tip change

If you want to back out a change other than the last one + you committed, pass the --merge option to the + hg backout command.

$ cd ..
+
+$ hg clone -r1 myrepo non-tip-repo
+requesting all changes
+adding changesets
+adding manifests
+adding file changes
+added 2 changesets with 2 changes to 1 files
+updating working directory
+1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+$ cd non-tip-repo
+

This makes backing out any changeset a + “one-shot” operation that's usually simple and + fast.

$ echo third change >> myfile
+
+$ hg commit -m 'third change'
+$ hg backout --merge -m 'back out second change' 1
+reverting myfile
+created new head
+changeset 3:abc7fd860049 backs out changeset 1:7e341ee3be7a
+merging with changeset 3:abc7fd860049
+merging myfile
+0 files updated, 1 files merged, 0 files removed, 0 files unresolved
+(branch merge, don't forget to commit)
+

If you take a look at the contents of + myfile after the backout finishes, you'll + see that the first and third changes are present, but not the + second.

$ cat myfile
+
+first change
+third change
+

As the graphical history in Figure 9.2, “Automated backout of a non-tip change using the + hg backout command” illustrates, Mercurial + still commits one change in this kind of situation (the + box-shaped node is the ones that Mercurial commits + automatically), but the revision graph now looks different. + Before Mercurial begins the backout process, it first + remembers what the current parent of the working directory is. + It then backs out the target changeset, and commits that as a + changeset. Finally, it merges back to the previous parent of + the working directory, but notice that it does not + commit the result of the merge. The repository + now contains two heads, and the working directory is in a + merge state.

Figure 9.2. Automated backout of a non-tip change using the + hg backout command

The result is that you end up “back where you + were”, only with some extra history that undoes the + effect of the changeset you wanted to back out.

You might wonder why Mercurial does not commit the result + of the merge that it performed. The reason lies in Mercurial + behaving conservatively: a merge naturally has more scope for + error than simply undoing the effect of the tip changeset, + so your work will be safest if you first inspect (and test!) + the result of the merge, then commit + it.

Always use the `--merge` option

In fact, since the --merge option will do the + “right thing” whether or not the changeset + you're backing out is the tip (i.e. it won't try to merge if + it's backing out the tip, since there's no need), you should + always use this option when you run the + hg backout command.

Gaining more control of the backout process

While I've recommended that you always use the --merge option when backing + out a change, the hg backout + + command lets you decide how to merge a backout changeset. + Taking control of the backout process by hand is something you + will rarely need to do, but it can be useful to understand + what the hg backout command + is doing for you automatically. To illustrate this, let's + clone our first repository, but omit the backout change that + it contains.

$ cd ..
+$ hg clone -r1 myrepo newrepo
+requesting all changes
+adding changesets
+adding manifests
+adding file changes
+added 2 changesets with 2 changes to 1 files
+updating working directory
+1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+$ cd newrepo
+
+

As with our + earlier example, We'll commit a third changeset, then back out + its parent, and see what happens.

$ echo third change >> myfile
+$ hg commit -m 'third change'
+$ hg backout -m 'back out second change' 1
+reverting myfile
+created new head
+changeset 3:abc7fd860049 backs out changeset 1:7e341ee3be7a
+the backout changeset is a new head - do not forget to merge
+(use "backout --merge" if you want to auto-merge)
+

Our new changeset is again a descendant of the changeset + we backout out; it's thus a new head, not + + a descendant of the changeset that was the tip. The hg backout command was quite + explicit in telling us this.

$ hg log --style compact
+3[tip]:1   abc7fd860049   2009-05-05 06:55 +0000   bos
+  back out second change
+
+2   bae4005ddac4   2009-05-05 06:55 +0000   bos
+  third change
+
+1   7e341ee3be7a   2009-05-05 06:55 +0000   bos
+  second change
+
+0   56b97fc928f2   2009-05-05 06:55 +0000   bos
+  first change
+
+

Again, it's easier to see what has happened by looking at + a graph of the revision history, in Figure 9.3, “Backing out a change using the hg backout command”. This makes it clear + that when we use hg backout + to back out a change other than the tip, Mercurial adds a new + head to the repository (the change it committed is + box-shaped).

Figure 9.3. Backing out a change using the hg backout command

After the hg backout + + command has completed, it leaves the new + “backout” changeset as the parent of the working + directory.

$ hg parents
+changeset:   2:bae4005ddac4
+user:        Bryan O'Sullivan <bos@serpentine.com>
+date:        Tue May 05 06:55:12 2009 +0000
+summary:     third change
+
+

Now we have two isolated sets of changes.

$ hg heads
+
+changeset:   3:abc7fd860049
+tag:         tip
+parent:      1:7e341ee3be7a
+user:        Bryan O'Sullivan <bos@serpentine.com>
+date:        Tue May 05 06:55:12 2009 +0000
+summary:     back out second change
+
+changeset:   2:bae4005ddac4
+user:        Bryan O'Sullivan <bos@serpentine.com>
+date:        Tue May 05 06:55:12 2009 +0000
+summary:     third change
+
+

Let's think about what we expect to see as the contents of + myfile now. The first change should be + present, because we've never backed it out. The second change + should be missing, as that's the change we backed out. Since + the history graph shows the third change as a separate head, + we don't expect to see the third change + present in myfile.

$ cat myfile
+
+first change
+

To get the third change back into the file, we just do a + normal merge of our two heads.

$ hg merge
+abort: outstanding uncommitted changes
+$ hg commit -m 'merged backout with previous tip'
+$ cat myfile
+first change
+

Afterwards, the graphical history of our + repository looks like + Figure 9.4, “Manually merging a backout change”.

Figure 9.4. Manually merging a backout change

Why hg backout works as + it does

Here's a brief description of how the hg backout command works.

It ensures that the working directory is + “clean”, i.e. that the output of hg status would be empty.
It remembers the current parent of the working + directory. Let's call this changeset + orig.
It does the equivalent of a hg update to sync the working + directory to the changeset you want to back out. Let's + call this changeset backout.
It finds the parent of that changeset. Let's + call that changeset parent.
For each file that the + backout changeset affected, it does the + equivalent of a hg revert -r + parent on that file, to restore it to the + contents it had before that changeset was + committed.
It commits the result as a new changeset. + This changeset has backout as its + parent.
If you specify --merge on the command + line, it merges with orig, and commits + the result of the merge.

An alternative way to implement the hg backout command would be to + hg export the + to-be-backed-out changeset as a diff, then use the --reverse option to the + patch command to reverse the effect of the + change without fiddling with the working directory. This + sounds much simpler, but it would not work nearly as + well.

The reason that hg + backout does an update, a commit, a merge, and + another commit is to give the merge machinery the best chance + to do a good job when dealing with all the changes + between the change you're backing out and + the current tip.

If you're backing out a changeset that's 100 revisions + back in your project's history, the chances that the + patch command will be able to apply a + reverse diff cleanly are not good, because intervening changes + are likely to have “broken the context” that + patch uses to determine whether it can + apply a patch (if this sounds like gibberish, see the section called “Understanding patches” for a + discussion of the patch command). Also, + Mercurial's merge machinery will handle files and directories + being renamed, permission changes, and modifications to binary + files, none of which patch can deal + with.

Changes that should never have been

Most of the time, the hg + backout command is exactly what you need if you want + to undo the effects of a change. It leaves a permanent record + of exactly what you did, both when committing the original + changeset and when you cleaned up after it.

On rare occasions, though, you may find that you've + committed a change that really should not be present in the + repository at all. For example, it would be very unusual, and + usually considered a mistake, to commit a software project's + object files as well as its source files. Object files have + almost no intrinsic value, and they're big, + so they increase the size of the repository and the amount of + time it takes to clone or pull changes.

Before I discuss the options that you have if you commit a + “brown paper bag” change (the kind that's so bad + that you want to pull a brown paper bag over your head), let me + first discuss some approaches that probably won't work.

Since Mercurial treats history as + accumulative—every change builds on top of all changes + that preceded it—you generally can't just make disastrous + changes disappear. The one exception is when you've just + committed a change, and it hasn't been pushed or pulled into + another repository. That's when you can safely use the hg rollback command, as I detailed in + the section called “Rolling back a transaction”.

After you've pushed a bad change to another repository, you + could still use hg + rollback to make your local copy of the change + disappear, but it won't have the consequences you want. The + change will still be present in the remote repository, so it + will reappear in your local repository the next time you + pull.

If a situation like this arises, and you know which + repositories your bad change has propagated into, you can + try to get rid of the change from + every one of those repositories. This is, + of course, not a satisfactory solution: if you miss even a + single repository while you're expunging, the change is still + “in the wild”, and could propagate further.

If you've committed one or more changes + after the change that you'd like to see + disappear, your options are further reduced. Mercurial doesn't + provide a way to “punch a hole” in history, leaving + changesets intact.

Backing out a merge

Since merges are often complicated, it is not unheard of + for a merge to be mangled badly, but committed erroneously. + Mercurial provides an important safeguard against bad merges + by refusing to commit unresolved files, but human ingenuity + guarantees that it is still possible to mess a merge up and + commit it.

Given a bad merge that has been committed, usually the + best way to approach it is to simply try to repair the damage + by hand. A complete disaster that cannot be easily fixed up + by hand ought to be very rare, but the hg backout command may help in + making the cleanup easier. It offers a --parent option, which lets + you specify which parent to revert to when backing out a + merge.

Figure 9.5. A bad merge

Suppose we have a revision graph like that in Figure 9.5, “A bad merge”. What we'd like is to + redo the merge of revisions 2 and + 3.

One way to do so would be as follows.

Call hg backout --rev=4 + --parent=2. This tells hg backout to back out revision + 4, which is the bad merge, and to when deciding which + revision to prefer, to choose parent 2, one of the parents + of the merge. The effect can be seen in Figure 9.6, “Backing out the merge, favoring one parent”.
Figure 9.6. Backing out the merge, favoring one parent
Call hg backout --rev=4 + --parent=3. This tells hg backout to back out revision + 4 again, but this time to choose parent 3, the other + parent of the merge. The result is visible in Figure 9.7, “Backing out the merge, favoring the other + parent”, in which the repository + now contains three heads.
Figure 9.7. Backing out the merge, favoring the other + parent
Redo the bad merge by merging the two backout heads, + which reduces the number of heads in the repository to + two, as can be seen in Figure 9.8, “Merging the backouts”.
Figure 9.8. Merging the backouts
Merge with the commit that was made after the bad + merge, as shown in Figure 9.9, “Merging the backouts”.
Figure 9.9. Merging the backouts

Protect yourself from “escaped” + changes

If you've committed some changes to your local repository + and they've been pushed or pulled somewhere else, this isn't + necessarily a disaster. You can protect yourself ahead of + time against some classes of bad changeset. This is + particularly easy if your team usually pulls changes from a + central repository.

By configuring some hooks on that repository to validate + incoming changesets (see chapter Chapter 10, Handling repository events with hooks), + you can + automatically prevent some kinds of bad changeset from being + pushed to the central repository at all. With such a + configuration in place, some kinds of bad changeset will + naturally tend to “die out” because they can't + propagate into the central repository. Better yet, this + happens without any need for explicit intervention.

For instance, an incoming change hook that + verifies that a changeset will actually compile can prevent + people from inadvertently “breaking the + build”.

What to do about sensitive changes that escape

Even a carefully run project can suffer an unfortunate + event such as the committing and uncontrolled propagation of a + file that contains important passwords.

If something like this happens to you, and the information + that gets accidentally propagated is truly sensitive, your + first step should be to mitigate the effect of the leak + without trying to control the leak itself. If you are not 100% + certain that you know exactly who could have seen the changes, + you should immediately change passwords, cancel credit cards, + or find some other way to make sure that the information that + has leaked is no longer useful. In other words, assume that + the change has propagated far and wide, and that there's + nothing more you can do.

You might hope that there would be mechanisms you could + use to either figure out who has seen a change or to erase the + change permanently everywhere, but there are good reasons why + these are not possible.

Mercurial does not provide an audit trail of who has + pulled changes from a repository, because it is usually either + impossible to record such information or trivial to spoof it. + In a multi-user or networked environment, you should thus be + extremely skeptical of yourself if you think that you have + identified every place that a sensitive changeset has + propagated to. Don't forget that people can and will send + bundles by email, have their backup software save data + offsite, carry repositories on USB sticks, and find other + completely innocent ways to confound your attempts to track + down every copy of a problematic change.

Mercurial also does not provide a way to make a file or + changeset completely disappear from history, because there is + no way to enforce its disappearance; someone could easily + modify their copy of Mercurial to ignore such directives. In + addition, even if Mercurial provided such a capability, + someone who simply hadn't pulled a “make this file + disappear” changeset wouldn't be affected by it, nor + would web crawlers visiting at the wrong time, disk backups, + or other mechanisms. Indeed, no distributed revision control + system can make data reliably vanish. Providing the illusion + of such control could easily give a false sense of security, + and be worse than not providing it at all.

Finding the source of a bug

While it's all very well to be able to back out a changeset + that introduced a bug, this requires that you know which + changeset to back out. Mercurial provides an invaluable + command, called hg bisect, that + helps you to automate this process and accomplish it very + efficiently.

The idea behind the hg + bisect command is that a changeset has introduced + some change of behavior that you can identify with a simple + pass/fail test. You don't know which piece of code introduced the + change, but you know how to test for the presence of the bug. + The hg bisect command uses your + test to direct its search for the changeset that introduced the + code that caused the bug.

Here are a few scenarios to help you understand how you + might apply this command.

The most recent version of your software has a + bug that you remember wasn't present a few weeks ago, but + you don't know when it was introduced. Here, your binary + test checks for the presence of that bug.
You fixed a bug in a rush, and now it's time to + close the entry in your team's bug database. The bug + database requires a changeset ID when you close an entry, + but you don't remember which changeset you fixed the bug in. + Once again, your binary test checks for the presence of the + bug.
Your software works correctly, but runs 15% + slower than the last time you measured it. You want to know + which changeset introduced the performance regression. In + this case, your binary test measures the performance of your + software, to see whether it's “fast” or + “slow”.
The sizes of the components of your project that + you ship exploded recently, and you suspect that something + changed in the way you build your project.

From these examples, it should be clear that the hg bisect command is not useful only + for finding the sources of bugs. You can use it to find any + “emergent property” of a repository (anything that + you can't find from a simple text search of the files in the + tree) for which you can write a binary test.

We'll introduce a little bit of terminology here, just to + make it clear which parts of the search process are your + responsibility, and which are Mercurial's. A + test is something that + you run when hg + bisect chooses a changeset. A + probe is what hg + bisect runs to tell whether a revision is good. + Finally, we'll use the word “bisect”, as both a + noun and a verb, to stand in for the phrase “search using + the hg bisect + + command”.

One simple way to automate the searching process would be + simply to probe every changeset. However, this scales poorly. + If it took ten minutes to test a single changeset, and you had + 10,000 changesets in your repository, the exhaustive approach + would take on average 35 days to find the + changeset that introduced a bug. Even if you knew that the bug + was introduced by one of the last 500 changesets, and limited + your search to those, you'd still be looking at over 40 hours to + find the changeset that introduced your bug.

What the hg bisect command + does is use its knowledge of the “shape” of your + project's revision history to perform a search in time + proportional to the logarithm of the number + of changesets to check (the kind of search it performs is called + a dichotomic search). With this approach, searching through + 10,000 changesets will take less than three hours, even at ten + minutes per test (the search will require about 14 tests). + Limit your search to the last hundred changesets, and it will + take only about an hour (roughly seven tests).

The hg bisect command is + aware of the “branchy” nature of a Mercurial + project's revision history, so it has no problems dealing with + branches, merges, or multiple heads in a repository. It can + prune entire branches of history with a single probe, which is + how it operates so efficiently.

Using the hg bisect + + command

Here's an example of hg + bisect in action.

	Note
	In versions 0.9.5 and earlier of Mercurial, hg bisect was not a core command: + it was distributed with Mercurial as an extension. This + section describes the built-in command, not the old + extension.

Now let's create a repository, so that we can try out the + hg bisect command in + isolation.

$ hg init mybug
+
+$ cd mybug
+

We'll simulate a project that has a bug in it in a + simple-minded way: create trivial changes in a loop, and + nominate one specific change that will have the + “bug”. This loop creates 35 changesets, each + adding a single file to the repository. We'll represent our + “bug” with a file that contains the text “i + have a gub”.

$ buggy_change=22
+$ for (( i = 0; i < 35; i++ )); do
+
+>   if [[ $i = $buggy_change ]]; then
+>     echo 'i have a gub' > myfile$i
+>     hg commit -q -A -m 'buggy changeset'
+>   else
+
+>     echo 'nothing to see here, move along' > myfile$i
+>     hg commit -q -A -m 'normal changeset'
+>   fi
+> done
+
+

The next thing that we'd like to do is figure out how to + use the hg bisect command. + We can use Mercurial's normal built-in help mechanism for + this.

$ hg help bisect
+hg bisect [-gbsr] [-c CMD] [REV]
+
+subdivision search of changesets
+
+    This command helps to find changesets which introduce problems.
+    To use, mark the earliest changeset you know exhibits the problem
+    as bad, then mark the latest changeset which is free from the
+    problem as good. Bisect will update your working directory to a
+    revision for testing (unless the --noupdate option is specified).
+    Once you have performed tests, mark the working directory as bad
+    or good and bisect will either update to another candidate changeset
+    or announce that it has found the bad revision.
+
+    As a shortcut, you can also use the revision argument to mark a
+    revision as good or bad without checking it out first.
+
+    If you supply a command it will be used for automatic bisection. Its exit
+    status will be used as flag to mark revision as bad or good. In case exit
+    status is 0 the revision is marked as good, 125 - skipped, 127 (command not
+    found) - bisection will be aborted; any other status bigger than 0 will
+    mark revision as bad.
+
+options:
+
+ -r --reset     reset bisect state
+ -g --good      mark changeset good
+ -b --bad       mark changeset bad
+ -s --skip      skip testing changeset
+ -c --command   use command to check changeset state
+ -U --noupdate  do not update to target
+
+use "hg -v help bisect" to show global options
+

The hg bisect command + works in steps. Each step proceeds as follows.

You run your binary test.
- If the test succeeded, you tell hg bisect by running the + hg bisect --good + + command.
- If it failed, run the hg bisect --bad + command.
The command uses your information to decide + which changeset to test next.
It updates the working directory to that + changeset, and the process begins again.

The process ends when hg + bisect identifies a unique changeset that marks + the point where your test transitioned from + “succeeding” to “failing”.

To start the search, we must run the hg bisect --reset command.

$ hg bisect --reset
+
+

In our case, the binary test we use is simple: we check to + see if any file in the repository contains the string “i + have a gub”. If it does, this changeset contains the + change that “caused the bug”. By convention, a + changeset that has the property we're searching for is + “bad”, while one that doesn't is + “good”.

Most of the time, the revision to which the working + directory is synced (usually the tip) already exhibits the + problem introduced by the buggy change, so we'll mark it as + “bad”.

$ hg bisect --bad
+

Our next task is to nominate a changeset that we know + doesn't have the bug; the hg bisect command will + “bracket” its search between the first pair of + good and bad changesets. In our case, we know that revision + 10 didn't have the bug. (I'll have more words about choosing + the first “good” changeset later.)

$ hg bisect --good 10
+
+Testing changeset 22:b8789808fc48 (24 changesets remaining, ~4 tests)
+0 files updated, 0 files merged, 12 files removed, 0 files unresolved
+

Notice that this command printed some output.

It told us how many changesets it must + consider before it can identify the one that introduced + the bug, and how many tests that will require.
It updated the working directory to the next + changeset to test, and told us which changeset it's + testing.

We now run our test in the working directory. We use the + grep command to see if our + “bad” file is present in the working directory. + If it is, this revision is bad; if not, this revision is good. + +

$ if grep -q 'i have a gub' *
+> then
+
+>   result=bad
+> else
+>   result=good
+> fi
+$ echo this revision is $result
+
+this revision is bad
+$ hg bisect --$result
+Testing changeset 16:e61fdddff53e (12 changesets remaining, ~3 tests)
+0 files updated, 0 files merged, 6 files removed, 0 files unresolved
+

+ +

This test looks like a perfect candidate for automation, + so let's turn it into a shell function.

$ mytest() {
+>   if grep -q 'i have a gub' *
+
+>   then
+>     result=bad
+>   else
+>     result=good
+>   fi
+
+>   echo this revision is $result
+>   hg bisect --$result
+> }
+

We can now run an entire test step with a single command, + mytest.

$ mytest
+
+this revision is good
+Testing changeset 19:706df39b003b (6 changesets remaining, ~2 tests)
+3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+

A few more invocations of our canned test step command, + and we're done.

$ mytest
+this revision is good
+Testing changeset 20:bf7ea9a054e6 (3 changesets remaining, ~1 tests)
+1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+$ mytest
+this revision is good
+Testing changeset 21:921391dd45c1 (2 changesets remaining, ~1 tests)
+1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+$ mytest
+this revision is good
+The first bad revision is:
+changeset:   22:b8789808fc48
+user:        Bryan O'Sullivan <bos@serpentine.com>
+
+date:        Tue May 05 06:55:14 2009 +0000
+summary:     buggy changeset
+
+

Even though we had 40 changesets to search through, the + hg bisect command let us find + the changeset that introduced our “bug” with only + five tests. Because the number of tests that the hg bisect command performs grows + logarithmically with the number of changesets to search, the + advantage that it has over the “brute force” + search approach increases with every changeset you add.

Cleaning up after your search

When you're finished using the hg + bisect command in a repository, you can use the + hg bisect --reset command to + drop the information it was using to drive your search. The + command doesn't use much space, so it doesn't matter if you + forget to run this command. However, hg bisect won't let you start a new + search in that repository until you do a hg bisect --reset.

$ hg bisect --reset
+
+

Tips for finding bugs effectively

Give consistent input

The hg bisect command + requires that you correctly report the result of every test + you perform. If you tell it that a test failed when it really + succeeded, it might be able to detect the + inconsistency. If it can identify an inconsistency in your + reports, it will tell you that a particular changeset is both + good and bad. However, it can't do this perfectly; it's about + as likely to report the wrong changeset as the source of the + bug.

Automate as much as possible

When I started using the hg + bisect command, I tried a few times to run my + tests by hand, on the command line. This is an approach that + I, at least, am not suited to. After a few tries, I found + that I was making enough mistakes that I was having to restart + my searches several times before finally getting correct + results.

My initial problems with driving the hg bisect command by hand occurred + even with simple searches on small repositories; if the + problem you're looking for is more subtle, or the number of + tests that hg bisect must + perform increases, the likelihood of operator error ruining + the search is much higher. Once I started automating my + tests, I had much better results.

The key to automated testing is twofold:

always test for the same symptom, and
always feed consistent input to the hg bisect command.

In my tutorial example above, the grep + + command tests for the symptom, and the if + statement takes the result of this check and ensures that we + always feed the same input to the hg + bisect command. The mytest + function marries these together in a reproducible way, so that + every test is uniform and consistent.

Check your results

Because the output of a hg + bisect search is only as good as the input you + give it, don't take the changeset it reports as the absolute + truth. A simple way to cross-check its report is to manually + run your test at each of the following changesets:

The changeset that it reports as the first bad + revision. Your test should still report this as + bad.
The parent of that changeset (either parent, + if it's a merge). Your test should report this changeset + as good.
A child of that changeset. Your test should + report this changeset as bad.

Beware interference between bugs

It's possible that your search for one bug could be + disrupted by the presence of another. For example, let's say + your software crashes at revision 100, and worked correctly at + revision 50. Unknown to you, someone else introduced a + different crashing bug at revision 60, and fixed it at + revision 80. This could distort your results in one of + several ways.

It is possible that this other bug completely + “masks” yours, which is to say that it occurs + before your bug has a chance to manifest itself. If you can't + avoid that other bug (for example, it prevents your project + from building), and so can't tell whether your bug is present + in a particular changeset, the hg + bisect command cannot help you directly. Instead, + you can mark a changeset as untested by running hg bisect --skip.

A different problem could arise if your test for a bug's + presence is not specific enough. If you check for “my + program crashes”, then both your crashing bug and an + unrelated crashing bug that masks it will look like the same + thing, and mislead hg + bisect.

Another useful situation in which to use hg bisect --skip is if you can't + test a revision because your project was in a broken and hence + untestable state at that revision, perhaps because someone + checked in a change that prevented the project from + building.

Bracket your search lazily

Choosing the first “good” and + “bad” changesets that will mark the end points of + your search is often easy, but it bears a little discussion + nevertheless. From the perspective of hg bisect, the “newest” + changeset is conventionally “bad”, and the older + changeset is “good”.

If you're having trouble remembering when a suitable + “good” change was, so that you can tell hg bisect, you could do worse than + testing changesets at random. Just remember to eliminate + contenders that can't possibly exhibit the bug (perhaps + because the feature with the bug isn't present yet) and those + where another problem masks the bug (as I discussed + above).

Even if you end up “early” by thousands of + changesets or months of history, you will only add a handful + of tests to the total number that hg + bisect must perform, thanks to its logarithmic + behavior.

Want to stay up to date? Subscribe to the comment feed for this chapter, or the entire book.

Prev		Next
Chapter 8. Managing releases and branchy development	Home	Chapter 10. Handling repository events with hooks

+ diff -r 000000000000 -r 8083d21c0020 web/html/backup/ar01.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/ar01.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,387 @@ + + + +Functional Approach + + + + + + + + + + + +

+ + + + + + + +

Functional Approach
+Prev

+Functional Approach

Table of Contents

5.1. List Comprehensions

def factorial(n):
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

The code snippet above defines a function with the name factorial, takes the number for which the factorial must be computed, computes the factorial and returns the value.

A + Function once defined can be used or called anywhere else in the program. We call a fucntion with its name followed by a pair of parenthesis which encloses the arguments to the function. +

The value that function returns can be assigned to a variable. Let's call the above function and store the factorial in a variable:

fact5 = factorial(5)
+
+

The value of fact5 will now be 120, which is the factorial of 5. Note that we passed 5 as the argument to the function.

def factorial(n):
+  'Returns the factorial for the number n.'
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

Let us write a small function to swap two values:

def swap(a, b):
+  return b, a
+
+c, d = swap(a, b)
+
+

def cant_change(n):
+  n = 10
+
+n = 5
+cant_change(n)
+
+

>>> def can_change(n):
+...   n[1] = James
+...
+
+>>> name = ['Mr.', 'Steve', 'Gosling']
+>>> can_change(name)
+>>> name
+['Mr.', 'James', 'Gosling']
+
+

If nothing is returned by the function explicitly, Python takes care to return None when the funnction is called.

+1. Default Arguments

def fib(n=10):
+  fib_list = [0, 1]
+  for i in range(n - 2):
+    next = fib_list[-2] + fib_list[-1]
+    fib_list.append(next)
+  return fib_list
+
+

fib()
+[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
+fib(5)
+[0, 1, 1, 2, 3]
+
+

+2. Keyword Arguments

In a function call, + Keyword arguments can be used for each argument, in the following fashion: +

argument_name=argument_value
+Also denoted as: keyword=argument
+
+def wish(name='World', greetings='Hello'):
+  print "%s, %s!" % (greetings, name)
+
+

wish(name='Guido', greetings='Hey')
+wish(greetings='Hey', name='Guido')
+
+

def my_func(x, y, z, u, v, w):
+  # initialize variables.
+  ...
+  # do some stuff 
+  ...
+  # return the value
+
+

It is valid to call the above functions in the following ways:

my_func(10, 20, 30, u=1.0, v=2.0, w=3.0)
+my_func(10, 20, 30, 1.0, 2.0, w=3.0)
+my_func(10, 20, z=30, u=1.0, v=2.0, w=3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, w=3.0)
+
+

Following lists some of the invalid calls:

my_func(10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, 3.0)
+
+

+3. Parameter Packing and Unpacking

def print_report(title, *args, **name):
+  """Structure of *args*
+  (age, email-id)
+  Structure of *name*
+  {
+      'first': First Name
+      'middle': Middle Name
+      'last': Last Name
+  }
+  """
+
+  print "Title: %s" % (title)
+  print "Full name: %(first)s %(middle)s %(last)s" % name
+  print "Age: %d\nEmail-ID: %s" % args
+
+

The above function can be called as. Note, the order of keyword parameters can be interchanged:

>>> print_report('Employee Report', 29, 'johny@example.com', first='Johny',
+                 last='Charles', middle='Douglas')
+Title: Employee Report
+Full name: Johny Douglas Charles
+Age: 29
+Email-ID: johny@example.com
+
+

def print_report(title, age, email, first, middle, last):
+  print "Title: %s" % (title)
+  print "Full name: %s %s %s" % (first, middle, last)
+  print "Age: %d\nEmail-ID: %s" % (age, email)
+
+>>> args = (29, 'johny@example.com')
+>>> name = {
+        'first': 'Johny',
+        'middle': 'Charles',
+        'last': 'Douglas'
+        }
+>>> print_report('Employee Report', *args, **name)
+Title: Employee Report
+Full name: Johny Charles Douglas
+Age: 29
+Email-ID: johny@example.com
+
+

+4. Nested Functions and Scopes

+      http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
+      http://personalpages.tds.net/~kent37/kk/00001.html
+

However, the following is an example for nested functions in Python:

def outer():
+  print "Outer..."
+  def inner():
+    print "Inner..."
+  print "Outer..."
+  inner()
+
+>>> outer()
+
+

+5. map, reduce and filter functions

Python provides several built-in functions for convenience. The + map(), + reduce() and + filter() functions prove to be very useful with sequences like + Lists. +

def square(x):
+  return x*x
+
+>>> map(square, [1, 2, 3, 4])
+[1, 4, 9, 16]
+
+def mul(x, y):
+  return x*y
+
+>>> map(mul, [1, 2, 3, 4], [6, 7, 8, 9])
+
+

def even(x):
+  if x % 2:
+    return True
+  else:
+    return False
+
+>>> filter(even, range(1, 10))
+[1, 3, 5, 7, 9]
+
+

def mul(x, y):
+  return x*y
+
+>>> reduce(mul, [1, 2, 3, 4])
+24
+
+

+5.1. List Comprehensions

>>> num = [1, 2, 3]
+>>> sq = [x*x for x in num]
+>>> sq
+[1, 4, 9]
+>>> all_num = [1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> even = [x for x in all_num if x%2 == 0]
+
+

The syntax used here is very clear from the way it is written. It can be translated into english as, "for each element x in the list all_num, if remainder of x divided by 2 is 0, add x to the list."

+ + + + + + + + + + + +

+Prev
Chapter 1.

+ + diff -r 000000000000 -r 8083d21c0020 web/html/backup/autoid.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/autoid.py Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,47 @@ +#!/usr/bin/env python +# +# Add unique ID attributes to para tags. This script should only be +# run by one person, since otherwise it introduces the possibility of +# chaotic conflicts among tags. + +import glob, os, re, sys + +tagged = re.compile(']* id="x_([0-9a-f]+)"[^>]*>', re.M) +untagged = re.compile('') + +names = glob.glob('ch*.xml') + glob.glob('app*.xml') + +# First pass: find the highest-numbered paragraph ID. + +biggest_id = 0 +seen = set() +errs = 0 + +for name in names: + for m in tagged.finditer(open(name).read()): + i = int(m.group(1),16) + if i in seen: + print >> sys.stderr, '%s: duplication of ID %s' % (name, i) + errs += 1 + seen.add(i) + if i > biggest_id: + biggest_id = i + +def retag(s): + global biggest_id + biggest_id += 1 + return '' % biggest_id + +# Second pass: add IDs to paragraphs that currently lack them. + +for name in names: + f = open(name).read() + f1 = untagged.sub(retag, f) + if f1 != f: + tmpname = name + '.tmp' + fp = open(tmpname, 'w') + fp.write(f1) + fp.close() + os.rename(tmpname, name) + +sys.exit(errs) diff -r 000000000000 -r 8083d21c0020 web/html/backup/ch01-intro.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/ch01-intro.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,686 @@ + + + +Basic Python + + + + + + + + + + +

+Basic Python

Table of Contents

2.1. The Interactive Interpreter
2.2. + ipython - An enhanced interactive Python interpreter +

5. The + while loop +

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

6. The + if conditional +

7. raw_input() +

8. + int() method +

This document is intended to be handed out at the end of the workshop. It has been designed for Engineering students who are Python beginners and have basic programming skills. The focus is on basic numerics and plotting using Python.

The system requirements:

Python - version 2.5.x or newer.
IPython
Text editor - scite, vim, emacs or whatever you are comfortable with.

+1. Introduction

The Python programming language was created by a dutch named Guido van Rossum. The idea of Python was conceived in December 1989. The name Python has nothing to do with the reptilian, but its been named after the 70s comedy series "Monty Python's Flying Circus", since it happens to be Guido's favourite TV series.

Current stable version of Python is 2.6.x, although Python 3.0 is also the stable version, it is not backwards compatible with the previous versions and is hence not entirely popular at the moment. This material will focus on the 2.6.x series.

Python is licensed under the Python Software Foundation License (PSF License) which is GPL compatible Free Software license (excepting license version 1.6 and 2.0) It is a no strings attached license, which means the source code is free to modify and redistribute.

The Python docs define Python as "Python is an interpreted, object-oriented, high-level programming language with dynamic semantics." A more detailed summary can be found at + http://www.python.org/doc/essays/blurb.html. Python is a language that has been designed to help the programmer concentrate on solving the problem at hand and not worry about the programming language idiosyncrasies. +

Python is a highly cross platform compatible language on account of it being an interpreted language. It is highly scalable and hence has been adapted to run on the Nokia 60 series phones. Python has been designed to be readable and easy to use

+ Resources available for reference +

Web: + http://www.python.org +
Doc: + http://www.python.org/doc +
Free Tutorials: + Official Python Tutorial: + http://docs.python.org/tut/tut.html + Byte of Python: + http://www.byteofpython.info/ * Dive into Python: + http://diveintopython.org/ +

+ Advantages of Python - Why Python?? +

Python has been designed for readability and ease of use. Its been designed in such a fashion that it imposes readability on the programmer. Python does away with the braces and the semicolons and instead implements code blocks based on indentation, thus enhancing readability.
Python is a high level, interpreted, modular and object oriented language. Python performs memory management on its own, thus the programmer need not bother about allocating and deallocating memory to variables. Python provides extensibility by providing modules which can be easily imported similar to headers in C and packages in Java. Python is object oriented and hence provides all the object oriented characteristics such as inheritance, encapsulation and polymorphism.
Python offers a highly powerful interactive programming interface in the form of the 'Interactive Interpreter' which will be discussed in more detail in the following sections.
Python provides a rich standard library and an extensive set of modules. The power of Python modules can be seen in this slightly exaggerated cartoon + http://xkcd.com/353/ +
Python interfaces well with most other programming languages such as C, C++ and FORTRAN.

Although, Python has one setback. Python is not fast as some of the compiled languages like C or C++. Yet, the amount of flexibility and power more than make up for this setback.

+2. The Python Interpreter

+2.1. The Interactive Interpreter

Typing + python at the shell prompt on any standard Unix/Gnu-Linux system and hitting the enter key fires up the Python 'Interactive Interpreter'. The Python interpreter is one of the most integral features of Python. The prompt obtained when the interactive interpreter is similar to what is shown below. The exact appearance might differ based on the version of Python being used. The + >>> thing shown is the python prompt. When something is typed at the prompt and the enter key is hit, the python interpreter interprets the command entered and performs the appropriate action. All the examples presented in this document are to be tried hands on, on the interactive interpreter. +

Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49) 
+[GCC 4.3.2] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>> 
+
+

Lets try with an example, type + print 'Hello, World!' at the prompt and hit the enter key. +

>>> print 'Hello, World!'
+Hello, World!
+
+

This example was quite straight forward, and thus we have written our first line of Python code. Now let us try typing something arbitrary at the prompt. For example:

>>> arbit word
+  File "<stdin>", line 1
+    arbit word
+            ^
+SyntaxError: invalid syntax
+>>>
+
+

The interpreter gave an error message saying that 'arbit word' was invalid syntax which is valid. The interpreter is an amazing tool when learning to program in Python. The interpreter provides a help function that provides the necessary documentation regarding all Python syntax, constructs, modules and objects. Typing + help() at the prompt gives the following output: +

>>> help()
+
+Welcome to Python 2.5!  This is the online help utility.
+
+If this is your first time using Python, you should definitely check out
+the tutorial on the Internet at http://www.python.org/doc/tut/.
+
+Enter the name of any module, keyword, or topic to get help on writing
+Python programs and using Python modules.  To quit this help utility and
+return to the interpreter, just type "quit".
+
+To get a list of available modules, keywords, or topics, type "modules",
+"keywords", or "topics".  Each module also comes with a one-line summary
+of what it does; to list the modules whose summaries contain a given word
+such as "spam", type "modules spam".
+
+help> 
+
+
+

As mentioned in the output, entering the name of any module, keyword or topic will provide the documentation and help regarding the same through the online help utility. Pressing + Ctrl+d exits the help prompt and returns to the python prompt. +

Let us now try a few examples at the python interpreter.

Eg 1:

>>> print 'Hello, python!'
+Hello, python!
+>>>
+
+

Eg 2:

>>> print 4321*567890
+2453852690
+>>> 
+
+

Eg 3:

>>> 4321*567890
+2453852690L
+>>>
+
+

Note: Notice the 'L' at the end of the output. The 'L' signifies that the
+output of the operation is of type *long*. It was absent in the previous
+example because we used the print statement. This is because *print* formats
+the output before displaying.
+
+

Eg 4:

>>> big = 12345678901234567890 ** 3
+>>> print big
+1881676372353657772490265749424677022198701224860897069000
+>>> 
+
+

This example is to show that unlike in C or C++ there is no limit on the
+value of an integer.
+
+

Try this on the interactive interpreter: + import this +

+ Hint: The output gives an idea of Power of Python +

+2.2. + ipython - An enhanced interactive Python interpreter +

The power and the importance of the interactive interpreter was the highlight of the previous section. This section provides insight into the enhanced interpreter with more advanced set of features called + ipython. Entering + ipython at the shell prompt fires up the interactive interpreter. +

$ ipython
+Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49) 
+Type "copyright", "credits" or "license" for more information.
+
+IPython 0.8.4 -- An enhanced Interactive Python.
+?         -> Introduction and overview of IPython's features.
+%quickref -> Quick reference.
+help      -> Python's own help system.
+object?   -> Details about 'object'. ?object also works, ?? prints more.
+
+In [1]: 
+
+

This is the output obtained upon firing ipython. The exact appearance may change based on the Python version installed. The following are some of the various features provided by + ipython: +

Suggestions - ipython provides suggestions of the possible methods and operations available for the given python object.

Eg 5:

In [4]: a = 6
+
+In [5]: a.
+a.__abs__           a.__divmod__        a.__index__         a.__neg__          a.__rand__          a.__rmod__          a.__rxor__
+a.__add__           a.__doc__           a.__init__          a.__new__          a.__rdiv__          a.__rmul__          a.__setattr__
+a.__and__           a.__float__         a.__int__           a.__nonzero__      a.__rdivmod__       a.__ror__           a.__str__
+a.__class__         a.__floordiv__      a.__invert__        a.__oct__          a.__reduce__        a.__rpow__          a.__sub__
+a.__cmp__           a.__getattribute__  a.__long__          a.__or__           a.__reduce_ex__     a.__rrshift__       a.__truediv__
+a.__coerce__        a.__getnewargs__    a.__lshift__        a.__pos__          a.__repr__          a.__rshift__        a.__xor__
+a.__delattr__       a.__hash__          a.__mod__           a.__pow__          a.__rfloordiv__     a.__rsub__          
+a.__div__           a.__hex__           a.__mul__           a.__radd__         a.__rlshift__       a.__rtruediv__      
+
+

In this example, we initialized 'a' (a variable - a concept that will be discussed in the subsequent sections.) to 6. In the next line when the + tab key is pressed after typing ' + a.' ipython displays the set of all possible methods that are applicable on the object 'a' (an integer in this context). Ipython provides many such datatype specific features which will be presented in the further sections as and when the datatypes are introduced. +

+3. Editing and running a python file

The previous sections focused on the use of the interpreter to run python code. While the interpeter is an excellent tool to test simple solutions and experiment with small code snippets, its main disadvantage is that everything written in the interpreter is lost once its quit. Most of the times a program is used by people other than the author. So the programs have to be available in some form suitable for distribution, and hence they are written in files. This section will focus on editing and running python files. Start by opening a text editor ( it is recommended you choose one from the list at the top of this page ). In the editor type down python code and save the file with an extension + .py (python files have an extension of .py). Once done with the editing, save the file and exit the editor. +

Let us look at a simple example of calculating the gcd of 2 numbers using Python:

+ Creating the first python script(file) : +

$ emacs gcd.py
+  def gcd(x,y):
+    if x % y == 0:
+      return y
+    return gcd(y, x%y)
+
+  print gcd(72, 92)
+
+

To run the script, open the shell prompt, navigate to the directory that contains the python file and run + python <filename.py> at the prompt ( in this case filename is gcd.py ) +

+ Running the python script : +

$ python gcd.py
+4
+$ 
+
+

Another method to run a python script would be to include the line

+ #! /usr/bin/python +

at the beginning of the python file and then make the file executable by

$ chmod a+x + filename.py +

Once this is done, the script can be run as a standalone program as follows:

$ ./ + filename.py +

+4. Basic Datatypes and operators in Python

Python provides the following set of basic datatypes.

+
Numbers: int, float, long, complex
+
Strings
+
Boolean
+

+4.1. Numbers

Numbers were introduced in the examples presented in the interactive interpreter section. Numbers include types as mentioned earlier viz., int (integers), float (floating point numbers), long (large integers), complex (complex numbers with real and imaginary parts). Python is not a strongly typed language, which means the type of a variable need not mentioned during its initialization. Let us look at a few examples.

Eg 6:

>>> a = 1 #here a is an integer variable
+
+

Eg 7:

>>> lng = 122333444455555666666777777788888888999999999 #here lng is a variable of type long
+>>> lng
+122333444455555666666777777788888888999999999L #notice the trailing 'L'
+>>> print lng
+122333444455555666666777777788888888999999999 #notice the absence of the trailing 'L'
+>>> lng+1
+122333444455555666666777777788888889000000000L
+
+
+

Long numbers are the same as integers in almost all aspects. They can be used in operations just like integers and along with integers without any distinction. The only distinction comes during type checking (which is not a healthy practice). Long numbers are tucked with a trailing 'L' just to signify that they are long. Notice that in the example just lng at the prompt displays the value of the variable with the 'L' whereas + print lng displays without the 'L'. This is because print formats the output before printing. Also in the example, notice that adding an integer to a long does not give any errors and the result is as expected. So for all practical purposes longs can be treated as ints. +

Eg 8:

>>> fl = 3.14159 #fl is a float variable
+>>> e = 1.234e-4 #e is also a float variable, specified in the exponential form
+>>> a = 1
+>>> b = 2
+>>> a/b #integer division
+0
+>>> a/fl #floating point division
+0.31831015504887655
+>>> e/fl
+3.9279473133031364e-05
+
+
+

Floating point numbers, simply called floats are real numbers with a decimal point. The example above shows the initialization of a float variable. Shown also in this example is the difference between integer division and floating point division. 'a' and 'b' here are integer variables and hence the division gives 0 as the quotient. When either of the operands is a float, the operation is a floating point division, and the result is also a float as illustrated.

Eg 9:

>>> cplx = 3 + 4j #cplx is a complex variable
+>>> cplx
+(3+4j)
+>>> print cplx.real #prints the real part of the complex number
+3.0
+>>> print cplx.imag #prints the imaginary part of the complex number
+4.0
+>>> print cplx*fl  #multiplies the real and imag parts of the complex number with the multiplier
+(9.42477+12.56636j)
+>>> abs(cplx) #returns the absolute value of the complex number
+5.0
+
+

Python provides a datatype for complex numbers. Complex numbers are initialized as shown in the example above. The + real and + imag operators return the real and imaginary parts of the complex number as shown. The + abs() returns the absolute value of the complex number. +

+4.2. Variables

Variables are just names that represent a value. Variables have already been introduced in the various examples from the previous sections. Certain rules about using variables:

+
Variables have to be initialized or assigned a value before being used.
+
Variable names can consist of letters, digits and + underscores . +
+
Variable names cannot begin with digits, but can contain digits in them.
+

In reference to the previous section examples, 'a', 'b', 'lng', 'fl', 'e' and 'cplx' are all variables of various datatypes.

Note: Python is not a strongly typed language and hence an integer variable can at a
+later stage be used as a float variable as well.
+
+

+4.3. Strings

Strings are one of the essential data structures of any programming language. The + print "Hello, World!" program was introduced in the earlier section, and the + "Hello, World!" in the print statement is a string. A string is basically a set of characters. Strings can be represented in various ways shown below: +

s = 'this is a string'              # a string variable can be represented using single quotes
+s = 'This one has "quotes" inside!' # The string can have quotes inside it as shown
+s = "I have 'single-quotes' inside!"
+l = "A string spanning many lines\
+one more line\
+yet another"                        # a string can span more than a single line.
+t = """A triple quoted string does  # another way of representing multiline strings.
+not need to be escaped at the end and
+"can have nested quotes" etc."""
+
+

Try the following on the interpreter: + s = 'this is a string with 'quotes' of similar kind' +

+ Exercise: How to use single quotes within single quotes in a string as shown in the above example without getting an error? +

+4.3.1. String operations

A few basic string operations are presented here.

+ String concatenation String concatenation is done by simple addition of two strings. +

>>> x = 'Hello'
+>>> y = ' Python'
+>>> print x+y
+Hello Python
+
+

Try this yourself: +

>>> somenum = 13
+>>> print x+somenum
+
+

The problem with the above example is that here a string variable and an integer variable are trying to be concantenated. To obtain the desired result from the above example the str(), repr() and the `` can be used.

+ str() simply converts a value to a string in a reasonable form. + repr() creates a string that is a representation of the value. +

The difference can be seen in the example shown below:

>>> str(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000'
+>>> repr(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000L'
+
+

It can be observed that the 'L' in the long value shown was omitted by str(), whereas repr() converted that into a string too. An alternative way of using repr(value) is ` + value`. +

A few more examples:

>>> x = "Let's go \nto Pycon"
+>>> print x
+Let's go 
+to Pycon
+
+

In the above example, notice that the '\n'(newline) character is formatted and the string is printed on two lines. The strings discussed until now were normal strings. Other than these there are two other types of strings namely, raw strings and unicode strings.

+ Raw strings are strings which are unformatted, that is the backslashes(\) are not parsed and are left as it is in the string. Raw strings are represented with an 'r' at the start of a string. Let us look at an example +

>>> x = r"Let's go \nto Pycon"
+>>> print x
+Let's go \nto Pycon
+
+

Note: The '\n' is not being parsed into a new line and is left as it is.

+ Try this yourself: +

>>> x = r"Let's go to Pycon\"
+
+

Unicode strings are strings where the characters are Unicode characters as opposed to ASCII characters. Unicode strings are represented with a 'u' at the start of the string. Let us look at an example: +

>>> x = u"Let's go to Pycon!"
+>>> print x
+Let's go to Pycon!
+
+

+4.4. Boolean

Python also provides special Boolean datatype. A boolean variable can assume a value of either + True or + False (Note the capitalizations). +

Let us look at examples:

>>> t = True
+>>> f = not t
+>>> print f
+False
+>>> f or t
+True
+>>> f and t
+False
+
+

+5. The + while loop +

The Python + while loop is similar to the C/C++ while loop. The syntax is as follows: +

statement 0
+while condition:
+  statement 1 #while block
+  statement 2 #while block
+statement 3 #outside the while block.
+
+

Let us look at an example:

>>> x = 1  
+>>> while x <= 5:
+...   print x
+...   x += 1
+... 
+1
+2
+3
+4
+5
+
+

+6. The + if conditional +

The Python + if block provides the conditional execution of statements. If the condition evaluates as true the block of statements defined under the if block are executed. +

If the first block is not executed on account of the condition not being satisfied, the set of statements in the + else block are executed. +

The + elif block provides the functionality of evaluation of multiple conditions as shown in the example. +

The syntax is as follows:

if condition :
+    statement_1
+    statement_2
+
+elif condition:
+    statement_3
+    statement_4
+else:
+    statement_5
+    statement_6
+
+

Let us look at an example:

>>> n = raw_input("Input a number:")
+>>> if n < 0:
+      print n," is negative"
+      elif n > 0:
+      print n," is positive"
+      else:
+      print n, " is 0"
+
+

+7. raw_input() +

In the previous example we saw the call to the raw_input() subroutine. The + raw_input() method is used to take user inputs through the console. Unlike + input() which assumes the data entered by the user as a standard python expression, + raw_input() treats all the input data as raw data and converts everything into a string. To illustrate this let us look at an example. +

>>> input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+121
+
+>>> input("Enter your name:")
+Enter your name:PythonFreak
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+  File "<string>", line 1, in <module>
+NameError: name 'PythonFreak' is not defined
+
+

As shown above the + input() assumes that the data entered is a valid Python expression. In the first call it prompts for an integer input and when entered it accepts the integer as an integer, whereas in the second call, when the string is entered without the quotes, + input() assumes that the entered data is a valid Python expression and hence it raises and exception saying PythonFreak is not defined. +

>>> input("Enter your name:")
+Enter your name:'PythonFreak'
+'PythonFreak'
+>>> 
+
+

Here the name is accepted because its entered as a string (within quotes). But its unreasonable to go on using quotes each time a string is entered. Hence the alternative is to use + raw_input(). +

Let us now look at how + raw_input() operates with an example. +

>>> raw_input("Enter your name:")
+Enter your name:PythonFreak
+'PythonFreak'
+
+

Observe that the + raw_input() is converting it into a string all by itself. +

>>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+'121'
+
+

Observe that + raw_input() is converting the integer 121 also to a string as '121'. Let us look at another example: +

>>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+>>> pal + 2
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: cannot concatenate 'str' and 'int' objects
+>>> pal
+'121'
+
+

Observe here that the variable + pal is a string and hence integer operations cannot be performed on it. Hence the exception is raised. +

+8. + int() method +

Generally for computing purposes, the data used is not strings or raw data but on integers, floats and similar mathematical data structures. The data obtained from + raw_input() is raw data in the form of strings. In order to obtain integers from strings we use the method + int(). +

Let us look at an example.

>>> intpal = int(pal)
+>>> intpal
+121
+
+

In the previous example it was observed that + pal was a string variable. Here using the + int() method the string + pal was converted to an integer variable. +

+ Try This Yourself: +

>>> stringvar = raw_input("Enter a name:")
+Enter a name:Guido Van Rossum
+>>> stringvar
+'Guido Van Rossum'
+>>> numvar = int(stringvar)
+
+

+ + + + + + + + + + + +

+Prev
Chapter 1. Introduction	Home

+ + diff -r 000000000000 -r 8083d21c0020 web/html/backup/ch02-list_tuples.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/ch02-list_tuples.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,652 @@ + + + +Lists and Tuples + + + + + + + +

+ + + + + + + +

Lists and Tuples
+Prev

+Lists and Tuples

Table of Contents

1. Common List Operations

1.1. Indexing
1.2. Concatenating
1.3. Slicing
1.4. Multiplication
1.5. Membership
1.6. Length, Maximum and Minimum
1.7. Changing Elements
1.8. Deleting Elements
1.9. Assign to Slices

2. None, Empty Lists, and Initialization

3. Nested Lists

4. List Methods

4.1. append
4.2. count
4.3. extend
4.4. index
4.5. insert
4.6. pop
4.7. remove
4.8. reverse
4.9. sort

5. Tuples

5.1. Common Tuple Operations

6. Additional Syntax

6.1. range()
6.2. for

7. Conclusion

Lists +

Python provides an intuitive way to represent a group items, called + Lists. The items of a + List are called its elements. Unlike C/C++, elements can be of any type. A + List is represented as a list of comma-sepated elements with square brackets around them: +

>>> a = [10, 'Python programming', 20.3523, 23, 3534534L]
+>>> a
+[10, 'Python programming', 20.3523, 23, 3534534L]
+
+
+

+1. Common List Operations

The following are some of the most commonly used operations on + Lists. +

+1.1. Indexing

Individual elements of a + List can be accessed using an index to the element. The indices start at 0. One can also access the elements of the + List in reverse using negative indices.: +

>>> a[1]
+'Python programming'
+>>> a[-1]
+3534534L
+
+

It is important to note here that the last element of the + List has an index of -1. +

+1.2. Concatenating

Two or more + Lists can be concatenated using the + operator: +

>>> a + ['foo', 12, 23.3432, 54]
+[10, 'Python programming', 20.3523, 'foo', 12, 23.3432, 54]
+>>> [54, 75, 23] + ['write', 67, 'read']
+[54, 75, 23, 'write', 67, 'read']
+
+
+

+1.3. Slicing

A + List can be sliced off to contain a subset of elements of the + List. Slicing can be done by using two indices separated by a colon, where the first index is inclusive and the second index is exclusive. The resulting slice is also a + List.: +

>>> num = [1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> num[3:6]
+[4, 5, 6]
+>>> num[0:1]
+[1]
+>>> num[7:10]
+[7, 8, 9]
+
+

The last example showed how to access last 3 elements of the + List. There is a small catch here. The second index 10 actually refers to the 11th element of the + List which is still valid, even though it doesn't exist because the second index is exclusive and tells the Python interpreter to get the last element of the + List. But this can also be done in a much easier way using negative indices: +

>>> num[-3:-1]
+[7, 8, 9]
+
+

Excluding the first index implies that the slice must start at the beginning of the + List, while excluding the second index includes all the elements till the end of the + List. A third parameter to a slice, which is implicitly taken as 1 is the step of the slice. It is specified as a value which follows a colon after the second index: +

>>> num[:4]
+[1, 2, 3, 4]
+>>> num[7:]
+[8, 9]
+>>> num[-3:]
+[7, 8, 9]
+>>> num[:]
+[1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> num[4:9:3]
+[5, 8]
+>>> num[3::2]
+[4, 6, 8]
+>>> num[::4]
+[1, 5, 9]
+
+
+

+1.4. Multiplication

A + List can be multiplied with an integer to repeat itself: +

>>> [20] * 5
+[20, 20, 20, 20, 20]
+>>> [42, 'Python', 54] * 3
+[42, 'Python', 54, 42, 'Python', 54, 42, 'Python', 54]
+
+
+

+1.5. Membership

+ in operator is used to find whether an element is part of the + List. It returns + True if the element is present in the + List or + False if it is not present. Since this operator returns a Boolean value it is called a Boolean operator: +

>>> names = ['Guido', 'Alex', 'Tim']
+>>> 'Tim' in names
+True
+>>> 'Adam' in names
+False
+
+
+

+1.6. Length, Maximum and Minimum

Length of a + List can be found out using the len function. The max function returns the element with the largest value and the min function returns the element with the smallest value: +

>>> num = [4, 1, 32, 12, 67, 34, 65]
+>>> len(num)
+7
+>>> max(num)
+67
+>>> min(num)
+1
+
+
+

+1.7. Changing Elements

Unlike Strings + Lists are mutable, i.e. elements of a + List can be manipulated: +

>>> a = [1, 3, 5, 7]
+>>> a[2] = 9
+>>> a
+[1, 3, 9, 7]
+
+
+

+1.8. Deleting Elements

An element or a slice of a + List can be deleted by using the + del statement: +

>>> a = [1, 3, 5, 7, 9, 11]
+>>> del a[-2:]
+>>> a
+[1, 3, 5, 7]
+>>> del a[1]
+>>> a
+[1, 5, 7]
+
+
+

+1.9. Assign to Slices

In the same way, values can be assigned to individual elements of the + List, a + List of elements can be assigned to a slice: +

>>> a = [2, 3, 4, 5]
+>>> a[:2] = [0, 1]
+[0, 1, 4, 5]
+>>> a[2:2] = [2, 3]
+>>> a
+[0, 1, 2, 3, 4, 5]
+>>> a[2:4] = []
+>>> a
+[0, 1, 4, 5]
+
+

The last two examples should be particularly noted carefully. The last but one example insert elements or a list of elements into a + List and the last example deletes a list of elements from the + List. +

+2. None, Empty Lists, and Initialization

An + Empty List is a + List with no elements and is simply represented as []. A + None List is one with all elements in it being + None. It serves the purpose having a container list of some fixed number of elements with no value: +

>>> a = []
+>>> a
+[]
+>>> n = [None] * 10
+>>> n
+[None, None, None, None, None, None, None, None, None, None]
+
+
+

+3. Nested Lists

As mentioned earlier, a List can contain elements of any data type. This also implies a + List can have a + Lists themselves as its elements. These are called as + Nested Lists. There is no limit on the depth of the + Nested Lists: +

>>> a = [1, [1, 2, 3], 3, [1, [1, 2, 3]], 7]
+
+
+

+4. List Methods

A method is a function that is coupled to an object. More about objects and its methods are discussed in Advanced Python module. In general, a method is called like:

object.method(arguments)
+
+

For now, it is enough to know that a list of elements is an object and so + List methods can be called upon them. Also some of the methods change the + List in-place, meaning it modifies the existing list instead of creating a new one, while other methods don't. It must be noted as we run through the + List methods. +

Some of the most commonly used + List methods are as follows: +

+4.1. append

The + append method is used to append an object at the end of the list: +

>>> prime = [2, 3, 5]
+>>> prime.append(7)
+>>> prime
+[2, 3, 5, 7]
+
+

It is important to note that append changes the + List in-place. +

+4.2. count

The + count method returns the number of occurences of a particular element in a list: +

>>> [1, 4, 4, 9, 9, 9].count(9)
+3
+>>> tlst = ['Python', 'is', 'a', 'beautiful', 'language']
+>>> tlst.count('Python')
+1
+
+
+

+4.3. extend

The + extend method extends the list on which it is called by the list supplied as argument to it: +

>>> a = [1, 2, 3]
+>>> b = [4, 5, 6]
+>>> a.extend(b)
+[1, 2, 3, 4, 5, 6]
+
+

This is an in-place method. This method is equivalent to using the + operator, but using the + operator returns a new list.

+4.4. index

The + index method returns the index position of the element in the list specified as argument: +

>>> a = [1, 2, 3, ,4, 5]
+>>> a.index(4)
+3
+
+
+

+4.5. insert

The + insert method is used to insert an element specified as the second argument to the list at the position specified by the first argument: +

>>> a = ['Python', 'is', 'cool']
+>>> a.insert(2, 'so')
+>>> a
+['Python', 'is', 'so', 'cool']
+
+

The + insert method changes the + List in-place. +

+4.6. pop

The + pop method removes an element from the list. The index position of the element to be removed can be specified as an argument to the + pop method, if not it removes the last element by default: +

>>> a = [1, 2, 3, 4, 5]
+>>> a.pop()
+>>> a
+5
+>>> a.pop(2)
+>>> a
+3
+
+

The + pop method changes the + List in-place. +

+4.7. remove

The + remove method removes the first occurence of an element supplied as a parameter: +

>>> a = [1, 2, 3, 4, 2, 5, 2]
+>>> a.remove(2)
+>>> a
+[1, 3, 4, 2, 5, 2]
+
+
+

+4.8. reverse

The + reverse method reverses elements in the list. It is important to note here that + reverse method changes the list in-place and doesn't return any thing: +

>>> a = ['guido', 'alex', 'tim']
+>>> a.reverse()
+>>> a
+['tim', 'alex', 'guido']
+
+
+

+4.9. sort

The + sort method is used to sort the elements of the list. The + sort method also sorts in-place and does not return anything: +

>>> a = [5, 1, 3, 7, 4]
+>>> a.sort()
+>>> a
+[1, 3, 4, 5, 7]
+
+

In addition to the sort method on a + List object we can also use the built-in + sorted function. This function takes the + List as a parameter and returns a sorted copy of the list. However the original list is left intact: +

>>> a = [5, 1, 3, 7, 4]
+>>> b = sorted(a)
+>>> b
+[1, 3, 4, 5, 7]
+>>> a
+[5, 1, 3, 7, 4]
+
+
+

+5. Tuples

+ Tuples are sequences just like + Lists, but they are immutable. In other words + Tuples provides a way to represent a group of items, where the group of items cannot be changed in any way. The syntax of a + Tuple is also very similar to + List. A + Tuple is represented with the list of items, called elements of the + Tuple separated by comma, with the entire list being enclosed in parenthesis. It is not compulsory to use parenthesis around a + Tuple but it may be necessary in some of the cases: +

>>> a = 1, 2, 3
+>>> a
+(1, 2, 3)
+>>> b = 1,
+>>> b
+(1,)
+
+

It is interesting to note the second example. Just a value followed by a comma automatically makes that an element of a + Tuple with only one element. It is also important to note that, irrespective of input having a parenthesis, the output always has a parenthesis. +

The first example is also known as + Tuple packing, because values are being packed into a tuple. It is also possible to do + Tuple unpacking which is more interesting. It is better to understand that by example. Say we have a co-ordinate pair from which we need to separate x and y co-ordinates: +

>>> a = (1, 2)
+>>> x, y = a
+>>> x
+1
+>>> y
+2
+
+

Tuple unpacking also has several other use-cases of which the most interesting one is to swap the values of two variables. Using programming languages like C would require anywhere around 10 lines of code and an extra temporary variable to do this (including all the #include stuff). Python does it in the most intuitive way in just one line. Say we want to swap the co-ordinates in the above example: +

>>> x, y = y, x
+>>> x
+2
+>>> y
+1
+
+

+5.1. Common Tuple Operations

There is no need to introduce all the + Tuple operations again, since + Tuples support the following operations that + List supports in exactly the same way: +

+
Indexing
+
Concatenating
+
Slicing
+
Membership
+
Multiplication
+
Length, Maximum, Minimum
+

The following examples illustrate the above operations:

>>> a = (1, 2, 3, 4, 5, 6)
+>>> a[5]
+6
+>>> b = (7, 8, 9)
+>>> a + b
+(1, 2, 3, 4, 5, 6, 7, 8, 9)
+>>> a[3:5]
+(4, 5)
+>>> 5 in a
+True
+>>> c = (1,)
+>>> c * 5
+(1, 1, 1, 1, 1)
+>>> len(a)
+6
+>>> max(a)
+6
+>>> min(a)
+1
+
+

However the following + List operations are not supported by + Tuples because + Tuples cannot be changed once they are created: +

+
Changing elements
+
Deleting elements
+
Assigning to slices
+

Similarity to + Lists leads to the questions like, why not + Lists only? Why do we even want + Tuples? Can we do the same with + Lists? And the answer is + Yes we can do it, but + Tuples are helpful at times, like we can return Tuples from functions. They are also returned by some built-in functions and methods. And also there are some use cases like co-ordinate among other things. So + Tuples are helpful. +

+6. Additional Syntax

The following additional syntax are introduced to make it easier to operate on + Lists. +

+6.1. range()

The + range function takes at least one argument and 2 additional optional arguments. If two or more arguments are specified, the range function returns a list of natural numbers starting from the first argument passed to it to the second argument. The third argument, if specified is used as a step. Suppose only one argument is specified, then + range function returns a list of natural numbers starting from 0 upto the argument specified: +

>>> range(5, 10, 2)
+[5, 7, 9]
+>>> range(2, 15)
+[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
+>>> range(12)
+[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
+
+

+6.2. for

The + for keyword is used as a part of the looping construct. Unlike for loops in other languages, Python's for is used to iterate through the elements of sequences like + Lists, + Tuples, + Dictionaries, etc. The syntax of the for loop consists of + for, followed by a variable to hold the individual or the current element of the list during iteration and + in, followed by the sequence and a semicolon(':') The next line which is part of the + for loop, i.e the statements that are part of the loop should start with a new intend: +

>>> names = ['Guido', 'Alex', 'Tim']
+>>> for name in names:
+...   print "Name =", name
+... 
+Name = Guido
+Name = Alex
+Name = Tim
+
+
+

+7. Conclusion

This section on + Lists and + Tuples introduces almost all the necessary machinary required to work on + Lists and + Tuples. Topics like how to use these data structures in bigger more useful programs will be introduced in the subsequent chapters. +

+ + + + + + + + + + + +

+Prev
Chapter 1. List and Tuples	Home

+ + diff -r 000000000000 -r 8083d21c0020 web/html/backup/ch03-oop.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/ch03-oop.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,109 @@ + + + +Classes and Objects + + + + + + + +

+ + + + + + + +

Classes and Objects
+Prev

+Classes and Objects

In the previous sections we learnt about functions which provide certain level of abstraction to our code by holding the code which performs one or more specific functionalities. We were able to use this function as many times as we wanted. In addition to functions, Python also higher level of abstractions through + Classes and + Objects. + Objects can be loosely defined as a collection of a set of data items and a set of methods. The data items can be any valid Python variable or any Python object. Functions enclosed within a class are called as + methods. If you are thinking if methods are functions why is there a distinction between the two? The answer to this will be given as we walk through the concepts of + Classes and + Objects. + Classes contain the definition for the + Objects. + Objects are instances of + Classes. +

A class is defined using the keyword + class followed by the class name, in turn followed by a semicolon. The statements that a + Class encloses are written in a new block, i.e on the next indentation level: +

class Employee:
+  def setName(self, name):
+    self.name = name
+
+  def getName(self):
+    return self.name
+
+

In the above example, we defined a class with the name Employee. We also defined two methods, setName and getName for this class. It is important to note the differences between the normal Python functions and class methods defined above. Each method of the class must take the same instance of the class(object) from which it was called as the first argument. It is conventionally given the name, + self. Note that + self is only a convention. You can use any other name, but the first argument to the method will always be the same object of the class from which the method was called. The data memebers that belong to the class are called as + class attributes. + Class attributes are preceded by the object of the class and a dot. In the above example, + name is a class attribute since it is preceded by the + self object. + Class attributes can be accessed from anywhere within the class. +

We can create objects of a class outside the class definition by using the same syntax we use to call a function with no parameters. We can assign this object to a variable:

emp = Employee()
+
+

In the above example, we create an object named + emp of the class + Employee. All the attributes and methods of the class can be accessed by the object of the class using the standard notation + object.attribute or + object.method(). Although the first parameter of a class method is the self object, it must not be passed as an argument when calling the method. The + self object is implicitly passed to the method by the Python interpreter. All other arguments passing rules like default arguments, keyword arguments, argument packing and unpacking follow the same rules as those for ordinary Python functions: +

>>> emp.setName('John')
+>>> name = emp.getName()
+>>> print name
+John
+>>> print emp.name
+John
+
+

If we at all try to access a class attribute before assigning a value to it, i.e before creating it, Python raises the same error as it would raise for the accessing undefined variable:

>>> emp = Employee()
+>>> emp.name
+Traceback (most recent call last):
+  File "class.py", line 10, in <module>
+    print e.name
+AttributeError: Employee instance has no attribute 'name'
+
+

+ + + + + + + + + + + +

+Prev
Chapter 1. Classes and Object	Home

+ + diff -r 000000000000 -r 8083d21c0020 web/html/backup/chap_intro.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/chap_intro.py Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,2 @@ +p_list={'chap_intro':['x_38', 'x_39', 'x_3d', 'x_3e', 'x_3f', 'x_40', 'x_41', 'x_42', 'x_46', 'x_4c', 'x_4d', 'x_4e', 'x_4f', 'x_50', 'x_51', 'x_52', 'x_53', 'x_54', 'x_55', 'x_56', 'x_57', 'x_58', 'x_59', 'x_5a', 'x_5b', 'x_5c', 'x_5d', 'x_5e', 'x_5f', 'x_60', 'x_61', 'x_62', 'x_63', 'x_64', 'x_65', 'x_66', 'x_67', 'x_68', 'x_69', 'x_6d', 'x_6e', 'x_6f', 'x_70', 'x_71', 'x_72', 'x_73', 'x_74', 'x_75', 'x_79', 'x_7a', 'x_7b', 'x_7c', 'x_7d', 'x_7e', 'x_7f', 'x_80', 'x_81', 'x_82', 'x_83', 'x_84', 'x_85', 'x_86', 'x_87', 'x_88', 'x_89', 'x_8a', 'x_8b', 'x_8c', 'x_8d', 'x_8e', 'x_8f', 'x_90', 'x_91', 'x_92', 'x_93', 'x_94', 'x_95', 'x_96', 'x_97', 'x_98', 'x_99', 'x_9a', 'x_9b', 'x_9c', 'x_9d']} + diff -r 000000000000 -r 8083d21c0020 web/html/backup/func.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/func.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,726 @@ + + +Chapter 9. Finding and fixing mistakes + + + + + + + + + + + +

+Chapter 14. Adding functionality with extensions

Table of Contents

14.1. Improve performance with the inotify extension

14.2. Flexible diff support with the extdiff extension

14.2.1. Defining command aliases

14.3. Cherrypicking changes with the transplant extension

14.4. Send changes via email with the patchbomb extension

14.4.1. Changing the behavior of patchbombs

Table of Contents

5.1. List Comprehensions

+ +

def factorial(n):
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

The code snippet above defines a function with the name factorial, takes the number for which the factorial must be computed, computes the factorial and returns the value.

A + Function once defined can be used or called anywhere else in the program. We call a fucntion with its name followed by a pair of parenthesis which encloses the arguments to the function. +

The value that function returns can be assigned to a variable. Let's call the above function and store the factorial in a variable:

fact5 = factorial(5)
+
+

The value of fact5 will now be 120, which is the factorial of 5. Note that we passed 5 as the argument to the function.

def factorial(n):
+  'Returns the factorial for the number n.'
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

Let us write a small function to swap two values:

def swap(a, b):
+  return b, a
+
+c, d = swap(a, b)
+
+

def cant_change(n):
+  n = 10
+
+n = 5
+cant_change(n)
+
+

>>> def can_change(n):
+...   n[1] = James
+...
+
+>>> name = ['Mr.', 'Steve', 'Gosling']
+>>> can_change(name)
+>>> name
+['Mr.', 'James', 'Gosling']
+
+

If nothing is returned by the function explicitly, Python takes care to return None when the funnction is called.

+1. Default Arguments

def fib(n=10):
+  fib_list = [0, 1]
+  for i in range(n - 2):
+    next = fib_list[-2] + fib_list[-1]
+    fib_list.append(next)
+  return fib_list
+
+

fib()
+[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
+fib(5)
+[0, 1, 1, 2, 3]
+
+

+2. Keyword Arguments

In a function call, + Keyword arguments can be used for each argument, in the following fashion: +

argument_name=argument_value
+Also denoted as: keyword=argument
+
+def wish(name='World', greetings='Hello'):
+  print "%s, %s!" % (greetings, name)
+
+

wish(name='Guido', greetings='Hey')
+wish(greetings='Hey', name='Guido')
+
+

def my_func(x, y, z, u, v, w):
+  # initialize variables.
+  ...
+  # do some stuff 
+  ...
+  # return the value
+
+

It is valid to call the above functions in the following ways:

my_func(10, 20, 30, u=1.0, v=2.0, w=3.0)
+my_func(10, 20, 30, 1.0, 2.0, w=3.0)
+my_func(10, 20, z=30, u=1.0, v=2.0, w=3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, w=3.0)
+
+

Following lists some of the invalid calls:

my_func(10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, 3.0)
+
+

+3. Parameter Packing and Unpacking

def print_report(title, *args, **name):
+  """Structure of *args*
+  (age, email-id)
+  Structure of *name*
+  {
+      'first': First Name
+      'middle': Middle Name
+      'last': Last Name
+  }
+  """
+
+  print "Title: %s" % (title)
+  print "Full name: %(first)s %(middle)s %(last)s" % name
+  print "Age: %d\nEmail-ID: %s" % args
+
+

The above function can be called as. Note, the order of keyword parameters can be interchanged:

>>> print_report('Employee Report', 29, 'johny@example.com', first='Johny',
+                 last='Charles', middle='Douglas')
+Title: Employee Report
+Full name: Johny Douglas Charles
+Age: 29
+Email-ID: johny@example.com
+
+

def print_report(title, age, email, first, middle, last):
+  print "Title: %s" % (title)
+  print "Full name: %s %s %s" % (first, middle, last)
+  print "Age: %d\nEmail-ID: %s" % (age, email)
+
+>>> args = (29, 'johny@example.com')
+>>> name = {
+        'first': 'Johny',
+        'middle': 'Charles',
+        'last': 'Douglas'
+        }
+>>> print_report('Employee Report', *args, **name)
+Title: Employee Report
+Full name: Johny Charles Douglas
+Age: 29
+Email-ID: johny@example.com
+
+

+4. Nested Functions and Scopes

+      http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
+      http://personalpages.tds.net/~kent37/kk/00001.html
+

However, the following is an example for nested functions in Python:

def outer():
+  print "Outer..."
+  def inner():
+    print "Inner..."
+  print "Outer..."
+  inner()
+
+>>> outer()
+
+

+5. map, reduce and filter functions

Python provides several built-in functions for convenience. The + map(), + reduce() and + filter() functions prove to be very useful with sequences like + Lists. +

def square(x):
+  return x*x
+
+>>> map(square, [1, 2, 3, 4])
+[1, 4, 9, 16]
+
+def mul(x, y):
+  return x*y
+
+>>> map(mul, [1, 2, 3, 4], [6, 7, 8, 9])
+
+

def even(x):
+  if x % 2:
+    return True
+  else:
+    return False
+
+>>> filter(even, range(1, 10))
+[1, 3, 5, 7, 9]
+
+

def mul(x, y):
+  return x*y
+
+>>> reduce(mul, [1, 2, 3, 4])
+24
+
+

+5.1. List Comprehensions

>>> num = [1, 2, 3]
+>>> sq = [x*x for x in num]
+>>> sq
+[1, 4, 9]
+>>> all_num = [1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> even = [x for x in all_num if x%2 == 0]
+
+

The syntax used here is very clear from the way it is written. It can be translated into english as, "for each element x in the list all_num, if remainder of x divided by 2 is 0, add x to the list."

+ + + + +Functional Approach + + + +

+Functional Approach

Table of Contents

5.1. List Comprehensions

def factorial(n):
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

The code snippet above defines a function with the name factorial, takes the number for which the factorial must be computed, computes the factorial and returns the value.

A + Function once defined can be used or called anywhere else in the program. We call a fucntion with its name followed by a pair of parenthesis which encloses the arguments to the function. +

The value that function returns can be assigned to a variable. Let's call the above function and store the factorial in a variable:

fact5 = factorial(5)
+
+

The value of fact5 will now be 120, which is the factorial of 5. Note that we passed 5 as the argument to the function.

def factorial(n):
+  'Returns the factorial for the number n.'
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact
+
+

Let us write a small function to swap two values:

def swap(a, b):
+  return b, a
+
+c, d = swap(a, b)
+
+

def cant_change(n):
+  n = 10
+
+n = 5
+cant_change(n)
+
+

>>> def can_change(n):
+...   n[1] = James
+...
+
+>>> name = ['Mr.', 'Steve', 'Gosling']
+>>> can_change(name)
+>>> name
+['Mr.', 'James', 'Gosling']
+
+

If nothing is returned by the function explicitly, Python takes care to return None when the funnction is called.

+1. Default Arguments

def fib(n=10):
+  fib_list = [0, 1]
+  for i in range(n - 2):
+    next = fib_list[-2] + fib_list[-1]
+    fib_list.append(next)
+  return fib_list
+
+

fib()
+[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
+fib(5)
+[0, 1, 1, 2, 3]
+
+

+2. Keyword Arguments

In a function call, + Keyword arguments can be used for each argument, in the following fashion: +

argument_name=argument_value
+Also denoted as: keyword=argument
+
+def wish(name='World', greetings='Hello'):
+  print "%s, %s!" % (greetings, name)
+
+

wish(name='Guido', greetings='Hey')
+wish(greetings='Hey', name='Guido')
+
+

def my_func(x, y, z, u, v, w):
+  # initialize variables.
+  ...
+  # do some stuff 
+  ...
+  # return the value
+
+

It is valid to call the above functions in the following ways:

my_func(10, 20, 30, u=1.0, v=2.0, w=3.0)
+my_func(10, 20, 30, 1.0, 2.0, w=3.0)
+my_func(10, 20, z=30, u=1.0, v=2.0, w=3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, w=3.0)
+
+

Following lists some of the invalid calls:

my_func(10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, 3.0)
+
+

+3. Parameter Packing and Unpacking

def print_report(title, *args, **name):
+  """Structure of *args*
+  (age, email-id)
+  Structure of *name*
+  {
+      'first': First Name
+      'middle': Middle Name
+      'last': Last Name
+  }
+  """
+
+  print "Title: %s" % (title)
+  print "Full name: %(first)s %(middle)s %(last)s" % name
+  print "Age: %d\nEmail-ID: %s" % args
+
+

The above function can be called as. Note, the order of keyword parameters can be interchanged:

>>> print_report('Employee Report', 29, 'johny@example.com', first='Johny',
+                 last='Charles', middle='Douglas')
+Title: Employee Report
+Full name: Johny Douglas Charles
+Age: 29
+Email-ID: johny@example.com
+
+

def print_report(title, age, email, first, middle, last):
+  print "Title: %s" % (title)
+  print "Full name: %s %s %s" % (first, middle, last)
+  print "Age: %d\nEmail-ID: %s" % (age, email)
+
+>>> args = (29, 'johny@example.com')
+>>> name = {
+        'first': 'Johny',
+        'middle': 'Charles',
+        'last': 'Douglas'
+        }
+>>> print_report('Employee Report', *args, **name)
+Title: Employee Report
+Full name: Johny Charles Douglas
+Age: 29
+Email-ID: johny@example.com
+
+

+4. Nested Functions and Scopes

+      http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
+      http://personalpages.tds.net/~kent37/kk/00001.html
+

However, the following is an example for nested functions in Python:

def outer():
+  print "Outer..."
+  def inner():
+    print "Inner..."
+  print "Outer..."
+  inner()
+
+>>> outer()
+
+

+5. map, reduce and filter functions

Python provides several built-in functions for convenience. The + map(), + reduce() and + filter() functions prove to be very useful with sequences like + Lists. +

def square(x):
+  return x*x
+
+>>> map(square, [1, 2, 3, 4])
+[1, 4, 9, 16]
+
+def mul(x, y):
+  return x*y
+
+>>> map(mul, [1, 2, 3, 4], [6, 7, 8, 9])
+
+

def even(x):
+  if x % 2:
+    return True
+  else:
+    return False
+
+>>> filter(even, range(1, 10))
+[1, 3, 5, 7, 9]
+
+

def mul(x, y):
+  return x*y
+
+>>> reduce(mul, [1, 2, 3, 4])
+24
+
+

+5.1. List Comprehensions

>>> num = [1, 2, 3]
+>>> sq = [x*x for x in num]
+>>> sq
+[1, 4, 9]
+>>> all_num = [1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> even = [x for x in all_num if x%2 == 0]
+
+

The syntax used here is very clear from the way it is written. It can be translated into english as, "for each element x in the list all_num, if remainder of x divided by 2 is 0, add x to the list."

+ diff -r 000000000000 -r 8083d21c0020 web/html/backup/paragraphlist.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/backup/paragraphlist.py Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,8 @@ +import sys +f=open(sys.argv[1],'r') +pid_list=[] +for i in f.readlines(): + if i.startswith('

+ + +Basic Python + + + + + + + + + +

+ +

+Basic Python

Table of Contents

2.1. The Interactive Interpreter
2.2. + ipython - An enhanced interactive Python interpreter +

5. The + while loop +

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

6. The + if conditional +

7. raw_input() +

8. + int() method +

The system requirements:

Python - version 2.5.x or newer.
IPython
Text editor - scite, vim, emacs or whatever you are comfortable with.

+1. Introduction

+ Resources available for reference +

Web: + http://www.python.org +
Doc: + http://www.python.org/doc +
Free Tutorials: + Official Python Tutorial: + http://docs.python.org/tut/tut.html + Byte of Python: + http://www.byteofpython.info/ * Dive into Python: + http://diveintopython.org/ +

+ Advantages of Python - Why Python?? +

Python has been designed for readability and ease of use. Its been designed in such a fashion that it imposes readability on the programmer. Python does away with the braces and the semicolons and instead implements code blocks based on indentation, thus enhancing readability.
Python is a high level, interpreted, modular and object oriented language. Python performs memory management on its own, thus the programmer need not bother about allocating and deallocating memory to variables. Python provides extensibility by providing modules which can be easily imported similar to headers in C and packages in Java. Python is object oriented and hence provides all the object oriented characteristics such as inheritance, encapsulation and polymorphism.
Python offers a highly powerful interactive programming interface in the form of the 'Interactive Interpreter' which will be discussed in more detail in the following sections.
Python provides a rich standard library and an extensive set of modules. The power of Python modules can be seen in this slightly exaggerated cartoon + http://xkcd.com/353/ +
Python interfaces well with most other programming languages such as C, C++ and FORTRAN.

Although, Python has one setback. Python is not fast as some of the compiled languages like C or C++. Yet, the amount of flexibility and power more than make up for this setback.

+2. The Python Interpreter

+2.1. The Interactive Interpreter

Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49) 
+[GCC 4.3.2] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>> 
+
+

Lets try with an example, type + print 'Hello, World!' at the prompt and hit the enter key. +

>>> print 'Hello, World!'
+Hello, World!
+
+

This example was quite straight forward, and thus we have written our first line of Python code. Now let us try typing something arbitrary at the prompt. For example:

>>> arbit word
+  File "<stdin>", line 1
+    arbit word
+            ^
+SyntaxError: invalid syntax
+>>>
+
+

>>> help()
+
+Welcome to Python 2.5!  This is the online help utility.
+
+If this is your first time using Python, you should definitely check out
+the tutorial on the Internet at http://www.python.org/doc/tut/.
+
+Enter the name of any module, keyword, or topic to get help on writing
+Python programs and using Python modules.  To quit this help utility and
+return to the interpreter, just type "quit".
+
+To get a list of available modules, keywords, or topics, type "modules",
+"keywords", or "topics".  Each module also comes with a one-line summary
+of what it does; to list the modules whose summaries contain a given word
+such as "spam", type "modules spam".
+
+help> 
+
+
+

Let us now try a few examples at the python interpreter.

Eg 1:

>>> print 'Hello, python!'
+Hello, python!
+>>>
+
+

Eg 2:

>>> print 4321*567890
+2453852690
+>>> 
+
+

Eg 3:

>>> 4321*567890
+2453852690L
+>>>
+
+

Note: Notice the 'L' at the end of the output. The 'L' signifies that the
+output of the operation is of type *long*. It was absent in the previous
+example because we used the print statement. This is because *print* formats
+the output before displaying.
+
+

Eg 4:

>>> big = 12345678901234567890 ** 3
+>>> print big
+1881676372353657772490265749424677022198701224860897069000
+>>> 
+
+

This example is to show that unlike in C or C++ there is no limit on the
+value of an integer.
+
+

Try this on the interactive interpreter: + import this +

+ Hint: The output gives an idea of Power of Python +

+2.2. + ipython - An enhanced interactive Python interpreter +

$ ipython
+Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49) 
+Type "copyright", "credits" or "license" for more information.
+
+IPython 0.8.4 -- An enhanced Interactive Python.
+?         -> Introduction and overview of IPython's features.
+%quickref -> Quick reference.
+help      -> Python's own help system.
+object?   -> Details about 'object'. ?object also works, ?? prints more.
+
+In [1]: 
+
+

This is the output obtained upon firing ipython. The exact appearance may change based on the Python version installed. The following are some of the various features provided by + ipython: +

Suggestions - ipython provides suggestions of the possible methods and operations available for the given python object.

Eg 5:

In [4]: a = 6
+
+In [5]: a.
+a.__abs__           a.__divmod__        a.__index__         a.__neg__          a.__rand__          a.__rmod__          a.__rxor__
+a.__add__           a.__doc__           a.__init__          a.__new__          a.__rdiv__          a.__rmul__          a.__setattr__
+a.__and__           a.__float__         a.__int__           a.__nonzero__      a.__rdivmod__       a.__ror__           a.__str__
+a.__class__         a.__floordiv__      a.__invert__        a.__oct__          a.__reduce__        a.__rpow__          a.__sub__
+a.__cmp__           a.__getattribute__  a.__long__          a.__or__           a.__reduce_ex__     a.__rrshift__       a.__truediv__
+a.__coerce__        a.__getnewargs__    a.__lshift__        a.__pos__          a.__repr__          a.__rshift__        a.__xor__
+a.__delattr__       a.__hash__          a.__mod__           a.__pow__          a.__rfloordiv__     a.__rsub__          
+a.__div__           a.__hex__           a.__mul__           a.__radd__         a.__rlshift__       a.__rtruediv__      
+
+

+3. Editing and running a python file

Let us look at a simple example of calculating the gcd of 2 numbers using Python:

+ Creating the first python script(file) : +

$ emacs gcd.py
+  def gcd(x,y):
+    if x % y == 0:
+      return y
+    return gcd(y, x%y)
+
+  print gcd(72, 92)
+
+

To run the script, open the shell prompt, navigate to the directory that contains the python file and run + python <filename.py> at the prompt ( in this case filename is gcd.py ) +

+ Running the python script : +

$ python gcd.py
+4
+$ 
+
+

Another method to run a python script would be to include the line

+ #! /usr/bin/python +

at the beginning of the python file and then make the file executable by

$ chmod a+x + filename.py +

Once this is done, the script can be run as a standalone program as follows:

$ ./ + filename.py +

+4. Basic Datatypes and operators in Python

Python provides the following set of basic datatypes.

+
Numbers: int, float, long, complex
+
Strings
+
Boolean
+

+4.1. Numbers

Eg 6:

>>> a = 1 #here a is an integer variable
+
+

Eg 7:

>>> lng = 122333444455555666666777777788888888999999999 #here lng is a variable of type long
+>>> lng
+122333444455555666666777777788888888999999999L #notice the trailing 'L'
+>>> print lng
+122333444455555666666777777788888888999999999 #notice the absence of the trailing 'L'
+>>> lng+1
+122333444455555666666777777788888889000000000L
+
+
+

Eg 8:

>>> fl = 3.14159 #fl is a float variable
+>>> e = 1.234e-4 #e is also a float variable, specified in the exponential form
+>>> a = 1
+>>> b = 2
+>>> a/b #integer division
+0
+>>> a/fl #floating point division
+0.31831015504887655
+>>> e/fl
+3.9279473133031364e-05
+
+
+

Eg 9:

>>> cplx = 3 + 4j #cplx is a complex variable
+>>> cplx
+(3+4j)
+>>> print cplx.real #prints the real part of the complex number
+3.0
+>>> print cplx.imag #prints the imaginary part of the complex number
+4.0
+>>> print cplx*fl  #multiplies the real and imag parts of the complex number with the multiplier
+(9.42477+12.56636j)
+>>> abs(cplx) #returns the absolute value of the complex number
+5.0
+
+

+4.2. Variables

Variables are just names that represent a value. Variables have already been introduced in the various examples from the previous sections. Certain rules about using variables:

+
Variables have to be initialized or assigned a value before being used.
+
Variable names can consist of letters, digits and + underscores . +
+
Variable names cannot begin with digits, but can contain digits in them.
+

In reference to the previous section examples, 'a', 'b', 'lng', 'fl', 'e' and 'cplx' are all variables of various datatypes.

Note: Python is not a strongly typed language and hence an integer variable can at a
+later stage be used as a float variable as well.
+
+

+4.3. Strings

s = 'this is a string'              # a string variable can be represented using single quotes
+s = 'This one has "quotes" inside!' # The string can have quotes inside it as shown
+s = "I have 'single-quotes' inside!"
+l = "A string spanning many lines\
+one more line\
+yet another"                        # a string can span more than a single line.
+t = """A triple quoted string does  # another way of representing multiline strings.
+not need to be escaped at the end and
+"can have nested quotes" etc."""
+
+

Try the following on the interpreter: + s = 'this is a string with 'quotes' of similar kind' +

+ Exercise: How to use single quotes within single quotes in a string as shown in the above example without getting an error? +

+4.3.1. String operations

A few basic string operations are presented here.

+ String concatenation String concatenation is done by simple addition of two strings. +

>>> x = 'Hello'
+>>> y = ' Python'
+>>> print x+y
+Hello Python
+
+

Try this yourself: +

>>> somenum = 13
+>>> print x+somenum
+
+

+ str() simply converts a value to a string in a reasonable form. + repr() creates a string that is a representation of the value. +

The difference can be seen in the example shown below:

>>> str(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000'
+>>> repr(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000L'
+
+

It can be observed that the 'L' in the long value shown was omitted by str(), whereas repr() converted that into a string too. An alternative way of using repr(value) is ` + value`. +

A few more examples:

>>> x = "Let's go \nto Pycon"
+>>> print x
+Let's go 
+to Pycon
+
+

>>> x = r"Let's go \nto Pycon"
+>>> print x
+Let's go \nto Pycon
+
+

Note: The '\n' is not being parsed into a new line and is left as it is.

+ Try this yourself: +

>>> x = r"Let's go to Pycon\"
+
+

>>> x = u"Let's go to Pycon!"
+>>> print x
+Let's go to Pycon!
+
+

+4.4. Boolean

Python also provides special Boolean datatype. A boolean variable can assume a value of either + True or + False (Note the capitalizations). +

Let us look at examples:

>>> t = True
+>>> f = not t
+>>> print f
+False
+>>> f or t
+True
+>>> f and t
+False
+
+

+5. The + while loop +

The Python + while loop is similar to the C/C++ while loop. The syntax is as follows: +

statement 0
+while condition:
+  statement 1 #while block
+  statement 2 #while block
+statement 3 #outside the while block.
+
+

Let us look at an example:

>>> x = 1  
+>>> while x <= 5:
+...   print x
+...   x += 1
+... 
+1
+2
+3
+4
+5
+
+

+6. The + if conditional +

The Python + if block provides the conditional execution of statements. If the condition evaluates as true the block of statements defined under the if block are executed. +

If the first block is not executed on account of the condition not being satisfied, the set of statements in the + else block are executed. +

The + elif block provides the functionality of evaluation of multiple conditions as shown in the example. +

The syntax is as follows:

if condition :
+    statement_1
+    statement_2
+
+elif condition:
+    statement_3
+    statement_4
+else:
+    statement_5
+    statement_6
+
+

Let us look at an example:

>>> n = raw_input("Input a number:")
+>>> if n < 0:
+      print n," is negative"
+      elif n > 0:
+      print n," is positive"
+      else:
+      print n, " is 0"
+
+

+7. raw_input() +

>>> input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+121
+
+>>> input("Enter your name:")
+Enter your name:PythonFreak
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+  File "<string>", line 1, in <module>
+NameError: name 'PythonFreak' is not defined
+
+

>>> input("Enter your name:")
+Enter your name:'PythonFreak'
+'PythonFreak'
+>>> 
+
+

Let us now look at how + raw_input() operates with an example. +

>>> raw_input("Enter your name:")
+Enter your name:PythonFreak
+'PythonFreak'
+
+

Observe that the + raw_input() is converting it into a string all by itself. +

>>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+'121'
+
+

Observe that + raw_input() is converting the integer 121 also to a string as '121'. Let us look at another example: +

>>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+>>> pal + 2
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: cannot concatenate 'str' and 'int' objects
+>>> pal
+'121'
+
+

Observe here that the variable + pal is a string and hence integer operations cannot be performed on it. Hence the exception is raised. +

+8. + int() method +

Let us look at an example.

>>> intpal = int(pal)
+>>> intpal
+121
+
+

In the previous example it was observed that + pal was a string variable. Here using the + int() method the string + pal was converted to an integer variable. +

+ Try This Yourself: +

>>> stringvar = raw_input("Enter a name:")
+Enter a name:Guido Van Rossum
+>>> stringvar
+'Guido Van Rossum'
+>>> numvar = int(stringvar)
+
+

+ + + + + + + + + + + +

+Prev
Chapter 1. Introduction	Home

+ + diff -r 000000000000 -r 8083d21c0020 web/html/ch01-introduction.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch01-introduction.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,44 @@ + + + +Chapter 1. + + + + + + + +

Table of Contents

Introduction to the Course

+Introduction to the Course

Engineering students use computers for a large number of curricular tasks – mostly computation centred. However, they do not see this as coding or programming tasks and usually are not even aware of the tools and techniques that will help them to handle these tasks better. This results in less than optimal use of their time and resources. This also causes difficulties when it comes tocollaboration and building on other people’s work. This course is intended to train such students in good software practices and tools for producing code and documentation.

After successfully completing the program, the participants will be able to:

understand how software tools work together and how they can be used in tandem to carry out tasks,
use unix command line tools to carry out common (mostly text processing tasks,
to generate professional documents,
use version control effectively – for both code and documents,
automate tasks by writing shell scripts and python scripts,
realise the impact of coding style and readbility on quality,
write mid-sized programs that carry out typical engineering / numerical computations such as those that involve (basic) manipulation of large arrays in an efficient manner,
generate 2D and simple 3D plots,
debug programs using a standardised approach,
understand the importance of tests and the philosophy of Test Driven Development,
write unit tests and improve the quality of code.

+ + + diff -r 000000000000 -r 8083d21c0020 web/html/ch02-basic_intro.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch02-basic_intro.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,658 @@ + + + +Basic Python + + + + + + + + +

+Basic Python

Table of Contents

2.1. The Interactive Interpreter
2.2. + ipython - An enhanced interactive Python interpreter +

5. The + while loop +

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

6. The + if conditional +

7. raw_input() +

8. + int() method +

The system requirements:

Python - version 2.5.x or newer.
IPython
Text editor - scite, vim, emacs or whatever you are comfortable with.

+1. Introduction

+ Resources available for reference +

Web: + http://www.python.org +
Doc: + http://www.python.org/doc +
Free Tutorials: + Official Python Tutorial: + http://docs.python.org/tut/tut.html + Byte of Python: + http://www.byteofpython.info/ * Dive into Python: + http://diveintopython.org/ +

+ Advantages of Python - Why Python?? +

Python has been designed for readability and ease of use. Its been designed in such a fashion that it imposes readability on the programmer. Python does away with the braces and the semicolons and instead implements code blocks based on indentation, thus enhancing readability.
Python is a high level, interpreted, modular and object oriented language. Python performs memory management on its own, thus the programmer need not bother about allocating and deallocating memory to variables. Python provides extensibility by providing modules which can be easily imported similar to headers in C and packages in Java. Python is object oriented and hence provides all the object oriented characteristics such as inheritance, encapsulation and polymorphism.
Python offers a highly powerful interactive programming interface in the form of the 'Interactive Interpreter' which will be discussed in more detail in the following sections.
Python provides a rich standard library and an extensive set of modules. The power of Python modules can be seen in this slightly exaggerated cartoon + http://xkcd.com/353/ +
Python interfaces well with most other programming languages such as C, C++ and FORTRAN.

Although, Python has one setback. Python is not fast as some of the compiled languages like C or C++. Yet, the amount of flexibility and power more than make up for this setback.

+2. The Python Interpreter

+2.1. The Interactive Interpreter

Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49) 
+[GCC 4.3.2] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>> 
+
+

Lets try with an example, type + print 'Hello, World!' at the prompt and hit the enter key. +

>>> print 'Hello, World!'
+Hello, World!
+
+

This example was quite straight forward, and thus we have written our first line of Python code. Now let us try typing something arbitrary at the prompt. For example:

>>> arbit word
+  File "<stdin>", line 1
+    arbit word
+            ^
+SyntaxError: invalid syntax
+>>>
+
+

>>> help()
+
+Welcome to Python 2.5!  This is the online help utility.
+
+If this is your first time using Python, you should definitely check out
+the tutorial on the Internet at http://www.python.org/doc/tut/.
+
+Enter the name of any module, keyword, or topic to get help on writing
+Python programs and using Python modules.  To quit this help utility and
+return to the interpreter, just type "quit".
+
+To get a list of available modules, keywords, or topics, type "modules",
+"keywords", or "topics".  Each module also comes with a one-line summary
+of what it does; to list the modules whose summaries contain a given word
+such as "spam", type "modules spam".
+
+help> 
+
+
+

Let us now try a few examples at the python interpreter.

Eg 1:

>>> print 'Hello, python!'
+Hello, python!
+>>>
+
+

Eg 2:

>>> print 4321*567890
+2453852690
+>>> 
+
+

Eg 3:

>>> 4321*567890
+2453852690L
+>>>
+
+

Note: Notice the 'L' at the end of the output. The 'L' signifies that the
+output of the operation is of type *long*. It was absent in the previous
+example because we used the print statement. This is because *print* formats
+the output before displaying.
+
+

Eg 4:

>>> big = 12345678901234567890 ** 3
+>>> print big
+1881676372353657772490265749424677022198701224860897069000
+>>> 
+
+

This example is to show that unlike in C or C++ there is no limit on the
+value of an integer.
+
+

Try this on the interactive interpreter: + import this +

+ Hint: The output gives an idea of Power of Python +

+2.2. + ipython - An enhanced interactive Python interpreter +

$ ipython
+Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49) 
+Type "copyright", "credits" or "license" for more information.
+
+IPython 0.8.4 -- An enhanced Interactive Python.
+?         -> Introduction and overview of IPython's features.
+%quickref -> Quick reference.
+help      -> Python's own help system.
+object?   -> Details about 'object'. ?object also works, ?? prints more.
+
+In [1]: 
+
+

This is the output obtained upon firing ipython. The exact appearance may change based on the Python version installed. The following are some of the various features provided by + ipython: +

Suggestions - ipython provides suggestions of the possible methods and operations available for the given python object.

Eg 5:

In [4]: a = 6
+
+In [5]: a.
+a.__abs__           a.__divmod__        a.__index__         a.__neg__          a.__rand__          a.__rmod__          a.__rxor__
+a.__add__           a.__doc__           a.__init__          a.__new__          a.__rdiv__          a.__rmul__          a.__setattr__
+a.__and__           a.__float__         a.__int__           a.__nonzero__      a.__rdivmod__       a.__ror__           a.__str__
+a.__class__         a.__floordiv__      a.__invert__        a.__oct__          a.__reduce__        a.__rpow__          a.__sub__
+a.__cmp__           a.__getattribute__  a.__long__          a.__or__           a.__reduce_ex__     a.__rrshift__       a.__truediv__
+a.__coerce__        a.__getnewargs__    a.__lshift__        a.__pos__          a.__repr__          a.__rshift__        a.__xor__
+a.__delattr__       a.__hash__          a.__mod__           a.__pow__          a.__rfloordiv__     a.__rsub__          
+a.__div__           a.__hex__           a.__mul__           a.__radd__         a.__rlshift__       a.__rtruediv__      
+
+

+3. Editing and running a python file

Let us look at a simple example of calculating the gcd of 2 numbers using Python:

+ Creating the first python script(file) : +

$ emacs gcd.py
+  def gcd(x,y):
+    if x % y == 0:
+      return y
+    return gcd(y, x%y)
+
+  print gcd(72, 92)
+
+

To run the script, open the shell prompt, navigate to the directory that contains the python file and run + python <filename.py> at the prompt ( in this case filename is gcd.py ) +

+ Running the python script : +

$ python gcd.py
+4
+$ 
+
+

Another method to run a python script would be to include the line

+ #! /usr/bin/python +

at the beginning of the python file and then make the file executable by

$ chmod a+x + filename.py +

Once this is done, the script can be run as a standalone program as follows:

$ ./ + filename.py +

+4. Basic Datatypes and operators in Python

Python provides the following set of basic datatypes.

+
Numbers: int, float, long, complex
+
Strings
+
Boolean
+

+4.1. Numbers

Eg 6:

>>> a = 1 #here a is an integer variable
+
+

Eg 7:

>>> lng = 122333444455555666666777777788888888999999999 #here lng is a variable of type long
+>>> lng
+122333444455555666666777777788888888999999999L #notice the trailing 'L'
+>>> print lng
+122333444455555666666777777788888888999999999 #notice the absence of the trailing 'L'
+>>> lng+1
+122333444455555666666777777788888889000000000L
+
+
+

Eg 8:

>>> fl = 3.14159 #fl is a float variable
+>>> e = 1.234e-4 #e is also a float variable, specified in the exponential form
+>>> a = 1
+>>> b = 2
+>>> a/b #integer division
+0
+>>> a/fl #floating point division
+0.31831015504887655
+>>> e/fl
+3.9279473133031364e-05
+
+
+

Eg 9:

>>> cplx = 3 + 4j #cplx is a complex variable
+>>> cplx
+(3+4j)
+>>> print cplx.real #prints the real part of the complex number
+3.0
+>>> print cplx.imag #prints the imaginary part of the complex number
+4.0
+>>> print cplx*fl  #multiplies the real and imag parts of the complex number with the multiplier
+(9.42477+12.56636j)
+>>> abs(cplx) #returns the absolute value of the complex number
+5.0
+
+

+4.2. Variables

Variables are just names that represent a value. Variables have already been introduced in the various examples from the previous sections. Certain rules about using variables:

+
Variables have to be initialized or assigned a value before being used.
+
Variable names can consist of letters, digits and + underscores . +
+
Variable names cannot begin with digits, but can contain digits in them.
+

In reference to the previous section examples, 'a', 'b', 'lng', 'fl', 'e' and 'cplx' are all variables of various datatypes.

Note: Python is not a strongly typed language and hence an integer variable can at a
+later stage be used as a float variable as well.
+
+

+4.3. Strings

s = 'this is a string'              # a string variable can be represented using single quotes
+s = 'This one has "quotes" inside!' # The string can have quotes inside it as shown
+s = "I have 'single-quotes' inside!"
+l = "A string spanning many lines\
+one more line\
+yet another"                        # a string can span more than a single line.
+t = """A triple quoted string does  # another way of representing multiline strings.
+not need to be escaped at the end and
+"can have nested quotes" etc."""
+
+

Try the following on the interpreter: + s = 'this is a string with 'quotes' of similar kind' +

+ Exercise: How to use single quotes within single quotes in a string as shown in the above example without getting an error? +

+4.3.1. String operations

A few basic string operations are presented here.

+ String concatenation String concatenation is done by simple addition of two strings. +

>>> x = 'Hello'
+>>> y = ' Python'
+>>> print x+y
+Hello Python
+
+

Try this yourself: +

>>> somenum = 13
+>>> print x+somenum
+
+

+ str() simply converts a value to a string in a reasonable form. + repr() creates a string that is a representation of the value. +

The difference can be seen in the example shown below:

>>> str(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000'
+>>> repr(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000L'
+
+

It can be observed that the 'L' in the long value shown was omitted by str(), whereas repr() converted that into a string too. An alternative way of using repr(value) is ` + value`. +

A few more examples:

>>> x = "Let's go \nto Pycon"
+>>> print x
+Let's go 
+to Pycon
+
+

>>> x = r"Let's go \nto Pycon"
+>>> print x
+Let's go \nto Pycon
+
+

Note: The '\n' is not being parsed into a new line and is left as it is.

+ Try this yourself: +

>>> x = r"Let's go to Pycon\"
+
+

>>> x = u"Let's go to Pycon!"
+>>> print x
+Let's go to Pycon!
+
+

+4.4. Boolean

Python also provides special Boolean datatype. A boolean variable can assume a value of either + True or + False (Note the capitalizations). +

Let us look at examples:

>>> t = True
+>>> f = not t
+>>> print f
+False
+>>> f or t
+True
+>>> f and t
+False
+
+

+5. The + while loop +

The Python + while loop is similar to the C/C++ while loop. The syntax is as follows: +

statement 0
+while condition:
+  statement 1 #while block
+  statement 2 #while block
+statement 3 #outside the while block.
+
+

Let us look at an example:

>>> x = 1  
+>>> while x <= 5:
+...   print x
+...   x += 1
+... 
+1
+2
+3
+4
+5
+
+

+6. The + if conditional +

The Python + if block provides the conditional execution of statements. If the condition evaluates as true the block of statements defined under the if block are executed. +

If the first block is not executed on account of the condition not being satisfied, the set of statements in the + else block are executed. +

The + elif block provides the functionality of evaluation of multiple conditions as shown in the example. +

The syntax is as follows:

if condition :
+    statement_1
+    statement_2
+
+elif condition:
+    statement_3
+    statement_4
+else:
+    statement_5
+    statement_6
+
+

Let us look at an example:

>>> n = raw_input("Input a number:")
+>>> if n < 0:
+      print n," is negative"
+      elif n > 0:
+      print n," is positive"
+      else:
+      print n, " is 0"
+
+

+7. raw_input() +

>>> input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+121
+
+>>> input("Enter your name:")
+Enter your name:PythonFreak
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+  File "<string>", line 1, in <module>
+NameError: name 'PythonFreak' is not defined
+
+

>>> input("Enter your name:")
+Enter your name:'PythonFreak'
+'PythonFreak'
+>>> 
+
+

Let us now look at how + raw_input() operates with an example. +

>>> raw_input("Enter your name:")
+Enter your name:PythonFreak
+'PythonFreak'
+
+

Observe that the + raw_input() is converting it into a string all by itself. +

>>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+'121'
+
+

Observe that + raw_input() is converting the integer 121 also to a string as '121'. Let us look at another example: +

>>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+>>> pal + 2
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: cannot concatenate 'str' and 'int' objects
+>>> pal
+'121'
+
+

Observe here that the variable + pal is a string and hence integer operations cannot be performed on it. Hence the exception is raised. +

+8. + int() method +

Let us look at an example.

>>> intpal = int(pal)
+>>> intpal
+121
+
+

In the previous example it was observed that + pal was a string variable. Here using the + int() method the string + pal was converted to an integer variable. +

+ Try This Yourself: +

>>> stringvar = raw_input("Enter a name:")
+Enter a name:Guido Van Rossum
+>>> stringvar
+'Guido Van Rossum'
+>>> numvar = int(stringvar)
+
+

+ + diff -r 000000000000 -r 8083d21c0020 web/html/ch02-oop.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch02-oop.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,82 @@ + + + +Classes and Objects + + + + + + + + +

+Classes and Objects

class Employee:
+  def setName(self, name):
+    self.name = name
+
+  def getName(self):
+    return self.name
+
+

We can create objects of a class outside the class definition by using the same syntax we use to call a function with no parameters. We can assign this object to a variable:

emp = Employee()
+
+

>>> emp.setName('John')
+>>> name = emp.getName()
+>>> print name
+John
+>>> print emp.name
+John
+
+

If we at all try to access a class attribute before assigning a value to it, i.e before creating it, Python raises the same error as it would raise for the accessing undefined variable:

>>> emp = Employee()
+>>> emp.name
+Traceback (most recent call last):
+  File "class.py", line 10, in <module>
+    print e.name
+AttributeError: Employee instance has no attribute 'name'
+
+

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch03-session4.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch03-session4.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,1187 @@ + + + +Chapter 1. + + + + + + + + +

Table of Contents

More text processing

1. + uniq +

2. join +

3. Generating a word frequency list

4. Basic editing and editors

4.1. vim
4.2. SciTE

5. Personalizing your Environment

5.1. .bashrc
5.2. .vimrc

6. Subshells and + source +

+More text processing

Table of Contents

1. + uniq +

2. join +

3. Generating a word frequency list

4. Basic editing and editors

4.1. vim
4.2. SciTE

5. Personalizing your Environment

5.1. .bashrc
5.2. .vimrc

6. Subshells and + source +

+sort

Let's say we have a file which lists a few of the stalwarts of the open source community and a few details about them, like their "other" name, their homepage address, and what they are well known for or their claim to fame.

Richard Stallman%rms%GNU Project
+Eric Raymond%ESR%Jargon File
+Ian Murdock% %Debian
+Lawrence Lessig% %Creative Commons
+Linus Torvalds% %Linux Kernel
+Guido van Rossum%BDFL%Python
+Larry Wall% %Perl
+
+
+

The sort command enables us to do this in a flash! Just running the sort command with the file name as a parameter sorts the lines of the file alphabetically and prints the output on the terminal. :

$ sort stalwarts.txt 
+Eric Raymond%ESR%Jargon File
+Guido van Rossum%BDFL%Python
+Ian Murdock% %Debian
+Larry Wall% %Perl
+Lawrence Lessig% %Creative Commons
+Linus Torvalds% %Linux Kernel
+Richard Stallman%rms%GNU Project
+
+

If you wish to sort them reverse alphabetically, you just need to pass the + -r option. Now, you might want to sort the lines, based on each person's claim to fame or their "other" name. What do we do in that case? +

Below is an example that sorts the file based on "other" names. :

$ sort -t % -k 2,2  stalwarts.txt
+
+Ian Murdock% %Debian
+Larry Wall% %Perl
+Lawrence Lessig% %Creative Commons
+Linus Torvalds% %Linux Kernel
+Guido van Rossum%BDFL%Python
+Eric Raymond%ESR%Jargon File
+Richard Stallman%rms%GNU Project
+
+

Sort command assumes white space to be the default delimiter for columns in each line. The + -t option specifies the delimiting character, which is + % in this case. +

The + -k option starts a key at position 2 and ends it at 2, essentially telling the sort command that it should sort based on the 2nd column, which is the other name. + sort also supports conflict resolution using multiple columns for sorting. You can see that the first three lines have nothing in the "other" names column. We could resolve the conflict by sorting based on the project names (the 3rd column). +

$ sort -t % -k 2,2 -k 3,3  stalwarts.txt
+
+Lawrence Lessig% %Creative Commons
+Ian Murdock% %Debian
+Linus Torvalds% %Linux Kernel
+Larry Wall% %Perl
+Guido van Rossum%BDFL%Python
+Eric Raymond%ESR%Jargon File
+Richard Stallman%rms%GNU Project
+
+

sort also has a lot of other options like ignoring case differences, month sort(JAN<FEB<...), merging already sorted files. + man sort would give you a lot of information. +

+1. + `uniq` +

Suppose we have a list of items, say books, and we wish to obtain a list which names of all the books only once, without any duplicates. We use the + uniq command to achieve this. +

Programming Pearls
+The C Programming Language
+The Mythical Man Month: Essays on Software Engineering 
+Programming Pearls
+The C Programming Language
+Structure and Interpretation of Computer Programs
+Programming Pearls
+Compilers: Principles, Techniques, and Tools
+The C Programming Language
+The Art of UNIX Programming
+Programming Pearls
+The Art of Computer Programming
+Introduction to Algorithms
+The Art of UNIX Programming
+The Pragmatic Programmer: From Journeyman to Master
+Programming Pearls
+Unix Power Tools
+The Art of UNIX Programming
+
+

Let us try and get rid of the duplicate lines from this file using the + uniq command. +

$ uniq items.txt 
+Programming Pearls
+The C Programming Language
+The Mythical Man Month: Essays on Software Engineering 
+Programming Pearls
+The C Programming Language
+Structure and Interpretation of Computer Programs
+Programming Pearls
+Compilers: Principles, Techniques, and Tools
+The C Programming Language
+The Art of UNIX Programming
+Programming Pearls
+The Art of Computer Programming
+Introduction to Algorithms
+The Art of UNIX Programming
+The Pragmatic Programmer: From Journeyman to Master
+Programming Pearls
+Unix Power Tools
+The Art of UNIX Programming
+
+

Nothing happens! Why? The + uniq command removes duplicate lines only when they are next to each other. So, we get a sorted file from the original file and work with that file, henceforth. +

$ sort items.txt > items-sorted.txt
+$ uniq items-sorted.txt
+Compilers: Principles, Techniques, and Tools
+Introduction to Algorithms
+Programming Pearls
+Structure and Interpretation of Computer Programs
+The Art of Computer Programming
+The Art of UNIX Programming
+The C Programming Language
+The Mythical Man Month: Essays on Software Engineering 
+The Pragmatic Programmer: From Journeyman to Master
+Unix Power Tools
+
+

uniq -u command gives the lines which are unique and do not have any duplicates in the file. + uniq -d outputs only those lines which have duplicates. The + -c option displays the number of times each line occurs in the file. : +

$ uniq -u items-sorted.txt 
+Compilers: Principles, Techniques, and Tools
+Introduction to Algorithms
+Structure and Interpretation of Computer Programs
+The Art of Computer Programming
+The Mythical Man Month: Essays on Software Engineering 
+The Pragmatic Programmer: From Journeyman to Master
+Unix Power Tools
+
+$ uniq -dc items-sorted.txt      
+5 Programming Pearls
+3 The Art of UNIX Programming
+3 The C Programming Language
+
+
+

+2. `join` +

Now suppose we had the file + stalwarts1.txt, which lists the home pages of all the people listed in + stalwarts.txt. : +

Richard Stallman%http://www.stallman.org
+Eric Raymond%http://www.catb.org/~esr/
+Ian Murdock%http://ianmurdock.com/
+Lawrence Lessig%http://lessig.org
+Linus Torvalds%http://torvalds-family.blogspot.com/
+Guido van Rossum%http://www.python.org/~guido/
+Larry Wall%http://www.wall.org/~larry/
+
+

It would be nice to have a single file with the information in both the files. To achieve this we use the + join command. : +

$ join stalwarts.txt stalwarts1.txt -t %
+Richard Stallman%rms%GNU Project%http://www.stallman.org
+Eric Raymond%ESR%Jargon File%http://www.catb.org/~esr/
+Ian Murdock% %Debian%http://ianmurdock.com/
+Lawrence Lessig% %Creative Commons%http://lessig.org
+Linus Torvalds% %Linux Kernel%http://torvalds-family.blogspot.com/
+Guido van Rossum%BDFL%Python%http://www.python.org/~guido/
+Larry Wall% %Perl%http://www.wall.org/~larry/
+
+

The + join command joins the two files, based on the common field present in both the files, which is the name, in this case. +

The + -t option again specifies the delimiting character. Unless that is specified, join assumes that the fields are separated by spaces. +

Note that, for + join to work, the common field should be in the same order in both the files. If this is not so, you could use + sort, to sort the files on the common field and then join the files. In the above example, we have the common field to be the first column in both the files. If this is not the case we could use the + -1 and + -2 options to specify the field to be used for joining the files. : +

$ join -2 2 stalwarts.txt stalwarts2.txt -t %
+Richard Stallman%rms%GNU Project%http://www.stallman.org
+Eric Raymond%ESR%Jargon File%http://www.catb.org/~esr/
+Ian Murdock% %Debian%http://ianmurdock.com/
+Lawrence Lessig% %Creative Commons%http://lessig.org
+Linus Torvalds% %Linux Kernel%http://torvalds-family.blogspot.com/
+Guido van Rossum%BDFL%Python%http://www.python.org/~guido/
+Larry Wall% %Perl%http://www.wall.org/~larry/
+
+
+

+3. Generating a word frequency list

Now, let us use the tools we have learnt to use, to generate a word frequency list of a text file. We shall use the free text of Alice in Wonderland.

The basic steps to achieve this task would be -

Eliminate the punctuation and spaces from the document.
Generate a list of words.
Count the words.

We first use + grep and some elementary + regex to eliminate the non-alpha-characters. : +

$ grep "[A-Za-z]*" alice-in-wonderland.txt
+
+

This outputs all the lines which has any alphabetic characters on it. This isn't of much use, since we haven't done anything with the code. We only require the alphabetic characters, without any of the other junk. + man grep shows us the + -o option for outputting only the text which matches the regular expression. : +

$ grep "[A-Za-z]*" -o alice-in-wonderland.txt
+
+

Not very surprisingly, we have all the words, spit out in the form of a list! Now that we have a list of words, it is quite simple to count the occurrences of the words. You would've realized that we can make use of + sort and + uniq commands. We pipe the output from the + grep to the + sort and then pipe it's output to + uniq. : +

$ grep "[A-Za-z]*" -o alice-in-wonderland.txt | sort | uniq -c 
+
+

Notice that you get the list of all words in the document in the alphabetical order, with it's frequency written next to it. But, you might have observed that Capitalized words and lower case words are being counted as different words. We therefore, replace all the Upper case characters with lower case ones, using the + tr command. : +

$ grep  "[A-Za-z]*" -o alice-in-wonderland.txt | tr 'A-Z' 'a-z' | sort | uniq -c 
+
+

Now, it would also be nice to have the list ordered in the decreasing order of the frequency of the appearance of the words. We sort the output of the + uniq command with + -n and + -r options, to get the desired output. : +

$ grep  "[A-Za-z]*" -o alice-in-wonderland.txt | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr
+
+

+4. Basic editing and editors

+4.1. vim

Vim is a very powerful editor. It has a lot of commands, and all of them cannot be explained here. We shall try and look at a few, so that you can find your way around in vim.

To open a file in vim, we pass the filename as a parameter to the + vim command. If a file with that filename does not exist, a new file is created. : +

$ vim first.txt
+
+

To start inserting text into the new file that we have opened, we need to press the + i key. This will take us into the + insert mode from the + command mode. Hitting the + esc key, will bring us back to the + command mode. There is also another mode of vim, called the + visual mode which will be discussed later in the course. +

In general, it is good to spend as little time as possible in the insert mode and extensively use the command mode to achieve various tasks.

To save the file, use + :w in the command mode. From here on, it is understood that we are in the command mode, whenever we are issuing any command to vim. +

To save a file and continue editing, use + :w FILENAME The file name is optional. If you do not specify a filename, it is saved in the same file that you opened. If a file name different from the one you opened is specified, the text is saved with the new name, but you continue editing the file that you opened. The next time you save it without specifying a name, it gets saved with the name of the file that you initially opened. +

To save file with a new name and continue editing the new file, use + :saveas FILENAME +

To save and quit, use + :wq +

To quit, use + :q +

To quit without saving, use + :q! +

+4.1.1. Moving around

While you are typing in a file, it is in-convenient to keep moving your fingers from the standard position for typing to the arrow keys. Vim, therefore, provides alternate keys for moving in the document. Note again that, you should be in the command mode, when issuing any commands to vim.

The basic cursor movement can be achieved using the keys, + h (left), + l (right), + k (up) and + j (down). : +

^
+k              
+

< h l >

j v

Note: Most commands can be prefixed with a number, to repeat the command. For instance, + 10j will move the cursor down 10 lines. +

+4.1.1.1. Moving within a line

++++ + + + + + + + + + + + + + + + + + + + + + + +

+ Cursor Movement +	+ Command +
+ Beginning of line +	+ + `0` + +
+ First non-space character of line +	+ + `^` + +
+ End of line +	+ + `$` + +
+ Last non-space character of line +	+ + `g_` + +

+4.1.1.2. Moving by words and sentences

++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ Cursor Movement +	+ Command +
+ Forward, word beginning +	+ + `w` + +
+ Backward, word beginning +	+ + `b` + +
+ Forward, word end +	+ + `e` + +
+ Backward, word end +	+ + `ge` + +
+ Forward, sentence beginning +	+ + `)` + +
+ Backward, sentence beginning +	+ + `(` + +
+ Forward, paragraph beginning +	+ + `}` + +
+ Backward, paragraph beginning +	+ + `{` + +

+4.1.1.3. More movement commands

++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ Cursor Movement +	+ Command +
+ Forward by a screenful of text +	+ + `C-f` + +
+ Backward by a screenful of text +	+ + `C-b` + +
+ Beginning of the screen +	+ + `H` + +
+ Middle of the screen +	+ + `M` + +
+ End of the screen +	+ + `L` + +
+ End of file +	+ + `G` + +
+ Line number + `num` + +	+ + `[num]G` + +
+ Beginning of file +	+ + `gg` + +
+ Next occurrence of the text under the cursor +	+ + + `*` + + + +
+ Previous occurrence of the text under the cursor +	+ + + `#` + + + +

Note: + C-x is + Ctrl + + x +

+4.1.2. The visual mode

The visual mode is a special mode that is not present in the original vi editor. It allows us to highlight text and perform actions on it. All the movement commands that have been discussed till now work in the visual mode also. The editing commands that will be discussed in the future work on the visual blocks selected, too.

+4.1.3. Editing commands

The editing commands usually take the movements as arguments. A movement is equivalent to a selection in the visual mode. The cursor is assumed to have moved over the text in between the initial and the final points of the movement. The motion or the visual block that's been highlighted can be passed as arguments to the editing commands.

++++ + + + + + + + + + + + + + + + + + + +

+ Editing effect +	+ Command +
+ Cutting text +	+ + `d` + +
+ Copying/Yanking text +	+ + `y` + +
+ Pasting copied/cut text +	+ + `p` + +

The cut and copy commands take the motions or visual blocks as arguments and act on them. For instance, if you wish to delete the text from the current text position to the beginning of the next word, type + dw. If you wish to copy the text from the current position to the end of this sentence, type + y). +

Apart from the above commands, that take any motion or visual block as an argument, there are additional special commands.

++++ + + + + + + + + + + + + + + + + + + + + + + +

+ Editing effect +	+ Command +
+ Cut the character under the cursor +	+ + `x` + +
+ Replace the character under the cursor with + `a` + +	+ + + `ra` + + + +
+ Cut an entire line +	+ + `dd` + +
+ Copy/yank an entire line +	+ + `yy` + +

Note: You can prefix numbers to any of the commands, to repeat them.

+4.1.4. Undo and Redo

You can undo almost anything using + u. +

To undo the undo command type + C-r +

+4.1.5. Searching and Replacing

++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ Finding +	+ Command +
+ Next occurrence of + `text`, forward + +	+ + `\text` + +
+ Next occurrence of + `text`, backward + +	+ + `?text` + +
+ Search again in the same direction +	+ + `n` + +
+ Search again in the opposite direction +	+ + `N` + +
+ Next occurrence of + `x` in the line + +	+ + `fx` + +
+ Previous occurrence of + `x` in the line + +	+ + `Fx` + +

++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ Finding and Replacing +	+ Command +
+ Replace the first instance of + `old` with + `new` in the current line. + +	+ + `:s/old/new` + +
+ Replace all instances of + `old` with + `new` in the current line. + +	+ + `:s/old/new/g` + +
+ Replace all instances of + `old` with + `new` in the current line, but ask for confirmation each time. + +	+ + `:s/old/new/gc` + +
+ Replace the first instance of + `old` with + `new` in the entire file. + +	+ + `:%s/old/new` + +
+ Replace all instances of + `old` with + `new` in the entire file. + +	+ + `:%s/old/new/g` + +
+ Replace all instances of + `old` with + `new` in the entire file but ask for confirmation each time. + +	+ + `:%s/old/new/gc` + +

+4.2. SciTE

SciTE is a + source code editor, that has a feel similar to the commonly used GUI text editors. It has a wide range of features that are extremely useful for a programmer, editing code. Also it aims to keep configuration simple, and the user needs to edit a text file to configure SciTE to his/her liking. +

Opening, Saving, Editing files with SciTE is extremely simple and trivial. Knowledge of using a text editor will suffice.

SciTE can syntax highlight code in various languages. It also has auto-indentation, code-folding and other such features which are useful when editing code.

SciTE also gives you the option to (compile and) run your code, from within the editor.

+5. Personalizing your Environment

+5.1. .bashrc

What would you do, if you want bash to execute a particular command each time you start it up? For instance, say you want the current directory to be your Desktop instead of your home folder, each time bash starts up. How would you achieve this? Bash reads and executes commands in a whole bunch of files called start-up files, when it starts up.

When bash starts up as an interactive login shell, it reads the files + /etc/profile, + ~/.bash_profile, + ~/.bash_login, and + ~/.profile in that order. +

When it is a shell that is not a login shell, + ~/.bashrc is read and the commands in it are executed. This can be prevented using the + --norc option. To force bash to use another file, instead of the + ~/.bashrc file on start-up, the + --rcfile option may be used. +

Now, you know what you should do, to change the current directory to you Desktop. Just put a + cd ~/Desktop into your + ~/.bashrc and you are set! +

This example is quite a simple and lame one. The start-up files are used for a lot more complex things than this. You could set (or unset) aliases and a whole bunch of environment variables in the + .bashrc. We shall look at them, in the next section where we look at environment variables and + set command. +

+5.2. .vimrc

+ .vimrc is a file similar to + .bashrc for vim. It is a start-up file that vim reads and executes, each time it starts up. The options that you would like to be set every time you use vim, are placed in the + .vimrc file, so that they are automatically set each time vim starts. The recommended place for having your + .vimrc is also your home directory. +

The file + /etc/vimrc is the global config file and shouldn't usually be edited. You can instead edit the + ~/.vimrc file that is present in your home folder. +

There are a whole bunch of variables that you could set in the + .vimrc file. You can look at all the options available, using the + :set all command in vim. You could use the + :help option_name to get more information about the option that you want to set. Once you are comfortable with what you want to set a particular variable to, you could add it to + .vimrc. You should also look at + :help vimrc for more info on the + .vimrc file. If you already have a + .vimrc file, you can edit it from within vim, using + :e $MYVIMRC command. We shall look at some of the most commonly used options. +

++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ Command +	+ Vim action +
+ + `set nocompatible` + +	+ Explicitly disable compatibility with vi +
+ + + `set backspace=indent,eol,start` + + + +	+ In the insert mode, vim allows the backspace key to delete white spaces at the start of line, line breaks and the character before which insert mode started. +
+ set autoindent +	+ Vim indents a new line with the same indentation of the previous line. +
+ set backup +	+ Vim keeps a backup copy of a file when overwriting it. +
+ set history=50 +	+ Vim keeps 50 commands and 50 search patterns in the history. +
+ set ruler +	+ Displays the current cursor position in the lower right corner of the vim window. +
+ set showcmd +	+ Displays the incomplete command in the lower right corner. +
+ set incsearch +	+ Turns on incremental searching. Displays search results while you type. +

You can see the effect of the changes made to your + .vimrc file by restarting vim. If you want to see the changes that you made to your + .vimrc file immediately, you could source the file from within vim. +

If the + .vimrc file has been sourced when this instance of vim was started, you could just resource the file again: +

:so $MYVIMRC

If you just created the + .vimrc file or it was not sourced when you stared this instance of vim, just replace the + $MYVIMRC variable above, with the location of the + .vimrc file that you created/edited. +

+6. Subshells and + `source` +

A subshell is just a separate instance of the shell which is a child process of the shell that launches it. Bash creates a subshell in various circumstances. Creation of subshells allows the execution of various processes simultaneously.

+
When an external command is executed, a new subshell is created. Any built-in commands of bash are executed with int the same shell, and no new subshell is started. When an external command is run, the bash shell copies itself (along with it's environment) creating a subshell and the process is changed to the external command executed. The subshell is a child process of this shell.
+
Any pipes being used, create a subshell. The commands on the input and output ends of the pipe are run in different subshells.
+
You could also, explicitly tell bash to start a subshell by enclosing a list of commands between parentheses. Each of the commands in the list is executed within a single new subshell.
+

To avoid creating a subshell, when running a shell script, you could use the + source command. : +

$ source script.sh
+
+

This will run the + script.sh within the present shell without creating a subshell. The + . command is an alias for the source command. + . script.sh is therefore equivalent to + source script.sh. +

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch04-handout.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch04-handout.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,1380 @@ + + + +Chapter 1. + + + + + + + +

Table of Contents

LaTeX

1. TeX & LaTeX

1.1. TeX
1.2. LaTeX

2. WYSIWG vs. WYSIWM

3. Hello World

3.1. Compiling & Output
3.2. A peek at the source

4. Where do we want to go

5. Some Basics

5.1. Spaces
5.2. Line & Page Breaks
5.3. Paragraphs
5.4. Special Characters
5.5. Commands
5.6. Environments

6. Some Structural Elements

6.1. + \documentclass +
6.2. Parts, Chapters and Sections
6.3. Top Matter
6.4. Abstract
6.5. Appendices
6.6. Table of Contents

7. Elementary Text Typesetting

7.1. Emphasizing
7.2. Quotation Marks
7.3. Dashes and Hyphens
7.4. Footnotes
7.5. Flushleft, Flushright, and Center
7.6. Itemize, Enumerate, and Description
7.7. Quote, Quotation, and Verse
7.8. Verbatim

8. Tables, Figures and Captions

8.1. The + \tabular environment +
8.2. Importing Graphics
8.3. Floats
8.4. Captions
8.5. List of Figures, Tables
8.6. Cross References

9. Bibliography

9.1. + thebibliography environment +
9.2. BibTeX

10. Typesetting Math

10.1. Math Mode
10.2. Single Equations
10.3. Basic Elements
10.4. Multiple Equations
10.5. Arrays and Matrices

11. Miscellaneous Stuff

11.1. Presentations
11.2. Including Code
11.3. Including files

12. Recommended Reading

+LaTeX

Table of Contents

1. TeX & LaTeX

1.1. TeX
1.2. LaTeX

2. WYSIWG vs. WYSIWM

3. Hello World

3.1. Compiling & Output
3.2. A peek at the source

4. Where do we want to go

5. Some Basics

5.1. Spaces
5.2. Line & Page Breaks
5.3. Paragraphs
5.4. Special Characters
5.5. Commands
5.6. Environments

6. Some Structural Elements

6.1. + \documentclass +
6.2. Parts, Chapters and Sections
6.3. Top Matter
6.4. Abstract
6.5. Appendices
6.6. Table of Contents

7. Elementary Text Typesetting

7.1. Emphasizing
7.2. Quotation Marks
7.3. Dashes and Hyphens
7.4. Footnotes
7.5. Flushleft, Flushright, and Center
7.6. Itemize, Enumerate, and Description
7.7. Quote, Quotation, and Verse
7.8. Verbatim

8. Tables, Figures and Captions

8.1. The + \tabular environment +
8.2. Importing Graphics
8.3. Floats
8.4. Captions
8.5. List of Figures, Tables
8.6. Cross References

9. Bibliography

9.1. + thebibliography environment +
9.2. BibTeX

10. Typesetting Math

10.1. Math Mode
10.2. Single Equations
10.3. Basic Elements
10.4. Multiple Equations
10.5. Arrays and Matrices

11. Miscellaneous Stuff

11.1. Presentations
11.2. Including Code
11.3. Including files

12. Recommended Reading

Introduction +

LaTeX is a typesetting program used to produce excellently typeset documents. It is extensively used for producing high quality scientific and mathematical documents. It may also be used for producing other kinds of documents, ranging from simple one page articles or letters

+1. TeX & LaTeX

+1.1. TeX

TeX is a typesetting system designed by Donald Knuth, the renowned Computer Scientist and Emeritus professor at Stanford University. Typesetting is placing text onto a page with all the style formatting defined, so that content looks as intended.

It was designed with two goals in mind-

To allow anybody to produce high-quality books using a reasonable amount of effort.
To provide a system that would give the exact same results on all computers, now and in the future

TeX is well known for it's stability and portability.

TeX is pronounced as "tech".

The current version of TeX is 3.1415926 and is converging to π.

+1.2. LaTeX

LaTeX was originally written by Leslie Lamport in the early 1980s. It is an extension of TeX, consisting of TeX macros and a program to parse the LaTeX files. It is easier to use than TeX itself, at the same time producing the same quality of output.

LaTeX is pronounced either as "Lah-tech" or "Lay-tech"

+2. WYSIWG vs. WYSIWM

WYSIWG is an acronym for "What You See Is What You Get". Word processors, are typically WYSIWG tools. LaTeX, TeX or other TeX based tools are not. They are typesetting or text formatting or document description programs. They can be called WYSIWM or "What You See Is What you Mean" systems, since you give a description of how things look, and LaTeX typesets the document for you.

Here are a few reasons, why you should use LaTeX -

+
LaTeX produces documents with excellent visual quality, especially mathematical and scientific documents.
+
It does the typesetting to you. Typically, when one works with a word-processor, the user is doing the text formatting or typesetting along with typing out the content. LaTeX allows the user to concentrate on the content leaving aside the typesetting to LaTeX.
+
It is light on your resources as compared to most of the word processors available today.
+
It is well known for it's stability and for it's virtually bug free code base.
+
It encourages users to structure documents by meaning rather than appearance, thereby helping produce well structured documents.
+
It uses plain text files as input, which have a lot of well known advantages over binary files. To state a few, they can be opened with any editor on any operating system, they are smaller in size compared to the binaries, can be version controlled and can be processed using widely used text processing utilities.
+
The output can be generated in more than one formats.
+
It is free software (free as in freedom) and gratis too.
+
It is widely used.
+

+3. Hello World

OK, let's get started with our first LaTeX document. Open up your favorite editor and type in the following code.

%hello.tex - First LaTeX document
+\documentclass{article}
+
+\begin{document}
+  Hello, World!
+\end{document}
+
+

Save the file as + hello.tex and open up a terminal to compile your + tex file to get the output in a + pdf format. +

+3.1. Compiling & Output

$pdflatex hello.tex
+
+Output written on hello.pdf (1 page, 5733 bytes).
+Transcript written on hello.log.
+
+

Open the + hello.pdf to see the output as shown. +

Note: The command + latex is often used to get the + dvi output. But, throughout this course, we shall use pdflatex to compile our documents. +

+3.2. A peek at the source

+ %hello.tex - First LaTeX document +

This line is a comment. LaTeX ignores this line and it is meant only for the human readers. LaTeX ignores anything after a + % symbol to the end of the line. +

+ \documentclass{article} +

This line is a command and sets the + documentclass of the document to + article. LaTeX has other classes like + report, + book, + letter, etc. The typesetting of the document varies depending on the + documentclass of the document. +

+ \begin{document} +

This line informs LaTeX that this is the beginning of the content of the document.

+ Hello, World! +

This is the actual text displayed in the document.

+ \end{document} +

This line tells LaTeX that the document is complete and LaTeX will simply ignore anything written after this line.

+4. Where do we want to go

During the course of this session we will learn how to do various things in LaTeX and try to produce the sample document provided.

+5. Some Basics

Before we get started with creating the document, let's try to understand a few things that would be useful during the course of this session.

+5.1. Spaces

LaTeX treats multiple empty spaces (or lines) as a single space (or line). An empty line between two lines of text is considered as a change of paragraphs.

+5.2. Line & Page Breaks

LaTeX usually does the job of breaking up your content into lines and pages, and does it well. But under some circumstances, you might want to instruct LaTeX to break line or start a new page at a particular point.

+ \\ or + \newline command is used to create a new line at the point where the command is issued. Appending + * to + \\, instructs LaTeX to create a new line, without creating a new page at that point. +

+5.3. Paragraphs

As already mentioned, LaTeX considers an empty line between two lines of text as a new paragraph. + \par command may also be used to start a newline. It is equivalent to the blank line. +

By default LaTeX indents new paragraphs. If you do not wish to have the paragraph indented, you can use the + \nointend command at the beginning of the paragraph. +

+5.4. Special Characters

LaTeX associates special meaning to the characters + ~ # $ % ^ & _ { } \. +

To have these characters in the text of your document, you need to prefix a backslash to them. + \~ \# \% \$ \^ \& \_ \{ \} \textbackslash +

+5.5. Commands

All LaTeX commands start with a backslash + \. +
Like the commands in Linux, they are case sensitive.
They usually have a backslash followed by a consisting of letters only. Any character other than letters, like space, numbers or special characters terminate the command.
The commands for producing special characters in the text, is an exception. They contain a backslash followed by a single special character.
Commands may have parameters, which are supplied to them by enclosing them in curly braces + { }. +
They may also have a few optional parameters which are added after the name in square brackets + [ ]. +

+5.6. Environments

Environments are very similar to the commands, except that they effect larger parts of the document. For example, we used the + document environment in our first LaTeX document. +

They begin with a + \begin and end with a + \end +
In general environments can be nested within each other.

+6. Some Structural Elements

+6.1. + `\documentclass` +

As already stated, the + documentclass command tells LaTeX, the type of the document that you intend to create. Each class has a few differences in how the content of the document is typeset. We presently have it set to the article class. Let us try changing it to the report class. +

Note that the top matter of the document appears in a different page for the report class.

Some of the LaTeX classes that you may want to use are, article, proc, report, book, slides, letter.

The + documentclass command also accepts a few optional parameters. For example: +

\documentclass[12pt,a4paper,oneside,draft]{report}

+ 12pt specifies the size of the main font in the document. The relative sizes of the various fonts is maintained, when the font size is changed. If no size is specified, + 10pt is assumed by default. +

+ a4paper specifies the size of the paper to be used for the document. +

+ oneside specifies that the document will be printed only on one side of the paper. The + article and + report classes are + oneside by default and the + book class is + twoside. +

+ draft marks the hyphenation and justification problems in the document with a small square in the right hand margin of the document, so that they can be easily spotted. +

Note: Everything written in between the + \documentclass command and the + \begin{document} command is called the Preamble. +

+6.2. Parts, Chapters and Sections

Often documents are divided into various parts, chapters, sections and subsections. LaTeX provides an intuitive mechanism to include this in your documents. It has various commands like + part, + chapter, + section, + subsection, + subsubsection, + paragraph and + subparagraph. Note that all these commands are not available in all the document classes. The + chapter command is available only in books and reports. Also, the + letter document class does not have any of these commands. +

Let us now give our document some structure, using these commands.

Note that you do not need to provide any numbers to the commands. LaTeX automatically takes care of the numbering. Also, you do not need to enclose the text of a block within + \begin and + \end commands. LaTeX starts a new block each time it finds a sectioning command. : +

\section[Short Title]{This is a very long title and the Short Title will appear in the Table of Contents.}
+
+
+

+6.2.1. Section Numbering

As already, you don't need to explicitly do any numbering in LaTeX. Parts are numbered using roman numerals; Chapters and sections are numbered using decimal numbers. When the table of contents is inserted into a document, all the numbered headings automatically appear in it.

By default LaTeX has numbering up 2 levels, i.e, the parts, chapters, sections and subsections are numbered. You can change this by setting the + secnumdepth counter using the + \setcounter command. The following command removes numbering of the subsections. Only parts, chapters and sections are numbered. : +

\setcounter{secnumdepth}{1}
+
+

A sectioning command appended with an asterisk gives an unnumbered heading that is not included in the table of contents. :

\section*{Introduction}
+
+

+6.3. Top Matter

The information about the document such as it's title, the date, the author(s) information etc, is collectively known as the topmatter. Though there is no command called + topmatter, the term topmatter is frequently used in LaTeX documentation. +

Let us input the top matter for our document now. :

\title{LaTeX - A How-to}
+\author{The FOSSEE Team}
+\date
+
+

The commands + \title and + \author are self explanatory. The + \date command automatically puts in today's date into the document. Now let us compile and look at the result. +

You would observe that the details do not appear in the document after recompilation. This is because, LaTeX has not been instructed what to do with the top matter information that you have given it. Use the + \maketitle command within the document environment to instruct LaTeX to place the top matter information into the document. +

+6.4. Abstract

Lets now place and abstract in the document using the + abstract environment of LaTeX. The abstract appears in the document after the topmatter but before the main body of the document. : +

\begin{abstract}
+The abstract abstract.
+\end{abstract}
+
+
+

+6.5. Appendices

LaTeX allows for separate numbering for appendices. + \appendix command indicates that the sections following are to be included in the appendix. : +

\appendix
+\chapter{First Appendix}
+
+

+6.6. Table of Contents

Parts, chapters or sections that have been auto numbered by LaTeX automatically appear in the Table of Contents (ToC). + \tableofcontents command places a the ToC, where the command has been issued. +

The counter + tocdepth specifies the depth up to which headings appear in the ToC. It can be set using the + \setcounter command as shown below. : +

\setcounter{tocdepth}{3}
+
+

Unnumbered sections can be placed in the table of contents using the + \addcontentsline command as shown below. : +

\section*{Introduction}
+\addcontentsline{toc}{section}{Introduction}
+
+

Note: To get the correct entries in your table of contents, you will need to run one extra compilation, each time. This is because, the entries of the table of contents are collected during each compilation of the document and utilized during the next compilation.

+7. Elementary Text Typesetting

+7.1. Emphasizing

+ Italic font is generally used to emphasize text. The + \emph command may be used to achieve this effect in LaTeX. : +

This is the \emph{emphasized text}.
+
+

If the + \emph command is nested within another emphasize command, LaTeX emphasized that text using normal fonts. : +

\emph{Did you wonder what happens when we try \emph{emphasizing text} within \emph{emphasized text}}?
+
+

This is emphasized text, and this is emphasized text with normal font + , within emphasized text. +

+7.2. Quotation Marks

When typing in LaTeX, the double quotation mark + " character shouldn't be used. The grave accent ` + + character produces the left quote and the apostrophe ' + character produces the right quote. To obtain double quotes they are, each, used twice. : +

`` Here is an example of putting `text' in quotes ''
+
+

+7.3. Dashes and Hyphens

LaTeX has four dashes of different lengths. Three of them can be produces with different number of consecutive dashes. The short dashes are used for hyphens, slightly longer ones for number ranges and the longest ones for comments. The fourth one is a mathematical symbol, the minus sign. :

The names of these dashes are: `-' hyphen, `--' en-dash, `---' em-dash and `$-$' minus sign.
+
+

The names for these dashes are: ‘‐’ hyphen, ‘–’ en-dash, ‘—’ em-dash and ‘−’ minus sign.

+7.4. Footnotes

With the command:

\footnote{footnote text}
+
+

a footnote is printed at the foot of the current page. Footnotes should always be put after the word or sentence they refer to. Footnotes referring to a sentence or part of it should therefore be put after the comma or period.

Note: Look at the + \marginpar command to insert margin notes +

+7.5. Flushleft, Flushright, and Center

The environments + flushleft and + flushright generate paragraphs that are either left- or right-aligned. +

The + center environment generates centered text. +

+7.6. Itemize, Enumerate, and Description

LaTeX has three different environments for producing lists. Itemize, Enumerate and Description allow you to produce lists of various types in LaTeX.

Itemize is used to produce unnumbered lists. The bullets of the list can be easily changed to use any character. Enumerate environment allows you to produce auto-numbered lists. The description environment, allows you to produce a list of definitions. These environments can be nested within each other, easily.

\begin{itemize}
+  \item Now we move onto some elementary \emph{Text Typesetting}.
+  \item How do we get \emph{emphasized or italic text}?
+  \item \emph{Did you wonder what happens when we try \emph{emphasizing text} within \emph{emphasized text}}?
+  \item ``Beautiful is better than ugly.''
+\end{itemize}
+
+\begin{description}
+  \item[Description] This list is a description list. 
+  \item[Enumerate] Numbered lists are often useful.
+    \begin{enumerate}
+    \item First
+    \item Second
+    \item Third
+    \item \ldots
+    \end{enumerate}
+  \item[Itemize] The list above this description list is an itemize list.
+\end{description}
+
+

+7.7. Quote, Quotation, and Verse

LaTeX provides a + quote environment that can be used for quoting, highlighting important material, etc. : +

The Zen of Python
+\begin{quote}
+  The Zen of Python, by Tim Peters
+
+  Beautiful is better than ugly.
+  Explicit is better than implicit.
+  Simple is better than complex.
+  Complex is better than complicated.
+  Flat is better than nested.
+  Sparse is better than dense.
+  Readability counts.
+  Special cases aren't special enough to break the rules.
+  Although practicality beats purity.
+  Errors should never pass silently.
+  Unless explicitly silenced.
+  In the face of ambiguity, refuse the temptation to guess.
+  There should be one-- and preferably only one --obvious way to do it.
+  Although that way may not be obvious at first unless you're Dutch.
+  Now is better than never.
+  Although never is often better than *right* now.
+  If the implementation is hard to explain, it's a bad idea.
+  If the implementation is easy to explain, it may be a good idea.
+  Namespaces are one honking great idea -- let's do more of those!
+\end{quote}
+
+

LaTeX provides two other similar environments, the quotation and the verse environments.

The quotation environment can be used for longer quotes which have several paragraphs, since it indents the first line of each paragraph.

The verse environment may be used to quote verses or poems, since the line breaks are important in quoting them. The lines are separated using + \\\\ at the end of a line and an empty line after each verse. +

+7.8. Verbatim

The verbatim environment allows us to insert pre-formatted text in a LaTeX document. It is useful for inserting code samples within the document. The verbatim text needs to be enclosed between + \begin{verbatim} and + \end{verbatim}. : +

\begin{verbatim}
+from numpy import *
+a = linspace(0, 5, 50, endpoint = False)
+\end{verbatim}
+
+from numpy import *
+a = linspace(0, 5, 50, endpoint = False)
+
+

To insert verbatim text in-line, the + \verb command can be used. : +

The verb command allows placing \verb|verbatim text| in-line. 
+
+

The | is just an example of a delimiter character. You can use any character except letters, * or space.

+8. Tables, Figures and Captions

+8.1. The + `\tabular` environment +

The + tabular environment allows you to typeset tables in LaTeX. + \begin{tabular}[pos]{col fmt} command can be used to specify the parameters of the table and start creating the table. +

The + pos argument specifies the vertical position of the table relative to the baseline of the surrounding text. It can take on the values + t for top, + b for bottom, or + c for center. +

The + col fmt argument specifies the formatting of the columns of the table. You need to explicitly specify the formatting for each of the columns in the table. The + col fmt argument can take on the following values. +

+++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + `l` + +	+ left justified column content +
+ + `r` + +	+ right justified column content +
+ + `c` + +	+ centered column content +
+ + + `*{n}{col}` + + + +	+ produces + `n` columns with the + `col` type of formatting + `*{3}{c}` is the same as {c c c} + +
+ ``\| +	+ produces a vertical line. +

Now we look at how to input the actual entries of the tables. Each horizontal row in a table is separated by + \\. Each column entry of a row is separated by + &. +

The + \hline command allows you to draw horizontal lines between two rows of the table. But it does not allow you do draw partial lines. + \cline{a-b} draws a horizontal line from column + a to column + b. : +

\begin{tabular}{|c|c|}
+  \hline
+  \verb+l+ & left justified column content\\ 
+  \hline
+  \verb+r+ & right justified column content\\ 
+  \hline
+  \verb+c+ & centered column content\\ 
+  \hline
+  \verb+*{n}{col}+ & produces \verb+n+ columns with the\\
+                 & \verb+col+ type of formatting\\
+  \cline{2-2}
+                 &\verb+*{3}{c}+ is the same as \verb+{c c c}+ \\
+  \hline
+  \verb+|+ & produces a vertical line\\ 
+  \hline
+\end{tabular}
+
+

+8.2. Importing Graphics

To include images in LaTeX, we require to use an additional package known as + graphicx. To load a package, we use the + \usepackage directive in the preamble of the document. : +

\usepackage{graphicx}
+
+

When compiling with + pdflatex command, + jpg, + png, + gif and + pdf images can be inserted. +

\includegraphics[optional arguments]{imagename}
+
+

A few + optional arguments: +

+
+
+ width=x, + height=x +
+
+
+
If only the height or width is specified, the image is scaled, maintaining the aspect ratio.
+
+
+
+
+
+ keepaspectratio +
+
+
+
This parameter can either be set to true or false. When set to true, the image is scaled according to both width and height, without changing the aspect ratio, so that it does not exceed both the width and the height dimensions.
+
+
+
+
+
+ scale=x +
+
+
+
Scale the image by a factor of + x. For example, + scale=2, will double the image size. +
+
+
+
+
+ angle=x +
+
+
+
This option can be used to rotate the image by + x degrees, counter-clockwise. +
+
+
+
+
+

\includegraphics[scale=0.8, angle=30]{lion_orig.png}
+
+

+8.3. Floats

Tables and Figures need to be treated in a special manner, since they cannot be split over pages, and they are referred to as floats in LaTeX.

When there is not enough space on a page, to fit in a table or figure, it is floated over to the next page filling up the current page with text. LaTeX has float environments called table and figure for tables and images, respectively.

Anything enclosed within the table or figure environments will be treated as floats. :

\begin{figure}[pos] or 
+\begin{table}[pos]
+
+

The + pos parameter specifies the placement of the float. The possible values it can take are as follows. +

++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ Specifier +	+ Permission +
+ h +	+ at approximately the same place where it occurs in the source +
+ t +	+ at the top of the page. +
+ b +	+ at the bottom of the page. +
+ p +	+ on a special page for floats only. +
+ ! +	+ Override LaTeX's internal parameters for good positions +
+ H +	+ nearly equivalent to h! +

Examples:

\begin{figure}[h]
+\centering
+\includegraphics[scale=0.8, angle=30]{lion_orig.png}
+\end{figure}
+
+
+

+8.4. Captions

The + \caption{text} command allows you to add captions to images or tables. LaTeX automatically numbers your tables and figures and you need not include numbers in the captions that you write. The caption appears below or on top of the image (or table), depending on whether you place it after or before the + importgraphics (or + tabular) command. +

\begin{figure}[h] \centering \includegraphics[scale=0.8]{lion_orig.png} \caption{CTAN lion drawing by Duane Bibby; thanks to www.ctan.org} \end{figure}

The caption command also, like the section command, has the short caption optional parameter. The short caption will appear in the list of tables or figures.

+8.5. List of Figures, Tables

LaTeX can automatically generate a List of Tables or Figures, with the table or figure numbers, the captions and page numbers on which they appear. This can be done using the + \listoftables or + listoffigures commands. +

Note: Just like table of contents, these lists also require an extra compilation.

+8.6. Cross References

LaTeX has a very efficient mechanism of inserting cross-references in documents.

The command + \label{name} is used to label figures, tables or segments of text. + \ref{name} refers to the object marked by the + name by it's numbering (figure, table, section etc.) + \pageref{name} gives the page number of the object which has been labeled with + name. +

Note: Cross referencing also requires an extra compilation, like table of contents.

+9. Bibliography

Bibliography or references can be added to LaTeX documents in two ways - using the + thebibliography environment, or using BibTeX. Let's first look at using the + \thebibliography environment and then move on to BibTeX. +

+9.1. + `thebibliography` environment +

Writing bibliographies in LaTeX using the + thebibliography environment is pretty easy. You simply have to list down all the bibliography items within the bibliography environment. +

Each entry of the bibliography begins with the command + \bibitem[label]{name}. The name is used to cite the bibliography item within the document using + \cite{name}. The label option replaces the numbers from the auto enumeration with the labels given. : +

He used this lion in the illustrations for D Knuth's original TeXbook\cite{DKnuth}, for L Lamport's LaTeX book\cite{LLamport}
+
+\begin{thebibliography}{99}
+  \bibitem{DKnuth} Donald E. Knuth (1984). \emph{The TeXbook} (Computers and Typesetting, Volume A). Reading, Massachusetts: Addison-Wesley. ISBN 0-201-13448-9.
+
+  \bibitem{LLamport} Lamport, Leslie (1994). \emph{LaTeX: A document preparation system: User's guide and reference}.
+   illustrations by Duane Bibby (2nd ed.). Reading, Mass: Addison-Wesley Professional. 
+\end{thebibliography}
+
+

The + 99 in the example above indicates the maximum width of the label that the references may get. We here assume that the number of Bibliography items will be less than 100. If your document has less than 10 references, you may want to replace + 99 with + 9. +

+9.2. BibTeX

The previous section explained the process of listing references at the end of a document and embedding cross references. In this section let us explore the BibTeX environment for keeping track of references.

Using BibTeX is a very convenient method to use, when writing multiple documents in a single area or field. BibTeX allows you to create a database of all your references and use them as and when required.

The BibTeX database is stored in a + .bib file. The structure of the file is quite simple and an example is shown below. : +

@book{Lamport94,
+author    = "Leslie Lamport",
+title     = "A Document Preparation System: User's Guide and Reference",
+publisher = "Addison-Wesley Professional",
+year      = "1994",
+edition    = "second",
+note      = "illustrations by Duane Bibby"
+}
+
+

Each bibliography entry starts with a declaration of the type of the reference being mentioned. The reference is in the above example is of the book type. BibTeX has a wide range of reference types, for example, + article, book, conference, manual, proceedings, unpublished. +

The type of reference is followed by a left curly brace, and immediately followed by the citation key. The citation key, + Lamport94 in the example above is used to cite this reference using the command + \cite{Lamport94}. +

This is followed by the relevant fields and their values, listed one by one. Each entry must be followed by a comma to delimit one field from the other.

To get your LaTeX document to use the bibliography database, you just add the following lines to your LaTeX document. :

\bibliographystyle{plain}
+\bibliography{LaTeX}
+
+

Bibliography styles are files that tell BibTeX how to format the information stored in the + .bib database file. The style file for this example is + plain.bst. Note that you do not need to add the + .bst extension to the filename. If you wish to achieve a particular style of listing the bibliography items and citing them, you should use an appropriate style file. +

The + bibliography command specifies the file that should be used as the database for references. The file used in this example is + LaTeX.bib +

+9.2.1. Compiling

Adding BibTeX based references, slightly complicates the process of compiling the document to obtain the desired output. The exact workings of LaTeX and BibTeX will not be explained here. The procedure for obtaining the output (without any explanations) is as follows:

Compile the + .tex file using + pdflatex - + $pdflatex LaTeX(.tex) +
Compile the + .bib file using + bibtex - + $bibtex LaTeX(.bib) +
Compile the + .tex file again. +
Compile the + .tex file for one last time! +

+10. Typesetting Math

It is advisable to use the AMS-LaTeX bundle to typeset mathematics in LaTeX. It is a collection of packages and classes for mathematical typesetting.

We load + amsmath by issuing the + \usepackage{amsmath} in the preamble. Through out this section, it is assumed that the + amsmath package has been loaded. +

+10.1. Math Mode

There are a few differences between the + math mode and the + text mode: +

Most spaces and line breaks do not have any significance, as all spaces are either derived logically from the mathematical expressions, or have to be specified with special commands such as + \, + \quad or + \qquad +
Empty lines are not allowed.
Each letter is considered to be the name of a variable and will be typeset as such. If you want to typeset normal text within a formula, then you have to enter the text using the \text{...} command

+10.2. Single Equations

Mathematical equations can be inserted in-line within a paragraph ( + text style), or the paragraph can be broken to typeset it separately ( + display style). +

A mathematical equation within a paragraph is entered between + $ and + $. Larger equations are set apart from the paragraph, by enclosing them within + \begin{equation} and + \end{equation}. If you don't wish to number a particular equation, the starred version of equation can be used. + \begin{equation*} and + \end{equation*} +

The equation can also be cross referenced using the + \label and + \eqref commands. +

+10.3. Basic Elements

Greek Letters can are entered as + \alpha, \beta, \gamma, \delta, ... for lowercase letters and + \Alpha, \Beta, \Gamma, ... for uppercase ones. +

Exponents and subscripts can be typeset using the carat + ^ and the underscore + _ respectively. Most of the math mode commands act only on the next character. If you want a command to affect several characters, they need to be enclosed in curly braces. +

The + \sqrt command is used to typeset the square root symbol. LaTeX of the root sign is determined automatically. The nth root is generated with + \sqrt[n]. +

To explicitly show a multiplication a dot may be shown. + \cdot could be used, which typesets the dot to the center. + \cdots is three centered dots while + \ldots sets the dots on the baseline. Besides that + \vdots for vertical and + \ddots can be used for diagonal dots. +

A fraction can be typeset with the command + \frac{..}{..} +

The integral operator is generated with + \int, the sum operator with + \sum, and the product operator with + \prod. The upper and lower limits are specified with + ^ and + _ like subscripts and superscripts. +

LaTeX provides all kinds of braces as delimiters. The round and square brackets can be produces using the keys on the keyboard and appending a backslash. Other delimiters can be produced using special commands of LaTeX. Placing + \left in front of an opening delimiter and + \right in front of a closing delimiter, instructs LaTeX to automatically take care of the sizes of the delimiters. +

+10.4. Multiple Equations

Long formulae that run over several lines or equation systems, can be typeset using the + align or + align* environments. + align numbers each of the lines in the environment, and + align* as expected, does not number any of them. +

The + & is used to align the equations vertically and the + \\ command is used to break the lines. Line numbering can be skipped for a particular line in the + align environment by placing a + \nonumber before the line break. +

\begin{align}
+\alpha^2 + \beta^2 &= \gamma^2 \\
+\sum_{i=1}^ni &= \frac{n(n+1)}{2}\\
+\sqrt{-1} &= \pm1 \nonumber
+\end{align}
+
+
+

+10.5. Arrays and Matrices

To typeset arrays, use the + array environment. It works similar to the + tabular environment. The + \\ command is used to break the lines. : +

\begin{equation*}
+\mathbf{X} = \left(
+ \begin{array}{ccc}
+ a_1 & a_2 & \ldots \\
+ b_1 & b_2 & \ldots \\
+ \vdots & \vdots & \ddots
+ \end{array} \right)
+\end{equation*}
+
+

The + array environment can also be used to typeset piecewise functions by using a “.” as an invisible + \right delimiter : +

\begin{equation*}
+f(x) = \left\{
+ \begin{array}{rl}
+   0 & \text{if } x \le 0\\
+   1 & \text{if } x > 0
+ \end{array} \right.
+ \end{equation*}
+
+

Six different types of matrix environments are available in the + amsmath package for typesetting matrices. They essentially have different delimiters: + matrix (none), + pmatrix (, + bmatrix [, + Bmatrix {, + vmatrix | and + Vmatrix ‖. In these matrix environments, the number of columns need not be specified, unlike the + array environment. : +

\begin{equation*}
+  \begin{matrix}
+  1 & 2 \\
+  3 & 4
+  \end{matrix} \qquad
+
+  \begin{bmatrix}
+  1 & 2 & 3 \\
+  4 & 5 & 6 \\
+  7 & 8 & 9
+  \end{bmatrix}
+\end{equation*}
+
+

+11. Miscellaneous Stuff

+11.1. Presentations

LaTeX has quite a few options to produce presentation slides. We shall look at the + beamer class, which is well developed and easy to use. We shall only briefly look at some of the features of beamer. For the best documentation, look at the beamer user guide. +

To write a + beamer presentation, it is recommended that you use one of the templates that beamer provides. We shall use the + speaker_introduction template to get started with beamer. +

As you can see, the document begins with the + documentclass being set to beamer. +

The + \setbeamertemplate command sets the template for various parameters. The + background canvas, + headline and + footline are being set using the command. +

+ \usetheme command sets the theme to be used in the presentation. +

Notice that each slide is enclosed within + \begin{frame} and + \end{frame} commands. The + \begin{frame} command can be passed the Title and Subtitle of the slide as parameters. +

To achieve more with beamer, it is highly recommended that you look at the + beameruserguide. +

+11.2. Including Code

The + listings package can be used to embed source code into your LaTeX document. We shall briefly explore inserting python code into our document. +

Obviously, you first need to tell LaTeX that you want it to use the + listings package, using the + \usepackage command. : +

\usepackage{listings}
+
+

Then, we tell LaTeX that we are going to embed Python code into this document. A simple code highlighting for Python code can be achieved using this. :

\lstset{language=Python,
+        showstringspaces=false,
+       }
+
+

You might want to customize the code highlighting further using other variables like + basicstyle, + commentstyle, + stringstyle, + keywordstyle etc. For detailed information on all this, you should look at the + listings package documentation. +

You include a block of code into your document by enclosing it within the + lstlisting environment. : +

\begin{lstlisting}
+string="Hello, World! "
+for i in range(10):
+    print string*i
+\end{lstlisting} 
+
+

You can also include source code files directly into your latex document, using the + lstinputlisting command. : +

\lstinputlisting[lastline=20]{lstexample.py}
+
+

This command includes the first 20 lines of the file + lstexample.py into out LaTeX document. +

+11.3. Including files

When working on a large document, it is convenient sometimes, to split the large file into smaller input files and club them together at the time of compiling.

The + \input or + \include commands may be used to embed one LaTeX file into another. The + \input command is equivalent to a copy and paste of the document, just before the compilation. The + \include command is exactly similar, except for the fact that it creates a new page every time it is issued. +

+ \input{file} or + \include{file} commands will include the file + file1.tex with in the file where the command has been issued. Note that you do not need to specify the + .tex extension of the file. +

The + \includeonly is useful for debugging or testing the LaTeX document that you are creating, since it restricts the + \include command. Only the files which are given as arguments to the + \includeonly command will be included in the document (wherever a + \include command for those files, has been issued). +

+11.3.1. A note on filenames

Never use filenames or directories that contain spaces. Make filenames as long or short as you would like, but strictly avoid spaces. Stick to upper or lower case letters (without accents), the digits, the hyphen and the full stop or period.

+12. Recommended Reading

+ LaTeX Wikibook +
+ The Not So Short Introduction to LaTeX2e by Tobias Oetikar et al.. +

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch04-latex.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch04-latex.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,66 @@ + + + +Chapter 1. + + + +

Table of Contents

Module 3: LaTeX

1. Suggested Reading
2. Session Level Split-up

+Module 3: LaTeX

Table of Contents

1. Suggested Reading
2. Session Level Split-up

Module Objectives +

After completing this module, a participant will be successfully able to:

Produce professional documents in LaTeX. RBT Ap
Typeset Mathematical equations. RBT Ap
Include figures, tables and code samples. RBT Ap
Add References and write BibTeX files. RBT Ap

+1. Suggested Reading

+ LaTeX Wikibook +
+ The Not So Short Introduction to LaTeX2e by Tobias Oetikar et. al. +

+2. Session Level Split-up

+---------+---------------------------------+---------+ | Session | Topic | Duration| +=========+=================================+=========+ | 1 | Introduction, TeX & LaTeX | 5 min | | | WYSIWG vs. WYSIWM | | | | | | | | Hello World, Compiling, | 10 min | | | Where we want to go, Some Basics| | +---------+---------------------------------+---------+ | 2 | Some Structural Elements | 15 min | | | | | | | Top Matter, + \documentclass, | | | | Abstract, | | | | Sections, Chapters & Parts, | | | | Appendices, Table of Contents | | +---------+---------------------------------+---------+ | 3 | Emphasizing, Quotation marks, | 5 min | | | Dashes & Hyphens, Footnotes, | | | | Flushleft, Flushright & Center | | | | | | | | Enumerate, Itemize, Description,| 10 min | | | Quote, Quotation and Verse, | | | | Verbatim | | +---------+---------------------------------+---------+ | 4 | + \tabular environment, | 20 min | | | Importing Graphics, Floats, | | | | Captions, List of Figures, | | | | List of Tables, Cross References| | +---------+---------------------------------+---------+ | 5 | + \thebibliography | 10 min | | | environment, BibTeX | | +---------+---------------------------------+---------+ | 6 | + \usepackage{amsmath}, | 5 min | | | Single Equations | | | | | | | | Building blocks of an equation, | 15 min | | | Multiple Equations, Arrays and | | | | Matrices | | +---------+---------------------------------+---------+ | 7 | + beamer, + listing, | 10 min | | | Including files | | +---------+---------------------------------+---------+ | 8 | Exercises | 15 min | +---------+---------------------------------+---------+ +

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch1Introduction.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch1Introduction.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,37 @@ + + + +Chapter 1. Introduction + + + + + + + + + +

Table of Contents

Introduction to the Course

+Introduction to the Course

Engineering students use computers for a large number of curricular +tasks – mostly computation centred. However, they do not see this as coding or programming tasks and usually are not even aware of the tools and +techniques that will help them to handle these tasks better. This results +in less than optimal use of their time and resources. This also causes +difficulties when it comes tocollaboration and building on other people’s +work. This course is intended to train such students in good software +practices and tools for producing code and documentation.

fter successfully completing the program, the participants will be able to:

understand how software tools work together and how they can be used in tandem to carry out tasks,

use unix command line tools to carry out common (mostly text processing tasks,

to generate professional documents,

use version control effectively – for both code and documents,

automate tasks by writing shell scripts and python scripts,

realise the impact of coding style and readbility on quality,

write mid-sized programs that carry out typical engineering / numerical computations such as those that involve (basic) manipulation of large arrays in an efficient manner,

generate 2D and simple 3D plots,

debug programs using a standardised approach,

understand the importance of tests and the philosophy of Test Driven Development,

write unit tests and improve the quality of code.

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch1Introduction.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch1Introduction.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,37 @@ + + + +Chapter 1. Introduction + + + + + + + + + +

Table of Contents

Introduction to the Course

+Introduction to the Course

fter successfully completing the program, the participants will be able to:

understand how software tools work together and how they can be used in tandem to carry out tasks,

use unix command line tools to carry out common (mostly text processing tasks,

to generate professional documents,

use version control effectively – for both code and documents,

automate tasks by writing shell scripts and python scripts,

realise the impact of coding style and readbility on quality,

write mid-sized programs that carry out typical engineering / numerical computations such as those that involve (basic) manipulation of large arrays in an efficient manner,

generate 2D and simple 3D plots,

debug programs using a standardised approach,

understand the importance of tests and the philosophy of Test Driven Development,

write unit tests and improve the quality of code.

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch2intro.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch2intro.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,591 @@ + + + +Chapter 2. Basic Python + + + + + + + + + + +

Table of Contents

Basic Python

2.1. The Interactive Interpreter
2.2. ipython - An enhanced interactive Python interpreter

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

+Basic Python

Table of Contents

2.1. The Interactive Interpreter
2.2. ipython - An enhanced interactive Python interpreter

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

This document is intended to be handed out at the end of the workshop. It has +been designed for Engineering students who are Python beginners and have basic +programming skills. The focus is on basic numerics and plotting using Python.

+The system requirements:

Python - version 2.5.x or newer.

IPython

Text editor - scite, vim, emacs or whatever you are comfortable with.

+1. Introduction

The Python programming language was created by a dutch named Guido van Rossum. +The idea of Python was conceived in December 1989. The name Python has nothing +to do with the reptilian, but its been named after the 70s comedy series +"Monty Python's Flying Circus", since it happens to be Guido's favourite +TV series.

Current stable version of Python is 2.6.x, although Python 3.0 is also the stable +version, it is not backwards compatible with the previous versions and is hence +not entirely popular at the moment. This material will focus on the 2.6.x series.

Python is licensed under the Python Software Foundation License (PSF License) +which is GPL compatible Free Software license (excepting license version 1.6 and 2.0) +It is a no strings attached license, which means the source code is free to modify +and redistribute.

The Python docs define Python as "Python is an interpreted, object-oriented, +high-level programming language with dynamic semantics." A more detailed summary +can be found at

http://www.python.org/doc/essays/blurb.html

. Python is a language that +has been designed to help the programmer concentrate on solving the problem at hand +and not worry about the programming language idiosyncrasies.

Python is a highly cross platform compatible language on account of it being an +interpreted language. It is highly scalable and hence has been adapted to run on +the Nokia 60 series phones. Python has been designed to be readable and easy to use

Resources available for reference

Web:

http://www.python.org

Doc:

http://www.python.org/doc

Free Tutorials:

Official Python Tutorial:

http://docs.python.org/tut/tut.html

Byte of Python:

http://www.byteofpython.info/

Dive into Python:

http://diveintopython.org/

Advantages of Python - Why Python??

Python has been designed for readability and ease of use. Its been designed in +such a fashion that it imposes readability on the programmer. Python does away +with the braces and the semicolons and instead implements code blocks based on +indentation, thus enhancing readability.

Python is a high level, interpreted, modular and object oriented language. +Python performs memory management on its own, thus the programmer need not bother +about allocating and deallocating memory to variables. Python provides extensibility +by providing modules which can be easily imported similar to headers in C and +packages in Java. Python is object oriented and hence provides all the object oriented +characteristics such as inheritance, encapsulation and polymorphism.

Python offers a highly powerful interactive programming interface in the form +of the 'Interactive Interpreter' which will be discussed in more detail in the +following sections.

Python provides a rich standard library and an extensive set of modules. The +power of Python modules can be seen in this slightly exaggerated cartoon +

http://xkcd.com/353/

Python interfaces well with most other programming languages such as C, C++ +and FORTRAN.

Although, Python has one setback. Python is not fast as some of the compiled +languages like C or C++. Yet, the amount of flexibility and power more than make +up for this setback.

+2. The Python Interpreter

+2.1. The Interactive Interpreter

Typing python at the shell prompt on any standard Unix/Gnu-Linux system and +hitting the enter key fires up the Python 'Interactive Interpreter'. The Python +interpreter is one of the most integral features of Python. The prompt obtained +when the interactive interpreter is similar to what is shown below. The exact +appearance might differ based on the version of Python being used. The >>> +thing shown is the python prompt. When something is typed at the prompt and the +enter key is hit, the python interpreter interprets the command entered and +performs the appropriate action. All the examples presented in this document are +to be tried hands on, on the interactive interpreter.

 Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
+[GCC 4.3.2] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>>

Lets try with an example, type print 'Hello, World!' at the prompt and hit +the enter key.

 >>> print 'Hello, World!'
+Hello, World!

This example was quite straight forward, and thus we have written our first +line of Python code. Now let us try typing something arbitrary at the prompt. +For example:

 >>> arbit word
+  File "<stdin>", line 1
+    arbit word
+            ^
+SyntaxError: invalid syntax
+>>>

The interpreter gave an error message saying that 'arbit word' was invalid +syntax which is valid. The interpreter is an amazing tool when learning to +program in Python. The interpreter provides a help function that provides the +necessary documentation regarding all Python syntax, constructs, modules and +objects. Typing help() at the prompt gives the following output:

 >>> help()
+
+Welcome to Python 2.5!  This is the online help utility.
+
+If this is your first time using Python, you should definitely check out
+the tutorial on the Internet at http://www.python.org/doc/tut/.
+
+Enter the name of any module, keyword, or topic to get help on writing
+Python programs and using Python modules.  To quit this help utility and
+return to the interpreter, just type "quit".
+
+To get a list of available modules, keywords, or topics, type "modules",
+"keywords", or "topics".  Each module also comes with a one-line summary
+of what it does; to list the modules whose summaries contain a given word
+such as "spam", type "modules spam".
+
+help>

As mentioned in the output, entering the name of any module, keyword or topic +will provide the documentation and help regarding the same through the online +help utility. Pressing Ctrl+d exits the help prompt and returns to the +python prompt.

Let us now try a few examples at the python interpreter.

Eg 1:

 >>> print 'Hello, python!'
+Hello, python!
+>>>

Eg 2:

 >>> print 4321*567890
+2453852690
+>>>

Eg 3:

 >>> 4321*567890
+2453852690L
+>>>

 Note: Notice the 'L' at the end of the output. The 'L' signifies that the
+output of the operation is of type *long*. It was absent in the previous
+example because we used the print statement. This is because *print* formats
+the output before displaying.

Eg 4:

 >>> big = 12345678901234567890 ** 3
+>>> print big
+1881676372353657772490265749424677022198701224860897069000
+>>>

 This example is to show that unlike in C or C++ there is no limit on the
+value of an integer.

Try this on the interactive interpreter: +import this

Hint: The output gives an idea of Power of Python

+2.2. ipython - An enhanced interactive Python interpreter

The power and the importance of the interactive interpreter was the highlight +of the previous section. This section provides insight into the enhanced +interpreter with more advanced set of features called ipython. Entering +ipython at the shell prompt fires up the interactive interpreter.

 $ ipython
+Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
+Type "copyright", "credits" or "license" for more information.
+
+IPython 0.8.4 -- An enhanced Interactive Python.
+?         -> Introduction and overview of IPython's features.
+%quickref -> Quick reference.
+help      -> Python's own help system.
+object?   -> Details about 'object'. ?object also works, ?? prints more.
+
+In [1]:

This is the output obtained upon firing ipython. The exact appearance may +change based on the Python version installed. The following are some of the +various features provided by ipython:

Suggestions - ipython provides suggestions of the possible methods and +operations available for the given python object.

Eg 5:

 In [4]: a = 6
+
+In [5]: a.
+a.__abs__           a.__divmod__        a.__index__         a.__neg__          a.__rand__          a.__rmod__          a.__rxor__
+a.__add__           a.__doc__           a.__init__          a.__new__          a.__rdiv__          a.__rmul__          a.__setattr__
+a.__and__           a.__float__         a.__int__           a.__nonzero__      a.__rdivmod__       a.__ror__           a.__str__
+a.__class__         a.__floordiv__      a.__invert__        a.__oct__          a.__reduce__        a.__rpow__          a.__sub__
+a.__cmp__           a.__getattribute__  a.__long__          a.__or__           a.__reduce_ex__     a.__rrshift__       a.__truediv__
+a.__coerce__        a.__getnewargs__    a.__lshift__        a.__pos__          a.__repr__          a.__rshift__        a.__xor__
+a.__delattr__       a.__hash__          a.__mod__           a.__pow__          a.__rfloordiv__     a.__rsub__
+a.__div__           a.__hex__           a.__mul__           a.__radd__         a.__rlshift__       a.__rtruediv__

In this example, we initialized 'a' (a variable - a concept that will be +discussed in the subsequent sections.) to 6. In the next line when the tab key +is pressed after typing 'a.' ipython displays the set of all possible methods +that are applicable on the object 'a' (an integer in this context). Ipython +provides many such datatype specific features which will be presented in the +further sections as and when the datatypes are introduced.

+3. Editing and running a python file

The previous sections focused on the use of the interpreter to run python code. +While the interpeter is an excellent tool to test simple solutions and +experiment with small code snippets, its main disadvantage is that everything +written in the interpreter is lost once its quit. Most of the times a program is +used by people other than the author. So the programs have to be available in +some form suitable for distribution, and hence they are written in files. This +section will focus on editing and running python files. Start by opening a text +editor ( it is recommended you choose one from the list at the top of this page ). +In the editor type down python code and save the file with an extension .py +(python files have an extension of .py). Once done with the editing, save the +file and exit the editor.

Let us look at a simple example of calculating the gcd of 2 numbers using Python:

Creating the first python script(file)

 $ emacs gcd.py
+  def gcd(x,y):
+    if x % y == 0:
+      return y
+    return gcd(y, x%y)
+
+  print gcd(72, 92)

To run the script, open the shell prompt, navigate to the directory that +contains the python file and run python <filename.py> at the prompt ( in this +case filename is gcd.py )

Running the python script

 $ python gcd.py
+4
+$

Another method to run a python script would be to include the line

#! /usr/bin/python

at the beginning of the python file and then make the file executable by

$ chmod a+x filename.py

Once this is done, the script can be run as a standalone program as follows:

$ ./filename.py

+4. Basic Datatypes and operators in Python

Python provides the following set of basic datatypes.

Numbers: int, float, long, complex

Strings

Boolean

+4.1. Numbers

Numbers were introduced in the examples presented in the interactive interpreter +section. Numbers include types as mentioned earlier viz., int (integers), float +(floating point numbers), long (large integers), complex (complex numbers with +real and imaginary parts). Python is not a strongly typed language, which means +the type of a variable need not mentioned during its initialization. Let us look +at a few examples.

Eg 6:

 >>> a = 1 #here a is an integer variable

Eg 7:

 >>> lng = 122333444455555666666777777788888888999999999 #here lng is a variable of type long
+>>> lng
+122333444455555666666777777788888888999999999L #notice the trailing 'L'
+>>> print lng
+122333444455555666666777777788888888999999999 #notice the absence of the trailing 'L'
+>>> lng+1
+122333444455555666666777777788888889000000000L

Long numbers are the same as integers in almost all aspects. They can be used in +operations just like integers and along with integers without any distinction. +The only distinction comes during type checking (which is not a healthy practice). +Long numbers are tucked with a trailing 'L' just to signify that they are long. +Notice that in the example just lng at the prompt displays the value of the variable +with the 'L' whereas print lng displays without the 'L'. This is because print +formats the output before printing. Also in the example, notice that adding an +integer to a long does not give any errors and the result is as expected. So for +all practical purposes longs can be treated as ints.

Eg 8:

 >>> fl = 3.14159 #fl is a float variable
+>>> e = 1.234e-4 #e is also a float variable, specified in the exponential form
+>>> a = 1
+>>> b = 2
+>>> a/b #integer division
+0
+>>> a/fl #floating point division
+0.31831015504887655
+>>> e/fl
+3.9279473133031364e-05

Floating point numbers, simply called floats are real numbers with a decimal point. +The example above shows the initialization of a float variable. Shown also in this +example is the difference between integer division and floating point division. +'a' and 'b' here are integer variables and hence the division gives 0 as the quotient. +When either of the operands is a float, the operation is a floating point division, +and the result is also a float as illustrated.

Eg 9:

 >>> cplx = 3 + 4j #cplx is a complex variable
+>>> cplx
+(3+4j)
+>>> print cplx.real #prints the real part of the complex number
+3.0
+>>> print cplx.imag #prints the imaginary part of the complex number
+4.0
+>>> print cplx*fl  #multiplies the real and imag parts of the complex number with the multiplier
+(9.42477+12.56636j)
+>>> abs(cplx) #returns the absolute value of the complex number
+5.0

Python provides a datatype for complex numbers. Complex numbers are initialized +as shown in the example above. The real and imag operators return the real and +imaginary parts of the complex number as shown. The abs() returns the absolute +value of the complex number.

+4.2. Variables

Variables are just names that represent a value. Variables have already been +introduced in the various examples from the previous sections. Certain rules about +using variables:

Variables have to be initialized or assigned a value before being used.

Variable names can consist of letters, digits and underscores(_).

Variable names cannot begin with digits, but can contain digits in them.

In reference to the previous section examples, 'a', 'b', 'lng', 'fl', 'e' and 'cplx' +are all variables of various datatypes.

 Note: Python is not a strongly typed language and hence an integer variable can at a
+later stage be used as a float variable as well.

+4.3. Strings

Strings are one of the essential data structures of any programming language. +The print "Hello, World!" program was introduced in the earlier section, and +the "Hello, World!" in the print statement is a string. A string is basically +a set of characters. Strings can be represented in various ways shown below:

 s = 'this is a string'              # a string variable can be represented using single quotes
+s = 'This one has "quotes" inside!' # The string can have quotes inside it as shown
+s = "I have 'single-quotes' inside!"
+l = "A string spanning many lines\
+one more line\
+yet another"                        # a string can span more than a single line.
+t = """A triple quoted string does  # another way of representing multiline strings.
+not need to be escaped at the end and
+"can have nested quotes" etc."""

Try the following on the interpreter: +s = 'this is a string with 'quotes' of similar kind'

Exercise: How to use single quotes within single quotes in a string as shown +in the above example without getting an error?

+4.3.1. String operations

A few basic string operations are presented here.

String concatenation +String concatenation is done by simple addition of two strings.

 >>> x = 'Hello'
+>>> y = ' Python'
+>>> print x+y
+Hello Python

Try this yourself:

 >>> somenum = 13
+>>> print x+somenum

The problem with the above example is that here a string variable and an integer +variable are trying to be concantenated. To obtain the desired result from the +above example the str(), repr() and the `` can be used.

str() simply converts a value to a string in a reasonable form. +repr() creates a string that is a representation of the value.

The difference can be seen in the example shown below:

 >>> str(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000'
+>>> repr(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000L'

It can be observed that the 'L' in the long value shown was omitted by str(), +whereas repr() converted that into a string too. An alternative way of using +repr(value) is `value`.

A few more examples:

 >>> x = "Let's go \nto Pycon"
+>>> print x
+Let's go
+to Pycon

In the above example, notice that the 'n'(newline) character is formatted and +the string is printed on two lines. The strings discussed until now were normal +strings. Other than these there are two other types of strings namely, raw strings +and unicode strings.

Raw strings are strings which are unformatted, that is the backslashes() are +not parsed and are left as it is in the string. Raw strings are represented with +an 'r' at the start of a string. +Let us look at an example

 >>> x = r"Let's go \nto Pycon"
+>>> print x
+Let's go \nto Pycon

Note: The 'n' is not being parsed into a new line and is left as it is.

Try this yourself:

 >>> x = r"Let's go to Pycon\"

Unicode strings are strings where the characters are Unicode characters as +opposed to ASCII characters. Unicode strings are represented with a 'u' at the +start of the string. +Let us look at an example:

 >>> x = u"Let's go to Pycon!"
+>>> print x
+Let's go to Pycon!

+4.4. Boolean

Python also provides special Boolean datatype. A boolean variable can assume a +value of either True or False (Note the capitalizations).

Let us look at examples:

 >>> t = True
+>>> f = not t
+>>> print f
+False
+>>> f or t
+True
+>>> f and t
+False

+5. The while loop

The Python while loop is similar to the C/C++ while loop. The syntax is as +follows:

 statement 0
+while condition:
+  statement 1 #while block
+  statement 2 #while block
+statement 3 #outside the while block.

Let us look at an example:

 >>> x = 1
+>>> while x <= 5:
+...   print x
+...   x += 1
+...
+1
+2
+3
+4
+5

+6. The if conditional

The Python if block provides the conditional execution of statements. +If the condition evaluates as true the block of statements defined under the if +block are executed.

If the first block is not executed on account of the condition not being satisfied, +the set of statements in the else block are executed.

The elif block provides the functionality of evaluation of multiple conditions +as shown in the example.

The syntax is as follows:

 if condition :
+    statement_1
+    statement_2
+
+elif condition:
+    statement_3
+    statement_4
+else:
+    statement_5
+    statement_6

Let us look at an example:

 >>> n = raw_input("Input a number:")
+>>> if n < 0:
+      print n," is negative"
+      elif n > 0:
+      print n," is positive"
+      else:
+      print n, " is 0"

+7. raw_input() +

In the previous example we saw the call to the raw_input() subroutine. +The raw_input() method is used to take user inputs through the console. +Unlike input() which assumes the data entered by the user as a standard python +expression, raw_input() treats all the input data as raw data and converts +everything into a string. To illustrate this let us look at an example.

 >>> input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+121
+
+>>> input("Enter your name:")
+Enter your name:PythonFreak
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+  File "<string>", line 1, in <module>
+NameError: name 'PythonFreak' is not defined

As shown above the input() assumes that the data entered is a valid Python +expression. In the first call it prompts for an integer input and when entered +it accepts the integer as an integer, whereas in the second call, when the string +is entered without the quotes, input() assumes that the entered data is a valid +Python expression and hence it raises and exception saying PythonFreak is not +defined.

 >>> input("Enter your name:")
+Enter your name:'PythonFreak'
+'PythonFreak'
+>>>

Here the name is accepted because its entered as a string (within quotes). But +its unreasonable to go on using quotes each time a string is entered. Hence the +alternative is to use raw_input().

Let us now look at how raw_input() operates with an example.

 >>> raw_input("Enter your name:")
+Enter your name:PythonFreak
+'PythonFreak'

Observe that the raw_input() is converting it into a string all by itself.

 >>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+'121'

Observe that raw_input() is converting the integer 121 also to a string as +'121'. Let us look at another example:

 >>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+>>> pal + 2
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: cannot concatenate 'str' and 'int' objects
+>>> pal
+'121'

Observe here that the variable pal is a string and hence integer operations +cannot be performed on it. Hence the exception is raised.

+8. int() method

Generally for computing purposes, the data used is not strings or raw data but +on integers, floats and similar mathematical data structures. The data obtained +from raw_input() is raw data in the form of strings. In order to obtain integers +from strings we use the method int().

Let us look at an example.

 >>> intpal = int(pal)
+>>> intpal
+121

In the previous example it was observed that pal was a string variable. Here +using the int() method the string pal was converted to an integer variable.

Try This Yourself:

 >>> stringvar = raw_input("Enter a name:")
+Enter a name:Guido Van Rossum
+>>> stringvar
+'Guido Van Rossum'
+>>> numvar = int(stringvar)

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch2intro.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch2intro.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,591 @@ + + + +Chapter 2. Basic Python + + + + + + + + + + +

Table of Contents

Basic Python

2.1. The Interactive Interpreter
2.2. ipython - An enhanced interactive Python interpreter

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

+Basic Python

Table of Contents

2.1. The Interactive Interpreter
2.2. ipython - An enhanced interactive Python interpreter

4.1. Numbers
4.2. Variables
4.3. Strings
4.4. Boolean

+The system requirements:

Python - version 2.5.x or newer.

IPython

Text editor - scite, vim, emacs or whatever you are comfortable with.

+1. Introduction

The Python docs define Python as "Python is an interpreted, object-oriented, +high-level programming language with dynamic semantics." A more detailed summary +can be found at

http://www.python.org/doc/essays/blurb.html

. Python is a language that +has been designed to help the programmer concentrate on solving the problem at hand +and not worry about the programming language idiosyncrasies.

Resources available for reference

Web:

http://www.python.org

Doc:

http://www.python.org/doc

Free Tutorials:

Official Python Tutorial:

http://docs.python.org/tut/tut.html

Byte of Python:

http://www.byteofpython.info/

Dive into Python:

http://diveintopython.org/

Advantages of Python - Why Python??

Python offers a highly powerful interactive programming interface in the form +of the 'Interactive Interpreter' which will be discussed in more detail in the +following sections.

Python provides a rich standard library and an extensive set of modules. The +power of Python modules can be seen in this slightly exaggerated cartoon +

http://xkcd.com/353/

Python interfaces well with most other programming languages such as C, C++ +and FORTRAN.

Although, Python has one setback. Python is not fast as some of the compiled +languages like C or C++. Yet, the amount of flexibility and power more than make +up for this setback.

+2. The Python Interpreter

+2.1. The Interactive Interpreter

 Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
+[GCC 4.3.2] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>>

Lets try with an example, type print 'Hello, World!' at the prompt and hit +the enter key.

 >>> print 'Hello, World!'
+Hello, World!

This example was quite straight forward, and thus we have written our first +line of Python code. Now let us try typing something arbitrary at the prompt. +For example:

 >>> arbit word
+  File "<stdin>", line 1
+    arbit word
+            ^
+SyntaxError: invalid syntax
+>>>

 >>> help()
+
+Welcome to Python 2.5!  This is the online help utility.
+
+If this is your first time using Python, you should definitely check out
+the tutorial on the Internet at http://www.python.org/doc/tut/.
+
+Enter the name of any module, keyword, or topic to get help on writing
+Python programs and using Python modules.  To quit this help utility and
+return to the interpreter, just type "quit".
+
+To get a list of available modules, keywords, or topics, type "modules",
+"keywords", or "topics".  Each module also comes with a one-line summary
+of what it does; to list the modules whose summaries contain a given word
+such as "spam", type "modules spam".
+
+help>

Let us now try a few examples at the python interpreter.

Eg 1:

 >>> print 'Hello, python!'
+Hello, python!
+>>>

Eg 2:

 >>> print 4321*567890
+2453852690
+>>>

Eg 3:

 >>> 4321*567890
+2453852690L
+>>>

 Note: Notice the 'L' at the end of the output. The 'L' signifies that the
+output of the operation is of type *long*. It was absent in the previous
+example because we used the print statement. This is because *print* formats
+the output before displaying.

Eg 4:

 >>> big = 12345678901234567890 ** 3
+>>> print big
+1881676372353657772490265749424677022198701224860897069000
+>>>

 This example is to show that unlike in C or C++ there is no limit on the
+value of an integer.

Try this on the interactive interpreter: +import this

Hint: The output gives an idea of Power of Python

+2.2. ipython - An enhanced interactive Python interpreter

 $ ipython
+Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
+Type "copyright", "credits" or "license" for more information.
+
+IPython 0.8.4 -- An enhanced Interactive Python.
+?         -> Introduction and overview of IPython's features.
+%quickref -> Quick reference.
+help      -> Python's own help system.
+object?   -> Details about 'object'. ?object also works, ?? prints more.
+
+In [1]:

This is the output obtained upon firing ipython. The exact appearance may +change based on the Python version installed. The following are some of the +various features provided by ipython:

Suggestions - ipython provides suggestions of the possible methods and +operations available for the given python object.

Eg 5:

 In [4]: a = 6
+
+In [5]: a.
+a.__abs__           a.__divmod__        a.__index__         a.__neg__          a.__rand__          a.__rmod__          a.__rxor__
+a.__add__           a.__doc__           a.__init__          a.__new__          a.__rdiv__          a.__rmul__          a.__setattr__
+a.__and__           a.__float__         a.__int__           a.__nonzero__      a.__rdivmod__       a.__ror__           a.__str__
+a.__class__         a.__floordiv__      a.__invert__        a.__oct__          a.__reduce__        a.__rpow__          a.__sub__
+a.__cmp__           a.__getattribute__  a.__long__          a.__or__           a.__reduce_ex__     a.__rrshift__       a.__truediv__
+a.__coerce__        a.__getnewargs__    a.__lshift__        a.__pos__          a.__repr__          a.__rshift__        a.__xor__
+a.__delattr__       a.__hash__          a.__mod__           a.__pow__          a.__rfloordiv__     a.__rsub__
+a.__div__           a.__hex__           a.__mul__           a.__radd__         a.__rlshift__       a.__rtruediv__

+3. Editing and running a python file

Let us look at a simple example of calculating the gcd of 2 numbers using Python:

Creating the first python script(file)

 $ emacs gcd.py
+  def gcd(x,y):
+    if x % y == 0:
+      return y
+    return gcd(y, x%y)
+
+  print gcd(72, 92)

To run the script, open the shell prompt, navigate to the directory that +contains the python file and run python <filename.py> at the prompt ( in this +case filename is gcd.py )

Running the python script

 $ python gcd.py
+4
+$

Another method to run a python script would be to include the line

#! /usr/bin/python

at the beginning of the python file and then make the file executable by

$ chmod a+x filename.py

Once this is done, the script can be run as a standalone program as follows:

$ ./filename.py

+4. Basic Datatypes and operators in Python

Python provides the following set of basic datatypes.

Numbers: int, float, long, complex

Strings

Boolean

+4.1. Numbers

Eg 6:

 >>> a = 1 #here a is an integer variable

Eg 7:

 >>> lng = 122333444455555666666777777788888888999999999 #here lng is a variable of type long
+>>> lng
+122333444455555666666777777788888888999999999L #notice the trailing 'L'
+>>> print lng
+122333444455555666666777777788888888999999999 #notice the absence of the trailing 'L'
+>>> lng+1
+122333444455555666666777777788888889000000000L

Eg 8:

 >>> fl = 3.14159 #fl is a float variable
+>>> e = 1.234e-4 #e is also a float variable, specified in the exponential form
+>>> a = 1
+>>> b = 2
+>>> a/b #integer division
+0
+>>> a/fl #floating point division
+0.31831015504887655
+>>> e/fl
+3.9279473133031364e-05

Eg 9:

 >>> cplx = 3 + 4j #cplx is a complex variable
+>>> cplx
+(3+4j)
+>>> print cplx.real #prints the real part of the complex number
+3.0
+>>> print cplx.imag #prints the imaginary part of the complex number
+4.0
+>>> print cplx*fl  #multiplies the real and imag parts of the complex number with the multiplier
+(9.42477+12.56636j)
+>>> abs(cplx) #returns the absolute value of the complex number
+5.0

+4.2. Variables

Variables are just names that represent a value. Variables have already been +introduced in the various examples from the previous sections. Certain rules about +using variables:

Variables have to be initialized or assigned a value before being used.

Variable names can consist of letters, digits and underscores(_).

Variable names cannot begin with digits, but can contain digits in them.

In reference to the previous section examples, 'a', 'b', 'lng', 'fl', 'e' and 'cplx' +are all variables of various datatypes.

 Note: Python is not a strongly typed language and hence an integer variable can at a
+later stage be used as a float variable as well.

+4.3. Strings

 s = 'this is a string'              # a string variable can be represented using single quotes
+s = 'This one has "quotes" inside!' # The string can have quotes inside it as shown
+s = "I have 'single-quotes' inside!"
+l = "A string spanning many lines\
+one more line\
+yet another"                        # a string can span more than a single line.
+t = """A triple quoted string does  # another way of representing multiline strings.
+not need to be escaped at the end and
+"can have nested quotes" etc."""

Try the following on the interpreter: +s = 'this is a string with 'quotes' of similar kind'

Exercise: How to use single quotes within single quotes in a string as shown +in the above example without getting an error?

+4.3.1. String operations

A few basic string operations are presented here.

String concatenation +String concatenation is done by simple addition of two strings.

 >>> x = 'Hello'
+>>> y = ' Python'
+>>> print x+y
+Hello Python

Try this yourself:

 >>> somenum = 13
+>>> print x+somenum

str() simply converts a value to a string in a reasonable form. +repr() creates a string that is a representation of the value.

The difference can be seen in the example shown below:

 >>> str(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000'
+>>> repr(1000000000000000000000000000000000000000000000000L)
+'1000000000000000000000000000000000000000000000000L'

It can be observed that the 'L' in the long value shown was omitted by str(), +whereas repr() converted that into a string too. An alternative way of using +repr(value) is `value`.

A few more examples:

 >>> x = "Let's go \nto Pycon"
+>>> print x
+Let's go
+to Pycon

 >>> x = r"Let's go \nto Pycon"
+>>> print x
+Let's go \nto Pycon

Note: The 'n' is not being parsed into a new line and is left as it is.

Try this yourself:

 >>> x = r"Let's go to Pycon\"

 >>> x = u"Let's go to Pycon!"
+>>> print x
+Let's go to Pycon!

+4.4. Boolean

Python also provides special Boolean datatype. A boolean variable can assume a +value of either True or False (Note the capitalizations).

Let us look at examples:

 >>> t = True
+>>> f = not t
+>>> print f
+False
+>>> f or t
+True
+>>> f and t
+False

+5. The while loop

The Python while loop is similar to the C/C++ while loop. The syntax is as +follows:

 statement 0
+while condition:
+  statement 1 #while block
+  statement 2 #while block
+statement 3 #outside the while block.

Let us look at an example:

 >>> x = 1
+>>> while x <= 5:
+...   print x
+...   x += 1
+...
+1
+2
+3
+4
+5

+6. The if conditional

The Python if block provides the conditional execution of statements. +If the condition evaluates as true the block of statements defined under the if +block are executed.

If the first block is not executed on account of the condition not being satisfied, +the set of statements in the else block are executed.

The elif block provides the functionality of evaluation of multiple conditions +as shown in the example.

The syntax is as follows:

 if condition :
+    statement_1
+    statement_2
+
+elif condition:
+    statement_3
+    statement_4
+else:
+    statement_5
+    statement_6

Let us look at an example:

 >>> n = raw_input("Input a number:")
+>>> if n < 0:
+      print n," is negative"
+      elif n > 0:
+      print n," is positive"
+      else:
+      print n, " is 0"

+7. raw_input() +

 >>> input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+121
+
+>>> input("Enter your name:")
+Enter your name:PythonFreak
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+  File "<string>", line 1, in <module>
+NameError: name 'PythonFreak' is not defined

 >>> input("Enter your name:")
+Enter your name:'PythonFreak'
+'PythonFreak'
+>>>

Here the name is accepted because its entered as a string (within quotes). But +its unreasonable to go on using quotes each time a string is entered. Hence the +alternative is to use raw_input().

Let us now look at how raw_input() operates with an example.

 >>> raw_input("Enter your name:")
+Enter your name:PythonFreak
+'PythonFreak'

Observe that the raw_input() is converting it into a string all by itself.

 >>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+'121'

Observe that raw_input() is converting the integer 121 also to a string as +'121'. Let us look at another example:

 >>> pal = raw_input("Enter a number thats a palindrome:")
+Enter a number thats a palindrome:121
+>>> pal + 2
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: cannot concatenate 'str' and 'int' objects
+>>> pal
+'121'

Observe here that the variable pal is a string and hence integer operations +cannot be performed on it. Hence the exception is raised.

+8. int() method

Let us look at an example.

 >>> intpal = int(pal)
+>>> intpal
+121

In the previous example it was observed that pal was a string variable. Here +using the int() method the string pal was converted to an integer variable.

Try This Yourself:

 >>> stringvar = raw_input("Enter a name:")
+Enter a name:Guido Van Rossum
+>>> stringvar
+'Guido Van Rossum'
+>>> numvar = int(stringvar)

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch4strings_dicts.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch4strings_dicts.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,465 @@ + + + +Chapter 4. strings_dicts + + + + + + + + +

Table of Contents

Strings

2.1. find
2.2. join
2.3. lower
2.4. replace
2.5. split
2.6. strip

4.1. Opening Files
4.2. Reading and Writing files

5.1. dict()
5.2. Dictionary Methods

+Strings

Table of Contents

2.1. find
2.2. join
2.3. lower
2.4. replace
2.5. split
2.6. strip

4.1. Opening Files
4.2. Reading and Writing files

5.1. dict()
5.2. Dictionary Methods

Strings were briefly introduced previously in the introduction document. In this +section strings will be presented in greater detail. All the standard operations +that can be performed on sequences such as indexing, slicing, multiplication, length +minimum and maximum can be performed on string variables as well. One thing to +be noted is that strings are immutable, which means that string variables are +unchangeable. Hence, all item and slice assignments on strings are illegal. +Let us look at a few example.

 >>> name = 'PythonFreak'
+>>> print name[3]
+h
+>>> print name[-1]
+k
+>>> print name[6:]
+Freak
+>>> name[6:0] = 'Maniac'
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: 'str' object does not support item assignment

This is quite expected, since string objects are immutable as already mentioned. +The error message is clear in mentioning that 'str' object does not support item +assignment.

+1. String Formatting

String formatting can be performed using the string formatting operator represented +as the percent (%) sign. The string placed before the % sign is formatted with +the value placed to the right of it. Let us look at a simple example.

 >>> format = 'Hello %s, from PythonFreak'
+>>> str1 = 'world!'
+>>> print format % str1
+Hello world!, from PythonFreak

The %s parts of the format string are called the coversion specifiers. The coversion +specifiers mark the places where the formatting has to be performed in a string. +In the example the %s is replaced by the value of str1. More than one value can +also be formatted at a time by specifying the values to be formatted using tuples +and dictionaries (explained in later sections). Let us look at an example.

 >>> format = 'Hello %s, from %s'
+>>> values = ('world!', 'PythonFreak')
+>>> print format % values
+Hello world!, from PythonFreak

In this example it can be observed that the format string contains two conversion +specifiers and they are formatted using the tuple of values as shown.

The s in %s specifies that the value to be replaced is of type string. Values of +other types can be specified as well such as integers and floats. Integers are +specified as %d and floats as %f. The precision with which the integer or the +float values are to be represented can also be specified using a . (dot) +followed by the precision value.

+2. String Methods

Similar to list methods, strings also have a rich set of methods to perform various +operations on strings. Some of the most important and popular ones are presented +in this section.

+2.1. find +

The find method is used to search for a substring within a given string. It +returns the left most index of the first occurence of the substring. If the +substring is not found in the string then it returns -1. Let us look at a few +examples.

 >>> longstring = 'Hello world!, from PythonFreak'
+>>> longstring.find('Python')
+19
+>>> longstring.find('Perl')
+-1

+2.2. join +

The join method is used to join the elements of a sequence. The sequence +elements that are to be join ed should all be strings. Let us look at a few +examples.

 >>> seq = ['With', 'great', 'power', 'comes', 'great', 'responsibility']
+>>> sep = ' '
+>>> sep.join(seq)
+'With great power comes great responsibility'
+>>> sep = ',!'
+>>> sep.join(seq)
+'With,!great,!power,!comes,!great,!responsibility'

Try this yourself

 >>> seq = [12,34,56,78]
+>>> sep.join(seq)

+2.3. lower +

The lower method, as the name indicates, converts the entire text of a string +to lower case. It is specially useful in cases where the programmers deal with case +insensitive data. Let us look at a few examples.

 >>> sometext = 'Hello world!, from PythonFreak'
+>>> sometext.lower()
+'hello world!, from pythonfreak'

+2.4. replace +

The replace method replaces a substring with another substring within +a given string and returns the new string. Let us look at an example.

 >>> sometext = 'Concise, precise and criticise is some of the words that end with ise'
+>>> sometext.replace('is', 'are')
+'Concaree, precaree and criticaree are some of the words that end with aree'

Observe here that all the occurences of the substring is have been replaced, +even the is in concise, precise and criticise have been replaced.

+2.5. split +

The split is one of the very important string methods. split is the opposite of the +join method. It is used to split a string based on the argument passed as the +delimiter. It returns a list of strings. By default when no argument is passed it +splits with space (' ') as the delimiter. Let us look at an example.

 >>> grocerylist = 'butter, cucumber, beer(a grocery item??), wheatbread'
+>>> grocerylist.split(',')
+['butter', ' cucumber', ' beer(a grocery item??)', ' wheatbread']
+>>> grocerylist.split()
+['butter,', 'cucumber,', 'beer(a', 'grocery', 'item??),', 'wheatbread']

Observe here that in the second case when the delimiter argument was not set +split was done with space as the delimiter.

+2.6. strip +

The strip method is used to remove or strip off any whitespaces that exist +to the left and right of a string, but not the whitespaces within a string. Let +us look at an example.

 >>> spacedtext = "               Where's the text??                 "
+>>> spacedtext.strip()
+"Where's the text??"

Observe that the whitespaces between the words have not been removed.

 Note: Very important thing to note is that all the methods shown above do not
+      transform the source string. The source string still remains the same.
+      Remember that **strings are immutable**.

+3. Introduction to the standard library

Python is often referred to as a "Batteries included!" language, mainly because +of the Python Standard Library. The Python Standard Library provides an extensive +set of features some of which are available directly for use while some require to +import a few modules. The Standard Library provides various built-in functions +like:

abs()

dict()

enumerate()

The built-in constants like True and False are provided by the Standard Library. +More information about the Python Standard Library is available

http://docs.python.org/library/

+4. I/O: Reading and Writing Files

Files are very important aspects when it comes to computing and programming. +Up until now the focus has been on small programs that interacted with users +through input() and raw_input(). Generally, for computational purposes +it becomes necessary to handle files, which are usually large in size as well. +This section focuses on basics of file handling.

+4.1. Opening Files

Files can be opened using the open() method. open() accepts 3 arguments +out of which 2 are optional. Let us look at the syntax of open():

f = open( filename, mode, buffering)

The filename is a compulsory argument while the mode and buffering are +optional. The filename should be a string and it should be the complete path +to the file to be opened (The path can be absolute or relative). Let us look at +an example.

 >>> f = open ('basic_python/interim_assessment.rst')

The mode argument specifies the mode in which the file has to be opened. +The following are the valid mode arguments:

r - Read mode +w - Write mode +a - Append mode +b - Binary mode ++ - Read/Write mode

The read mode opens the file as a read-only document. The write mode opens the +file in the Write only mode. In the write mode, if the file existed prior to the +opening, the previous contents of the file are erased. The append mode opens the +file in the write mode but the previous contents of the file are not erased and +the current data is appended onto the file. +The binary and the read/write modes are special in the sense that they are added +onto other modes. The read/write mode opens the file in the reading and writing +mode combined. The binary mode can be used to open a files that do not contain +text. Binary files such as images should be opened in the binary mode. Let us look +at a few examples.

 >>> f = open ('basic_python/interim_assessment.rst', 'r')
+>>> f = open ('armstrong.py', 'r+')

The third argument to the open() method is the buffering argument. This takes +a boolean value, True or 1 indicates that buffering has to be enabled on the file, +that is the file is loaded on to the main memory and the changes made to the file are +not immediately written to the disk. If the buffering argument is 0 or False the +changes are directly written on to the disk immediately.

+4.2. Reading and Writing files

+4.2.1. write() +

write(), evidently, is used to write data onto a file. It takes the data to +be written as the argument. The data can be a string, an integer, a float or any +other datatype. In order to be able to write data onto a file, the file has to +be opened in one of w, a or + modes.

+4.2.2. read() +

read() is used to read data from a file. It takes the number of bytes of data +to be read as the argument. If nothing is specified by default it reads the entire +contents from the current position to the end of file.

Let us look at a few examples:

 >>> f = open ('randomtextfile', 'w')
+>>> f.write('Hello all, this is PythonFreak. This is a random text file.')
+>>> f = open ('../randomtextfile', 'r')
+>>> f = open ('../randomtextfile', 'r')
+>>> f.read(5)
+'Hello'
+>>> f.read()
+' all, this is PythonFreak. This is a random text file.'
+>>> f.close()

+4.2.3. readline() +

readline() is used to read a file line by line. readline() reads a line +of a file at a time. When an argument is passed to readline() it reads that +many bytes from the current line.

One other method to read a file line by line is using the read() and the +for construct. Let us look at this block of code as an example.

 >>> f = open('../randomtextfile', 'r')
+>>> for line in f:
+...     print line
+...
+Hello all!
+
+This is PythonFreak on the second line.
+
+This is a random text file on line 3

+4.2.4. close() +

One must always close all the files that have been opened. Although, files opened +will be closed automatically when the program ends. When files opened in read mode +are not closed it might lead to uselessly locked sometimes. In case of files +opened in the write mode it is more important to close the files. This is because, +Python maybe using the file in the buffering mode and when the file is not closed +the buffer maybe lost completely and the changes made to the file are lost forever.

+5. Dictionaries

A dictionary in general, are designed to be able to look up meanings of words. +Similarly, the Python dictionaries are also designed to look up for a specific +key and retrieve the corresponding value. Dictionaries are data structures that +provide key-value mappings. Dictionaries are similar to lists except that instead +of the values having integer indexes, dictionaries have keys or strings as indexes. +Let us look at an example of how to define dictionaries.

 >>> dct = { 'Sachin': 'Tendulkar', 'Rahul': 'Dravid', 'Anil': 'Kumble'}

The dictionary consists of pairs of strings, which are called keys and their +corresponding values separated by : and each of these key-value pairs are +comma(',') separated and the entire structure wrapped in a pair curly braces {}.

 Note: The data inside a dictionary is not ordered. The order in which you enter
+the key-value pairs is not the order in which they are stored in the dictionary.
+Python has an internal storage mechanism for that which is out of the purview
+of this document.

+5.1. dict() +

The dict() function is used to create dictionaries from other mappings or other +dictionaries. Let us look at an example.

 >>> diction = dict(mat = 133, avg = 52.53)

String Formatting with Dictionaries:

String formatting was discussed in the previous section and it was mentioned that +dictionaries can also be used for formatting more than one value. This section +focuses on the formatting of strings using dictionaries. String formatting using +dictionaries is more appealing than doing the same with tuples. Here the keyword +can be used as a place holder and the value corresponding to it is replaced in +the formatted string. Let us look at an example.

 >>> player = { 'Name':'Rahul Dravid', 'Matches':133, 'Avg':52.53, '100s':26 }
+>>> strng = '%(Name)s has played %(Matches)d with an average of %(Avg).2f and has %(100s)d hundreds to his name.'
+>>> print strng % player
+Rahul Dravid has played 133 with an average of 52.53 and has 26 hundreds to his name.

+5.2. Dictionary Methods

+5.2.1. clear() +

The clear() method removes all the existing key-value pairs from a dictionary. +It returns None or rather does not return anything. It is a method that changes +the object. It has to be noted here that dictionaries are not immutable. Let us +look at an example.

 >>> dct
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}
+>>> dct.clear()
+>>> dct
+{}

+5.2.2. copy() +

The copy() returns a copy of a given dictionary. Let us look at an example.

 >>> dct = {'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}
+>>> dctcopy = dct.copy()
+>>> dctcopy
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}

+5.2.3. get() +

get() returns the value for the key passed as the argument and if the +key does not exist in the dictionary, it returns None. Let us look at an +example.

 >>> print dctcopy.get('Saurav')
+None
+>>> print dctcopy.get('Anil')
+Kumble

+5.2.4. has_key() +

This method returns True if the given key is in the dictionary, else it returns +False.

 >>> dctcopy.has_key('Saurav')
+False
+>>> dctcopy.has_key('Sachin')
+True

+5.2.5. pop() +

This method is used to retrieve the value of a given key and subsequently +remove the key-value pair from the dictionary. Let us look at an example.

 >>> print dctcopy.pop('Sachin')
+Tendulkar
+>>> dctcopy
+{'Anil': 'Kumble', 'Rahul': 'Dravid'}

+5.2.6. popitem() +

This method randomly pops a key-value pair from a dictionary and returns it. +The key-value pair returned is removed from the dictionary. Let us look at an +example.

 >>> print dctcopy.popitem()
+('Anil', 'Kumble')
+>>> dctcopy
+{'Rahul': 'Dravid'}
+
+Note that the item chosen is completely random since dictionaries are unordered
+as mentioned earlier.

+5.2.7. update() +

The update() method updates the contents of one dictionary with the contents +of another dictionary. For items with existing keys their values are updated, +and the rest of the items are added. Let us look at an example.

 >>> dctcopy.update(dct)
+>>> dct
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}
+>>> dctcopy
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch4strings_dicts.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch4strings_dicts.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,465 @@ + + + +Chapter 4. strings_dicts + + + + + + + + +

Table of Contents

Strings

2.1. find
2.2. join
2.3. lower
2.4. replace
2.5. split
2.6. strip

4.1. Opening Files
4.2. Reading and Writing files

5.1. dict()
5.2. Dictionary Methods

+Strings

Table of Contents

2.1. find
2.2. join
2.3. lower
2.4. replace
2.5. split
2.6. strip

4.1. Opening Files
4.2. Reading and Writing files

5.1. dict()
5.2. Dictionary Methods

 >>> name = 'PythonFreak'
+>>> print name[3]
+h
+>>> print name[-1]
+k
+>>> print name[6:]
+Freak
+>>> name[6:0] = 'Maniac'
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: 'str' object does not support item assignment

This is quite expected, since string objects are immutable as already mentioned. +The error message is clear in mentioning that 'str' object does not support item +assignment.

+1. String Formatting

 >>> format = 'Hello %s, from PythonFreak'
+>>> str1 = 'world!'
+>>> print format % str1
+Hello world!, from PythonFreak

 >>> format = 'Hello %s, from %s'
+>>> values = ('world!', 'PythonFreak')
+>>> print format % values
+Hello world!, from PythonFreak

In this example it can be observed that the format string contains two conversion +specifiers and they are formatted using the tuple of values as shown.

+2. String Methods

Similar to list methods, strings also have a rich set of methods to perform various +operations on strings. Some of the most important and popular ones are presented +in this section.

+2.1. find +

 >>> longstring = 'Hello world!, from PythonFreak'
+>>> longstring.find('Python')
+19
+>>> longstring.find('Perl')
+-1

+2.2. join +

The join method is used to join the elements of a sequence. The sequence +elements that are to be join ed should all be strings. Let us look at a few +examples.

 >>> seq = ['With', 'great', 'power', 'comes', 'great', 'responsibility']
+>>> sep = ' '
+>>> sep.join(seq)
+'With great power comes great responsibility'
+>>> sep = ',!'
+>>> sep.join(seq)
+'With,!great,!power,!comes,!great,!responsibility'

Try this yourself

 >>> seq = [12,34,56,78]
+>>> sep.join(seq)

+2.3. lower +

 >>> sometext = 'Hello world!, from PythonFreak'
+>>> sometext.lower()
+'hello world!, from pythonfreak'

+2.4. replace +

The replace method replaces a substring with another substring within +a given string and returns the new string. Let us look at an example.

 >>> sometext = 'Concise, precise and criticise is some of the words that end with ise'
+>>> sometext.replace('is', 'are')
+'Concaree, precaree and criticaree are some of the words that end with aree'

Observe here that all the occurences of the substring is have been replaced, +even the is in concise, precise and criticise have been replaced.

+2.5. split +

 >>> grocerylist = 'butter, cucumber, beer(a grocery item??), wheatbread'
+>>> grocerylist.split(',')
+['butter', ' cucumber', ' beer(a grocery item??)', ' wheatbread']
+>>> grocerylist.split()
+['butter,', 'cucumber,', 'beer(a', 'grocery', 'item??),', 'wheatbread']

Observe here that in the second case when the delimiter argument was not set +split was done with space as the delimiter.

+2.6. strip +

The strip method is used to remove or strip off any whitespaces that exist +to the left and right of a string, but not the whitespaces within a string. Let +us look at an example.

 >>> spacedtext = "               Where's the text??                 "
+>>> spacedtext.strip()
+"Where's the text??"

Observe that the whitespaces between the words have not been removed.

 Note: Very important thing to note is that all the methods shown above do not
+      transform the source string. The source string still remains the same.
+      Remember that **strings are immutable**.

+3. Introduction to the standard library

abs()

dict()

enumerate()

The built-in constants like True and False are provided by the Standard Library. +More information about the Python Standard Library is available

http://docs.python.org/library/

+4. I/O: Reading and Writing Files

+4.1. Opening Files

Files can be opened using the open() method. open() accepts 3 arguments +out of which 2 are optional. Let us look at the syntax of open():

f = open( filename, mode, buffering)

 >>> f = open ('basic_python/interim_assessment.rst')

The mode argument specifies the mode in which the file has to be opened. +The following are the valid mode arguments:

r - Read mode +w - Write mode +a - Append mode +b - Binary mode ++ - Read/Write mode

 >>> f = open ('basic_python/interim_assessment.rst', 'r')
+>>> f = open ('armstrong.py', 'r+')

+4.2. Reading and Writing files

+4.2.1. write() +

+4.2.2. read() +

Let us look at a few examples:

 >>> f = open ('randomtextfile', 'w')
+>>> f.write('Hello all, this is PythonFreak. This is a random text file.')
+>>> f = open ('../randomtextfile', 'r')
+>>> f = open ('../randomtextfile', 'r')
+>>> f.read(5)
+'Hello'
+>>> f.read()
+' all, this is PythonFreak. This is a random text file.'
+>>> f.close()

+4.2.3. readline() +

readline() is used to read a file line by line. readline() reads a line +of a file at a time. When an argument is passed to readline() it reads that +many bytes from the current line.

One other method to read a file line by line is using the read() and the +for construct. Let us look at this block of code as an example.

 >>> f = open('../randomtextfile', 'r')
+>>> for line in f:
+...     print line
+...
+Hello all!
+
+This is PythonFreak on the second line.
+
+This is a random text file on line 3

+4.2.4. close() +

+5. Dictionaries

 >>> dct = { 'Sachin': 'Tendulkar', 'Rahul': 'Dravid', 'Anil': 'Kumble'}

 Note: The data inside a dictionary is not ordered. The order in which you enter
+the key-value pairs is not the order in which they are stored in the dictionary.
+Python has an internal storage mechanism for that which is out of the purview
+of this document.

+5.1. dict() +

The dict() function is used to create dictionaries from other mappings or other +dictionaries. Let us look at an example.

 >>> diction = dict(mat = 133, avg = 52.53)

String Formatting with Dictionaries:

 >>> player = { 'Name':'Rahul Dravid', 'Matches':133, 'Avg':52.53, '100s':26 }
+>>> strng = '%(Name)s has played %(Matches)d with an average of %(Avg).2f and has %(100s)d hundreds to his name.'
+>>> print strng % player
+Rahul Dravid has played 133 with an average of 52.53 and has 26 hundreds to his name.

+5.2. Dictionary Methods

+5.2.1. clear() +

 >>> dct
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}
+>>> dct.clear()
+>>> dct
+{}

+5.2.2. copy() +

The copy() returns a copy of a given dictionary. Let us look at an example.

 >>> dct = {'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}
+>>> dctcopy = dct.copy()
+>>> dctcopy
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}

+5.2.3. get() +

get() returns the value for the key passed as the argument and if the +key does not exist in the dictionary, it returns None. Let us look at an +example.

 >>> print dctcopy.get('Saurav')
+None
+>>> print dctcopy.get('Anil')
+Kumble

+5.2.4. has_key() +

This method returns True if the given key is in the dictionary, else it returns +False.

 >>> dctcopy.has_key('Saurav')
+False
+>>> dctcopy.has_key('Sachin')
+True

+5.2.5. pop() +

This method is used to retrieve the value of a given key and subsequently +remove the key-value pair from the dictionary. Let us look at an example.

 >>> print dctcopy.pop('Sachin')
+Tendulkar
+>>> dctcopy
+{'Anil': 'Kumble', 'Rahul': 'Dravid'}

+5.2.6. popitem() +

This method randomly pops a key-value pair from a dictionary and returns it. +The key-value pair returned is removed from the dictionary. Let us look at an +example.

 >>> print dctcopy.popitem()
+('Anil', 'Kumble')
+>>> dctcopy
+{'Rahul': 'Dravid'}
+
+Note that the item chosen is completely random since dictionaries are unordered
+as mentioned earlier.

+5.2.7. update() +

 >>> dctcopy.update(dct)
+>>> dct
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}
+>>> dctcopy
+{'Anil': 'Kumble', 'Sachin': 'Tendulkar', 'Rahul': 'Dravid'}

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch5func.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch5func.html Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,342 @@ + + + +Chapter 5. Functions + + + + + + + + +

Table of Contents

Functional Approach

6.1. List Comprehensions

+Functional Approach

Table of Contents

6.1. List Comprehensions

Functions allow us to enclose a set of statements and call the function again +and again instead of repeating the group of statements everytime. Functions also +allow us to isolate a piece of code from all the other code and provides the +convenience of not polluting the global variables.

Function in python is defined with the keyword def followed by the name +of the function, in turn followed by a pair of parenthesis which encloses the +list of parameters to the function. The definition line ends with a ':'. The +definition line is followed by the body of the function intended by one block. +The Function must return a value:

 def factorial(n):
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact

The code snippet above defines a function with the name factorial, takes the +number for which the factorial must be computed, computes the factorial and +returns the value.

A Function once defined can be used or called anywhere else in the program. We +call a fucntion with its name followed by a pair of parenthesis which encloses +the arguments to the function.

The value that function returns can be assigned to a variable. Let's call the +above function and store the factorial in a variable:

 fact5 = factorial(5)

The value of fact5 will now be 120, which is the factorial of 5. Note that we +passed 5 as the argument to the function.

It may be necessary to document what the function does, for each of the function +to help the person who reads our code to understand it better. In order to do +this Python allows the first line of the function body to be a string. This +string is called as Documentation String or docstring. docstrings prove +to be very handy since there are number of tools which can pull out all the +docstrings from Python functions and generate the documentation automatically +from it. docstrings for functions can be written as follows:

 def factorial(n):
+  'Returns the factorial for the number n.'
+  fact = 1
+  for i in range(2, n):
+    fact *= i
+
+  return fact

An important point to note at this point is that, a function can return any +Python value or a Python object, which also includes a Tuple. A Tuple is +just a collection of values and those values themselves can be of any other +valid Python datatypes, including Lists, Tuples, Dictionaries among other +things. So effectively, if a function can return a tuple, it can return any +number of values through a tuple

Let us write a small function to swap two values:

 def swap(a, b):
+  return b, a
+
+c, d = swap(a, b)

+1. Function scope

The variables used inside the function are confined to the function's scope +and doesn't pollute the variables of the same name outside the scope of the +function. Also the arguments passed to the function are passed by-value if +it is of basic Python data type:

 def cant_change(n):
+  n = 10
+
+n = 5
+cant_change(n)

Upon running this code, what do you think would have happened to value of n +which was assigned 5 before the function call? If you have already tried out +that snippet on the interpreter you already know that the value of n is not +changed. This is true of any immutable types of Python like Numbers, Strings +and Tuples. But when you pass mutable objects like Lists and Dictionaries +the values are manipulated even outside the function:

 >>> def can_change(n):
+...   n[1] = James
+...
+
+>>> name = ['Mr.', 'Steve', 'Gosling']
+>>> can_change(name)
+>>> name
+['Mr.', 'James', 'Gosling']

If nothing is returned by the function explicitly, Python takes care to return +None when the funnction is called.

+2. Default Arguments

There may be situations where we need to allow the functions to take the +arguments optionally. Python allows us to define function this way by providing +a facility called Default Arguments. For example, we need to write a function +that returns a list of fibonacci numbers. Since our function cannot generate an +infinite list of fibonacci numbers, we need to specify the number of elements +that the fibonacci sequence must contain. Suppose, additionally, we want to the +function to return 10 numbers in the sequence if no option is specified we can +define the function as follows:

 def fib(n=10):
+  fib_list = [0, 1]
+  for i in range(n - 2):
+    next = fib_list[-2] + fib_list[-1]
+    fib_list.append(next)
+  return fib_list

When we call this function, we can optionally specify the value for the +parameter n, during the call as an argument. Calling with no argument and +argument with n=5 returns the following fibonacci sequences:

 fib()
+[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
+fib(5)
+[0, 1, 1, 2, 3]

+3. Keyword Arguments

When a function takes a large number of arguments, it may be difficult to +remember the order of the parameters in the function definition or it may +be necessary to pass values to only certain parameters since others take +the default value. In either of these cases, Python provides the facility +of passing arguments by specifying the name of the parameter as defined in +the function definition. This is known as Keyword Arguments.

In a function call, Keyword arguments can be used for each argument, in the +following fashion:

 argument_name=argument_value
+Also denoted as: keyword=argument
+
+def wish(name='World', greetings='Hello'):
+  print "%s, %s!" % (greetings, name)

This function can be called in one of the following ways. It is important to +note that no restriction is imposed in the order in which Keyword arguments +can be specified. Also note, that we have combined Keyword arguments with +Default arguments in this example, however it is not necessary:

 wish(name='Guido', greetings='Hey')
+wish(greetings='Hey', name='Guido')

Calling functions by specifying arguments in the order of parameters specified +in the function definition is called as Positional arguments, as opposed to +Keyword arguments. It is possible to use both Positional arguments and +Keyword arguments in a single function call. But Python doesn't allow us to +bungle up both of them. The arguments to the function, in the call, must always +start with Positional arguments which is in turn followed by Keyword +arguments:

 def my_func(x, y, z, u, v, w):
+  # initialize variables.
+  ...
+  # do some stuff
+  ...
+  # return the value

It is valid to call the above functions in the following ways:

 my_func(10, 20, 30, u=1.0, v=2.0, w=3.0)
+my_func(10, 20, 30, 1.0, 2.0, w=3.0)
+my_func(10, 20, z=30, u=1.0, v=2.0, w=3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, w=3.0)

Following lists some of the invalid calls:

 my_func(10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, 20, z=30, 1.0, 2.0, 3.0)
+my_func(x=10, y=20, z=30, u=1.0, v=2.0, 3.0)

+4. Parameter Packing and Unpacking

The positional arguments passed to a function can be collected in a tuple +parameter and keyword arguments can be collected in a dictionary. Since keyword +arguments must always be the last set of arguments passed to a function, the +keyword dictionary parameter must be the last parameter. The function definition +must include a list explicit parameters, followed by tuple paramter collecting +parameter, whose name is preceded by a *, for collecting positional +parameters, in turn followed by the dictionary collecting parameter, whose name +is preceded by a **

 def print_report(title, *args, **name):
+  """Structure of *args*
+  (age, email-id)
+  Structure of *name*
+  {
+      'first': First Name
+      'middle': Middle Name
+      'last': Last Name
+  }
+  """
+
+  print "Title: %s" % (title)
+  print "Full name: %(first)s %(middle)s %(last)s" % name
+  print "Age: %d\nEmail-ID: %s" % args

The above function can be called as. Note, the order of keyword parameters can +be interchanged:

 >>> print_report('Employee Report', 29, 'johny@example.com', first='Johny',
+                 last='Charles', middle='Douglas')
+Title: Employee Report
+Full name: Johny Douglas Charles
+Age: 29
+Email-ID: johny@example.com

The reverse of this can also be achieved by using a very identical syntax while +calling the function. A tuple or a dictionary can be passed as arguments in +place of a list of Positional arguments or Keyword arguments respectively +using * or **

 def print_report(title, age, email, first, middle, last):
+  print "Title: %s" % (title)
+  print "Full name: %s %s %s" % (first, middle, last)
+  print "Age: %d\nEmail-ID: %s" % (age, email)
+
+>>> args = (29, 'johny@example.com')
+>>> name = {
+        'first': 'Johny',
+        'middle': 'Charles',
+        'last': 'Douglas'
+        }
+>>> print_report('Employee Report', *args, **name)
+Title: Employee Report
+Full name: Johny Charles Douglas
+Age: 29
+Email-ID: johny@example.com

+5. Nested Functions and Scopes

Python allows nesting one function inside another. This style of programming +turns out to be extremely flexible and powerful features when we use Python +decorators. We will not talk about decorators is beyond the scope of this +course. If you are interested in knowing more about decorator programming in +Python you are suggested to read:

+<line_block><line>

http://avinashv.net/2008/04/python-decorators-syntactic-sugar/

</line><line>

http://personalpages.tds.net/~kent37/kk/00001.html

</line></line_block>

However, the following is an example for nested functions in Python:

 def outer():
+  print "Outer..."
+  def inner():
+    print "Inner..."
+  print "Outer..."
+  inner()
+
+>>> outer()

+6. map, reduce and filter functions

Python provides several built-in functions for convenience. The map(), +reduce() and filter() functions prove to be very useful with sequences like +Lists.

The map (function, sequence) function takes two arguments: function +and a sequence argument. The function argument must be the name of the +function which in turn takes a single argument, the individual element of the +sequence. The map function calls function(item), for each item in the +sequence and returns a list of values, where each value is the value returned +by each call to function(item). map() function allows to pass more than +one sequence. In this case, the first argument, function must take as many +arguments as the number of sequences passed. This function is called with each +corresponding element in the each of the sequences, or None if one of the +sequence is exhausted:

 def square(x):
+  return x*x
+
+>>> map(square, [1, 2, 3, 4])
+[1, 4, 9, 16]
+
+def mul(x, y):
+  return x*y
+
+>>> map(mul, [1, 2, 3, 4], [6, 7, 8, 9])

The filter (function, sequence) function takes two arguments, similar to +the map() function. The filter function calls function(item), for each +item in the sequence and returns all the elements in the sequence for which +function(item) returned True:

 def even(x):
+  if x % 2:
+    return True
+  else:
+    return False
+
+>>> filter(even, range(1, 10))
+[1, 3, 5, 7, 9]

The reduce (function, sequence) function takes two arguments, similar to +map function, however multiple sequences are not allowed. The reduce +function calls function with first two consecutive elements in the sequence, +obtains the result, calls function with the result and the subsequent element +in the sequence and so on until the end of the list and returns the final result:

 def mul(x, y):
+  return x*y
+
+>>> reduce(mul, [1, 2, 3, 4])
+24

+6.1. List Comprehensions

List Comprehension is a convenvience utility provided by Python. It is a +syntatic sugar to create Lists. Using List Comprehensions one can create +Lists from other type of sequential data structures or other Lists itself. +The syntax of List Comprehensions consists of a square brackets to indicate +the result is a List within which we include at least one for clause and +multiple if clauses. It will be more clear with an example:

 >>> num = [1, 2, 3]
+>>> sq = [x*x for x in num]
+>>> sq
+[1, 4, 9]
+>>> all_num = [1, 2, 3, 4, 5, 6, 7, 8, 9]
+>>> even = [x for x in all_num if x%2 == 0]

The syntax used here is very clear from the way it is written. It can be +translated into english as, "for each element x in the list all_num, +if remainder of x divided by 2 is 0, add x to the list."

+ diff -r 000000000000 -r 8083d21c0020 web/html/ch5func.html~ --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/web/html/ch5func.html~ Mon Jan 25 18:56:45 2010 +0530 @@ -0,0 +1,342 @@ + + + +Chapter 5. Functions + + + + + + + + +

Table of Contents

Functional Approach

6.1. List Comprehensions

+Functional Approach

Table of Contents