Improved the initial Introduction on tests content.
authorMadhusudan.C.S <madhusudancs@gmail.com>
Tue, 31 Aug 2010 18:57:30 +0530
changeset 110 4e7b98636b58
parent 109 0afd25eadf41
child 111 a6a442d1bbd9
Improved the initial Introduction on tests content.
tdd/tdd.rst
--- a/tdd/tdd.rst	Wed Jul 14 00:19:54 2010 +0530
+++ b/tdd/tdd.rst	Tue Aug 31 18:57:30 2010 +0530
@@ -42,41 +42,177 @@
   def gcd(a, b):
       pass
 
-This stub does nothing other than defining a new function called gcd which
-takes two parameters a and b for which the GCD must be calculated. The body
-of the function just contains pass which means it does nothing, i.e. empty.
-We have our stub ready. One important thing we need to keep in mind when
-we adopt TDD methodology is that we need to have a clear set of results
-defined for our code units. To put it more clearly, for every given set of
-inputs as test case we must have, before hand, the exact outputs that are
-expected for those input test cases. If we don't have that we have failed
-in the first step of the TDD methodology itself. We must never run for
-outputs for our test cases after we have the code ready or even while
-writing tests. The expected outputs/behaviour must be in our hands before
-we start writing tests. Therefore let us define our test cases and the
-expected output for those inputs. Let one of our test cases be 48 and 64
-as *a* and *b* respectively. For this test case we know that the GCD is
-16. So that is the expected output. Let our second test case be 44 and
-19 as *a* and *b* respectively. We know that their GCD is 1 by simple paper
-and pen calculation.
+This stub does nothing other than defining a new function called gcd
+which takes two parameters a and b for which the GCD must be
+calculated. The body of the function just contains Python's **pass**
+statement which means it does nothing, i.e. empty. We have our stub
+ready. One important thing we need to keep in mind when we adopt TDD
+methodology is that we need to have a clear set of results defined for
+our code units. To put it more clearly, for every given set of inputs
+as test case we must have, before hand, the exact outputs that are
+expected for those input test cases. If we don't have that we have
+failed in the first step of the TDD methodology itself. We must never
+run looking for outputs for our test cases after we have the code
+ready or even while writing tests. The expected outputs/behaviour must
+be in our hands before we start writing tests. Therefore let us define
+our test cases and the expected output for those inputs. Let one of
+our test cases be 48 and 64 as *a* and *b* respectively. For this test
+case we know that the GCD is 16. So that is the expected output. Let
+our second test case be 44 and 19 as *a* and *b* respectively. We know
+that their GCD is 1 by simple paper and pen calculation.
 
 Now we know what a test is? What are the ingredients required to write
 tests? So what else should we wait for? Let us write our first test!::
 
   tc1 = gcd(48, 64)
   if tc1 != 16:
-      print "Test failed for the case a=48 and b=64. Expected 16. Obtained
-          %d instead." % tc1
+      print "Test failed for the case a=48 and b=64. Expected 16. Obtained %d instead." % tc1
       exit(1)
   
   tc2 = gcd(44, 19)
   if tc2 != 1:
-      print "Test failed for the case a=44 and b=19. Expected 1. Obtained
-          %d instead." % tc2
+      print "Test failed for the case a=44 and b=19. Expected 1. Obtained %d instead." % tc2
       exit(1)
 
   print "All tests passed!"
 
+Let us put all these in a file and call this file **gcd.py**::
+
+  def gcd(a, b):
+      pass
+
+  if __name__ == '__main__':
+      tc1 = gcd(48, 64)
+      if tc1 != 16:
+          print "Test failed for the case a=48 and b=64. Expected 16. Obtained %d instead." % tc1
+          exit(1)
+
+      tc2 = gcd(44, 19)
+      if tc2 != 1:
+          print "Test failed for the case a=44 and b=19. Expected 1. Obtained %d instead." % tc2
+          exit(1)
+
+      print "All tests passed!"
+
+Note that we have introduced a new semantic which uses two new magic names
+in Python *__name__* and *__main__*. This is a very common idiom used in
+Python. Every Python code in a file can be run in two ways: Either as an
+independent stand-alone script or as a Python module which can be imported
+by other Python scripts or modules. When the idiom::
+
+  if __name__ == '__main__':
+
+is used, the code within this if block is executed first when we run the
+Python file as a stand-alone script. In other words, when we run this
+python file as a stand-alone script the control of the program first starts
+from the code that is within this if block from which the control is
+transferred to other parts of the program or to other modules from
+here. This comes as an extremely handy feature especially when we want to
+test our modules individually. Now let us run our code as a stand-alone
+script.::
+
+  madhu@madhu:~/Desktop$ python gcd.py
+  Traceback (most recent call last):
+    File "gcd.py", line 7, in <module> print "Test failed for the case a=48 and b=64. Expected 16. Obtained %d instead." % tc1
+  TypeError: %d format: a number is required, not NoneType
+
+Now we have our tests, the test cases and the code unit stub at
+hand. We also have the failing test. So we know for sure that we have
+cleared the first check point of TDD where the tests have failed. The
+failing tests also give a green signal for us to go ahead to our next
+check point i.e. to write the actual code in our code unit and make
+the test pass. So let us write the code for the gcd function by
+removing the **pass** control statement which had just created a gcd
+function stub for us.
+
+Most of us have learnt in high school math classes that the best and
+the easiest known algorithm to compute the gcd of two numbers was
+given to us 2300 years ago by a greek mathematician named Euclid. So
+let us use the Euclid's algorithm to compute the gcd of two numbers a
+and b::
+
+  def gcd(a, b):
+      if a == 0:
+          return b
+      while b != 0:
+          if a > b:
+              a = a - b
+          else:
+              b = b - a
+      return a
+
+**Note**: If you are unaware of Euclidean algorithm to compute the gcd
+of two numbers please refer to it on wikipedia. It has a very detailed
+explanation of the algorithm and its proof of validity among other
+things.
+
+Now let us run our script which already has the tests written in it
+and see what happens::
+
+  madhu@madhu:/media/python/sttp/tdd$ python gcd.py
+  All tests passed!
+
+Success! We managed to pass all the tests. But wasn't that code simple
+enough? Indeed it was. If you take a closer look at the code you will
+soon realize that the chain of subtraction operations can be replaced
+by a modulo operation i.e. taking remainders of the division between
+the two numbers since they are equivalent operations. Also modulo
+operation is far better than chain of subtractions because you will
+reduce much faster using modulo operation than the subtraction. For
+example if let us take 25, 5 as a and b in our example. If we write
+down the steps of the algorithm written above we have the following:
+
+Step 1: a = 25 b = 5: Since both a and b are not 0 and b is greater
+than a: b = 25 - 5 = 20
+Step 2: Since b is still not 0 and b is greater than a: b = 20 - 5 =
+15
+Step 3: Since b is still not 0 and b is greater than a: b = 15 - 5 =
+10
+Step 4: Since b is still not 0 and b is greater than a: b = 10 - 5 = 5
+Step 5: Since b is still not 0 and b is equal to a: b = 5 - 5 = 0
+Step 6: Since b is 0 the gcd is a = 5 which is returned
+
+If we adopt the modulo operation instead of subtraction and follow the
+steps:
+
+Step 1: a = 25 b = 5: Since both a and b are not 0 and b is greater
+than a: b = 25 % 5 = 0
+Step 2: Since b is 0 the gcd is a = 5 which is returned
+
+Wow! That was overwhelmingly lesser number of steps! So now we are
+convinced that if we replace the subtraction operation with the modulo
+operation our code performs much better. But if we think carefully we
+know that the modulo of a and b is less than b irrespective of how
+large the value of a is, including the case where a is already less
+than b. So we can eliminate that extra conditional **if** statement by
+just swapping the result of the modulo operation to the position of b
+and b to the position of a. This ensures that a is always greater than
+b and if not the swapping combined with modulo operation takes care of
+it. To exemplify it, if a = 5 and b = 25 then by swapping and
+performing modulo we have a = b = 25 and b = a % b = 5 % 25 = 5 and
+hence we proceed. So let us replace our original code with this new
+improved code we have come up with simple observations::
+
+  def gcd(a, b):
+      while b != 0:
+          a, b = b, a % b
+      return a
+
+Executing our script again we will see that all the tests pass. One
+final improvement we can think of which is not necessary in terms of
+efficiency but is certainly good to do keeping in mind the readability
+is that we can use the concept of recursion for the same
+algorithm. Without going into much detail this is how the code looks
+if we use a recursive approach::
+
+  def gcd(a, b):
+      if b == 0:
+          return a
+      return gcd(b, a%b)
+
+Much shorter and sweeter! And it passes all the tests!
+
+
 More realistic "Tests"
 ======================