# HG changeset patch # User Madhusudan.C.S # Date 1283261250 -19800 # Node ID 4e7b98636b5885be38f4dce5a63aab23bda78821 # Parent 0afd25eadf41d15045593ccf71d9eb0cd7c530ba Improved the initial Introduction on tests content. diff -r 0afd25eadf41 -r 4e7b98636b58 tdd/tdd.rst --- a/tdd/tdd.rst Wed Jul 14 00:19:54 2010 +0530 +++ b/tdd/tdd.rst Tue Aug 31 18:57:30 2010 +0530 @@ -42,41 +42,177 @@ def gcd(a, b): pass -This stub does nothing other than defining a new function called gcd which -takes two parameters a and b for which the GCD must be calculated. The body -of the function just contains pass which means it does nothing, i.e. empty. -We have our stub ready. One important thing we need to keep in mind when -we adopt TDD methodology is that we need to have a clear set of results -defined for our code units. To put it more clearly, for every given set of -inputs as test case we must have, before hand, the exact outputs that are -expected for those input test cases. If we don't have that we have failed -in the first step of the TDD methodology itself. We must never run for -outputs for our test cases after we have the code ready or even while -writing tests. The expected outputs/behaviour must be in our hands before -we start writing tests. Therefore let us define our test cases and the -expected output for those inputs. Let one of our test cases be 48 and 64 -as *a* and *b* respectively. For this test case we know that the GCD is -16. So that is the expected output. Let our second test case be 44 and -19 as *a* and *b* respectively. We know that their GCD is 1 by simple paper -and pen calculation. +This stub does nothing other than defining a new function called gcd +which takes two parameters a and b for which the GCD must be +calculated. The body of the function just contains Python's **pass** +statement which means it does nothing, i.e. empty. We have our stub +ready. One important thing we need to keep in mind when we adopt TDD +methodology is that we need to have a clear set of results defined for +our code units. To put it more clearly, for every given set of inputs +as test case we must have, before hand, the exact outputs that are +expected for those input test cases. If we don't have that we have +failed in the first step of the TDD methodology itself. We must never +run looking for outputs for our test cases after we have the code +ready or even while writing tests. The expected outputs/behaviour must +be in our hands before we start writing tests. Therefore let us define +our test cases and the expected output for those inputs. Let one of +our test cases be 48 and 64 as *a* and *b* respectively. For this test +case we know that the GCD is 16. So that is the expected output. Let +our second test case be 44 and 19 as *a* and *b* respectively. We know +that their GCD is 1 by simple paper and pen calculation. Now we know what a test is? What are the ingredients required to write tests? So what else should we wait for? Let us write our first test!:: tc1 = gcd(48, 64) if tc1 != 16: - print "Test failed for the case a=48 and b=64. Expected 16. Obtained - %d instead." % tc1 + print "Test failed for the case a=48 and b=64. Expected 16. Obtained %d instead." % tc1 exit(1) tc2 = gcd(44, 19) if tc2 != 1: - print "Test failed for the case a=44 and b=19. Expected 1. Obtained - %d instead." % tc2 + print "Test failed for the case a=44 and b=19. Expected 1. Obtained %d instead." % tc2 exit(1) print "All tests passed!" +Let us put all these in a file and call this file **gcd.py**:: + + def gcd(a, b): + pass + + if __name__ == '__main__': + tc1 = gcd(48, 64) + if tc1 != 16: + print "Test failed for the case a=48 and b=64. Expected 16. Obtained %d instead." % tc1 + exit(1) + + tc2 = gcd(44, 19) + if tc2 != 1: + print "Test failed for the case a=44 and b=19. Expected 1. Obtained %d instead." % tc2 + exit(1) + + print "All tests passed!" + +Note that we have introduced a new semantic which uses two new magic names +in Python *__name__* and *__main__*. This is a very common idiom used in +Python. Every Python code in a file can be run in two ways: Either as an +independent stand-alone script or as a Python module which can be imported +by other Python scripts or modules. When the idiom:: + + if __name__ == '__main__': + +is used, the code within this if block is executed first when we run the +Python file as a stand-alone script. In other words, when we run this +python file as a stand-alone script the control of the program first starts +from the code that is within this if block from which the control is +transferred to other parts of the program or to other modules from +here. This comes as an extremely handy feature especially when we want to +test our modules individually. Now let us run our code as a stand-alone +script.:: + + madhu@madhu:~/Desktop$ python gcd.py + Traceback (most recent call last): + File "gcd.py", line 7, in print "Test failed for the case a=48 and b=64. Expected 16. Obtained %d instead." % tc1 + TypeError: %d format: a number is required, not NoneType + +Now we have our tests, the test cases and the code unit stub at +hand. We also have the failing test. So we know for sure that we have +cleared the first check point of TDD where the tests have failed. The +failing tests also give a green signal for us to go ahead to our next +check point i.e. to write the actual code in our code unit and make +the test pass. So let us write the code for the gcd function by +removing the **pass** control statement which had just created a gcd +function stub for us. + +Most of us have learnt in high school math classes that the best and +the easiest known algorithm to compute the gcd of two numbers was +given to us 2300 years ago by a greek mathematician named Euclid. So +let us use the Euclid's algorithm to compute the gcd of two numbers a +and b:: + + def gcd(a, b): + if a == 0: + return b + while b != 0: + if a > b: + a = a - b + else: + b = b - a + return a + +**Note**: If you are unaware of Euclidean algorithm to compute the gcd +of two numbers please refer to it on wikipedia. It has a very detailed +explanation of the algorithm and its proof of validity among other +things. + +Now let us run our script which already has the tests written in it +and see what happens:: + + madhu@madhu:/media/python/sttp/tdd$ python gcd.py + All tests passed! + +Success! We managed to pass all the tests. But wasn't that code simple +enough? Indeed it was. If you take a closer look at the code you will +soon realize that the chain of subtraction operations can be replaced +by a modulo operation i.e. taking remainders of the division between +the two numbers since they are equivalent operations. Also modulo +operation is far better than chain of subtractions because you will +reduce much faster using modulo operation than the subtraction. For +example if let us take 25, 5 as a and b in our example. If we write +down the steps of the algorithm written above we have the following: + +Step 1: a = 25 b = 5: Since both a and b are not 0 and b is greater +than a: b = 25 - 5 = 20 +Step 2: Since b is still not 0 and b is greater than a: b = 20 - 5 = +15 +Step 3: Since b is still not 0 and b is greater than a: b = 15 - 5 = +10 +Step 4: Since b is still not 0 and b is greater than a: b = 10 - 5 = 5 +Step 5: Since b is still not 0 and b is equal to a: b = 5 - 5 = 0 +Step 6: Since b is 0 the gcd is a = 5 which is returned + +If we adopt the modulo operation instead of subtraction and follow the +steps: + +Step 1: a = 25 b = 5: Since both a and b are not 0 and b is greater +than a: b = 25 % 5 = 0 +Step 2: Since b is 0 the gcd is a = 5 which is returned + +Wow! That was overwhelmingly lesser number of steps! So now we are +convinced that if we replace the subtraction operation with the modulo +operation our code performs much better. But if we think carefully we +know that the modulo of a and b is less than b irrespective of how +large the value of a is, including the case where a is already less +than b. So we can eliminate that extra conditional **if** statement by +just swapping the result of the modulo operation to the position of b +and b to the position of a. This ensures that a is always greater than +b and if not the swapping combined with modulo operation takes care of +it. To exemplify it, if a = 5 and b = 25 then by swapping and +performing modulo we have a = b = 25 and b = a % b = 5 % 25 = 5 and +hence we proceed. So let us replace our original code with this new +improved code we have come up with simple observations:: + + def gcd(a, b): + while b != 0: + a, b = b, a % b + return a + +Executing our script again we will see that all the tests pass. One +final improvement we can think of which is not necessary in terms of +efficiency but is certainly good to do keeping in mind the readability +is that we can use the concept of recursion for the same +algorithm. Without going into much detail this is how the code looks +if we use a recursive approach:: + + def gcd(a, b): + if b == 0: + return a + return gcd(b, a%b) + +Much shorter and sweeter! And it passes all the tests! + + More realistic "Tests" ======================