Edsger Dijkstra once said that testing can demonstrate the presence of bugs but not their absence. That is good to keep in mind when you are developing code - good code. What is the difference between debugging and testing? You debug when you know or have identified problems in your code. Testing is the art of systematically trying to break code, which you think is bug free. In this lecture, we will deal with the detective work needed to break code and try and uncover inconsistencies and problems with your working code.
In the next lecture, we will discuss how to automate unit testing - an extremely important part of the design and development process. In these notes, we discuss neat strategies such as test as you code; for example, many bugs exist at what we call boundaries: e.g., a program that reads data up to the end of an allocated buffer; a program expecting a stream of characters gets a newline or EOF character as the first character it reads; a for loop not written as a C idiom attempts to write one memory location beyond the end of an array (C can be dangerous, for example, it has no subscript checking). If you think about common bugs that could creep into code while writing code then that could save you significant time in the development process.
The material used in the lecture is strongly influenced by: [KP, 1999] The Practice of Programming (Addison-Wesley Professional Computing Series) by Brian W. Kernighan, Rob Pike. The notes use a number of programming examples from Chapter 6 on Testing. The text also reflects and cites material from that chapter. This is an outstanding book and strongly recommended to advance your knowledge on good programming practices. We use some coded examples from [KP, 1999] as cited in the notes.
Another short reference I like is Jeff Canna’s “Testing, Fun, Really”. I would recommend that you read Chapter 6 from [KP, 1999] and Jeff Canna article.
BTW, you can skip these notes if you write bug free code ;-)
We plan to learn the following from today’s lecture:
Many people have the attitude that testing code is a waste of time and boring - why do it? When one considers the software lifecycle discussed in class: procurement, requirements, design, coding, testing, debugging, integration and bake off then I would say that testing and debugging take the largest chuck of time. It is smart to develop tools and code to automate the testing of code. Why do this? Well the simple answer is it saves time and pushes the sometimes tedious task of testing into tools and away from humans. Imagine you design and develop a complex system and sell it to a customer who subsequently finds a serious bug out in the wild; that is to say, when the product is deployed in a power station, across a country such as an air traffic control system, in washing machines, or an app. The point is that you fix the problem but how do you know you haven’t broken something else while fixing your serious bug? You don’t know unless you have developed a set of tools to systematic retest the code. If you have developed a set of integration tests, sub-system tests and unit tests then you can quickly rerun them before you reship the new product with the bug fix. If all the tests pass then you have some confidence that the change you made did not introduce additional problems. Note, I use the phase “some confidence” because you are never 100% sure.
Let me restate the difference between debugging and testing again - with some philosophical comments. It is good to have that clear in your mind:
Testing is a determined, systematic attempt to break a program that you think is working. As discussed above, putting smarts into the development of a set of tools and test harness can automate this process. Story: this lecturer started his software career working for a company called Plessey Radar in the UK. His first assignment was testing someone else code. Fortunately, this was very cool because the code under test was the kernel of an Operating System. I did applied math at university and knew nothing about software. It might have felt like grunt work to a newbie on the job but I learnt how to design an operating system from the ground up. Later in my career I worked for company that brought in a consultant to run integration tests again code - my code. I was not happy dealing with testers until they found bugs in my code that was already under beta release. After that I had time and respect for them. Today, I still think producing smart testing tools to automate the testing of your code produces much more robust code.
Debugging is what you do when you know that the program is broken (e.g., segfault), fails (e.g., DDNODE is never linked into the hash table), underperforms (e.g., the memory leaks bring the system to a slow but grinding halt) or acts inconsistently (e.g., never terminates when it should). These are all bugs that testing can find. Better to find them and fix them. Sometimes bugs cascade, i.e., one bug creates others and so on. These types of bugs are hard to fix. In essence, the “bug chain” needs to be worked through to get to the real culprit at the start of the chain. Many times it is not obvious what problems in a system are linked and therefore requires some detective work: in the last lecture we forced the bug out into the open before swatting that pesky bug - which, you recall, lurked at the boundary of the for loop logic.
The earlier you find a problem with your code the better; it will save significant time and your company money. You will also get respect from other programmers. Digging bugs out months after you have written and forgotten the code is a serious challenge: test as you go. I know that for most labs you have sat a the terminal and just hacked at the code to reveal and fix bugs - you sit there praying, hacking some printfs and hit make and them run your code: you are doing this 100s of times. It is not a smart to work like this and represents a dumb brute force method - you’ll get no respect in industry for that. So be smart add a couple of new skills to your cs50 toolbox: “test as you go is one”. Using C idioms is another great way to limit bugs; for example, if you do not blindly hack code but enter code and sit back for a moment and read that while or for loop logic through carefully then you have already tested your code on one level before you have hit the gcc button. Doesn’t this make good sense? Yes, it does make good sense and having put your code through inspection before you compile it is great, really great, and it comes for free.
You already know many C idioms even if we haven’t always labeled them that way. But there are many ways to write a simple loop for example:
There are many ways to skin a cat - that is horrible saying, right and my cat Tiger would be alarmed. The simple loop is developed in three different ways. But like the English language the C programming language has idioms: where idiom means: C idioms are conventional ways that experienced programmers write common pieces of code. A central part of any language is developing familiarity with its idioms [KP, 1999].
While all of the loops shown above would work they are not C idioms. The C idiom is:
Being consistent when programming C will help enormously. For example, if you are use to writing idiomatic code then when you see a piece of code that is not idiomatic you should stop and take a close look at that code: maybe a boundary problem is more likely? If code is written the same way each time then when you see code that is not idiomatic it either suggests poor code or some genuine difference exists that the idiom does not cover. Either way: take a second close look to convince yourself its one or the other.
Here are some more examples of C idioms that you should be familiar with:
Here are some TinySearch idioms that you should be using in your code. See solution for lab4 for many examples.
So write in idioms as much as possible; it limits bugs at boundaries.
In what follows, we use a series of code snippets from [KP, 1999] to illustrate how to code to remove boundary bugs. It is a simple yet elegant example of writing code to handle all the boundary conditions that may present themselves.
Boundary testing assumes you test a small snippet of code at a time - sort of micro testing of code sequences or what should be idiomatic code. The example code does not use idioms and is poorly written to cater for boundary bugs. It is set up to illustrate that. But we have all written poor code like this. The take home from this section is how the code evolves once we think about where the boundary bugs are. So, for example, as you write a loop of some sort check there and then that condition branches the right way or that execution would go through the loop the correct number of times.
The technique is called boundary condition testing because you probe at the natural boundaries of the program, its data and its data structures (if they exist). For example, for the code below we probe for the following boundaries:
1) empty input
2) a single input character
3) an exactly full array - but could be malloced buffer
and so on. It could have been
4) empty queue
5) no collisions in a hash table
6) collisions in a hash table
7) add to the end of a cluster, etc.
Consider the following code snippet from [KP, 1999]:
Looking at this code the first thing that should strike us is that itś not a C idiom. That should make us look very closely at the loop logic; particularly, the loops conditional logic. Maybe, after studying the code thinking about characters being input we convince ourselves that the code is non idiomatic but works. Now start to test boundaries. Consider 1) above where there is no input but a newline - the user simply types carriage return for example. The resulting code would terminate immediately on the first iteration with i set to zero. The last line that should replace a newline character with a end of string character writes NULL to s[i-1], which is before the start of the array. Not a good idea hey. Thinking about a boundary test gets the pesky bug out into the open to swat.
Here is the nasty edge case code. Try it out. C code: edgecases.c
OK. The smart thing to do is rewrite this convoluted loop as an idiom and solve the problem; for example:
Inspecting the code above we can easily see that the previous boundary problem is solved with the new idiomatic code. It handles the case when the input is solely a newline beautifully. If we mentally compute through reading 1, 2 or 3 characters we see it also works; for example a b c \n works.
s[0] is a
s[1] is b
s[2] is c
s[3] is 0
Looks good.
Here is the better code. Try it out. C code: better.c
But what if we get an empty line and the first characters is an EOF? It breaks. Another pesky bug in our code. More specifically, we do not cater for an unexpected action, well at least in terms of the person that coded the loop. Someone hitting control D (EOF) would be a likely occurrences, well a probabilistic occurrence.
OK. Letś fix this boundary bug.
That works! Nice piece of code now. Looks like we are done.
Here is the even better code. Try it out. C code: evenbetter.c
We have removed and tested all the edge cases. The code handles all the edge cases nicely now.
// - where the first character is newline
// - where the first character is EOF
// - where there is one character
// - where there are more characters input than the max size of the array
// - where characters are entered terminated by a newline
// - where there are characters entered terminated by a EOF (control D)
Em...are we sure. There are other boundary problems that could lurk here. What happens if the array is nearly full - does it work? What happens if the array is exactly full? Or over full - what happens if any of these conditions occur followed by a newline. Are these boundary conditions catered for? We will leave you to determine the answer to that question.
You get the idea. Bugs lurk at boundaries. Conversely, if code works at boundaries it is likely to work elsewhere
It is always a good idea to test for pre and post conditions - that is before and after, respectively, some program executes. For example, we have already used defensive program to check input values are with in range - an example of pre-condition testing. Letś look at another simple example out of [KP, 1999] that computes the average of n elements in an array a[]. Closer inspection of the code reveals that there is a problem is n is less than or equal to 0.
A natural question is what to do if someone calls avg() with n=0? An array of zero elements does not make much sense but an average of 0 does. Should our code catch the division by zero? with an assert, or abort, or complain or be silent? One reasonable approach is to just return 0 as the average if n is less than or equal to zero. While the code is idiomatic in style we need to tweak it to test pre-condition, as shown below - note: return n < = 0 ? 0.0 : sum/n
We have used MY_ASSERT() macros in the development of TinySearch and in the unit testing code with SHOULD_BE(); for example:
C provides an assertion facility in assert.h useful for pre and post condition testing. Asserts are usually used for unexpected failure where there is no clean way to recover the logic control. For example our avg() function could include a different solution then n <= 0 using the assert function:
If the assertion is in fact violated it will cause an abort and standard message to be printed out:
Assertion failed: n > 0, file avgtest.c, line 7
Abort(crash)
Assertions are very useful in validating the expected properties of an interface or range of input arguments.
Your crawler and indexer uses defensive programming for checking that the input arguments are logically correct: e.g., that a path actually existed. But a useful technique when coding is to “expect the unexpected” or code for the unexpected. Adding checks for n < 0 in avg() is an example. Another is below:
The CS50 automatic grading program - a snippet is shown above ;-) handles negative grades (Ie yet to
do that) and very large grades. If the unexpected happens the code returns a ? This is a good example of
defensive programming. In essence our programmer is coding against incorrect use or illegal data. Other
examples include:
1) Out of range subscripts
2) NULL pointers
3) Divide by zero
A really good programmer will always check the return status from functions, system calls, and libraries. If you neglect to look at the return status then how do you know that the function really worked. If your code assumes it did not fail but it did then the segfault or error will be hard to debug. Better to always check the error status returned by functions.
Another example from [KP, 1996].
Output errors can be serious problem and if the file being written above fails and the error status not checked on fprintf then the file data can be lost. But the check above will save you from removing the old file if the new one was not written probably.
1) Test incrementally and build confidence in your code.
2) Write unit tests that can be re-run once fixes or changes have been made.
3) Write self-contained unit tests
3.1) Test inputs and outputs.
3.2) Test the dataflow through the program.
3.3) Test all the execution paths through the program.
3.4) Question: what environment do you need to set up to do 3.1-3.3
4) Stress test the code; start simple and advance (test crawler at depth 1 .. 10 for example).
5) Don’t implement new features if there are known bugs in the system.
6) Test for portability: run code and tests on multiple machines/OSs.
7) Before shipping code make sure that the test code ifdefs are off.
If you follow at least 50% of the tips in these notes you will write better code and it will have considerably less bugs than if you did not apply these simple tips and strategy. Or your money back.