Monday, February 20, 2012

The Value of Code Reading versus Testing

This morning's reading is a paper from Basili originally published in 1997 involving the value of code reading as a quality improvement technique. It is gratifying to see him make some points that I have felt for a while. But my immediate reaction is how it informs my pedagogy.

Of all the things we teach computer science students, the hardest and yet most fundamental is computational thinking. At its most basic we expect our students to read code and be able to execute it by hand. It occurs to me that the most important computer we build is the one in their heads. Their ability to submit programs to this mental computer improves throughout their careers. I know I myself find no value in executing student programs for the lower-level classes simply because I catch far more logic errors by reading their code than I would from testing with the limited time I can give each program. It is simply more efficient. My internal computer is good.

But the form of reasoning I do when reading these programs is not strictly computational. It is similar to formal methods in that I am proving program correctness in my reading, not submitting some test data in my head. The ability develops naturally for someone who works with code over a long period of time and develops without much formal training. It is the ability that I believe Basili is exploiting in his studies and one that is far more powerful than testing can ever be. But if a student never learns how to "play computer", this mode of thought is shut off. I believe this is one of the most basic skills we teach.

"We differentiate a technique from a method, from a life cycle model. A technique is the most primitive. it is an algorithm, a series of steps producing the desired effect, and requires skill. A method is a management procedure for applying techniques, organized by a set of rules stating how and when to apply and when to stop applying the technique (entry and exit criteria), when the technique is appropriate, and how to evaluate it. We will define a technology as a collection of techniques and methods. A life cycle model is a set of methods that covers the entire life cycle of a software product." [1]

from the 1987 study...
"The results were that code reading found more faults than functional testing, and functional testing found more faults than structural testing. Also, doe reading found more faults per unit of time spent than either of the other two techniques. Different techniques seemed to be more effective for different classes of faults. for example, code reading was more effective for interface faults and functional testing more effective for control flow faults."[1]
...
"Based upon this study, reading was implemented as part of the ... development process. However, much to our surprise, reading appeared to have very little effect on reducing defects." op cit

Another purpose this article serves for me is to remind me of the insights gained from my master's project. The project proved to be a lesson in reverse engineering in the end. My contribution ultimately proved to be the elaboration of the process which is followed when an existing system is taken as the input for a re-engineering project and the existing architecture is either not understood and/or must be significantly changed to support the new requirements. I don't find the results either surprising or inspired. Yet I am not aware of the steps of this process being documented anywhere else.

What Basili is talking about in this paper in the discussion of the step-wise abstraction technique isn't any different than what I needed to do for my project, albeit at a different level of abstraction. For Bass et al [2] the purpose of the architecture documentation is to provide exactly the kind of abstraction these readers must construct. I would presume that for Basili's study, even if this design level existed it would not be shared with the readers. Perhaps it would be an interesting study to have two groups of readers; one who read without the design and one who read with the design available. I feel that in the end, I was reading the code base in a step-wise abstraction way to recover the architecture design I needed for this system to support my vision for the product.

For my students, I have a renewed appreciation for what they must learn. When introducing programming I am going to place greater emphasis on the techniques for playing computer than I have with the intent that I will be able to give them simple programs on a test and improve their scores regarding the result of their execution. After all, the computer is a machine. Their ability to anticipate the response of the machine to its input is only a higher abstraction of asking them to anticipate which way a gear will turn in a mechanism when the user turns a crank.



[1]Evolving and Packaging Reading Technologies, V Basili, Foundations of Empirical Software Engineering, eds. Boehm, Rombach, Zelkowitz, Springer 2005

[2] L Bass, et al, Software Architecture in Practice, 2003, Addison-Wesley

No comments:

Post a Comment