ABSTRACT
By the nature of computer science as an applied math and its use to train our legions of software engineers, software engineering is deeply wedded to rationalism as the primary mode of thought. But that mode of thinking has limits that are not often realized and if taught at all are poorly taught. This short essay looks at the limits of rationalism as it applies to the development of large scale commercial information systems by software engineers steeped in this tradition. It concludes with some recommendations on the adjustment of an undergraduate program that would more directly address this shortcoming and give the new software engineer a larger mode of reasoning about how to approach the task. It also has implications for the graduate school programs and the directions that future PhD students could pursue.
INTRODUCTION
To keep this essay short, it is limited in scope. The focus is on the engineering of large scale information systems within a commercial context. This is for several reasons. First large scale systems cannot be created by a sole contributor. The inability for one person to satisfy the client need without coordination, communication and collaboration with peers creates a change in the techniques that can be applied. What was internal thought must now be shared and what was vague must be made explicit. Since language is the way this is done, the work cannot be done without artifacts and process.
We currently use software dominated computer systems for many purposes including entertainment and machine control to say nothing of research and artificial intelligence. But the historical domination of information systems gives us a rich history to rely on and a shared foundation from which to begin. This paper is largely dominated by that history of information systems that could be specified in formal ways.
An immediate application of any contribution toward the tasks of software engineering is the education of new engineers. This piece will not stray from undergraduate material that one might hope is universal in every university computer science program. Instead it will focus on the interdisciplinary application of that material.
We very briefly remind the reader of the philosophical concept of rationalism and its contrasting concept of empiricism. The two most common process models of software methodology are mentioned, waterfall and Agile-oriented. Then the remainder looks at specific ways in which we try to apply rationalism yet find it wanting in practice. The conclusion looks at these shortcomings and possible ways of restating the issue to offer the most promising alternative to a rationalist approach.
BACKGROUND
Rationalism has rich roots in philosophy. While it includes the philosophy of mind in this essay we look at the question of what can be taken a priori versus what must be learned. Constrained in this way we can apply that concept to many important software engineering areas including requirements engineering, specification, design and testing. While rationalism is seen in the rationalization of human process (methodology) and the intrinsic nature of formal languages and machines, it is also attempted in specific tasks such as testing and requirements engineering.
Large scale development without some degree of methodology, whether explicit or implicit, is prone to failure and not a good model for engineering. There will always be exceptions but in an educational environment we teach to what can be learned, not what depends upon an exceptional practitioner.
The discussion follows the most canonical progression of software engineering tasks as often taught in the waterfall methodology.
RATIONALISM IN ORGANIZATIONAL CHANGE
A common characterization of large organizations which consume the majority of commercial software engineering services is that the need or opportunity to improve the organization through some project is realized. While this alone may be interesting it is before the commencement of any recognized software engineering methodology. However we want to note that it is rare for a project to to commence with a completely blank slate either in environment or in the pre-formed attitudes of the people who will make the decisions. There is an a priori set of beliefs, processes, constraints and other limiting factors which will shape the project. Insofar as it is a purely rational process, some idealized process of project management might find the result predictable. The fact that it is not is the very truth we wish to examine.
RATIONALISM IN REQUIREMENTS ENGINEERING
We lie to undergraduates in the computer science program. We lead them to believe that the clients give clear requirements, (ok, perhaps we help a bit) and from those we create some high level design which is reified until we have the specifications for all the modules to be coded. If we are good at our jobs as software engineers, we take those specifications and implement some algorithm which satisfies the specification. In the best of all possible worlds, that satisfaction can be proven using static analysis giving ultimate certainty that as long as the hardware properly carries out the logic that the valid input will result in the desired output.
The failure of rationalism here is rather clear, clients are rarely capable of thinking in the extremely abstract manner that is needed to reduce a human need to a formal language. Yet they alone possess the domain knowledge and judgement to decide that the completed project is acceptable. They must be satisfied and a failure to properly interpret what is needed cannot be the fault of the client alone. The software engineer is responsible for the extraction of the necessary information.
The primary tool that allows clients and software engineers to communicate is language. This is not just natural language but also the various semi-formal languages of diagrams and models. Large scale enterprises often have extensive documentation of their processes which can form at least part of the corpus of support for a requirements document in a waterfall methodology or a reference for an Agile-oriented process. But these are not formal languages and they all suffer from the common fault of all non-formal languages of potential ambiguity. There is no way to teach how one proceeds with the reductionism of natural
RATIONALISM IN PROJECT MANAGEMENT
Agile methodologies have successfully challenged the command and control with linear process that is the waterfall methodology. One reason the waterfall is still taught is because it is rational and easy to understand. But the reality is that the pure method was more unsuccessful than successful in the industry. Rather than believe that the business analysts have completely captured all the success criteria for the system under development an Agile process assumes that the requirements are not complete and that only a working prototype will elicit the knowledge of what is lacking in the system. This is a far cry from the methodology of formal methods that depends upon formal predicates that form the needed post conditions the system must satisfy.
RATIONALISM IN DESIGN
"In Nicomachean Ethics, Aristotle claims that to discover the human good we must identify the function of a human being. He argues that the human function is rational activity. Our good is therefore rational activity performed well, which Aristotle takes to mean in accordance with virtue. " (Aritstotle's Function Argument from The constitution of Agency, Christine M Korsgaard, Oxford Univ Press). Many attempts have been made to rationalize corporate governance and process. But the reality of management is that it is often necessary to make decisions in the absence of perfect knowledge and this defines a limit to this attempt at complete rationalization. People will make decisions that if complete knowledge were available might be different. Such is the nature of being human.
While systems are not humans, a similar argument applies, a good system is one which executes its function well. Function takes on a distinct mathematical definition in computer science and it is important to note that the word is qualified by "well." What does this mean?
In an earlier age we talked about functional and non-functional requirements. Those non-functional requirements are better called qualia in keeping with the work of CMU's Software Engineering Institute. So their argument goes the structure of function in a large system is not unique. There are a large number of ways to organize the ensemble yet still achieve the strictly functional specification in this narrow mathematical sense. But what can change is the other qualia of the system which can be vital to the ultimate satisfaction of the human need which motivated the project.
For large systems, the client becomes more difficult to define in the traditional engineering sense. It can be the person who pays the invoices, the person who casts the tie breaking vote when there is a disagreement over requirements, it can be the person whose business unit will benefit from the new system. Project management texts tend to call all of these stake holders in the project. There is some function of individual satisfaction that will yield a result that will "satisfice" for this collective client. But that is virtually never a well defined function. Again, there is no rationalism in the final decision regarding the intended design of the system, merely a large number of possibly contradictory or overstated requirements from many people who may disagree with each other about the true mission of the project and the system to be built. As in many other human affairs, rationalism often fails to bring consensus to a group.
Even when there is consensus on the various qualia to be achieved by the system, there is no reductionism that gives a linear path to that final design. There are heuristics, expert advise, patterns to follow and other guides but none of these represent the methodical approach that is implied by rationalism. The objective is to balance the many controllable factors to create a whole that again can "satisfice" for the problem to be solved. Much of the work of design methodology has been inspired by the auto industry which deals with a very analogous problem.
Cars must run, steer, stop, etc. These functional qualities are augmented by things like acceleration, comfort, and appeal. Metrics are easily acquired for functional properties but become progressively harder with many other qualities. How does one measure comfort? or appeal? But coming up with the proper collection of properties while remaining within the economic zone of cost and time has always been a challenge and more of an art than engineering science. This too is an area that does not yield to rationalism.
If we had rational specifications for the system to be built, we would also have what is needed to demonstrate at least partial compliance to those specifications. But design is often recursive and exploratory, not some linear process that begins with a complete specification. If a specification for the complete system is ever created, it grows with the design of the solution.
RATIONALISM IN HUMAN COMPUTER INTERFACE
Individuals are not rational creatures. Predicting how attractive any individual will find a particular automobile is difficult although using statistical techniques it may be possible to predict some statistics for that question. Since we left the glass CRT behind and embraced bit mapped graphics we have created a progressive series of interface devices that become ever richer in their visual communication ability along with the other senses. It is not yet possible to see how far the engineering of human computer interface design will go. But one thing has become clear, the average computer science student is poorly prepared to create a usable interface without training. And the challenges of creating a linear method of going from the system design to the human interface are far greater. It requires a knowledge of the human perceptual system, principles of good visual communication as well as an ability to understand the human context in which the system will be used.
RATIONALISM IN DESIGN REALIZATION
We want to believe that a software engineer is so familiar with the language and the idioms of coding that once the specification is given they proceed in some linear way to write the lines of code that will result in that system. We proceed to refute that with analogy to several other similar tasks, non-fiction writing and theorem proving.
Some writing instructors say that writing is thinking. It is not desirable for a program specification to be so reduced as to leave no room for variation. Such a specification is often difficult to work with since it hides some of the higher level knowledge to inform the software engineer how this module functions within the larger system, information that is needed to guide possible decisions to be made while coding. The coding proceeds until there is some question of design that arises that the software engineer does not have enough information to answer. The engineer must research and consider the choices and then proceed. This breaks the linear rationalistic conceit.
RATIONALISM IN THE VERIFICATION AND VALIDATION OF THE SYSTEM
As we observed it is rare to have a specification for the complete system. This is the ideal state of a project that applies formal methods and when done, it is possible to methodically prove compliance to some or all of the design even before it is complete. But without that specification the best that can be achieved is a methodical process of module/system/acceptance testing which attempts to present evidence that the system under design/being delivered will perform acceptably. This also recognizes the reality of clients who will often accept less than total perfection, something economists call "satisficing".
Wednesday, May 29, 2019
Tuesday, May 21, 2019
Leibniz, Locke and Wittgenstein: Rationalism versus Empiricism and the Language Game Synthesis
FSE2016 Panel: The State of Software Engineering Research
Prem 3 Lines of Research - Leibniz, Locke, Wittgenstein
https://www.youtube.com/watch?v=sE_jX92jJr8&feature=youtu.be&t=4
28:00-34:00
Rationalism epitomized by Gottfried Leibniz
We locate defects by proving a program can fail
Empiricism epitomized by John Locke
We locate defects by finding patterns of human error
Synthesis
Empirical evaluation of static versus dynamic typing
Cost-effectiveness of dynamic analysis (fault localization)
Ordering static analysis warnings by Naturalness
Exploring statistical language models for type inference
Wittgenstein might question semantics. What is a defect? A server? etc. Raises question of the language game. A language game recognizes zones of agreement.
Prior mentions include Goguen, Zave, Osterweil, Shaw, Garlan, Fielding, Taylor, Medvidovic, Qualitative work
Daniel Jackson, The "broken" concepts & rules of Git, Daniel Jackson 2016
The conceptual blending of "developer" and "operator" in DevOps
The architecture of IoT
Formulations of Security Agents, Policies, Principles
Are We Neglecting our Concepts?
Empirical and Formal are OK
"Language Game" areas: Architecture, Requirements, Process? Not so much.
Two risks: Fixations on tired old ideas, and missing out on radical new innovations
References:
Joseph Goguen
The Denial of Error, Softrware Development and Reality Construction, pp 193-202
Four Pieces on Error, Truth and Reality, 1990 accessible at https://www.cs.ox.ac.uk/files/3412/PRG89.pdf
Value-Driven Design with Algebraic Semotics, draft book with Fox Harrell, 2006, accessible at https://cseweb.ucsd.edu/~goguen/courses/271/book/uibko.pdf
Pamela Zave,
The operational versus the conventional approach to software development, CACM, Feb 1984, Vol 27 Iss 2, pp 104-118
Four dark corners of requirements engineering, Jan 1997, ACM TOSEM vol 6 iss 1, pp 1-30
Leon Osterweil
Software Processes are Software Too, from Engineering of Software: The continuing contributions of Leon J Osterweil, ed Tarr, Wolf, pp 323-344, ICSE9, ICSE97 Proceedings of the 19th pp 540-548
Prem 3 Lines of Research - Leibniz, Locke, Wittgenstein
https://www.youtube.com/watch?v=sE_jX92jJr8&feature=youtu.be&t=4
28:00-34:00
Rationalism epitomized by Gottfried Leibniz
We locate defects by proving a program can fail
Empiricism epitomized by John Locke
We locate defects by finding patterns of human error
Synthesis
Empirical evaluation of static versus dynamic typing
Cost-effectiveness of dynamic analysis (fault localization)
Ordering static analysis warnings by Naturalness
Exploring statistical language models for type inference
Wittgenstein might question semantics. What is a defect? A server? etc. Raises question of the language game. A language game recognizes zones of agreement.
Prior mentions include Goguen, Zave, Osterweil, Shaw, Garlan, Fielding, Taylor, Medvidovic, Qualitative work
Daniel Jackson, The "broken" concepts & rules of Git, Daniel Jackson 2016
The conceptual blending of "developer" and "operator" in DevOps
The architecture of IoT
Formulations of Security Agents, Policies, Principles
Are We Neglecting our Concepts?
Empirical and Formal are OK
"Language Game" areas: Architecture, Requirements, Process? Not so much.
Two risks: Fixations on tired old ideas, and missing out on radical new innovations
References:
Joseph Goguen
The Denial of Error, Softrware Development and Reality Construction, pp 193-202
Four Pieces on Error, Truth and Reality, 1990 accessible at https://www.cs.ox.ac.uk/files/3412/PRG89.pdf
Value-Driven Design with Algebraic Semotics, draft book with Fox Harrell, 2006, accessible at https://cseweb.ucsd.edu/~goguen/courses/271/book/uibko.pdf
Pamela Zave,
The operational versus the conventional approach to software development, CACM, Feb 1984, Vol 27 Iss 2, pp 104-118
Four dark corners of requirements engineering, Jan 1997, ACM TOSEM vol 6 iss 1, pp 1-30
Leon Osterweil
Software Processes are Software Too, from Engineering of Software: The continuing contributions of Leon J Osterweil, ed Tarr, Wolf, pp 323-344, ICSE9, ICSE97 Proceedings of the 19th pp 540-548
Monday, May 13, 2019
The Semantics of "Bug" in Empirical Software Engineering
In my graduate studies I regularly read papers and hear people talk about software bugs. But I often find that what is meant by the word bug has a very specific meaning in their work. This short essay voices some of my concerns about the implications of this for serious work on software engineering research.
Perhaps the best comment about the meaning of the word bug comes from Dijkstra who famously railed against its use at all. The term apocryphally only came into use because of the physical nature of the post-war computers which used relays that were fouled by insects. It stuck as a way of talking about something that causes an algorithm to behave in an unintended way. But it has the unfortunate connotation of a logic error that is deus in machina, something that was not an error in thought but something that could not be foreseen. It provides far too large a loophole for algorithm designers, who design the algorithms and distances them from their lack of rigor.
In contemporary research the most common definition of bug is a reported defect or error in some bug tracking system. This can be most anything depending upon the context of the system and the human process by which those reports are created. This can include such obvious failures as an abnormal end or crash which causes the uncontrolled termination by the operating system or potentially even a crash of the operating system itself. It could be some unquantified quality that is missing such as too low a MTTF because it is difficult to find the source of the errors. It could be some expected behavior of the system which was not met but on reflection was always an expectation by the client that commissioned the system's creation. But this can also include defects that only exist because a change in the environment or requirements not anticipated at the time of the original creation. These should never be judged as all being bugs of the same category but the practicality of empirical software engineer forces them to become equal. I personally find it difficult to do any meta analysis of research because of this and doubt the results of individual papers that do not properly consider the heterogeneity of the data sources used. Can we find some agreement on the semantics of the word rather than allow it to become any observed deviation from what the end user thinks? I think this is clearly yes. But it opens a Pandora's box of semantics as it forces us to deal with design and specification, not merely the implementation.
A formal definition of bug would probably be some observable deviation of behavior from what was specified. But outside the realm of formal methods there are almost never any well formed specifications that can support this metric. For functional specifications these metrics are infrequent enough but for the previously "non-functional requirements" (software qualities) these are even rarer outside a couple domains such as military or aerospace. Even when they exist I believe the will to keep records to relate a particular code change to the one or more quality specification within its scope is lacking in most commercial products.
One technique that offers a different approach is Test Driven Development (TDD) which packages a code unit with a series of tests that become part of the automated recursion testing. This fixes the shortcoming of having an easily identified specification to match to the code unit. But I have never heard of a recursion testing system that includes the many more difficult tests such as capacity testing or performance testing. An of course testing for usability or maintainability are so difficult that I doubt anyone tries to include them in recursion testing. So while TDD offers some promise it is far from providing a framework for the kind of empirical study that the industry could benefit from.
In my conversations with Silicon Valley software engineers I have noticed a distinct trend away from anything that even bears a resemblance to the old waterfall methodologies. TDD may be the last vestige of a process that requires thought into how to articulate the behavior that is desired. This has been spurred by the strong reaction that consulting practitioners brought to the industry against waterfall methodologies. I have not yet read anything that uses the words I do to contrast the two approaches to system creation so let me explain the contrast I see.
Waterfall was a big-design-up-front. It required deep analysis and frequently caused paralysis through analysis as everyone became afraid of making an error. Agile broke that logjam by insisting on fast turnaround and the delivery of quantitized function. It encouraged an organic approach where some prototype of the function was continually reworked and accreted function until the gap between the desired system and the delivered prototype "satisficed" and was accepted into production.
But the Agile process has an inherent defect introduced by that process, it discourages a great deal of thought into global issues of system design. For small systems or systems that do not require integration later, this is perfectly acceptable. However many large systems that will accrete their functionality over a long period of time in a partly unknown domain will not lend itself to the kind of immediate insight that allows for prescient decision making from the first prototype. There are some qualities of a system that are only evident in the whole and cannot be found in the constituent parts. These can be emergent or are qualities that can be impaired from the failure of one component. We use the term "software architecture" when we try to discuss design issues of a large ensemble of components that comprise a system. And a failure to properly appreciate the interconnectedness of these components in the beginning can lead to some very painful refactoring later in the project. It is this trend toward needed refactoring coupled with management's reluctance to acknowledge and fund the refactoring by denying the technical debt the system has accumulated. Software engineers can call this a management failure but management will call it an engineering failure as the relationship between the incremental functionality and that cost of implementing that functionality diverge because of that refactoring. At its most dysfunctional an existing system will be abandoned and rebuilt while a project not yet implemented may be canceled.
So I argue that the term "bug" can be indicative of a software engineer who is lacking in the maturity of their profession that comes after many years of watching these forces play out on real projects. I have a tendency to extrapolate from my experience to see this in the current software engineering practice. But I hear enough stories to convince myself that no one has found a magic bullet which helps a software engineer use a process that is in any way analogous to that used by engineers in other engineering fields. The reasons for this alone are interesting but beside the point; we must accept the fact that real software quality cannot be attained when the only focus on the delivered software product is limited to the most easily or badly quantified measures that sweep the subtlety of software defects away in an effort to use an existing dataset and avoid the time and expense of data gathering.
Perhaps the best comment about the meaning of the word bug comes from Dijkstra who famously railed against its use at all. The term apocryphally only came into use because of the physical nature of the post-war computers which used relays that were fouled by insects. It stuck as a way of talking about something that causes an algorithm to behave in an unintended way. But it has the unfortunate connotation of a logic error that is deus in machina, something that was not an error in thought but something that could not be foreseen. It provides far too large a loophole for algorithm designers, who design the algorithms and distances them from their lack of rigor.
In contemporary research the most common definition of bug is a reported defect or error in some bug tracking system. This can be most anything depending upon the context of the system and the human process by which those reports are created. This can include such obvious failures as an abnormal end or crash which causes the uncontrolled termination by the operating system or potentially even a crash of the operating system itself. It could be some unquantified quality that is missing such as too low a MTTF because it is difficult to find the source of the errors. It could be some expected behavior of the system which was not met but on reflection was always an expectation by the client that commissioned the system's creation. But this can also include defects that only exist because a change in the environment or requirements not anticipated at the time of the original creation. These should never be judged as all being bugs of the same category but the practicality of empirical software engineer forces them to become equal. I personally find it difficult to do any meta analysis of research because of this and doubt the results of individual papers that do not properly consider the heterogeneity of the data sources used. Can we find some agreement on the semantics of the word rather than allow it to become any observed deviation from what the end user thinks? I think this is clearly yes. But it opens a Pandora's box of semantics as it forces us to deal with design and specification, not merely the implementation.
A formal definition of bug would probably be some observable deviation of behavior from what was specified. But outside the realm of formal methods there are almost never any well formed specifications that can support this metric. For functional specifications these metrics are infrequent enough but for the previously "non-functional requirements" (software qualities) these are even rarer outside a couple domains such as military or aerospace. Even when they exist I believe the will to keep records to relate a particular code change to the one or more quality specification within its scope is lacking in most commercial products.
One technique that offers a different approach is Test Driven Development (TDD) which packages a code unit with a series of tests that become part of the automated recursion testing. This fixes the shortcoming of having an easily identified specification to match to the code unit. But I have never heard of a recursion testing system that includes the many more difficult tests such as capacity testing or performance testing. An of course testing for usability or maintainability are so difficult that I doubt anyone tries to include them in recursion testing. So while TDD offers some promise it is far from providing a framework for the kind of empirical study that the industry could benefit from.
In my conversations with Silicon Valley software engineers I have noticed a distinct trend away from anything that even bears a resemblance to the old waterfall methodologies. TDD may be the last vestige of a process that requires thought into how to articulate the behavior that is desired. This has been spurred by the strong reaction that consulting practitioners brought to the industry against waterfall methodologies. I have not yet read anything that uses the words I do to contrast the two approaches to system creation so let me explain the contrast I see.
Waterfall was a big-design-up-front. It required deep analysis and frequently caused paralysis through analysis as everyone became afraid of making an error. Agile broke that logjam by insisting on fast turnaround and the delivery of quantitized function. It encouraged an organic approach where some prototype of the function was continually reworked and accreted function until the gap between the desired system and the delivered prototype "satisficed" and was accepted into production.
But the Agile process has an inherent defect introduced by that process, it discourages a great deal of thought into global issues of system design. For small systems or systems that do not require integration later, this is perfectly acceptable. However many large systems that will accrete their functionality over a long period of time in a partly unknown domain will not lend itself to the kind of immediate insight that allows for prescient decision making from the first prototype. There are some qualities of a system that are only evident in the whole and cannot be found in the constituent parts. These can be emergent or are qualities that can be impaired from the failure of one component. We use the term "software architecture" when we try to discuss design issues of a large ensemble of components that comprise a system. And a failure to properly appreciate the interconnectedness of these components in the beginning can lead to some very painful refactoring later in the project. It is this trend toward needed refactoring coupled with management's reluctance to acknowledge and fund the refactoring by denying the technical debt the system has accumulated. Software engineers can call this a management failure but management will call it an engineering failure as the relationship between the incremental functionality and that cost of implementing that functionality diverge because of that refactoring. At its most dysfunctional an existing system will be abandoned and rebuilt while a project not yet implemented may be canceled.
So I argue that the term "bug" can be indicative of a software engineer who is lacking in the maturity of their profession that comes after many years of watching these forces play out on real projects. I have a tendency to extrapolate from my experience to see this in the current software engineering practice. But I hear enough stories to convince myself that no one has found a magic bullet which helps a software engineer use a process that is in any way analogous to that used by engineers in other engineering fields. The reasons for this alone are interesting but beside the point; we must accept the fact that real software quality cannot be attained when the only focus on the delivered software product is limited to the most easily or badly quantified measures that sweep the subtlety of software defects away in an effort to use an existing dataset and avoid the time and expense of data gathering.
Subscribe to:
Posts (Atom)