Dale Fletter Research Journal: February 2012

Tuesday, February 28, 2012

Locus of Control in Open Source Software

At last Friday's meeting, we had a brief discussion of what does an open source project give up by not having a project manager? My mind fills with thoughts regarding this question and I need to capture them since I intend to dig into this question at some point.

For me, a project manager, like any manager, certainly does provide some planning and coordination for the team members. But I think the more important contribution is the leadership she provides. The question has the assumption built in that open source projects do not have project managers. I will comment on this before I consider other aspects of this question.

Any human group will have a leader. As social animals, we will spontaneously put someone into that role or the group is likely to never be productive. The old chestnut from sociology is forming, storming, norming and performing (with mourning a newer addition). The group formed around some shared vision, cause, or need. Often, but not always, one of the chief founders of the group will remain as the leader. Sometimes this will be shared or will pass between people. This brings up new research questions. In an open source software development effort, how can the leader be recognized? Will the topology of communications reveal this relationship? or must more sophisticated analysis be performed? Does the power to make decisions flow to the leader? You might assume it does but it would be interesting to verify that with research.

In the discussion, my primary reaction was that open source software development is different and limited in some way. My comment was that for open source the consumers of the product and the producers of the product are the same. Others disagreed and suggested that it was an overly broad generalization. I can see their point but I still maintain that there is an inherent difference that pervades these projects.

Most open source projects are for products that only an engineer can love. What end user cares whether their web site is hosted on an Apache server? or cares about Eclipse? Most of these projects focus on products that are used by developers or IT. The one exception that comes to mind, perhaps, is Linux. But even there its halting acceptance as a consumer platform reinforces my belief that it is only slightly more sophisticated than a roll-your-own kind of OS that engineers love simply because it is not controlled by a mega-corporation and allows them the ability to play with the insides in a way that other OSs will not. This appeal is unique to the engineers that can appreciate what this openness brings and have the skills to make those tweaks. In short, I believe that open source projects depend upon a subject matter expertise that limits the products that are developed.

One big advantage of this self-selection is the depth of knowledge that can accumulate in an open-source project. Its as if you were a 4 star chef and had to eat at a diner. But instead of being forced to eat what the cook was serving, you were suddenly given kitchen privileges and you started making gourmet meals for yourself and your friends. You know what you want and trying to explain that to someone else who does not know it themselves and to do that with the impediment of required skills is too frustrating. Open source eliminates those barriers.

If I investigated the development of an open source project from its inception, I would expect to see some evidence of storming, the inevitable sorting out of mission and purpose that is needed to motivate everyone. While the original vision might be sufficient to attract one or more people into the project, the mission will continue to be elaborated as the project grows. Like the splintering of Christianity into innumerable splinter groups, open source projects can potentially continue to fork whenever there is a disagreement about either purpose or some other divisive issue like trade-offs or design. What issues precipitate forks? Is there evidence in the project logs which can indicate an increased chance of a fork? What is the success of forks? How often do both branches continue long term?

The discussion on Friday touched on the success of an open source project and what it means for an open source project to be successful. Several of my friends immediately took a commercial oriented outlook focusing on corporate sponsorship, downloads or some other economic metric. But in our discussion, my colleague suggested that some open source projects are motivated by non-economic interests such as the exploration of a new design or product idea that has no obvious economic benefit. I must allow that people can be motivated to participate in an academic project that inspires the sense of exploration. After all, we have had explorers throughout time and the world of cyberspace is certainly rich with unexplored areas.

Young, talented engineers are not necessarily any less altruistic than the people who join peace corps. But unlike engineers who volunteer to install a drinking water system in an African village, a software engineer can make a contribution from the comfort of her easy chair. It lacks the direct reward of the happy people who benefit. Yet there must be intrinsic rewards to the work. Economic theory suggests that if there were no reward for the work, it would not happen.

We briefly touched on intrinsic rewards on Friday but I would be interested to understand the personal rewards gained by these engineers. It is easy to make up stories that may, or may not, be true. Certainly the Hollywood version is some cheetos eating, game playing, socially awkward, post-adolescent who has developed formidable skills at coding who is gratified by the online attention his (and lets face it, its almost always a him) garnered. Among his peer group, he revels in being the best. Competition is fierce among some coders to be the best at what they do. It is reasonable to assume that this is one source of gratification for their work. But would questionnaires support this? How many contributors are supported by corporations like IBM? What influence does this economic support have on the motivations for the other project team members? Does this change the dynamics? Are some of these people working on the project to get hired? Are they doing it to develop new skills? Be mentored by experience coders?

During norming, the project team must decide on pragmatic matters like how decisions will be made and the nature of the license, or lack thereof, for the final product. How new members will be admitted. What type of artifacts beyond the code must be produced? These concerns may have been addressed already but if not, they will surface once the work begins.

My going in position is that the guidance and leadership emerges spontaneously within the group. If the expertise is not there in the beginning, someone will grow into the role making mistakes as they learn. Someone who has prior experience with an open source project may be the natural leader.

If there is to be any vital difference between manager led groups and self-organizing groups, I believe it will be in the area of coercing team members to take on tasks they are not intrinsically drawn to. I believe I heard D call this out as an issue. Staff engineers are hired to develop systems for someone else and not for themselves. This work for pay requires management to adjust performance and reward to ensure that sufficient labor of the correct kinds are available and are properly motivated to do the work. As I suggested above, there is a type of product that seems to naturally gain open source support. If I am right, there are other products that are unlikely to gain this kind of spontaneous open source support. How could these products be categorized? This seems to be a continuum with many models of organization and support available. Can any conclusions be drawn regarding the type of support, or level of support, and the type of product? Am I correct when I assert that tools gain the most support for open source with consumer products generally garnering less?

Finally there is the fact that companies can find ways to contribute to more open source projects. Software may, or may not, bring a competitive advantage to a company. Most of the software companies use is commodity that lends no competitive advantage. A company that insists on having custom software for all automated functions should find itself at a competitive disadvantage as the costs of these custom systems will exceed commodity software over the long-term. This suggests that cooperative efforts, such as open-source software, can lead to lower cost software. The company could still customize some features off the main trunk of the development by lobbying for the sorts of hooks and interfaces that enable their customizations. Having staff members contribute to the project ensures they get the software they want and maintain the skills needed to customize the product for any unique needs they have.

I used the term locus of control in the title. This is a reference to a psychological theory talking about whether a person feels they control their destiny or whether they feel controlled by someone, or something else. A commonality I feel in this discussion of open source software is the sense of internal locus of control they feel. The employee/employer relationship certainly has its good and bad. But it rarely instills a sense of internal locus of control. In the end, perhaps this is the trait that most powerfully motivates the team.

Monday, February 20, 2012

The Value of Code Reading versus Testing

This morning's reading is a paper from Basili originally published in 1997 involving the value of code reading as a quality improvement technique. It is gratifying to see him make some points that I have felt for a while. But my immediate reaction is how it informs my pedagogy.

Of all the things we teach computer science students, the hardest and yet most fundamental is computational thinking. At its most basic we expect our students to read code and be able to execute it by hand. It occurs to me that the most important computer we build is the one in their heads. Their ability to submit programs to this mental computer improves throughout their careers. I know I myself find no value in executing student programs for the lower-level classes simply because I catch far more logic errors by reading their code than I would from testing with the limited time I can give each program. It is simply more efficient. My internal computer is good.

But the form of reasoning I do when reading these programs is not strictly computational. It is similar to formal methods in that I am proving program correctness in my reading, not submitting some test data in my head. The ability develops naturally for someone who works with code over a long period of time and develops without much formal training. It is the ability that I believe Basili is exploiting in his studies and one that is far more powerful than testing can ever be. But if a student never learns how to "play computer", this mode of thought is shut off. I believe this is one of the most basic skills we teach.

"We differentiate a technique from a method, from a life cycle model. A technique is the most primitive. it is an algorithm, a series of steps producing the desired effect, and requires skill. A method is a management procedure for applying techniques, organized by a set of rules stating how and when to apply and when to stop applying the technique (entry and exit criteria), when the technique is appropriate, and how to evaluate it. We will define a technology as a collection of techniques and methods. A life cycle model is a set of methods that covers the entire life cycle of a software product." [1]

from the 1987 study...
"The results were that code reading found more faults than functional testing, and functional testing found more faults than structural testing. Also, doe reading found more faults per unit of time spent than either of the other two techniques. Different techniques seemed to be more effective for different classes of faults. for example, code reading was more effective for interface faults and functional testing more effective for control flow faults."[1]
...
"Based upon this study, reading was implemented as part of the ... development process. However, much to our surprise, reading appeared to have very little effect on reducing defects." op cit

Another purpose this article serves for me is to remind me of the insights gained from my master's project. The project proved to be a lesson in reverse engineering in the end. My contribution ultimately proved to be the elaboration of the process which is followed when an existing system is taken as the input for a re-engineering project and the existing architecture is either not understood and/or must be significantly changed to support the new requirements. I don't find the results either surprising or inspired. Yet I am not aware of the steps of this process being documented anywhere else.

What Basili is talking about in this paper in the discussion of the step-wise abstraction technique isn't any different than what I needed to do for my project, albeit at a different level of abstraction. For Bass et al [2] the purpose of the architecture documentation is to provide exactly the kind of abstraction these readers must construct. I would presume that for Basili's study, even if this design level existed it would not be shared with the readers. Perhaps it would be an interesting study to have two groups of readers; one who read without the design and one who read with the design available. I feel that in the end, I was reading the code base in a step-wise abstraction way to recover the architecture design I needed for this system to support my vision for the product.

For my students, I have a renewed appreciation for what they must learn. When introducing programming I am going to place greater emphasis on the techniques for playing computer than I have with the intent that I will be able to give them simple programs on a test and improve their scores regarding the result of their execution. After all, the computer is a machine. Their ability to anticipate the response of the machine to its input is only a higher abstraction of asking them to anticipate which way a gear will turn in a mechanism when the user turns a crank.

[1]Evolving and Packaging Reading Technologies, V Basili, Foundations of Empirical Software Engineering, eds. Boehm, Rombach, Zelkowitz, Springer 2005

[2] L Bass, et al, Software Architecture in Practice, 2003, Addison-Wesley

Sunday, February 19, 2012

Autonomic Computing

http://www.research.ibm.com/autonomic/index.html

Oh great, something else to read about...

Thinking outside the box

Today's rant was inspired by an article in the most recent CACM, Wanton Acts of Debuggery | February 2012 | Communications of the ACM. But the article really just offered me a good jumping off point. The article discusses the habit of some coders to create some really inane messages for debugging or exception conditions. Of course I agree. But I don't really think the author gets to what is the deeper and ultimately the more interesting issue. We are so focused on the primary functionality of the module that we are only peripherally aware of the number of important relationships that exist between the module and potentially many other modules in the system.

Below is a brain teaser I first saw more years ago than I'll every admit. It was used in an innovative series of ads for a company I can no longer remember. The point of the brain teaser is to connect the dots with four straight lines without lifting your pencil from the paper. I'll give the answer later. But join me again below the image.

There are nine obvious rectangles that can be formed in this image and one more that is arguable. This brings me to where my mind went and why these two things are related.

A point that Bass et al make in the book, Software Architecture in Practice is that UML diagrams often encourage the denigration of the relationships between the modules. The single line that is often used tends to suggest that the relationship between the modules is somehow simpler than the modules themselves. This may have been true once but it certainly isn't true in the more complex systems. The relationships need far more attention than a simple line can convey.

The CACM article focuses on the exception messages from a module which are directed at the programmer who will use them to diagnose the fault. However in a fault tolerant environment, these messages must be received by some module other than the one processing the transformed data is directed to. Control and data travel different paths.

A quality that must be exhibited by some modules is testability. Instrumented code creates another stream of messages that is distinct from the control messages and data messages. And none of these channels need to be one-way nor do they need to implement simple protocols. The relationships between modules and the interaction of these simultaneous communicating processes presents challenges to predict all possible states.

The point of the brain teaser above was the necessity of allowing the lines to extend beyond the assumed box defined by the outermost points. The box that can be argued as the 10th box defined in the puzzle is the one that encloses all the white space which surrounds the points. I think the implied box which can be easily forgotten represents the broader system requirements. And I think the complexity and importance of the relationships between these modules is overshadowed by the processing modules.

Saturday, February 18, 2012

Google, Google, Google

No, this post is not about Google. My metrics had a blip when I posted about Google research so this is simply a red herring to see if it was mention of Google that made them bounce. However I am going to capture a thought I had after viewing the Bret Victor talk.

The most arresting part of the talk is the demonstration of the symbolic execution for the graphics code. There is no doubt in my mind that this is a very desirable direction in code creation and maintenance. We have virtually free machines cycles available to us and gamers have benefited for many years now. It is time the software engineering community insists that the traditional way of executing this code in our heads and writing it in an arcane symbolic language limits what people can create using the machines. Will this be how all code is written in the future? I highly doubt it. But if a significant portion of the day-to-day coding can be moved over to this style it should save many work-years of effort that can be used elsewhere.

The talk also makes me mindful of my going-in position regarding software engineering. One of the things that humans do exceptionally well is recognize patterns. As Gladwell observes, our brains are hardwired to recognize some patterns is a fraction of a second. The way this is done is obviously not using symbolic representation and some of the other high-level functions of our mind but instead uses something much deeper and closer to our lizard brain. Unless and until we can duplicate this ability with our machines it makes sense to use this ability to the extent possible. More visualization of code is one way this can be done.

This is not a novel idea. Fred Brooks had advocated for IA instead of AI; Intelligence Amplification, not Artificial Intelligence. This tends to push us toward a more cybernetic view with our machines acting as means of extending the inherent abilities of humans and not as autonomous agents. To do this most effectively means understanding where we can best extend the natural abilities of the brain.

Our path until now has been to treat the human as the adaptable member of the man/machine pairing. the inherent adaptability of the human has been exploited by machines for many centuries. If humans are really in control, you would expect that the machines would adapt to the human and not the reverse. I am well aware of the economic issues that have fed this trend. However this is a trend that will eventually reach some limit beyond which a human being cannot adapt to the machine and progress will cease. There must be a parallel trend in which the machines adapt to us.

It is not as if there is no indication that this occurs. Clearly our user interface design has turned a corner and is now taken seriously as a design issue. For all of their faults, Apple has demonstrated that design matters. I personally believe that design will be THE single biggest determinant of market success in the next century. Design is pure intellectual property and an endeavor beyond the ability of machine intelligence (imho) and will therefore be where companies can best achieve relative advantage in the marketplace. Creativity is a very human trait and what we seek to do in creating software engineering tools is to harness that creativity and aid it as much as we can.

So Google seems to be on a good track to exploit machine intelligence in their products. But for me, I am interested in figuring out how we can best exploit human creativity.

Friday, February 17, 2012

Bret Victor - Inventing on Principle

Prem recommended this talk (http://vimeo.com/36579366) by Bret Victor. The way code is so easily animated is nothing short of amazing even though it has been slowing moving that way for many years. Here are my comments about the talk with their time marks.

[2:10]

Le plus ce change, plus c'est la meme chose...

"2.1 The Bauhaus experience

The Bauhaus has generally been considered as the leading modern art school to implement new

methods of teaching aimed at encouraging creativity and development of personal abilities, with

much emphasis laid on practical work in workshops. One of the key courses which left and indelible

memory in many of the Bauhaus students undoubtedly was the “preliminary course”(Vorkurs) taught by Josef Albers. This course has been the source and starting point for many

subsequent courses in basic design throughout the world. A key point, in the pedagogy of form

developed in it, was the research of the relationship between form and material by experimenting

with different workshop materials. "(http://riunet.upv.es/bitstream/handle/10251/8835/Full%20paper%20J.M.%20Songel.pdf)

[7:54] that is cool

[16:51] animation of binary search

[23:10] Am I a voice in the wilderness when I say that software engineering is about a whole lot more than just coding??

[26:32] Tufte-ism

[27:57] Back to Bauhaus

[36:00] When he talks about how dreadful our current tools are, he is preaching to the choir. Why are UML diagrams so rarely animated by the code they are supposed to represent? Many years ago Visio was doing good work with this but the Visual Studio product has barely moved beyond the most rudimentary capabilities.

[36:30] OK, this is where he becomes a Zorastrian...

[38:00] He starts proselytizing about how it is important to work for an idea or principle

[41:55] He claims that Tesler's principle of modelessness is his raison d'etre. Yet I can't help feeling that if I were to discover something so fundamental as the observation that modes are a barrier to usability, I might be inclined to push it as far as it goes too. Basili gave us GQM. I don't want to take anything away from either man but isn't it a bit hyperbolic to mythologize their one big idea? Can't he see that at some point it is no longer a springboard but a brick wall to further progress.

[43:13] paradigm shift, not a crusade against wrong. While I agree there is a qualitative difference between the tinkering that represents the vast majority of technical innovation, Tesler's contribution was far more inspired and fundamental to a way we think. Yet I am not buying this as some metaphysical truth.

[47:00] I do not disagree that the people he cites are highly motivated and answer to some higher calling than making money, getting papers published or any venal concern. What I believe motivates them is akin to a religious calling, a motivation that transcends the mechanics of what they are on about. Religious zeal has been highly motivating over human history and led to great things in art and architecture. I'll even grant that the motivation for GNU and the open source movement has a religious zeal that cannot be explained in purely rational terms. But I think he overstates his thesis. For me at least, I see this as the motivation that comes from the joy of creating. I see it as the same force that makes artists create. The difference is that for us we have software as a medium and not paint, steel, or society.

[50:00] This part of the talk sounds more genuine and meaningful, at least to me. What I find interesting is how he glosses over the pragmatics of pursuing self-actualization.

[51:00] It must be actionable. YES!!

Here is a link to his web page http://worrydream.com/

There are some interesting ideas there visually presented.

Thursday, February 16, 2012

The reason I continue to troll the web rather than limit myself to the academic papers is because sometimes I find something that captures a simple truth in a few words. This link leads to an article that appeared in eWeek. The simple message is that software testing, while important, is growing less important to achieving software quality as these systems keep growing in size. At the risk of stating the obvious, you cannot tack quality onto the a completed product.

http://www.eweek.com/index2.php?option=content&do_pdf=1&id=62126

Defining Software Quality

"Software engineering research results that I don't like often suffer from addressing an uninteresting problem or from trying to get people to do things computers are better at, and vice versa."
David Notkin UWash

While I know of more scholarly sources to go to for an academic treatment of software quality I enjoy trolling the web since I am more inclined to find the unexpected there. Today I happened to have found the site http://www.robelle.com/library/papers/quality/. It is not a remarkable page but it does repeat some of my prior observations:

quality is measured by a human's perceptions
correctness as measured against the specification is not the same as quality
different people or the same person at different times will evaluate the quality differently
quality can only be judged in context

The prior post does a good job of summarizing some key views of quality and helped me make the point that each stakeholder will evaluate quality differently. This page helps me make the point that there are nested systems and the quality is evaluated at each system boundary by the stakeholder(s) at that point. Since this last point is a bit obtuse, let me give an example of what I mean.

In a large organization, the person usually called the "user" is the one who directly interacts with the routine transactions of the business system in which this software product is embedded. For a customer relationship management tool, it is the sales and marketing people who frequently enter pipeline and contact information. Since their job uses the mechanized system as a tool for performance, their roll up of quality attributes will concern the use of this tool in the context of their jobs. The goal is to close as many deals as possible as quickly as possible. Their key performance indicator will be something like booked revenue per quarter. Most likely the majority of the metrics that would eventually support that rollup will be related to how the product enhances or hurts this productivity.

But in the modern organization, there are other users as well. The entire sales and marketing organizational hierarchy will also use this product as they review the collective pipeline for the organization and use the results of this consolidation for financial planning. Their task is not the direct booking of the contracts but the management information reporting and assimilation of the information into the broader context of the business. Whether this is viewed as one additional layer of user or several makes no difference once you can see that the way these two important stakeholders view the product with different expectations and interests.

One of the temporal considerations in quality evaluation occurs during the sales cycle. While the end user's evaluation of the product would logically be considered, the logistics, economics and politics often prevent it. Product purchase decisions are usually made by a different set of stakeholders, often the most senior managers the product sales staff can engage and the company's sourcing and purchasing professionals. Their evaluation of the the product quality stems from many sources that have no guaranteed basis on what the end users will assess once the product is in production. I think this topic can easily become an entire paper but not a software engineering paper.

In this discussion I could have substituted value for quality and value is certainly one of the key filters someone will use in determining quality. This is quite evident in cars. Clearly a Toyota Civic will measure very differently than a Porsche Boxster. Yet many Toyota Civic customers feel the Civic is a good quality product for the price. In a reverse fashion, Jaguar once had a dependability problem with their cars. Everyone agree that it was a high quality product when it ran. But the frequent trips for repair caused their reputation to diminish forcing them to address the dependability issue. No doubt all auto makers needed to face that issue as the marketplace improved raising the bar for everyone.

But quality can transcend mere value. While the dictionary definitions of quality focus more on the philosophical or aesthetic sense of the word, even the prosaic world of business cannot completely avoid this sense of the word as well. I think Apple provides a good example of what I intend here. Before Apple became dominant I heard many product managers moan that computer hardware had become a race to the bottom to see who could make it the cheapest and could not see any value in user interface research investment. But Apple stubbornly stuck to their visions and eventually proved that the marketplace will reward superior design.

My friends in college were design students and I listened to endless pompous discussions of aesthetics and beauty. They frequently chose products that were beautiful and represented a refined sense of taste. However they often chose aesthetics over function despite the Bauhaus tradition of the school. These people were partly frustrated artists who wanted to make beautiful things but had not completely absorbed some of the Bauhaus teaching.

Meanwhile I was taking classes in the engineering school with students who were only mildly put off when they realized their socks did not match. Pocket protectors were the norm since pens still leaked back then and engineers can be very cheap when it comes to replacing their short sleeved white shirts. It was their aesthetic that brought us inscrutable black boxes that would blink 12:00 for eternity since figuring out how to set the time was almost impossible even with a manual. Of course other engineers could in a heartbeat but in the marketplace, the majority of the buyers were not engineers and people only tolerated these machines since they provided value despite their flaws.

These two very different aesthetics existed side-by-side and never realized any synergy. Sadly, I don't feel we have gotten too much past that and we are leaving it to the marketplace to slowly sort out how people quantify the various qualities of a complex product like an iPhone. While there is a great deal of elegance in the market-driven approach, it gives the product designers little to base their decisions on absent some form of model that predicts market success from some set of attributes, most commonly function and features. Yet how would you go about quantifying design aesthetics so it could be included in a quality calculation?

In the end I don't find anything on this page which is more hopeful for defining quality than what I have already found in Bass et al from Software Architecture in Practice. Their approach starts with some feature or function expressed as a use case. It then provides some additional information about this use case which supports a defined metric for the behavior of interest. I'll continue my web crawl but I suspect I may not find anything to extend their concepts of quality decomposition until I peruse the academic vein.

On Possibility and Choice

There is an acquaintance of mine who is pursuing a PhD in the Information Systems and Management group at the Warwick Business School at the University of Warwick. His research journal is posted at http://www.luminousgroup.net/. When I reach a blank slate while writing in my own journal I pop over to his for a break. The overlap in our interests makes his journal interesting, his outlook is sufficiently different that I can see the same thing in a different way.

Earlier this week he made a posting http://www.luminousgroup.net/2012/02/possibility-and-choice.html which has a quote from Zuboff (Zuboff, S., 1988. In the Age of the Smart Machine: the Future of Work and Power, Basic Books., p. 387-388) The quote pertained to the kalidescopic way in which our machines present the world to us but how in the end we are still faced with choices. I had commented that the quote brought the cybernetic view to mind and how the machines enhance or augment our perceptual system and changes us in subtle ways. He responded that he agrees with the augmented perception, "but I would question whether it has had the effect on the general population of extending mind" I can understand his point of view but as regards to how humanity has adapted, I ultimately disagree. I am choosing to expand on my idea here rather than continuing the discussion in comments to the original post.

I am teaching an intro to CS currently. Since the class does not articulate to any CS class and is nominally intended for non-majors as well as majors, I take a broad sweep over the material and place a heavy emphasis on the uses of information technology over human history. One thing I discovered in gathering material for the presentation of the abacus was a YouTube of Japanese students preparing for competition. I expected them to be fast with the abacus. What I had not previously known was that the best students don't bother using the physical device any longer but merely manipulate virtual beads with their fingers in lieu of the device. This is the most dramatic demonstration I can think of for the potential of the human mind to develop computational skills.

We all know that with practice we can add long columns of numbers without mechanical aid. What these Japanese students demonstrate is how far the limits of computation can be taken. Savants are known for their ability to multiply large numbers together with incomprehensible speed. But all of this makes the current state of computation skills at a college entrance level shocking to someone of an earlier, pre-computer, generation. An extreme case was one student I was tutoring who could not give an answer when I asked him to multiply a number by 10. Students are learning computation differently now.

I have been careful to call it computation and not math. While computation is certainly needed at some point when learning math, the inexpensive electronic calculator has made the advanced human computation skills irrelevant to society. But what is also evident in society today is how "innumerate" society seems to be at times. Take the ongoing Washington debate over the national debt. Millions, billions and trillions are sprinkled throughout conversation and it is not uncommon to hear someone confuse them. Yet, of course, these figures differ by 3 orders of magnitude. For the engineering student whose best computational tool was a slide rule, keeping track of the decimal place becomes second nature. Yet in the public sphere it seems like the average person is actually less capable of keeping track of the relative size of these numbers than they were several generations ago. Correlation is not causation but it is not unreasonable to posit that there is a causal connection. So I'll just assert that I have a belief that the ubiquitous computational ability has caused our education to neglect the old-fashioned drill on computation with an increasing dependence upon mechanical computation. The change is more in the form of atrophy of the skills possessed by prior generations. That loss of skills is made possible by the augmentation of humans by ubiquitous computation. Perhaps not what Barton thought I was going for when I claimed an enhancement in people but a change none-the-less.

Barton is ultimately more concerned with organizations than with individuals. Yet even there I see some profound shifts in management decision making over my lifetime and they are largely due to the reduced cost of computation. Prior to WWII, organizations were either limited in size or limited in their ability to act in a command-and-control structure. Mid-level managers were given great latitude simply because the ability to micro-manage did not exist. But with the growth of information technology (including communications) and the reduced cost of computation a new model of management grew out of the 1960s. I am not a fan so you'll need to excuse the fact that I derisively call this "spreadsheet management." Economic models began to first augment, and then in some cases, replace human judgement in the organization. The link between management and the functions being managed has grown so tenuous that a CEO from one industry can become the CEO of another industry without anyone thinking there is anything odd about it. The CEOs job after all is to create shareholder wealth and what does it matter what the line is doing when that wealth is a function of short-term financial decisions rather than the result of long-term strategic vision.

Given this view of contemporary management in large multi-national organizations, I see the growth of IT as having had an enormous impact on management decision making. The financial models and projections envisioned in the 60s are now easily realizable by our machines at almost no cost. Management decision making has shifted from a human centered activity to a data driven activity. To me this is profound and part of the source for the economic melt down.

One of the many factors in the recent economic turmoil was the failure of risk assessment organizations to properly quantify the risk in the marketplace. The real-estate wealth drove the economy into uncharted numbers of wealth and diversity but the knowledge of how it could all come unraveled after a turn in the market was willfully ignored. Like a snowball at the top of a ski run, a relatively small event cascaded through the economy just like the proverbial butterfly wing causing a hurricane. Our machines are capable of great feats of computation but the models underlying the meaning we attribute to those numbers is still human as is greed and optimism.

In short, what I see as the effect of this ubiquitous computing in organizations has been a reliance upon those models that are efficiently executed by the machine and a dulling of the management skills in the organization. When you are rewarded or punished because of the numbers you are nominally in control of, what incentive is there to understand the complex model that creates those number or to challenge them if you do? The skills a mid-level manager needs now are more political than analytic since it is likely her boss does not understand these models either.

Given this cynical attitude toward numbers it is almost humorous that I am choosing to pursue the empirical route which will stress numbers over persuasive rhetoric. But it is precisely because of this cynicism that I am choosing this path. It is not as much that I have lost faith in what the numbers can do for us. I think there is ample evidence to suggest they can do great things. What I rail against is the loss of humanity and the loss of what I can only at this time describe as common sense when it comes to using these numbers. As with individuals it looks like the atrophy of skills that were more broadly developed by prior generations of managers. I see to many otherwise bright people think that the only thing that exists is what has been measured and quantified. Mind you, not what COULD be quantified, but only what has. I believe that even the universe of what could be quantified is insufficient for the best performance at a management level. Qualities like persuasion, leadership, and inspiration will never be delivered in a spreadsheet but only through direct human contact. Yet a modern manager must be able to use numbers to effectively do these things as well in our world. The numbers and their appropriate use will be an important skill in management as well as in engineering. I am not interested in the management route any longer since I feel there are too many people as good as me or better who want that more. However it is the engineers who will do the exploration into the quantitative world and find the techniques and models that will influence the managers of future generations. Whether those managers will use the knowledge properly or not is not my concern. But I am motivated to ensure these workers have the best tools available to understand how machine processes can be crafted as deliberately as we manufacture hardware today.

Wednesday, February 15, 2012

I had previous ranted about a website that rubbed me the wrong way when it discussed software quality (Towards and Improved Definition of Software Quality). I apologize for being hot-headed about the topic and I will try to take a more measured approach to the subject and begin to lay the foundation for a deeper study of the topic.

Historical Perspective
I found a paper by Boehm from 1976 that addressed the topic. [1] It is very typical of the way people talked about software quality historically.

Kitchenham & Pfleeger offers that quality views are [2]:

transcendental
user
manufacturing
product
value-based

To paraphrase her definitions, the transcendental view is "I know it when I see it" type of definition which is not helpful for detailed analysis. The user sees software as fitness for purpose. The manufacturing view of quality is its conformance to specification. The product view of quality ties it to the intrinsic qualities of the product. The value-based view should really be called the market view of quality; high quality products command a premium in the marketplace.

They observe that people who focus on metric tend toward the product view of software quality. Their orientation is to measure the intrinsic qualities so as to predict something about the quality-in-use that will be observed.

An important opinion expressed in this article is that "there is little evidence that conformance to process standards is a guarantee of quality software." The components of quality in the user view are the proper mix of functionality, the non-functional behavior of the software and external factors that affect the use of the software. They quote the ISO definition of quality as “the totality of characteristics of an entity
that bear on its ability to satisfy stated and implied needs." They go on to say that Tom Glib proposes a method for measuring the non-functional characteristics that will affect the composite perception of quality in the user. The ultimate goal is to have a metric for each characteristic.

In the Kitchenham & Pfleeger article, I note that they implicitly define quality as a stakeholder perception and not some absolute measure of anything fixed and unchangable. But this impression is corrected in the article call out where an article by David Garvin called "What Does Product Quality Really Mean?" (Sloan Management Review, 1984) where it is explicitly acknowledge that there are as many measures of quality as their are stakeholders. This article then goes on to introduce the various papers in this publication. What is safely understood from this paper is that the balancing of these qualities is a shared responsibility between technical project management, product management and senior management.

Herbsleb et al [4] published an article called "Software quality and the Capability Maturity Model." [4] Their article is a straightforward presentation of the CMMI from SEI and their justification for it as a basis for a metrics program. The take on and address the criticisms of CMMI. What is probably the most difficult challenge is the charge that a CMMI organization will become rigid and bureaucratic. After all, the organization will focus on the measures that are chosen and those that are not measured may change, often for the worse. To measure everything creates a very data driven organization. Without the proper tool support, an excessive amount of time and energy is spent on on the gathering, analysis, and challenges to the data to the exclusion of the ultimate product quality. Yet their data support their belief that this charge is not supported by the customers who adopt their process model.

(to be continued)

REFERENCES
[1] B. W. Boehm, J. R. Brown, and M. Lipow. 1976. Quantitative evaluation of software quality. InProceedings of the 2nd international conference on Software engineering (ICSE '76). IEEE Computer Society Press, Los Alamitos, CA, USA, 592-605.

[2] Kitchenham, B.; Pfleeger, S.L.; , "Software quality: the elusive target [special issues section]," Software, IEEE , vol.13, no.1, pp.12-21, Jan 1996
doi: 10.1109/52.476281
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=476281&isnumber=10198

[3] T. Gilb, Phczpals of Software Engineering Management, Addison-Wesley, Reading, Mass., 1987

[4] James Herbsleb, David Zubrow, Dennis Goldenson, Will Hayes, and Mark Paulk. 1997. Software quality and the Capability Maturity Model. Commun. ACM 40, 6 (June 1997), 30-40. DOI=10.1145/255656.255692 http://doi.acm.org.proxy.lib.csus.edu/10.1145/255656.255692

Risk Estimation for Large Software Development Projects

I don't have anything substantive to say about this. I just wanted to drop a placeholder since this is an interesting topic that should lend itself to quantitative analysis, be meaningful and be interesting to me. But would this be too far into economics instead of CS?

Tuesday, February 14, 2012

software engineering data links

http://ase.csc.ncsu.edu/dmse/

"The goal of this thesis is to identify the data sources that libre software projects offer publicly, to

present and display some methodologies for the analysis of these sources and the data that we can

extract from them, and to show the results that have been obtained from applying these methodologies."

http://libresoft.es/webfm_send/45

The Data and Analysis Center for Software

http://www.thedacs.com/databases/

Repositories with Public Data about Software

Development

http://ifipwg213.org/system/files/ijossp2010.pdf

SOURCES OF SOFTWARE BENCHMARKS, Version 10.0, August 28, 2011, Capers Jones & Associates LLC.

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=20&ved=0COwBEBYwEw&url=http%3A%2F%2Fwww.isbsg.org%2Fisbsgnew.nsf%2FWebPages%2F94B5F80C7018BBC4CA2576B1007F30D4%2F%24file%2FSOURCES%2520OF%2520SOFTWARE%2520BENCHMARKS%252010.doc&ei=z6g4T4KcFoLRiAKhm_mMBA&usg=AFQjCNFRzMWvO-KiGTuu_gvm70btiT_y2Q&sig2=mv4V0tWjAVlfk2T6M5DD9g

On the efficacy of software quality metrics

For the past decade at least, the dialog about software quality has been intense. Since there is a way of looking at a problem that says if you can't measure it, you can't be sure it exists, there has also been an inquiry into software quality metrics and the decomposition of the components of software quality. It has now been several generations since the Japanese clearly demonstrated that the quality of automobiles can be analyzed, understood and lead to actionable management strategies that result in a product that both scores high on these quality metrics and is judged a quality product in the marketplace. At least for the next few weeks this topic is one that I will be looking at.

To be honest, I eventually found the Basili book a bit tiresome. In my metrics class we had already covered his GQM methodology and I get it. The remainder of the essays seem to offer a historical perspective on how he got to that formulation. The book does end with an essay that hints at things beyond this idea but I think those thoughts will naturally work into this topic.

As I imagine is norm for contemporary researchers these days, my first stop this morning was Google to see what pops up with the term "software quality metrics." As is no surprise to most people who care, Wikipedia was the first few results which I promptly ignored. The first non-trivial result was from the site www.developer.com, specifically a post called Software Quality Metrics. [1] Since it acknowledges the work of SEI and Watts Humphrey as well as the work on TQM, it passes my filter for serious consideration.

They quickly acknowledge that quality is a multi-dimensional attribute. A good start. They claim that IBM measures customer satisfaction along 8 dimensions:

capability or functionality,
usability,
performance,
reliability,
installability,
maintainability,
documentation, and
availability.

While the authors are comfortable with this immediate link between software quality metrics and the dimensions, I want to take a moment to deconstruct this for myself.

There is no doubt that in the marketplace, customer satisfaction is an important metric. You'll need to pardon my cynicism on the topic however since I know how shamelessly companies manipulate this metric in the commercial setting. Yet at a conceptual level I concede that any company that delivers a product judged to be of inferior quality (whatever that means at this point) cannot maintain a high customer satisfaction rating for long.

But customer satisfaction is itself a multi-faceted attribute and there are components that have little to do with the product quality depending upon how you define product. Sales force honesty and integrity, post-sales support, the cultural fit between the company and the customer can all affect the reported satisfaction and have nothing to do with the software artifact. Since the given taxonomy seems to not have those things but instead focuses on the most of the same qualities as the SEI SAPP taxonomy, I'll assume this was handled elsewhere. I am still dropping a red flag on this topic since I will want to understand how we connect customer satisfaction to perceived product quality. Indeed, I will want to see how they even define product.

The authors do take the time to discuss fitness for use as an important dimension. What I don't see in this article on first read is a good discussion of the various stake holders and their unique evaluation of the quality of the product. An end user will be the single most important stakeholder to a package software company selling a mass produced shrinkwrap product. But even this is moderated by other qualities that are of no interest to the customer such as profitability, maintainability, supportability, testability, or extendability. These qualities are important to the ownership stakeholders; shareholders, management, staff. The product management is ideally centrally involved with the design tradeoffs that are needed when balancing between these two different sets of qualities. Time-to-market and unity of design can easily be in opposition and a reasonable decision must be made to ensure the most optimal balance for that organization with that product at that point in time.

I want to take a moment to drill into how customer perceived satisfaction will change over time. Complex product are not mastered in the first few hours of use but will create a curve of frustration and satisfaction. The IBM taxonomy uses documentation as a dimension. At Andersen 20 years ago we stopped talking about documentation and instead stressed integrated performance support. To speak of documentation is to assume an artifact separate from the product itself. My position is that whatever documentation that must be provided is done so in the most integrated manner possible so as to minimize any difference between the two artifacts. At its best, the user does not perceive an separation between the two artifacts.

What I'm also mindful of in reading this article is how the authors do not challenge the taxonomies of quality. A central point of SEI's treatment of quality requirements is how ultimately meaningless the taxonomy becomes. The ideal statement of a quality property is a metric against some use case. When you accept this as the ultimate statement of the quality requirement, the hierarchy of how these are aggregated becomes less meaningful for a researcher. In fact, you can have several different taxonomies in addition to other tree structures to organize any number of metrics. Unless I were driven away from this approach, I will cling to SEIs way of looking at this since it naturally places the emphasis on the metrics and defers the sometimes endless semantic discussions of which category a particular metric should be placed into.

The authors suggest that IEEE's formulation of quality [2] is derivative of VOC and QFD. While I also studied these in my metrics class I have not looked at IEEE's interpretation. This standard should provide good guidance on how to roll the metrics up to a projected cust sat number or drill down from customer statements into the specific metrics to drive the process but I'll reserve judgement for when I get a chance to review it. As they say "TQM methodology is based on the teachings of such quality gurus as Philip B. Crosby, W. Edwards Deming, Armand V. Feigenbaum, Kaoru Ishikawa, and Joseph M. Juran."

The authors present some classic material on defect correction before they present a section they title Software Science. I am intrigued. They say "In 1977 Professor Maurice H. Halstead distinguished software science from computer science by describing programming as a process of collecting and arranging software tokens, which are either operands or operators. " I am embarrassed to admit I have not yet studied Halstead's work. The authors give a high-level review of some of his metrics. What I note in passing is that I have not yet heard of someone who has attempted to apply these concepts to the requirements engineering phase. I am inclined to think that a simple count of the tokens in a requirements document may be a reasonable place to begin with requirements metrics. Function point analysis has a long history but it was also reviled in the communities I worked in. However now as an academic, I need to reopen my mind to this material and see what I think.

The authors cover a brief overview of cyclomatic complexity.

The authors have one paragraph that I've read 3 times without feeling certain I have a complete grasp of the point they are making:

Availability and Customer Satisfaction Metrics
To the end user of an application, the only measures of quality are in the performance, reliability, and stability of the application or system in everyday use. This is "where the rubber meets the road," as users often say. Developer quality metrics and their assessment are often referred to as "where the rubber meets the sky." This article is dedicated to the proposition that we can arrive at a priori user-defined metrics that can be used to guide and assess development at all stages, from functional specification through installation and use. These metrics also can meet the road a posteriori to guide modification and enhancement of the software to meet the user's changing needs. Caution is advised here, because software problems are not, for the most part, valid defects, but rather are due to individual user and organizational learning curves. The latter class of problem calls places an enormous burden on user support during the early days of a new release. The catch here is that neither alpha testing (initial testing of a new release by the developer) nor beta testing (initial testing of a new release by advanced or experienced users) of a new release with current users identifies these problems. The purpose of a new release is to add functionality and performance to attract new users, who initially are bound to be disappointed, perhaps unfairly, with the software's quality. The DFTS approach we advocate in this article is intended to handle both valid and perceived software problems.

I am inclined to agree that it should be possible to develop customer sat targets a priori and use those to guide development through the creation of various metrics. This is a tall order. This paragraph also makes the point I had observed earlier that the product evaluation is not immediate but is best presented as a time series.

Their next section talks about the current state of metrics. They cite this book as their primary source:

About the Source of the Material

Design for Trustworthy Software: Tools, Techniques, and Methodology of Developing Robust Software
By Bijay Jayaswal, Peter Patton

Published: Aug 31, 2006, Hardcover: 840 pages
Copyright 2007 Pearson Education, Inc.
ISBN: 0131872508
Retail price: $64.99

I found it used on Amazon for $10 so I'll have it next week. I'll save my review for this material until after I've read their source material.

REFERENCES
[1] http://www.developer.com/tech/article.php/3644656/Software-Quality-Metrics.htm
[2] IEEE, Standard for a Software Quality Metrics Methodology (New York: IEEE, Inc., 1993)

MY TODO LIST
Review VOC, QFD, IEEE's
Study Halstead, function point analysis(IFPUG, International Function Point User's Group Standard (IFPUG, 1999).
d

Sunday, February 12, 2012

My First Look at Google Research

There was another article in the NYT today about the growing importance of quant jocks to analyze the blizzard of data the cloud is generating. Apparently data analyst is now the hot new job title out there. Makes me feel good I'm going back to my quant roots.

For a quant, its all about the data. Given the cost of getting new data, if there are stores of data that I can farm from, it allows me to get a jump start on refining my thesis. Tonight I am looking at Google Research to see what I find. So far I have not found references to data but in a slide presentation (http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/stanford-295-talk.pdf) I see a reference to the system qualities they value;
– Simplicity
– Scalability
– Performance
– Reliability
– Generality
– Features

This is as good a list as any to define the attributes people value that are divorced from the functional aspect of the product. (although I must confess I'd be guessing what they mean by Generality, and of course features is another word for functionality IMHO)

This looks like something that could be helpful...

L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlock 100 ns
Main memory reference 100 ns
Compress 1K bytes with Zippy 10,000 ns
Send 2K bytes over 1 Gbps network 20,000 ns
Read 1 MB sequentially from memory 250,000 ns
Round trip within same datacenter 500,000 ns
Disk seek 10,000,000 ns
Read 1 MB sequentially from network 10,000,000 ns
Read 1 MB sequentially from disk 30,000,000 ns
Send packet CA->Netherlands->CA 150,000,000 ns

I find these slides interesting...

Source Code Philosophy

• Google has one large shared source base

– lots of lower-level libraries used by almost everything

– higher-level app or domain-specific libraries

– application specific code

• Many benefits:

– improvements in core libraries benefit everyone

– easy to reuse code that someone else has written in another context

• Drawbacks:

– reuse sometimes leads to tangled dependencies

• Essential to be able to easily search whole source base

– gsearch: internal tool for fast searching of source code

– huge productivity boost: easy to find uses, defs, examples, etc.

– makes large-scale refactoring or renaming easier

Software Engineering Hygiene

• Code reviews

• Design reviews

• Lots of testing

– unittests for individual modules

– larger tests for whole systems

– continuous testing system

• Most development done in C++, Java, & Python

– C++: performance critical systems (e.g. everything for a web query)

– Java: lower volume apps (advertising front end, parts of gmail, etc.)

– Python: configuration tools, etc.

Multi-Site Software Engineering

• Google has moved from one to a handful to 20+ engineering sites

around the world in last few years

• Motivation:

– hire best canidates, regardless of their geographic location

• Issues:

– more coordination needed

– communication somewhat harder (no hallway conversations, time zone

issues)

– establishing trust between remote teams important

• Techniques:

– online documentation, e-mail, video conferencing, careful choice of

interfaces/project decomposition

– BigTable: split across three sites

Something else I found at Google Research was Google Correlate. Type in a term and it will find other terms whose search pattern matches. Try to correlate by time-series or geography. Kinda cool...

thesis has a clear seasonal pattern peaking in fall and spring and matching the term factor ( United States Web Search activity for thesis and factors (r=0.9628) )
The term "data analyst" has been trending up since 2008 after having been level in the period 2004 to 2008. It correlates with these other terms with p ranging from 0.9390 to 0.9218:

pain management
biotin
ignore
hiring manager
coordinator salary
spondylosis
psychiatric nurse
how to answer

Mental Maps

This link is to a NYT article about mental maps as they relate to GPS devices. I do not believe this observation is limited to physical maps.

http://nyti.ms/wNnJW5

Note to readers

I had a previous blog called Dale's Dilenttantic Deliberations which was more tongue in cheek than this one will be. However I did post some pieces there that are more appropriate here. I have reposted them here.

Saturday, February 11, 2012

Understanding and Documenting Programs

This essay is from the book I'm reading, Foundations of Empirical Software Engineering: The Legacy of Victor Basili and was originally published in 1980. It gives me a platform for some of the thoughts I have had about programs/systems and what is really meant by documenting.

The essay discusses an experiment where he sought to answer questions about a math routine only having some complex code and a gross statement of the routine's semantics. It brings to mind a discussion I had with my adviser about a project I was working on where I had initially titled the work as the documentation of the system I intended to extend. She pointed out the negative connotation of the word "to document" and persuaded me it understated the work I had done. I immediately understood her comment but I am still coming to grips with its implications years later. What was I doing if not documenting? I was clearly engaged in a form of reverse engineering and attempting to do exactly what Basili describes here. The code by itself certainly worked; that much I had demonstrated. But to extend the system required a deeper understanding of the structure and decomposed semantics of the parts. That required significant work (it was over 100KSLOC in Python by sophisticated developers) and the most visible artifact of that work was the documentation. In the end I called it architectural reconstruction and that still seems the best description of what I achieved in that project. What Basili's paper shows is the painful reasoning that must be done to recover knowledge that existed at a prior point in the development of the code artifact. It clearly shows that the design artifacts that preceded the code can have real economic value to the maintenance coder who must perform repair and maintenance work. I also find this essay instructive for demonstrating what SEI maintains, that the structure of the program is not driven by the functional requirements as much as by the non-functional, or quality requirements.

In my professional practice, I was frequently involved in projects to upgrade systems for a client. While the reasons varied, the one constant was the question, "Why do you have to gather requirements again? We already have a system that does all those things. Why can't you just look at that to get the requirements." While the practitioners reading this may cringe, it does ask an apt question that I believe is still important today. I has to do with both the entropy of the development process and the weighing of the importance of the runnable artifact from its antecedents. After all, if the requirements document from the prior project was available and sufficiently self-explanatory, a new requirements phase would not be required except to discover any delta from that prior effort.

One obvious reason why a new requirements elicitation phase was required was the simple absence of the document. The executable is well protected to ensure integrity as well as the source code that created it (usually). But the farther removed from the source code you get the more unlikely there will be good artifacts. By the time you get to the artifacts like the charter or the early design documents, the more likely you will need to engage in a fair amount of modern archaeology to find them. Once found, they still need to be verified, comprehended and extended to include the newest concerns to the organization. What is remarkable is no matter how extensive the document is, there is inevitably information that just never made it into the document. Even in the most document driven organization, there is a significant amount of oral tradition that exists. To call the recovery of this requirements documentation a documentation task is indeed condescending given the skills required.

When confronted with dense code, a software engineer doesn't so much document what is found as use documentation to record the results of his reasoning so as to supplement his memory. As the essay explains, the reasoning is hardly trivial. Even if it is possible for the engineer to remember the key structures and embedded semantics of the program, they only exist in that engineer's mind and are not readily available to other engineers.

I can't help but make a small digression. My generation of software engineers fell prey to a social pathology that may or may not still exist in practice today. Management at the time was clueless about non-executable project artifacts (with the possible exception of the military-industrial complex) and hence required little beyond properly functioning code. The result was that maintenance software engineers would need to consult with the more senior engineers who first wrote the code. There was obvious pride in authorship, which is good, but also a clear sense of superiority and sense of power that these more senior engineers felt in being sought out. I would like to say they were all paragons of maturity but the industry seemed to attract a somewhat emotionally stilted type of person who did not always have the organization's best intentions as their greatest imperative. I'm not exaggerating when I say some were pompous asses who would lord their knowledge over the newbie and extract great stress as a form of rite of passage. It is little wonder that maintenance programming was a phase of the engineer's career they sought to end as soon as possible.

My ultimate interest in this essay though is not the reverse engineering process however but how this essay brings out SEI's thesis. The code in question clearly did not exhibit high modifiability if it required this much analysis to understand for a straightforward question. This code was highly tuned to perform well. The quality of performance was far more important than the quality of modifiability and design choices clearly showed that. This is largely self-evident to anyone who understand the context of a math routine in the larger system. It will be very stable once it functions properly and exhibits the proper blend of qualities. Therefore the increased maintenance cost for those few changes that may be required or enhancements that future generations find for this code, can be accepted as a trade-off for the increased speed and efficiency of the more complicated algorithm. The structure of this code was obviously highly influenced by this quality requirement.

I had first encountered Basili in my software metrics class and I had misunderstood his oeuvre from that context. I am now understanding his contribution to software engineering and am beginning to see him in the same class as Parnas. Why did it take me so long to discover his?

Friday, February 10, 2012

Switching to New Blogger Look

I'll be switching from the traditional Blogger interface to Google's new one. The implication for me is that I will lose my old nom de blog Miss And Thrope. It had been Miss Ann Thrope but I am slowly moving away from the tongue in cheek town toward a more serious and scholarly one...but ever so slowly. The new interface merges the profile with my Google+ profile and the old identities will die. So here's a so long to Miss.

The Origins of Agile Programming

The Agile Manifesto is common knowledge now. I had always assumed that its origin story was somewhere in the early 90s or late 80s. But in the book I'm reading I now see that it started far earlier. In the Basili book I'm reading, this comes from page 30 which is a paper from 1975:

Any difficult in design, coding, or debugging a modification should signal the need for redesign or receding of existing components.

Modifications should fit easily into isolated and easy-to-find modules. If not, then some redesign is needed.

Modifications to tables should be especially easy to make. If any table modification is not quickly and easily done, then a redesign is indicated.

Modifications should become easier to make as the iterations progress. If not, then there is a basic problem such as a design flow (sic I suspect it should read flaw) or a proliferation of 'patches.'

'Patches' should normally be allowed to exist for only one or two iterations. Patches should be allowed, however, in order to avoid redesigning during an implementation phase.

The existing implementation should be analyzed frequently to determine how well it measures up to the project goals.

Program analysis facilities should be used whenever available to aid in the analysis of the partial implementations.

User reaction should always be solicited and analyzed for indications of deficiencies in the existing implementation.

While there is much to fault in this model, it clearly emphasizes the need for frequent refactoring of code as design flaws are recognized and the iterative cycle with far shorter sprints than was common at the time it was written. Elsewhere he talks about decomposing the requirements into a list that can be implemented incrementally. The domain of this paper was the creation of a compiler. The increments were the specific language features. So each release of the compiler would be complete for that set of language features.

Clearly the years and our personal experiences have brought us to this belief. I'll be curious to find the empirical evidence that supports this belief.

Wednesday, February 8, 2012

Starting a New Book: Foundations of Empirical Software Engineering: The Legacy of Victor R Basili eds Boehm, Rombach, Zelkowitz

I expect this book will represent a form of manifesto for the research group I hope to work with during my PhD studies. The basic outline of the study of Empirical Software Engineering is laid out in a paper Basili wrote in 1996 and is the first one presented in the book. Here is a key point:

"First we define some terms for discussing experimentation. A hypothesis is a tentative assumption made in order to draw out and test its logical or empirical consequence. We define study broadly, as an act or operation for the purpose of discovering something unknown or of testing a hypothesis. We will include various forms of experimental, empirical and qualitative studies under this heading. We will use the term experiment to mean a study undertaken in which the researcher has control over some of the conditions in which the study takes place and control over (some aspects of the independent variables being studies. We will use the term controlled experiment to mean an experiment in which the subjects are randomly assigned to experimental conditions, the researcher manipulates an independent variable, and the subjects in different experimental conditions are treated similarly with regard to all variables except the independent variable." (p4)

Of course he is describing basic empirical study that is no different than any other scientific discipline. What is remarkable is how remarkable that it is directed at software engineering, a field that traditionally was led by ideas that sounded reasonable but were never subjected to any form of empirical study.

Tuesday, February 7, 2012

On Non-Functional Requirements, Martin Glinz, 15th IEEE Intl Req Eng Conf

This small paper tries to tease apart a good definition for non-functional requirements. He makes a good point in the dependence of classification on how the requirement is represented. For example, there is a difference between stating "The data shall prevent any unauthorized access to customer data" from "The database shall grant access to the customer data only to those who have been authorized by their user name and password". But note that the second requirement is more restrictive than the first and includes an implementation detail that was not present in the first; username and password. There was an implied design step between the two that introduced functional requirements to meet the original non-functional requirement. When authorized by the username and password, the system grants access. The first requirement leaves that detail unspecified allowing the designer latitude in how the requirement is met. The first requirement could be satisfied with retinal scans or other bioinformatic data that could no longer become part of the system specified by the second requirement. It seems that all non-functional requirements ultimately reduce to functional requirements and this makes sense since they must be implemented using the formalism of functional design. What makes the study interesting is doing this decomposition of non-functional requirements into functional specification in such a way as to aid in the construction of test cases that determine whether or not the requirement was met.

Where I seem to be going in my interest is being rigorous in the input to the design process so as to provide a solid base for the more rigorous look at the design process itself. The outcome of the design process should ultimately be fixed by the validation of this product against the specification and the verification that it behaves as intended.

I had a conversation with Earl in which he agreed that functional versus non-functional seems to make little sense in most usage. But he resisted my assertion that there is any relationship to any notion of a mathematical function at its core. His resistance is fueling my desire to get to the bottom of this so I can turn back to a more rigorous understanding of what is meant by software "quality."

In section 4.2 this paper offers a definition that embodies the point I was trying to make with Earl. "A concern is a matter of interest in a system. A concern is a functional or behaviorial concern if its matter of interest is primarily the expected behavior of a system or system component in terms of its reaction to a given stimuli and the functions and data required for processing the stimuli and producing the reaction." Where I think this moves the conversation forward is the way it brings attention to the stimulus/response of the system as the primary differential. After hearing Earl's comments I feel I need to gain a better grasp of the formal definitions but informally in the undergrad curriculum functional is always defined in terms of a component that accepts multiple input variables and returns a single result. It is defined by its signature and the unit of abstraction for design. If I'm allowed this definition of function for the moment, a functional requirement is one that specifies the response given the state and input to the system. Simply stated then, the functional measure of conformance to the functional specification is whether or not when the system is in the specified state and presented with the specified input, if the system produces the specified response.

Of course this type of functional specification is one that the paper refers to as a "hard" requirement (I believe) and one that is either always met or it is judged to not meet the specification. I infer that what the paper means by "soft" requirement is one where some probability function is specified that defines the tolerance allowed for the behavior.

This leads me to two points that I believe need to be explored. The first point is the temporal aspect of behavior and the second is how systems and requirements are decomposed or aggregated in a complex system.

There are advanced logics that address the temporal component and I had briefly studied one of the temporal specification languages that addressed these temporal concerns. However it is clearly understood in the literature that these languages stand apart from the more common ones which do not. To be able to distinguish between a deadlock and an underpowered processor is vital in analysis of design. But at the level of specification or requirements it is expressed as a constraint on the functionality at some level. Perhaps an interrupt is only presented for a fraction of a second. The system must recognize and remember the interrupt within that time parameter or the system will not function as intended. OCL and other specification languages allow this additional information to be captured in the specification model. But I think it is important that it supplements, not replaces, the stimulus response aspect of the system.

The other aspect of the specification is how it is expressed at different levels of decomposition. Something like the overall usability of the system has a cascade of specifications as it is decomposed into the design. When the requirement includes the requirement to allow an arbitrary number of "undo" steps in the system, it begins to constrain various sub-specifications to the parts of the design. This includes both behavior and information representation choices and even has business implications that the user must be made aware of as the ability to backout a transaction may entail the "undoing" of later transactions that depended upon this state of the system.

I think the levels of specification and behavior that I see can most easily be seen in the difference between an requirement-in-use like the usability requirement from above, and a requirement-of-ownership like maintainability.

The system that is being specified for a requirement-in-use is not just the software but also the software plus the hardware and human environment in which the software is deployed. There is a continual surprise when testing fails to uncover an error from a common business transaction due to the lack of sufficient understanding of the domain compounded with the lack of end-user involvement in the specification and test preparation for the system. To think of this as only a software requirement will perpetuate this mismatch and ensure we do not find a solid foundation. But most important is to note how this differs from the requirement-of-ownership like maintainability.

In use, the system of interest is the production software, hardware and business environment. This environment either does not include the human organizations concerned with the future versions or the planning for that change. It does not include the development environment. Even more dramatically for a compiled language the system-in-use does not even require the source code. These are two distinct systems that overlap in the production software code and some interfacing organizational units. At the macro level the metric for a software change includes the variables for the size or complexity of the change, the skill set of the resources to be allocated to the change, the development environment deployed to support the maintenance and the required time and cost for that change. The response would be quantified by the achievement of some level of MTTF or defects detected against some standardized test bed.

These definitions are not just academic exercises. They become the foundation which supports the engineering of the system and the basis for an unbiased and objective assessment of whether the system met its specifications. It is these specifications, and the attributes of the product produced from them, that provides the basis for the assessment of the product composite quality.