Saturday, January 21, 2012

Reflections on the SEI Seminar on Documenting Software Architectures

I just completed the SEI training course on documenting software architectures and I have some thoughts about how architecture is sold. I use the term "sold" very specifically since I believe that architecture always happens within the context of an organization which is largely it's management context. Since a recurring theme in these seminars, and in the pragmatic side of software architectures, is the conflict between management and software architects, I cannot help but ponder how this tension might be resolved.

Management of any large project is largely shaped by that organization's management organization, culture, policies, practices and procedures. The architect can change this within the context of a project in only the most constrained ways. Since what is suggested by the architect may conflict with the path of least resistance for project management, the architect must be prepared to justify the proposal with a logical argument but also sell the idea by the use of extra-logical methods. I think SEI does as good a job as can be done in making the logical arguments in favor of an architecture-centric approach. But the extra-logical methods are outside their scope. Selling is the term I use to encompass these extra-logical methods, and tactics.

Managers, particularly senior managers, respond favorably to the idea of strategy. By rebranding the effort of software architecture as project strategic design I think it more nearly hits the mark in what motivates the effort in the mind of the manager. To fail strategically suggests that some war or wars may have been won but in the end the mission was not accomplished. This is precisely what an architecture aims to achieve. There are a set of quality and functional requirements that define risk points for this project. The architecture and its creation seeks to specifically identify these concerns, capture them in a way that can be shared and certified and then used to methodically drive a design that demonstrably addresses those risks with a design (planned design). I want to believe that this message will resonate more effectively in the mind of the decision maker and motivate at least an open mind, if not an open wallet.

I see that this recast of the architecture as a strategic document also helps to define a clearer scope. Once the strategic risks have been dealt with to the satisfaction of the most senior members of the project team, it is time to activate the implied (or explicit?) plan that is the architecture document. If parallelism in the development is a concern at the outset, the major modules will have been designated and designed in such a way as to provide the input to the initiation of those phases of the project. If security is a concern, there must be sufficient design as to ensure that the concern can be reasonably argued as to have been mitigated.

It is interesting to note that this is how the boundary between what used to be called high-level and low-level design breaks down. If the original concept of operations will use hardware or external components that are believed to represent some risk to this project's mission, the project architect will need to continue the decomposition of those aspects of the design until there is persuasive evidence that this quality can be achieved (but not necessarily achieved) by use of the design so far elaborated. But once the decision has been made that the risk of not achieving that quality has been answered, the further elaboration of the architecture must stop.

Again, one aspect of the architect's job is to help project management with this decision and provide the leadership they will look for. If project management continues to doubt the achievability of some quality, they may continue to ask for further elaboration of the design. At some point, this design can over specify the system to be constructed. This in itself creates a new risk to the project. Over-specification reduces the solution space for the developer charged with the design and construction. If the architect has expertise equal to or greater than the development organization to which it is delegated, this may be good. But the nature of architecture, and the inherent role of the architect, is to be somewhat removed from all the various technologies that will be brought to bear with in the consolidated solution space. Therefore it is just as likely that the delegated organization will possess technical skills in the technology that the architect does not possess and therefore can create a design that is superior to any from the architecture. Again the architect is forced to possess some skills in leader a management group through a decision making process. The parallel between strategic and tactical decisions should be familiar language to this group and hopefully persuade them to belabor the point beyond its management purpose.

There are some ways in which a re-branding of the effort as strategic design can be challenging. This would be an unfamiliar term to the team members and some time will need to be spent explaining it. A natural confusion will arise in differentiating what I am calling strategic design from strategic planning. I am not aware of any use of strategic planning as an activity within the context of the project and of course its purpose is very different in the context of organizational planning. However the connotation of risks, opportunities and constraints used in strategic planning provides the architect with some handy cognitive landmarks to broach the themes of the software architecture.

Another issue with the branding as strategic design is the loss of distinction between software architecture and system architecture. SEI's work is always identified as software architecture. However there have been some reports that have begun to talk about software intensive systems architecture rather than software architectures. Even within this course there was an acknowledgement of the necessity of considering hardware as well as software. Perhaps it would be a good thing to let the term shift to suggest that both are considered when making the strategic decisions that could make or break the project.

I hope I have provided the outline for what I believe is a shift in labeling that could empower the architect to more effectively negotiate with project management and increase the use of architecture centric techniques in industry.

Tuesday, January 17, 2012

Too Much Programming Too Soon? by Mark Guzdial and Judy Robertson

OK, I'm a little behind in my reading. But this is just as important as the day I read it. I'm talking about the article of this name in the March 2010 CACM where the authors discuss Mark's book "How We Teach Introductory Computer Science is Wrong". (http://cacm.acm.org/blogs/blog-cacm/45725-how-we-teach-introductory-computer-science-is-wrong/fulltext) Both the article and the book are an indictment of the practice of doing minimal instruction in the teaching of programming. The research cited suggests that showing fully worked examples of programs significantly improves the speed and quality with which students learn how to program.

At the risk of reading too much into this fact I have always had the feeling that teaching programming has a storied history of a form of hazing. Perhaps because no one really knows how best to teach it or we all think that somehow we can teach it by lecturing and talking about the atomic syntax. But the reality is that students are always flumoxed at figuring out how to assemble the pieces. This "minimally guided instruction". You would think that the text books would do the job of providing the worked examples that this article suggests but I have been disappointed in most of the texts I've looked at. Ironically this is exactly the same suggestion that I have gotten from advanced students. So what could be going on the heads of these students that makes showing them worked examples help them learn? I have an idea.

Since the 1980s there has been way too much ink spilt over the discussion of patterns. While I think some of the discussion of pattern languages a bit over the top, there can be no doubt that there is something very important in patterns. Instead of teaching the core instructions of the language with their formal syntax, what we really need to do is teach them basic problems solved by computers and the patterns of language that solves that problem. I believe what this does it allow the student to learn by induction. By showing how to solve one problem we reasonably expect the student to learn how to solve a similar problem. If we show how to read a number from the keyboard, we certainly expect them to follow that pattern when they must read another number. For the gifted students, we expect them to figure out how other data types are read. This sounds very close to the thesis of one of my favorite books; Metaphors We Live By.

Metaphors We Live By was written by Lakoff and Johnson which has gone through many printings since it was published in 1980 and imho a book everyone should read. The central point is that metaphor is not merely a poetic device but a key insight into how we learn new things. This fits perfectly with Mark's suggestion that we should show a complete working program first and then ask the student to first make a small change and then quickly advance the distance between the example and the problem.

But then that's just me...

Overspecification

One of the key concepts in software engineering is the need to avoid over specification. It is natural that at some point in the decomposition of the specification of a system to be created that the specifier resorts to a description of how to do it instead of a statement of what must be done. When you drill down into this the problem is the difference between denotational semantics versus axiomatic semantics. In the first, the problem to be decomposed is given in what is hopefully the most abstract algorithmic way possible by the specifier. The problem is that it assumes there is only one algorithm possible for the solution and it locks the implementor into that algorithm.

The alternative is the axiomatic semantics of stating the pre- and post-conditions as well as any invariants in the required solution. This at once gives the implementer the choice of algorithm and implementation choices possible in the solution set. But at the same time it gives the implementer no direction as to how it can be achieved. Traditionally in commercial work the axiomatic method of specification has not been used merely because of the difficulty of making these statements about the required implementation. They are seen in formal methods but the difficulty of implementing formal methods is well known.

What is a Software Product?

I have just completed another seminar at SEI and I want to return to the subject of the prior post. However I am going to recast it into an answer to the question "What is a software product?"

I think most people are likely to take the question to refer to a consumer product like Word or Windows 7. In this case, the product primarily consists of an executable module but no source code. In addition, to make it fit for its intended use, there are several other assumed artifacts associated or imbedded into the product. One is either an installer application or instructions on how the product is to be placed into the intended environment. In addition, there is an implied deliverable which will instruct the end-user how to perform the tasks for which the product intended using the product. These days that is often left implied more than explicit as many products are designed to be self-evident. This means that the intended end-user has sufficient experience with similar user interfaces that the use can be learned with some exploration and no explicit training. Of course this is not always the case and products may either embed the user manual into the product or provide it as a separate artifact.

There is another sense of software product though. If you consider the automobile factory that turns out cars, the cars are products to be sold to consumers. However the factory itself is an asset that can be sold to another auto maker and retooled for their use. In a similar way, source code is a major component in a product that allows a software vendor to create software products. This "software factory" is a tangible asset and is recognized as that through intellectual property laws and common sense. (To avoid some awkward language, from here on, when I refer to a software product, I am using this second sense of the term and will ignore products that only include executables. )But is the source code the entire asset? If we were to sell our intellectual property to another, what can reasonably be considered part of the asset?

In the rush to market, many software vendors are willing to sacrifice the quality of the allied artifacts of the complete system. While this helps to achieve the end goal of creating a working system, it creates technical debt in the deferred maintenance on the artifact with a measurable increase in cost to the total cost of ownership. If the product will never be modified, the complete lack of supporting documentation for the product is reasonable. Even if some was produced during the initial construction of the product, there is no reason to keep it since it will never be read. But how many software products are created that are never intended to be modified? The correction of latent defects, changes in the environment, new requirements, are just a few of the many reasons why the maintenance phase of a software product life cycle is typically the costliest. To ignore the needs of this phase of the product life-cycle are foolish and self-defeating. Therefore any technical debt incurred from the initial development will eventually need to be paid if the consequences of that debt are to be avoided.

The platonic ideal that is sought is some form of self-documenting code. While local comments, when done well, or even well written code without comments, can often be understood without additional commentary, large software systems cannot be self-documenting since many issues transcend individual modules. A software engineer may be able to reproduce the information needed from the source code alone, or at least find a way to insert the modification without it, but this is most likely a more expensive fix than it would have been had the recreation work not been needed. The lack of adequate support for the code base increases maintenance cost. Further, until a code base automatically includes self-documenting ways to explain design choices that transcend individual module, this kind of documentation will continue to exist outside of the source code itself.

This highlights a weakness in the current practice of software engineering management. The true value of this artifact is discounted by management. Since the increased support cost can never be quantified, there is no self-correction of management attitudes. The continued ratio of maintenance cost to total life-cycle cost will remain the same. Worse, there are two compounding effects from this. First, the best information about the design of the system often only exists in the minds of the developers. When they leave the project, it is usually lost for good. Even if they remain in the same organization, the fidelity of the information is lost over time. Second, without an adequate understanding of the design principles used in constructing the product, a maintainer is likely to undermine some of the design clarity that requires restraint on the part of the coder due to ignorance. It is as if the architect leaves no plans for a building behind and the maintenance engineer tries to knock out a wall only to discover an unexpected support post or beam. Even worse would be to not recognize the structural nature of the element during the remodeling and remove it, thus weakening the structure increasing the risk of structural failure.
(originally posted at Dale's Dilettantic Deliberations Aug 12, 2011)

As Agile methods gain wider use in industry, some inexperienced developers are likely to believe that producing good design documentation is just a bother. A lack of professional development, the desire to just get on with their career, an ignorance of how to best document the design, or some combination of all three likely contributes to the poor quality of this product artifact. Often professionals do not even consider the design artifact to be part of the product itself but view it as some form of construction artifact that serves no purpose after delivery. Certainly many management teams do not appreciate, nor know how to evaluate, the quality of these allied artifacts.

The lack of attention given to the maintenance phase infrastructure and staff needs is a significant blind spot. It has always been a function that has suffered from lower status than the initial construction. As the product life cycles of these software products has gotten longer, there is likely to be at least some far sighted management teams who will eventually realize that long-term profits can be improved through a reduced maintenance phase in their software assets. Once that begins to happen, the types of design documents that most efficiently support the maintenance needs will become the object of study. Until then, these support staffs will continue to be expected to do their jobs with less than ideal knowledge transfer and the need to continually read the minds of developers who are no longer around to ask.

System Documentation

(originally posted at Dale's Dilettantic Deliberations Aug 2, 2011)

A friend wrote me about the frustration of performing software maintenance in an environment largely devoid of good program documentation. He finds that he must spend a great deal of time just trying to understand how the various classes relate to each other before he can begin to focus on finding the source of a bug or trying to decide how to add a piece of functionality. I feel for him. I have been there and I suspect most programmers at some point in there career have as well. It is an interesting way to peel back some truths about the software development life cycle.

The first truth is that modern languages do a poor job of capturing the larger abstractions of their design in a way that is self-maintaining. It is possible to create a code base that has good supporting documentation of the design underlying the code. However this documentation is separate from the code. Without proper management discipline, the supporting documentation will deviate from the as-built system, if it ever matched in the first place. At the higher levels of abstraction in the design, this is exactly what is described as the system architecture.

What a maintenance programmer wants in the software system is the quality of maintainability. This is greatly enhanced by the presence of good supporting design documentation. One of the key attributes desired in this documentation is the tracability of a requirement to the specification to the code which implements that quality. When the only artifact available is the source code, this is rarely possible since the relationship is never one-to-one and usually not even many to one. Instead it is a many to many relationship between requirement and code. The consequence of this is unintended consequence and it has cost many programmers sleepless nights that they search for ways to undo the unintended consequence of their fix or enhancement. With current, accurate and complete documentation, the maintenance programmer has a far greater chance of more quickly understanding how the code implements the function (or non-functional quality) and making informed decisions about how best to make the change or fix the bug consistent with the original design and without introducing a new bug.

The sad fact is that the majority of shops do not have even inaccurate design documentation that reflect their production systems. There are a few reasons for this but in the end it always is indicative of poor management. First, there must be management discipline to enforce the delivery of design documentation with a system. Second, the professional staff must have the skills and discipline to create usable documentation. Third, the tradition of looking upon maintenance programmers as less-than developers is short-sighted when viewed from the perspective of the entire system life-cycle.

If the development staff and maintenance staffs are part of the same organization, management has an excellent opportunity to evolve a site standard for the design documentation that will be most helpful to the maintenance organization. When they are within the same larger organization this is a straightforward management task. Peer reviews by the maintenance staff of the documentation to be turned over and the empowerment of the maintenance organization to resist turnovers that are incomplete or inaccurate will allow the maintenance organization to provide the kind of efficient and quality service that is expected of them.

If the development organization is not within the same organization, many other problems occur. Often development is out-sourced to a professional organization. While they will offer a high-value service, they will be constrained by the terms of their contract. Sadly, the needs of the maintenance and operational organization are often not given proper thought if they are even considered at all. Yet the statistics show that many systems have extended production and maintenance periods and that the money spent in those periods far exceeds what is spent in the initial development. Providing artifacts that reduce the cost and increase the efficiency of these processes is enlightened self-interest. Assuming the maintenance organization has a standard way that they document their system design, the development organization should be contracted to provide an acceptable product at delivery.

Since the beginning of system development, the emphasis has always been on the finished, working product with no regard for the artifacts that can be associated with that. Few shops even have a filing system in place to keep the products of the system development in a way that allows their review. More often than not, those products are boxed and remain in the project manager's office until he leaves the organization and are then discarded with no review.

This emphasis on the end-product alone extends to management decisions regarding what is important when a project inevitably is pressed to deliver and the schedule has slipped. Rarely will the staff be driven to deliver documentation contemporary with the product. This separation of the product and the documentation subverts the review process (if it exists) and diminishes the quality of that product. Often errors in the documentation are only caught much later when the maintenance staff must use it to perform their work.

Since the design artifacts are never directly delivered, they often depart from the as-built system. Without the professional and management commitment to create and preserve high-quality design documentation, it will not happen. This is as much due to the lack of professionalism within the management of the maintenance staff. A seasoned maintenance staff with experience will push for and receive good documentation that will support their job function.

Availability and Fault-Tolerance

I got this discussion of the difference between availability and fault-tolerance from Quora...


3 Answers • Create Answer Wiki

Edmond LauQuora Engineer
While availability and fault tolerance are sometimes conflated to mean the same concept, the two terms actually refer to different requirements. Designing for high availability is a stricter requirement than designing for high fault tolerance.

Availability is a measure of a system's uptime -- the percentage of time that a system is actually operational and providing its intended service. Service companies, when offering service level agreements (SLAs) to their customers, usually quantify their availability in nines of availability. Carrier-grade telecommunication networks claim "five nines" of availability [1, 2], meaning that the network should be up 99.999% of time and experience no more than 5.26 minutes of downtime per year. Amazon's S3 covers three nines of availability (99.9% uptime) in its SLA [3] and offers a service credit if it is down for more than 43.2 minutes per month.

Fault tolerance refers to a system's ability to continue operating, perhaps gracefully degrading in performance when components of the system fail. RAID 1, for example, by mirroring data across multiple disks, provides fault tolerance from disk failures [4]. Running a hot MySQL slave that can be promoted to a master if the master fails, or eliminating Hadoop's NameNode as a single point of failure [5] are other examples of making a system more fault tolerant.

Making individual components more reliable and more fault tolerant are steps toward making an overall system more highly available; however, a system can be fault tolerant and not be highly available. An analytics system based on Cassandra, for example, where there are no single points of failure might be considered fault tolerant, but if application-level data migrations, software upgrades, or configuration changes take an hour or more of downtime to complete, then the system is not highly available.

--------
[1] http://en.wikipedia.org/wiki/Car...
[2] http://www.windriver.com/announc...
[3] http://aws.amazon.com/s3-sla/
[4] http://en.wikipedia.org/wiki/RAID
[5] http://www.cloudera.com/blog/201...
Suggest Edits

My Conversation with Tim


from
Tim Bender to me
show details Sep 10 (2 days ago)
In discussing a large software project with an artist, I conjured the analogy that a software product is like a painting. Each engineer is an artist with their own style and they must paint a small portion of a large masterpiece, often without ever being able to see the whole thing. Recently, I watched a movie which made me recall this analogy and I pondered it further as a way of explaining in quite a simple way some of the rather complex interactions that occur in software engineering.

The idea centers around giving groups the common task of creating the simple image of a house with a lawn, a small family, the sun, a bird, and a flower. The image would need to be simple enough to be easily reproduced by an individual, but complex enough to offer varying entry points for concept learning opportunities.

Some of these concepts are weak and need some fleshing out.
1. Requirement solicitation:
Scenario: Give a team a sheet of paper and some coloring pencils. Express to them the importance of this drawing and that it must look exactly like what is being requested. Tell them to draw "a house with a lawn, a small family, the sun, and perhaps a bird and flower or something". Being purposefully vague and leaving them to either create only the initial description, or go ask follow-up questions.
Challenge: Requesting help and/or more information.

2. Specifying interfaces:
Scenario: Give a team the exact image they are to produce and a collection of transparencies attached to construction paper (making them non-transparent). Tell the team that each person must draw something on their transparency. Inform that the transparencies will be stacked to create the complete image.
Challenge: Specifying interfaces clearly to minimize integration failures.

3. Organizing for a task:
Scenario: Give a team the exact image they are to produce and a blank canvas with some pens. Tell them that each of them must contribute something and that they will be asked to state their contribution. From there, let them self-organize.
Challenge: Skill auditing. Some members will be excellent at sketching/drawing. Some may be terrible. Those that are not gifted in this area could take on a small portion of the image or provide administrative support. A variation might include purposefully leaving out supplies that a team member must requisition.

I was curious if you would have any thoughts on this.

***********************
Me? Have opinions? Perish the thought!

This is an interesting metaphor but I probably don't see it the same way you do.

On my visit to Vienna I had a curated tour of the Kusthistoriches (Art History) Museum. In the pre-20th century tradition of large art studios, it was common for the studio to accept commissions in very must the way you describe and in a way that does have some interesting parallels to software engineering. Each studio is headed by a master who is the creative genius that gave the studio its fame and provides the marketing effort to keep it employed. But given the output of the studio, it was not possible for the master to personally paint all the canvases. Raphael ran such a studio and we spent time talking about how to tell a Raphael from his hand versus one that he may have never touched.

In the studio system the various artists had their individual talents. The less talented might be relegated to painting landscape backgrounds, another buildings, etc. Only the most talented would paint the main subject which was always a human being. The ability to achieve life-like form and tone was prized and only the very best could capture the "essence" of a human subject. For historical paintings that had many figures, the master might paint a few of the figures but would allow the studio to fill in the remainder. This form or organization seems to be very similar to how a software project develops. There is someone with experience who is responsible for the early launch and concept. But what is different is that unless it is an architecture driven development, there is no chief designer who will oversee the design from start to finish. With each phase transition it is the project manager who oversees the progress toward the goal and not an architect. A project manager is usually more focused on the more mundane aspects of the project such as time and budget rather than the less externally visible properties of the system under development.

There is a fundamental difference between art and software. For art, the effect of the completed product is what is valued and not the qualities of the individual pieces, no matter how good. So it is with a software system. A property like security is not solely dependent upon any single component, although that cross-cutting concern will be seen in many of the modules. The lack of attention to even one point in the design can doom the security of the entire system. To one extent or another, this can be true of all the quality (non-functional) requirements placed upon the system. Performance is famously lost through a weak-link in the processing as is availability; modifiability by the inappropriate decomposition of the modules; usability by the lack of a facility for an un-do in the transactions. These qualities can be reduced to an engineering model that provides quantitative, or at least, testable properties. This is not true of art which is often said to lie in the eye of the beholder; not a good quality in an engineering project.

Ironically, I have seen many project proceed as if they were art project. The project management, the client, the business analysts, and the coders all proceed with the functional requirements only and wing it with the crap that is often offered as "non-functional requirements." This forces the team to make decisions for the client as to what qualities and what relative priorities to give to those qualities. This causes the client to not perceive the lack of some quality until they see an early prototype of the product, or worse, the finished product. Then, and only then, does the performance/availability/fault tolerance/usability/security get the attention that it deserved often with disastrous consequences to the design concepts that had guided the many hands that crafted the code.
The number 2 scenario sounds exactly like the way cell animation is done. I think the key difference here is that the transparencies are opaque and therefore block what is behind. This makes the "interface" irrelevant since there is no need for coordination between Bambi and the forest trees behind her. As long as there is no interleaving between two transparencies, there is no need for the careful coordination that is suggested with interfaces. Software, of course, is very different since most any decomposition of a function into smaller modules will create some form of interface that must be spec'd.
For any product that will be created by many hands I cannot believe that self-organization, as it is often understood, can work. I think most people take self-organization to be some form of egalitarian 'let's all get along' I do not believe this is what the Agile manifesto suggests. Rather I believe the Agile manifesto suggests that experienced and skilled professionals are more effective of organizing themselves for the task without an explicit plan from some "manager" who is not as experienced with the technology. However all the same roles will be filled. Group dynamics ensures that a leader will be selected and that the group will go through the steps of forming, storming, norming and performing. Self-organization will also do nothing to prevent the group dysfunctions that affect traditional forms of management as well.

Going back to your example, if a group of artists are given a commission and left to self-organize, the leader that emerges is not guaranteed to be the best artist. It is often true in engineering circles as well. Arrogance, self-confidence, domineering personalities are the most likely to be the group leaders. When this is also the most talented individual and has the leadership skills to bring the team along, the results can be striking. But it can just as easily lead to disaster or a mediocre product too.

Are They Really Requirements?

The paradigm we get from the waterfall model is that a client states their requirements and they lead unidirectionally to the specification for the system. We already know from the popularity of the Agile methods that waterfall is not the correct model, that for many projects it is more productive to approach it as an artistic product and make many iterations that can be shared with the client. The benefits of this approach have been well documented but it still leaves the idea that there are some set of requirements that just need to be uncovered and refined. I now believe this is a fatal flaw in this part of even the Agile methods.

Iterative methods gain their power from the inclusion of the client in the design process. In the older waterfall approaches, after the requirements were documented and signed off, the client would often see little progress (except bills and excuses why the project was behind schedule) until an advanced level of testing allowed the client to see the complete system as it is intended to work. The pain this caused when it was realized that what was built was not an acceptable solution to the client's problem was always significant since so much had already been sunk into the wrong solution. Agile at least boxes the risk into a single iteration and if they are small enough, the damage can be limited. One good trait of all these Agile methods is that they support faster failure of dysfunctional projects.

Clients have always had misgivings about requirements signoffs. While they could not articulate it, they felt they were being setup for failure. Their business expertise did not prepare them for the task of directly specifying the product they needed nor is the average business analyst prepared to completely understand the business function that is within the scope of development. Neither is generally prepared for the design tradeoffs that are often done without any direct client involvement, even if the team is articulate enough to explain how a particular design decision impacts the various emergent properties of the system under construction. The current methodologies simply don't work during the requirements elicitation and analysis phases of even the most Agile methods. Instead they substitute what I like to call the waving of the hands form of requirements where everyone simply talks about what is needed, very often in "I'll recognize it when I see it" kind of terms. The design team does their best to interpret these statements, goes away and comes back with something that can be critiqued. The progressive elaboration of models and prototypes will allow an experienced team to eventually drive to a solution.

So what is wrong with this approach? After all, it does produce a solution. Is it optimal? Who knows? Is there any traceability to the design decisions made? Probably not. Does this sound like engineering? Emphatically no. We can do better.

What this approach lacks is the top-sight and planning that enable an architecture centric approach. Many solutions do not depend upon an architecture centric approach to achieve business success. When the solution is self-contained, highly derivative of a prior effort or does not have exceptional quality requirements, the emergent properties of the system will probably not be difficult to achieve when there is little or no "big analysis up front." But this is not the forefront of software engineering today and the solution to these challenges has not yet been found.

Emergent properties of a software system do not derive from one or even a small number of design decisions made when designing the product. Rather they emerge from a collection of many, most or sometimes all of the elements that comprise the system that have cross-cutting concerns addressed in a manner that allows the ensemble to achieve the collective goal. When the top-level decomposition of responsibilities has been properly done, these emergent properties can be achieved by independent developers working on sub-systems within the larger ensemble. But too often poor choices made in the first few design decisions can block the achievement of an emergent behavior even when all purely functional properties exist in the finished product. If the emergent properties that exist do not satisfy the business need, the product is rejected. But if the tradeoffs provide a product that "satifices", the product will be accepted and since the knowledge of what might have been is at best speculative (at worst, vindictive), the missed opportunity will never be known.

This can be improved by recognizing the key decision making role the client plays throughout the design process. The key challenge faced in trying to include the client, though, can be easily seen in a near universal scenario. The client has established a timeline and budget for the project. Then the requirements gathering begins. At some point, an astute engineer will observe that it is possible to get the needed functions and qualities a, b, and c at levels of x, y, and z and this combination is judged unacceptable. The engineer turns to the client and says, which are you willing to sacrafice. The client says "none. I want them all." What is really being said here is that the client is abrogating their responsiblity to be a decision maker in the project. If the engineer is correct that the qualities cannot all be achieved simultaneously (time and budget being two of those qualities), then the decision is foisted upon a team member to make a decision for the client. Just as the client felt cornered when asked to sign off on the requirements document, so too does the designer feel abandoned just when he needs the client the most. This is, after all, a business decision, a business decision that can have real business consequences. What is needed is for the client to engage in a form of negotiation to explore the possibilities, ensure that it really is impossible to achieve all goals simultaneously and then to take responsibility for the ultimate responsibility with the help of the business analyst, architect or whoever is in the lead design role.

So you can see that it is not as if requirements are completely elaborated at the time the project team ordinarily asks the business to sign off on them. This is an important first step and one that cannot be taken lightly. But neither can this be seen as the final word for business input. For an architecture centric approach, it must include sufficient elaboration of the qualities that can only be achieved with cross-cutting concerns that span a large portion of the code base to be developed. For any development it must at least be accurate even if it is not complete. But the impact of the aspirational goals cannot be known until the design progresses to a lower level of design and only then can the implications of that set of aspirational goals be known. As soon as possible thereafter the client must be brought back into the design process to negotiate the tradeoffs that should be made.

This process is much closer to advocacy than it is to specification. As leverage points appear there is someone to argue for the most advantageous tradeoffs that achieve their client's goals. For the client representative(s), it is to collaborate with the chief designer. For the project team, it is to do the most professional job possible in predicting the properties that will be present in the final product extended from this design.

(originally posted to Dale's Dilenttantic Deliberations Sept 27, 2011)

The Failure of Reductionist Thinking in the Creation of Software Intensive Systems

My career was dedicated to helping large organizations build software intensive systems that would solve their business needs. This was an honorable calling and one that had many moments of intellectual joy. With each level achieved, I perceived that the projects were often doomed to sub-optimal achievement by decisions made earlier in the process. I was drawn inexorably to these higher levels like a moth to a light. Since my capabilities allowed me to ascend toward management ranks I was allowed to see how these decisions were arrived at and to make my attempt to avoid the pitfalls I had witnessed in earlier projects. The culmination for me was to be project manager for several large, complex, high-risk projects, work that I eventually found soul crushing and caused an existential crisis that I am only now recovering from. But this experience has left me with a clarity of vision on how management and the engineering staffs they employ come from such different cultures that to find they achieve anything at all is a testament to the underlying unity of human society.

Projects are funded to solve business problems. This goal directed nature of these efforts gives them a vitality that is lost if the goals cannot be succinctly expressed or are not shared by the stakeholders. This deep understanding of the goals is the sole province of the leaders of that organization. Many projects fail at this level do to the annunciation of commandments or the articulation of such laudable, but ultimately vacuous goals as to be useless for the guidance to a satisfactory conclusion. At this level, at a minimum, a project must be able to articulate how a successful conclusion will be recognized. There is no great harm in allowing untestable or abstract statements. However management must recognize that their job is not complete until the project can articulate a set of concrete business objectives that will justify the capital and expense of this project. It must also accept that this is a contract with the project team and the project stakeholders that constrain them to accept this as a stated end position and only change it with the full cooperation of the project team. After all, things change and knowledge is learned during the articulation of the goals. But too many projects lose focus and support when the vague goals they started with are never reified into something tangible enough to drive decisions and the most energetic members of management move on to other more alluring goals leaving the project with the heavy lifting of ensuring at least some reasonable goal is achieved with the resources expended.

Traditional waterfall methodologies inherited from the success of very large well disciplined organizations to manage large complex projects may never have worked very well but where they were employed with thought, consistency with well trained and stable staffs, they worked better than anything else that had previously been tried. The dominance of the model of project process led most managements to view the creation of software systems as more akin to the creation of an automobile on the assembly line than the creation of a work of art with inscrutable processes and unpredictable outcomes. Yet the experience of the past 40 years suggests that the latter may be a closer model of systems creation than the former.

The most recent understandings of project failures involve the consequences of the earliest design decisions on some of the most valuable properties of the system to be created, the emergent qualities of the complete ensemble. This view is supported by the thesis of SEI which posits that most, if not all, "non-functional" qualities of a software system derive from the was the system is decomposed and constituted during the design process. If is oft stated that you can't tack on (take your pick here: usability, security, performance, etc) on the end of an already existing product. It must be a design goal from the beginning due to the cross cutting concerns. That is to say that qualities are orthogonal to the functional needs. If you think about it, it cannot be otherwise since we know we can refactor code in any number of ways without making any change in its dynamic behavior. Yet one or two inappropriate decompositions can make the achievement of some desirable qualities difficult or impossible. Yet these earliest design decisions are made long before there is any significant understanding of even the problem-domain, no less the solution-domain for this project. This reality puts architecture-centric techniques into a tension with Agile techniques which are embraced by managements for their fast delivery of tangible results. To balance these two competing forces requires a management that can weigh short-term gain against long-term investment; that can participate in the technological vision of an architect and create a safe-space in which difficult concepts can be given enough time to mature without engaging in a blind trust that time and money will necessarily result in a better product.

Another danger faced by management at these early stages of the project is the lack of maturity both within management and within the engineering community to properly analyze the problem-space. From an engineering perspective, the requirements engineering phase is at its best if it results in a set of models which express views of the problem-space in a way that enables the designer to project a new and different version that can be understood by both the business and the development organization(s). Yet no matter how powerful, these models are often the source of a great deal of damage. At this point I always have Magritte's famous "pipe" painting in mind which proclaims "this is not a pipe." The point made is that the painting is a depiction of a pipe, not an actual pipe. In software even more than in art, the error of reification is so easy to make that it is almost impossible to avoid. No model of human or organizational behavior can ever capture the true richness of behavior that it is capable of. But rather than embrace, or even acknowledge this truth, management will attempt to use a model to constrain behavior in an attempt to achieve uniformity, uniformity that may be driven by a desire to reduce the skill set needed by the worker, to ensure that different organizations create an identical product or merely to believe that they are achieving some business goal that is never expressed. This reduction of human work to a standardized process has been studied since the dawn of the assembly line and doesn't need to be rehashed. But what is important here is how this mentality can cause the project to mistakenly believe that the model of the organization into which a system will be crafted really expresses the behavior of the people and the jobs they do rather than allow for the extra-system work that in most cases allows the system adapt and remain resilient to unexpected input. The model is not the process.

The biggest dangers in systems development are seen when the team believes they understand the problem in sufficient detail to fully articulate the solution specification. This is the realm of design. This design exists on a continuum from the selection of previous designs that satisfy this requirements, as if the problem is a burnt out bulb and the solution is the replacement with a functionally equivalent one, to creation of a novel and innovative solution to a problem that had never been tackled before, such as the creation of a spacecraft to support colonization on Mars. Too often a project is funded as if the problem space is burnt out light bulb while the stakeholders want to go on a mission to Mars. Project management, with support from senior management, must ensure that everyone on the project is guided to the same place on this continuum and that this scope is normalized to the budget it is given.

Another pitfall during this phase of the project is the abdication of management decision making. Design is all about tradeoffs. I haven't met a business manager yet that will not quip when asked if they want it fast or cheap will not answer "yes." It is not unreasonable for business managers to seek a quality product that is instantly available, does everything they could ever think of for that product and for it to be free. But we all know that it is impossible. Decisions must be made and if the business will not make them, the designers will. For the business to not engage in this decision making (assuming that the designers do not usurp that responsibility) is a clear abdication of their responsibility. While a talented designer may actually make more astute decisions than the business, the business is left poorer for its ownership of those decisions and it leaves a flaw in the crystalline connections of forces that connect the problem to the solution via a line of empowered managers. What is sought is a product that once delivered is known to represent the best decisions possible from this organization and one that cannot be disowned by those managers.

This brings me to what I believe is perhaps the most intractable danger in systems development; the achievement of emergent properties in the system. The drumbeat for "quality software" has been growing over the years and will surely increase as we attempt ever more complex systems in ever more demanding areas like health and transportation. Yet what we are grappling with is an engineering approach to achieving properties that have more in common with chaos and stochastic processes than they do with Newtonian mechanics. The Alexandrian pattern languages will enable us to evolve solutions through trial and error. Yet to an engineer this is an unsatisfactory place to be. By what principles, by what mathematics can be envision some emergent property and then back into the simple rules that will allow this property to be expressed? Kurtzweil suggests we will see singularity before 2050 but even if we do, I don't think it will do us much good. Reaching the tipping point where our manufactured logic machines have comparable statistics to the wetware in our heads does not guarantee that they will become HAL-like. Unless and until we know more about the brain than the number of neurons, dendrites, axons or even their wiring pattern, I don't think the hardware achievement will mean much beyond the new economics of computation. There is still a new mathematics waiting to be created out there, one that does not fall prey to reductionist thinking.

(originally posted to Dale's Dilenttantic Deliberations Oct 9, 2011)

Note to readers

I had a previous blog called Dale's Dilenttantic Deliberations which was more tongue in cheek than this one will be. However I did post some pieces there that are more appropriate here. I am reposting them here.

Sunday, January 15, 2012

Why Does the Module Reign in Software Engineering?

Time and again I am confronted by a prejudice in favor of the module as the primary object of study in software engineering. Yet the reality is that the module has shrunk over time while the overall size of applications has grown (ok, I don't have any proof of this yet.) So why does the module continue this reign?

There is no doubt that code and programming languages are the bedrock foundation of a software system. But people do no fund building projects for their foundations and architectural criticism is never about the infrastructure. We build software systems for behaviors and artifacts that are enabled by the ensemble of modules. Sure, the individual modules must do their part to support the collective good of the application. But like an individual cell of the body, the part cannot comprehend the good done by the whole.

While the creation of the module and the productivity with which it is created are important subjects of study. But even more to me is the study of how the overarching design that resulted in the specification for the module came to be. The programming languages used for the module do not seem to be the language with which we understand the larger design. Even pseudocode does not seem to capture the essence that must be comprehended before we can deconstruct it to the level of module specification. IMHO, it was the introduction of object oriented design and programming that helped us make a big leap forward. But now we seem to be reaching the limits of what that paradigm shift offered and now must find a new conceptualization of how we methodically go about designing and building these ever growing software demands.

A case in point comes from a Microsoft research paper I saw from Bird talking about the growing importance of predicting the power consumption of software. In the bad old days of programming, it was necessary to be continually mindful of the resource demands of the program under development. If you wanted something to be memory resident, you needed to carefully plan how big that item could be. Now the hardware resources are so inexpensive that it is unusual to need to give it more than a moments thought. Further, the sophistication of systems now provide mechanical assistance in the form of code optimization, caching and other techniques that find optimization where none had been planned. This allows the coder to do what the machine cannot; to creatively interpret the specification and design the algorithms that will implement the desired behavior.

In this Bird paper, he points out that this obliviousness toward resource consumption is never too far away. As the world's platform shifts from the desktop to the palm, battery life is everything. And we are unlikely to see the kind of Moore's law in battery technology. The best we can hope for is a continued improvement in the power consumption for computation.

What  I think is even more salient is that we are moving away from homogeneity in the qualities we are looking for in software products. Where once we only needed a functional implementation of some behavior, now we might need an energy efficient one for one proprietary application but one that is easy to extend in another open source. If we can make one application that satisfies both needs, great. But having two versions is entirely acceptable.

SEI seems to be the continued champion of viewing software not as a bundling of functional behavior but as an ensemble that possesses emergent behavior that can sometimes be difficult to coax out of the interconnection of the modules. As the interconnections between and among the modules grows exponentially, we approach a point where deterministic analysis becomes humanly impossible. We are faced with two options; the use of machines to do what humans can no longer do or to result in the kind of analysis and measurement done in thermodynamics. As the numbers grow larger we stop caring about the individual elements and instead depend upon the aggregates of their behavior. We don't talk about the kinetic energy of the molecules, we talk about the heat and pressure of the gas. We'll try to kick this can down the road a bit but I will be looking for the first signs that we might more productively discuss the measurements of these qualities rather than the metrics of the modules that comprise the whole.

Saturday, January 14, 2012

First reading

The empirical software engineering group is headed by Premkumar Thomas Devenbu. He invited me to sit in on their weekly readings. The paper we discussed yesterday was An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time (http://dl.acm.org/citation.cfm?id=1869462). It was fun to participate in the discussion but I found this paper frustrating since I find the conclusion utterly unremarkable.

The experiment had two groups of students create a scanner and parser. One group used a statically typed language while the other used a dynamically typed language. Given the anecdotal evidence, the expectation is that the statically typed language group would achieve superior results.

The starting point of the experiment is the hypothesis underlyingthe work in [23, 7]: the application of static typesystems reduces the development time of programmingtasks.
The conclusion is that the dynamically typed language produced the better results. The conclusion was that static typing did not have a positive or negative affect on development time.

My first issue with the paper is the underlying motivation for the experiement. The experiment was for a task done by a single student for a single result over a period of about 27 hours development time. He states that the motivation is to provide empirical evidence for decision makers who must decide whether to employ static or dynamic typed languages when given a choice. I fail to see how one can seriously argue that the results of this experiment can give you the evidence needed for that decision except to provide a paper to begin to discuss the methodology by which one approaches the gathering of that evidence as we did.

First, my intuition tells me that typing helps the programmer when the type of the variable cannot be known from its name alone and a design choice is made that is incompatible with the actual type of the variable. As I think Earl pointed out, (paraphrasing and elaborating) the greater the distance between the design decision when the variable is first recognized and the use of that variable, there can be some cognitive slip. If it is intended as one thing (say an integer) but later interpreted as a pointer (assuming that is possible), there will be a run time error in a dynamically typed language but a compile time error for a statically typed language. The greater the distance, the higher the likelihood that a type error can occur.

How much distance can be created between the recognition and use of a variable in this experiment? Distance comes in several forms. First, there can be the lexical distance (Earl's point). If you have a module of 6k SLOC (they exist), the first use can be many screens away from the new use. The harder it is to see all uses of a variable the easier it is to depend on memory, instinct or some other clue. The worse case is when the immediate need overwhelms the other evidence or substitutes for that evidence and causes the programmer to remember falsely that the variable is of one type when it is in fact of another. For those who practice a form of bottom up design, this shift in thinking is likely as their understanding of the problem grows. While this is possible in the experiment, how large could the source code have become? If we are generous and assume a productivity of 100 SLOC per hour, we are only talking about 2.7 kSLOC; hardly an overwhelming burden.

Another source of distance occurs when the person who wishes to use the variable is someone different that the one who conceived it. This is the norm for maintenance programming. It would also be true when the development is a team effort and one member creates modules and publishes an API without sufficient (or unused) documentation. It is even possible when the same person approaches the same task after a break and had decided on a change in design and does not completely refactor the code to properly reflect the change. Here the distance is created by a perceptual shift, the perceptual shift being the change due to different programmers or due to a change in thinking related to the passage of time.

Again, how likely is this to occur in this experiment? A single student working over a relatively short period of time cannot create much distance. The most likely cause would be due to the bottom-up method of design that the student is likely to exhibit and for which static typing is helpful. Yet the size of the task make it easy for the student to be continually aware of most of the variables used in their design without reference; they will be able to contain the entire design in their head.

Another way in which dynamic typing can introduce type errors is when there are a large number of types being used. In an object-oriented language, each class should be thought of as a type. The larger the number of classes, the higher the likelihood the programmer will mis-remember the correct class for a variable.

I see how even a shift in data can result in latent run time errors. Consider that a program could come undone when it ordinarily expects to see a string but instead finds a number. The chain of type inference will lead it to conclude that all subsequent references are numeric and not string.

Statement of Purpose

Scientific theories come from intuition honed by experience. I created many large, complex software systems for Fortune 100 companies and depended upon my intuition in addition to state-of-the-art methods and tools when building these systems. I hoped a master’s degree in software engineering would validate these intuitions. That degree now looks like the start and not the end of this hope. I wish to continue research into the methods and tools that improve the quality of software and the efficiency of its development.

My master’s project started as a design for the automated assessment of website usability. That changed once I found a European research group that had already created one. In trying to extend their work the project changed and it ultimately became a study in how one approaches the reverse engineering of a moderately large system (130KLOC) rather than on the results that can be gathered by using that system. Though small, my project demonstrated to me that I could successfully adapt from the commercial world of development to academic research.

Many things have prepared me for this new challenge: my undergraduate degree from a well respected engineering school in computer science, my recent master’s degree in software engineering, the education I pursued outside of any program, and my career. Of that preparation what I most prize is the discipline of detailed thinking and self-education I learned in my career. Extensive testing always indicated that I have the aptitude for this kind of rigorous abstract reasoning but it was the encouragement I received from my teachers made me realize I was prepared to tackle a thesis.

At UC Davis, I discovered the empirical software engineering group. I find overlap between the orientation of the group and my own instinct for investigation. My early training as a scientist left me aware of the paucity of hard evidence to prove the benefit of one software engineering technique over another. Most new techniques are fads with anecdotal support at best. I share their belief that software engineering will be improved through the rigorous application of empirical investigation over this kind of anecdotal or qualitative evidence.  I also firmly believe that the application of methodologies from the social sciences rather than the hard sciences is more appropriate for software engineering.

Given the opportunity, I would choose to research either requirements engineering or code comprehension. We know that the cost of correction grows exponentially the later in the process it is made. Because of that, errors in requirements engineering are the most expensive and yet still quite common. Based on what I have read so far there is still much to be learned.

I also believe that peer review of code is an underused and powerful way of improving quality. Code written solely for the benefit of mechanical functionality without regard to human comprehension is rarely acceptable. Evidence suggests that human review of code is more effective for verification and validation than testing. Beyond the question of why this technique is not used is the question of what makes code comprehensible to non-authors.

After a great deal of reflection on my future path, I feel I am ready, willing, and able to tackle this new challenge. It will be the challenge of taking some intuitive, novel idea and developing it into an proven statement about software engineering and a lasting addition to the field. Regardless what research or teaching opportunities are enabled by this degree, this is something I feel I am meant to do.