Category Archives: Lean Bureaucracy

Who’s overseeing the overseers?

The government provides oversight over projects and programs. Interestingly, this oversight often happens outside of the normal reporting structure of the government agencies. It is considered important for these overseers to be independent – not part of the organization that is sponsoring or administering the project. While this allows for some objectivity, it also means that the overseers have little “skin in the game” – they do not have to live with the consequences of their decisions. The team running the program does.

Now suppose – this is theoretical, of course, and would never happen in any situation I am familiar with, ahem – suppose that the oversight body imposed substantial burdens on the programs it oversaw. Suppose that it demanded extensive documentation that no one ever read, nit-picked on the format of the documentation, imposed supposed “best practices” that were not actually best practices, and frequently asked for data or status updates that distracted those managing the program. Suppose further that the overseers themselves were not always efficient; held up the programs while they tried to schedule review meetings, gave the programs contradictory direction, and argued amongst themselves or prepared inadequately for review meetings. The problem could be exacerbated if the overseers did not themselves have the experience of running programs, and therefore their understanding of best practices was at best theoretical, at worst superstitious.

If that – ahem – ever happened, then given the power oversight bodies have, they would essentially be ordering the programs to waste money. They might also add risk to programs. Since the job of the oversight body is ostensibly the opposite – to prevent waste and mismanagement – this could be a critical issue. What controls are in place to prevent this? Is the oversight body perhaps incentivized to make “corrections” to the programs to demonstrate its own usefulness?

Because the oversight bodies are not “inline” with the management structure over the program, they have no obligation to cultivate the program team as employees. They do not need to encourage program staff, deal with any issues of demoralization, provide positive feedback, make a comfortable work environment that will attract more great performers into program management. Oversight in such an environment runs the danger of focusing on negativity and control, rather than on successful execution.

How can we improve this? Oversight bodies must be measured by the success of the programs they oversee, not by their willingness to cancel failing programs. They must be composed of people who are experts – not in overseeing programs, but in executing them. They must work to optimize their processes and to minimize the waste they add to programs, and must solicit feedback from programs to understand what waste they are causing. What I am saying is that oversight must create value for programs, and only the programs can judge whether they do.

Cancel Successful Projects, Not Failing Ones

Government oversight bodies take great pride in canceling failing projects. They consider it a measure of success. What are oversight bodies for? Eliminating wasteful investments, of course. At first glance this might seem to be consistent with an agile mindset. Among the advantages of an agile approach are the transparency given to stakeholders and the ability to manage risk by working in increments. Only the current increment is at risk, since the project can be stopped before the next increment begins. Problems are exposed through transparency and oversight can take advantage of the incremental approach to stop a failing project.

This way of thinking is a mistake. The project was started because of a mission need. Canceling the project leaves that mission need unmet. Oversight has failed in two senses: (1) it failed to make the project successful, and (2) it did not allow a mission need to be met, one that was important enough to have invested in. If, in fact, the mission need is important, then a new project will have to be started to address that same need. The new project will have the overhead of starting up a new program, thereby wasting more money. Instead of canceling the program, the oversight body should shift the course of the current program to make it more successful.

But isn’t that throwing more money into a wasteful program? No – what has been spent so far is a sunk cost, and there may be some salvageable assets. Doesn’t the program’s failure to date mean that it is poorly managed? Not necessarily, but if it is, the oversight body should simply force the program management to change, not cancel the program. In many cases the problem is not management, but circumstances outside their control. The oversight body should help the program overcome those outside forces. Terminating the program is just a way to punish the program’s management, and adds waste for the government as a whole.

Why cancel a successful program? If a program has delivered substantial value to date, then the oversight body should consider whether the remaining work of the program is necessary. If the program’s work was appropriately prioritized, then there should be diminishing returns to continuing the program. Oversight should constantly reassess the value of the remaining work, and see if the agency’s needs have changed in a way that the remaining work is no longer worth the investment. If the oversight body decides that this is the case, it should cancel the remainder of the program and rejoice – for this it can legitimately claim an oversight success!

The problematic Paperwork Reduction Act

The Paperwork Reduction Act (1980,1995) and the Government Paperwork Elimination Act (1998) together suggest that the government wants to move away from burdensome, paper-only interactions with the public toward a 21st century approach that takes advantage of the online world. The Paperwork Elimination Act (GPEA) mandates that government agencies treat electronically submitted information the same as a paper version – even to the extent of recognizing electronic signatures – so that individuals can transact with the government electronically. The Paperwork Reduction Act (PRA) is intended to reduce the burden on the public resulting from information collections. Simply put, agencies should not require unnecessary information from the public and should make the best use of the information it has collected.

These goals are the right ones. As someone who has applied for visas for foreign countries and had to provide odd pieces of information that were clearly irrelevant, I am happy that the US has a mechanism to avoid such a thing. Unfortunately, the details of the legislation and its implementation are interfering with the goal, despite what are clearly the best intentions of all concerned.

One problem is process-related. The PRA sets up a process for both new forms and changes to existing forms that requires a 60-day public comment period followed by a second 30-day public comment period once feedback from the initial comment period has been incorporated. The form must then be approved by the chronically understaffed Office of Information and Regulatory Affairs (OIRA) at OMB. With the time required for preparation of the documents OIRA requires, the process can take 1 – 2 years for a change to an existing form.

The result is that agencies are discouraged from making improvements to their forms. Planning within agencies centers around how to avoid making changes that will trigger a PRA review. In an era when tech-savvy companies make continuous improvements to their user interactions, often testing two versions of the user interface at the same time (called A-B testing), this process interferes with the government’s ability to reduce burden and improve the public’s experience when transacting with the government.

A second issue is the existence of loopholes in the legislation. Government agencies are instructed to accept electronic signatures “where practicable.” In many cases the Department of Justice believes that such signatures are not “practicable” and agencies must require “wet” signatures even if a form is submitted electronically.

Perhaps the biggest issue, though, is the equating of paper and electronic versions of forms. OIRA requires parity between forms that are available both electronically and in print. This means that many of the features of electronic customer interaction are not allowed, since they would create a disparity between the channels. For example, online forms typically “validate” information as it is entered, flagging errors in the user’s input. Since paper allows the user to write anything they want, agencies are not allowed to stop an applicant from electronically submitting information that is clearly wrong. This denies agencies and the public one of the greatest benefits of electronic interactions.

There is a more subtle and insidious problem with this requirement. Electronic applications are generally – outside of the government – interactive; that is, as the user enters information the computer responds by providing related information. For example, once the applicant has been identified, the system can look up information it already has on the applicant and provide it as a “default” to reduce the burden on the applicant. But this would diverge from what is available on a paper application.

As a result the government’s electronic applications are static; viewed as just an equivalent of the paper application. As with paper, the applicant is expected to fill out information on a static page and submit it before the government can provide any help. The paperwork burden on the public is not reduced and the agency receives bad data, which makes its processing less efficient.

The PRA requires that an agency “to the maximum extent practicable, uses information technology to reduce burden and improve data quality, agency efficiency and responsiveness to the public.” The Open Government Directive further requires that OIRA review the PRA for impediments to the use of new technologies. In my view, that means that we cannot treat electronic forms as if they were paper forms, but rather must take advantage of all the advantages electronic interaction allows. Doing so would realize the spirit of the PRA and GPEA better than today’s process.

Spend more on delivery, less on risk mitigation

Let’s do a simple Lean analysis of government IT system delivery projects. How much of our spend is on activities that directly create value, and how much is additional overhead? What percentage of our spend is value-creating?

The value-creating part of a software development project is primarily the actual development and testing of the software. Add to that the cost of the infrastructure on which it is run, the cost of designing and building that infrastructure, and perhaps the cost of any software components from which it is built. I mean to include in these costs the salaries of everyone who is a hands-on contributor to those activities.

The non-direct-value-creating part is primarily management overhead and risk mitigation activities. Add to these the costs of the contracting process, documentation, and a bunch of other activities. Let’s call these overhead. A great deal of this overhead is for risk mitigation – oversight to make sure the project is under control; management to ensure that developers are doing a good job; contract terms to protect the government against non-performance.

No one would claim that these overhead categories are bad things to spend money on. The real question is what a reasonable ratio would be between the two. Let’s try a few scenarios here. An overhead:value ratio of 1:1 would mean that for every $10 we spend creating our product, we are spending an additional $10 to make sure the original $10 was well-spent. Sounds wrong. How about 3:1? For every $10 we spend, we spend $30 to make sure it is well spent? Unfortunately – admittedly without much concrete evidence to base it on – I think 3:1 is actually pretty close to the truth.

Why would the ratio be so lopsided? One reason is that we tend to outsource most of the value-add work. The government’s role is management overhead and the transactional costs of contracting. Management overhead is duplicative: the contractor manages the project and charges the government for it, and the government also provides program management. Another reason is the many layers of oversight and the diverse stakeholders involved. Oversight has a cost, as does all the documentation and risk mitigation activity that is tied to it. When something goes wrong, our tendency is to add more overhead to future projects.

A thought exercise. Let’s start with the amount we are currently spending on value-creating activity, and $0 for overhead. Now let’s add incremental dollars. For each marginal dollar, let’s decide whether it should be spent on overhead or on additional value creation (that is, programmers and testers). Clearly we will get benefit from directing some of those marginal dollars to overhead. But very soon we will start facing a difficult choice: investing in more programmers will allow us to produce more. Isn’t that better than adding more management or oversight?

To produce better results, we need to maintain a strong focus on the value creating activities – delivery, delivery, delivery.

Good technical practices are critical for government contracting

Good technical practices (such as those typical in DevOps environments) can help the government in contracting for information technology services. We should require these technical practices in our IT services contracts, and if we are investing in QA and independent verification, we should invest first on validating good technical practices. Let me give a few examples. For readers without a technical background, you should be able to find more information about these practices online. 

Good, state-of-the-art testing practices are important for more than the obvious reasons. Most tests should be automated and should follow the classic “testing pyramid” (many unit tests, somewhat fewer integration tests, and fewer tests at the user interface level). The automated tests themselves are just as important a deliverable from the contractor as the code itself.

There are many reasons why such automated tests are important in our contracting environment. The automated tests serve as regression tests that will speed later work on the system. If a second contractor does something that “breaks” the first contractor’s code, it will immediately be spotted; in essence, the tests can be said to “protect” the first contractor’s code. If a new contractor is brought in for O&M or future development, the automated tests serve as documentation of the requirements and allow the new contractor to be confident in making changes or refactoring – they are OK as long as the regression tests continue to pass. 

Scripted deployments and “infrastructure as code” serve a similar function. By providing automated scripts to set up the production environment and deploy code, the contractor is documenting the deployment process (and reducing the amount and cost of paper documentation!). No longer is the knowledge just in their heads (making it costly to replace the contractor). Deployment scripts can be tested, making them an even more valuable form of documentation. They can be placed under version control and audited, increasing security.

Continuous integration increases our ability to work with multiple contractors and gives us more confidence in a contractor’s status reports. By continuously integrating code we ensure that code from multiple contractors will interoperate, and we avoid last-minute surprises when a contractor’s 100% finished work fails to integrate.

A zero-defect mentality where user stories are tested immediately and defects are remediated immediately ensures that code the contractor says is finished really is finished. It avoids passing defective code from one contractor to another; reduces finger-pointing; and makes integrating code simpler. If we are comparing contractor performance it serves as an equalizer – if one contractor finishes 10 stories and leaves 15 defects while another contractor finishes 8 similarly sized stories and leaves only 12 defects, which has performed better? We can’t know. Zero known defects should be our expectation.

The last practice I will mention is the use of good design patterns and architectures that feature loose coupling. Good use of design patterns makes it easier for a new contractor to understand the code they inherit. By encapsulating pieces of the system it can make it easier to have multiple contractors work in parallel and even at different paces.

Together, these practices can make it easier to judge contractor performance, allow us to partition work between a number of contractors, and make it easy to switch contractors over time.

(thanks to Robert Read at 18F for some of these ideas)

The “business value” of government

Agile delivery approaches focus on maximizing business value rather than blindly adhering to pre-determined schedule and scope milestones. On the definition of “business value” the agile literature is appropriately vague, for business value is defined differently in different types of organizations. I would even argue that it is necessarily different in every organization – each company, for example, is trying to build a unique competitive advantage, and results that contribute to that advantage can be valuable (“net” value, of course, would have to consider other factors as well). A publicly held company needs to maximize shareholder value; a closely-held private company values … well, whatever the owners value. A nonprofit values mission accomplishment. What does the government value and how does it measure value?

The answer is not obvious. Mission accomplishment is certainly valued. But different agencies have different missions and for some agencies measuring mission accomplishment is difficult (James Q. Wilson’s book Bureaucracy is great reading on the topic of agency missions). If the Department of Homeland Security values keeping Americans safe, how can it measure how many Americans were not killed because of its actions? In an agile software development project, how can we weigh cost against that sort of negative value to determine which features are important to build?

To make matters more complicated, the government values many things besides mission accomplishment. Controlling costs, obviously. Transparency to the public and to oversight bodies. Implementation of social or economic goals (small business preferences, veterans preferences, etc.). Auditability – evidence that projects are following policies. Fairness to any business that wants to bid on a project. Security, which in the government IT context can extend to keeping the entire country safe. And through appointed political agency leadership, political goals can also be a source of value. Each of these values may add cost and effort to a project.

To maximize business value, we must consider all of these sources of value. If we limit ourselves to the value of particular features of our software, we are missing the point. Rather, as IT organizations in the government, we need to self-organize to deliver the most value possible, given all of these sources of value. The government context determines what is valuable. What we must do is find the leanest, most effective way to deliver this value. This is no different from the commercial sector – only the values are different.

Government as a low-trust environment

The US government is, deliberately and structurally, a low trust environment. Think about why we have a “system of checks and balances.” We have proudly created a government structure that is self-correcting and that incarnates our distrust of each branch of the government. Why is freedom of the press such an important value to us? Because we all want transparency into the government’s actions – not to celebrate its fine management practices, but to know when it is doing something wrong. Within the government, we have Inspectors General to investigate misbehavior, Ombudsmen to make sure we are serving the public, and a Government Accountability Office. To work in the government is to work in an environment where people are watching to make sure you do the right thing. It is a culture of mistrust.

That sounds horrible, and from the standpoint of classic agile software development thinking, it is unworkable. But take a step back – don’t we sort of like this about the government? “Distrust” has unpleasant connotations, but as a systematic way of setting up a government, there is a lot to be said for it. It is another way of saying that the government is accountable to the people. You could almost say – you might want to hold on to your chairs here, agile thinkers – that mistrust is actually a value in the government context. So where does that leave us if agile thinking wants us to deliver as much value as possible, but believes that agile approaches require trust?

It might sound academic, but I think solving this dilemma is critical to finding ways to bring agile thinking into the federal government. A typical IT project experiences this structural distrust over and over: in the reams of documentation it is required to produce, in the layers of oversight and reviews it must face, and in the constraints imposed on it.

I will argue that even in a low trust environment, agile approaches are still the best way to deliver IT systems. And that certain tools – borrowed primarily from DevOps – actually help us resolve the dilemma. Waterfall approaches fit well with mistrustful environments by holding out the promise of accountability and control – but they just don’t work. So how can we bring agile, lean, team-based processes into an environment that is structurally mistrustful, and realize our goal of a lean bureaucracy?