Category Archives: DevOps

DevOps and FISMA, part 2

In my last post I discussed how rapid feedback cycles from production can support FISMA goals of continuous monitoring and ongoing authorization. Today I’d like to discuss FISMA compliance and DevOps from another perspective.

In order to support frequent, rapid, small deployments to production, we must ensure – no surprise – that our system is always deployable, or “potentially shippable.” That means that our system must always be secure, not just in production, but also in the development pipeline. With a bit of effort, the DevOps pipeline can be set up so as to achieve this.

I find it helpful to think of security vulnerabilities or flaws as simply a particular kind of defect. I would treat privacy flaws, accessibility flaws (“section 508 compliance”), and other non-functional flaws the same way. I believe this is consistent with the ideas behind the Rugged DevOps movement. We want to move to a zero-defect mentality, and that includes all of these non-functional types of defects.

Clearly, then, we need to start development with a hardened system, and keep it hardened – that way it is always deployable and FISMA compliant. This, in turn, requires an automated suite of security tests (and privacy, accessibility, etc.). We can start by using a combination of automated functional tests and static code analysis that can check for typical programming errors. We can then use threat modeling and “abuser stories” to generate additional tests, perhaps adding infrastructure and network tests as well. This suite of security tests can be run as part of the build pipeline to prevent regressions and ensure deployability.

How can we start with a hardened system, when we almost always need to develop security controls, and that takes time and effort? I don’t have a perfect answer, but our general strategy should be to use inherited controls – by definition, controls that are already in place when we start development. These controls may be inherited from a secure cloud environment, an ICAM system (Identity, Credential, and Access Management) that is already in place, libraries for error logging and pre-existing log analysis tools, and so on. These “plug and play” controls can be made to cover entire families of the controls described in the NIST standard 800-53.

Start hardened. Stay hardened, Build rugged.

How DevOps supports FISMA (Federal Information Security)

The DevOps model is based on rapid and constant feedback, both from the development process and from the system in production. Continuous integration, user review, and automated testing provide feedback during development; production monitoring, alerting, and user behavior provide feedback in production.

The Federal Government has been moving toward an interpretation of FISMA (The Federal Information Security Act) that is very much consistent with this feedback-based approach. The National Institute of Standards and Technology (NIST) publishes guidance on how agencies should implement FISMA. Their publication 800-137 promotes the use of Information Security Continuous Monitoring (ISCM) and makes it the cornerstone of a new Ongoing Authorization (OA) program. A later NIST publication (June 2014) titled “Supplemental Guidance on Ongoing Authorization: Transitioning to Near Real-Time Risk Management” provides additional details. DHS and GSA have worked to create a Continuous Diagnostics and Mitigation (CDM) framework and a contract vehicle through which agencies can procure CDM services.

The core idea is that federal information systems should be continuously monitored for vulnerabilities while in production. Those vulnerabilities should be rapidly remediated and can be used to “trigger” security reviews based on the agency’s risk posture. In other words, we are moving from a process where security is tested and documented every few years to a process based on continuous feedback from production to a team that is charged with remediating and optimizing. It is, in other words, a DevOps system.

The title of the NIST publication indicates that there is more here than meets the eye. The intention is to move to a “near real-time risk management” approach that is based on frequent reassessments of risks, threats, and vulnerabilities. It moves the focus of security activities from documenting that required controls have been implemented (a compliance focus) to one of responding to a changing landscape of real, emerging threats (a risk-based, dynamic focus).

DevOps provides an ideal way to implement this new security approach. Continuous Monitoring for security vulnerabilities is just another type of production monitoring in the DevOps world. A rapid feedback cycle enables the DevOps team to respond quickly to the newly discovered vulnerability. Since the DevOps team has already shortened cycle time and automated its deployments, the vulnerability can be addressed as quickly as possible. As an added bonus, the system in production doesn’t need to be patched; instead the source system can be modified, and the entire system rebuilt and deployed to a new set of VMs, and the old ones torn down.

The influence can go both ways: by incorporating the ideas of triggers and business-based risk assessments, DevOps can be extended to include risk-based decision making.

Good technical practices are critical for government contracting

Good technical practices (such as those typical in DevOps environments) can help the government in contracting for information technology services. We should require these technical practices in our IT services contracts, and if we are investing in QA and independent verification, we should invest first on validating good technical practices. Let me give a few examples. For readers without a technical background, you should be able to find more information about these practices online. 

Good, state-of-the-art testing practices are important for more than the obvious reasons. Most tests should be automated and should follow the classic “testing pyramid” (many unit tests, somewhat fewer integration tests, and fewer tests at the user interface level). The automated tests themselves are just as important a deliverable from the contractor as the code itself.

There are many reasons why such automated tests are important in our contracting environment. The automated tests serve as regression tests that will speed later work on the system. If a second contractor does something that “breaks” the first contractor’s code, it will immediately be spotted; in essence, the tests can be said to “protect” the first contractor’s code. If a new contractor is brought in for O&M or future development, the automated tests serve as documentation of the requirements and allow the new contractor to be confident in making changes or refactoring – they are OK as long as the regression tests continue to pass. 

Scripted deployments and “infrastructure as code” serve a similar function. By providing automated scripts to set up the production environment and deploy code, the contractor is documenting the deployment process (and reducing the amount and cost of paper documentation!). No longer is the knowledge just in their heads (making it costly to replace the contractor). Deployment scripts can be tested, making them an even more valuable form of documentation. They can be placed under version control and audited, increasing security.

Continuous integration increases our ability to work with multiple contractors and gives us more confidence in a contractor’s status reports. By continuously integrating code we ensure that code from multiple contractors will interoperate, and we avoid last-minute surprises when a contractor’s 100% finished work fails to integrate.

A zero-defect mentality where user stories are tested immediately and defects are remediated immediately ensures that code the contractor says is finished really is finished. It avoids passing defective code from one contractor to another; reduces finger-pointing; and makes integrating code simpler. If we are comparing contractor performance it serves as an equalizer – if one contractor finishes 10 stories and leaves 15 defects while another contractor finishes 8 similarly sized stories and leaves only 12 defects, which has performed better? We can’t know. Zero known defects should be our expectation.

The last practice I will mention is the use of good design patterns and architectures that feature loose coupling. Good use of design patterns makes it easier for a new contractor to understand the code they inherit. By encapsulating pieces of the system it can make it easier to have multiple contractors work in parallel and even at different paces.

Together, these practices can make it easier to judge contractor performance, allow us to partition work between a number of contractors, and make it easy to switch contractors over time.

(thanks to Robert Read at 18F for some of these ideas)

Why DevOps will revolutionize Federal IT

DevOps, more than any other agile way of thinking, can cause dramatic change in how the government does IT. There is a lot to talk about here, but in this post I’ll try to present a simple line of thought that will give you an idea of what I mean.

First, we need a hypothesis on what the critical problem is in government IT. I have claimed in many of my speaking engagements that the problem we should be focused on is that of cycle time (technically lead time), by which I mean the time from recognizing a mission need to deploying an IT capability to address that need. Cycle times can be as long as ten years in the government, though I can’t give a meaningful statistic because as far as I know this is not something that is measured. What is reasonable? I would say something more on the order of a week or so. There is clearly room for improvement.

You might think that a more important problem to solve is waste, or possibly overspending. I propose cycle time instead because it is measurable, actionable, and includes these other problems. Cycle times are long because of waste; if we reduce cycle time it will be by eliminating waste. Is the government overspending on IT? It is hard to say, since we don’t know what the right level of spending should be. But everyone can agree that our spending should be as little as possible for the value we receive. Reducing cycle time will help achieve that goal. And it increases the government’s responsiveness and shortens feedback cycles, which result in better products.

Good, so what does DevOps contribute to achieving shorter cycle times? To keep things simple, let’s think of DevOps as a combination of three things: Lean Software Development, Continuous Delivery, and teaming or integration across development, operations, security, and other functions. Lean Software Development provides the framework and tools for reducing cycle times. Continuous Delivery is the best way we know to structure a software lifecycle to minimize waste. And cross-functional integration further reduces waste, addressing the costly “hidden factories” created by handoffs between development, operations, and security.

There are many more reasons why I believe that DevOps is the key to reforming government IT. I will address them in future posts.