Plan to Fail in your DevOps Iterations… and Win Overall
DevOps transformations are not easy. Organizational culture, people, processes, collaboration, automation and continuous improvement can act as key enablers. By paying attention to failures, having open communication, piloting, limiting impacts with mature deployment practices and building newer tests, you can continuously make your systems and processes ever healthier. Having a fluid framework and measurable metrics with high transparency is a suggested best practice. For info on how to go about doing this, you can follow my DevOps series.
Technologies are rapidly evolving in development and infrastructure operations. While the innovation is exciting, the end mission of delivering business value and rapidly producing incremental product features is paramount.
How can we support this pace of innovation, while improving quality and governance?
Enter DevOps. It’s a combination of culture and technology practices that enable the need for rapid, continuous delivery and improvement.
In the first part of this DevOps series, we will focus on some of the challenges you might encounter in your journey and recommendations from the trenches on navigating and succeeding.
Plan for Failure
DevOps transformations are hard and if you have been witness to a few of these, you know what I mean. The culture, skills and process hurdles in organizations are many. In fact, for your overall strategy to succeed, you actually need your DevOps activities to fail.
Yes, they need to fail, but the intent is not to focus on failure. Instead, in order to succeed in your overall DevOps strategy, you should plan for failures (in test or production) and celebrate them along the way. Let me elaborate…
Creating the DevOps Organization
From an organizational perspective, DevOps is about breaking down barriers between traditional Dev & Ops teams. To succeed, you must understand the DevOps principles (no silos, quick feedback, automation and system thinking) and create a culture from the top-down that is supportive of those ideals.
Traditionally, developers tend to think that Ops teams slow things down – they won’t let us deploy, or won’t let us touch servers, and they have long wait times, etc. Ops folks, on the other hand, tend to think that developers are going to break stuff in production, or that they don’t really understand operations, system dependencies, etc. Even with the existing tensions between Dev and Ops, you should by no means create another department for DevOps. If anything, this will probably contribute to adding more process overhead and red-tape. You should try to implement DevOps as a framework.
DevOps is usually synonymous with CICD (Continuous Integration & Continuous Deployment). However, for truly successful DevOps, you must strive for getting better at continuous testing and continuous improvement, as well. We will look at metrics later in the article to help you gain an objective understanding of the changes and their effects.
No Blame Policy
When handling failures (or even looking forward to them), having open communication & collaboration is the biggest factor in determining a successful transformation.
Strive towards a ”No Blame Policy.” Reward the team on a product’s awesome features, not on its lines of code. Do not punish when there are failures in production; instead, leverage it as a learning opportunity for the whole team. When production failures occur, everyone should be open and candid with their input, focusing on determining the root cause and then building new tests around it. Following these steps as part of a “No Blame Policy” will ensure the failure is caught upfront in the automated testing phase. If it’s an environment related issue, focus on understanding the difference in the two environments, then implement a fix and challenge yourself and the team to build an automated test to ensure that this is caught before it gets to production.
Technologies like automated configuration management, containers, orchestration, etc, can help in minimizing this kind of environment discrepancies. Also, planning for maturity in your deployments using dark launch, canary, or blue-green releases will help ensure that your code is tested, and any failures are caught in a new, controlled production environment, before being rolled out to the larger subset of users.
The above few scenarios evoke Bruce Lee’s oft quoted“Be water my friend.” You must have a fluid framework, and have open communication and feedback mechanisms to in order to continuously improve.
How to Pilot a DevOps Organization
Here are some tips and tricks that will allow you to successfully pilot a DevOps team.
- Collaboration: Pick a group that is co-located. If that’s not possible, at least pick a team that has a strong history of virtual collaboration. This will reduce the hurdles in communication, and improve your chances to fine tune DevOps processes for your org.
- Size of Service or Application: Pick a service that is neither too central nor too small in order to give you breathing room while figuring out which tools and workflows work best for your org.
- Meet Often: Establish a daily stand up with the developers, testers, Ops/System Engineers, and Product Managers to foster an open environment.
- Keep the Team Small: The size of teams is usually defined by 2 pizza boxes–as in, if you order two pizzas, the entire team will be fed. This creates teams with end-end accountability of development, deployment & operations support.
- Having a person or two on hand to help as a process SME, tools expert, and evangelist will be very valuable. This should be someone who is good to work with and can act as a reassuring agent with experience in the cultural, tech tooling and process aspects of DevOps transformations
- Ensure to schedule fun lunches/outings with the team after reaching sizeable milestones and rejuvenate
Don’t Underestimate the Culture Change
Many times it’s easier to address tools and how to use them, and to forget about the people actually using them. But, people come with preconceived notions and when they are at odds with the culture that we want built into the new org, adequate measures must be taken to help address and alleviate their stress. If these efforts to build respect and trust fail consistently with certain individuals, it is an indication to replace the resource. Empowerment, respect and trust among team members will take you a long way in adopting a successful devops culture.
Interactions and discussions among Development, Operations, Product and Release folks should focus on the product: its features, schedules, resources, testing, etc. These conversations should not happen via complex change control and ticket management systems, but instead organically among the team members. You must also remember that in order to be successful, at the org leadership level the focus cannot be on growing fiefdoms or increasing power.
In mature DevOps orgs, engineers wear many different hats, but in reality you can’t train everybody in everything. Having automation helps, with the right architecture, environments, CICD, continuous testing and continuous improvement. Specifically around automation, everything should be on the table–integration, testing, code check-in policies, servers builds, deployments, configuration management, monitoring, etc.
Product Management evolution
The product team must be able to decompose products and features into small increments, strive to provide complete visibility of the flow of work from idea to production, gather user feedback and then incorporating it (plus any lessons learned or new tests needed from deployment issues) into the next iteration. This will empower agile execution with deep product and sprint backlogs.
The goal is to improve overall performance of the IT org by having sizeable releases every 2-4 weeks. With devops delivery maturity the team can strive towards minor updates and releases every day/hour using the micro services design approach to their applications. It is also imperative to display build and deploy metrics prominently.
DevOps Passes Audit
Typically, one of the biggest areas of concerns with DevOps is audit, compliance and change control. You must dig into the reasons why these existing, legacy change control processes were put in place originally: to keep track of changes to the environment across multiple systems and manage risk. The evolution of infrastructure-as-code and automation has helped address the audit problem by leveraging SCM (Source Control Management) and TDD (Test Driven Development). You now have tested infrastructure with a complete audit trail of what and who is changing the environment and when, all while notifying appropriate resources, as needed.
This can be achieved with great speed using integrations into existing ITIL change control or CMDB systems. For example, a build triggers an auto deploy in the test environment with pre and post deployment tests executed, which then triggers an approval workflow in the change control system for deploy to production. Understanding and communicating this effectively can allay some of the audit and compliance resistance you might face while piloting the DevOps continuous delivery methods.
Measuring your DevOp Success
Sharing the DevOps goals very openly when rolling out changes, like restructuring teams, or learning new tools and techniques, is very important to achieve transparency and adoption. Some of these goals and metrics have bottom-line financial benefits, and should be tracked over time, with transparent dashboards
- Deployment time (decreased) & Deployment frequency (increased)
- Feature releases (increased) (/lower TTM – Time To Market)
- Quicker feedback (system/user) to teams & product manager
- System Availability (higher)
- Greater % of defects detected in testing (yes, you read that right)
- Ticket volume (reduction)
- Infrastructure workload density/efficiency (increase)
Congrats! You’ve Built a Successful DevOps Team
Phew, we covered a lot of ground. We touched upon the organizational, cultural and technical aspects involved in adopting a DevOps philosophy and places where you can afford to fail. Remember, the mission is to streamline the process of development, deployment and release, which means the focus must be on failing fast and in a controlled fashion in order to continuously improve. The key strategy is to adopt the new DevOps processes (CICD, continuous testing, continuous improvement and automation) and motivate the people and teams who are part of making this change a success, while enjoying the journey along the way.
Coming up in the DevOps series, we will delve into different workflows and tool stacks (while getting our hands dirty) so that we can move towards DevOps nirvana.
Kiran Chitturi is a technology leader on the Sungard AS emerging tech team. With 18+ years of experience, Chitturi was instrumental in Capital One’s cloud transformation journey and in setting up AIG’s next-generation incident response automation architecture. He has a master’s in Computer Science