DevOps - Containing Complexity and Delivering Continuous Deployment Jeff Smith, Process & Tools Lead Development Operations, Vertafore visionary1usa@gmail.com May 5, 2013
Quick Intro Jeff Smith, BSEE, CSM visionary1usa@gmail.com Companies: Dell, Boeing, LexisNexis, Vertafore, NetObjectives, Northwest Cadence, LTV Aerospace & Defense, IBM, Recognition Equipment, Mizar, Journee, ClearOrbit,. Certifications: Certified ScrumMaster, Six Sigma Greenbelt, Lean+, Rational SME, ITIL Foundations, ATSQB Test Foundations, Cognitive Edge Advanced Practitioner Roles: Senior Hardware Developer, Lead Software Developer, QA Lead, Process Advisor, Solutions Manager, Consultant, Productivity Coach, ALM Consultant, Process & Tools Lead
Quick Poll How Many Developers(Testers, Coders, BAs, Scrum Masters) & Dev Teams? #P/#T How Many Applications Total for Organization? How Many People in IT/Data Center Operations? (Internal Apps & Customer-facing) On Your Application - How Often is a Major Release? Minor Release? Hot-fix? Emergency Fix?
Where are we and what are we trying to do? INTRO
The Overall Business Goal Minimize Lead Time by Focusing on Flow Throughout Process... Getting the Most Important Work Done First Agile Suggests Regular Creation of Deployable Releases - Lean Suggests Actually Delivering Incremental Change is Faster to Develop & Deliver w/lower Risk Continuous Value Delivered grows Customer Trust and is actually Cheaper Smaller, Prioritized Increments Let You Change Direction or Stop When Feedback Dictates Better customer & business results; The Best Feedback is Customer Feedback Improved Clarity thru Collaboration, Innovation, Knowledge Transfer, Visibility, Recognition of Value Throughout
Divided Commitments Development - We are awesome - every feature should be deployed immediately Operations - We are penalized when things go wrong; stability is our goal Also - we, like our environments, are not infinite resources simply waiting for the next magical binary IT/Operations vs Development Don t Overstep on Authority or Strengths Development is App-Focused; The best thing for Dev is usually addressing quality issues, relieving the immense technical debt and deployment pain in the application (from legacy issues of mismatch with new tech) IT/Operations is Often Everything Else: 100s or 1000s of Machines, Multiple Isolated Networks, Domains, Firewalls, Load Balancers, SAN/NAS, CDN, Monitoring, Upgrades, Updates, Deployments, License Management, Availability/Resources, Access Management, SLAs, Economics
What is it? COMPLEXITY
What is this Complexity Thing? Things unnecessary, extra, overhead Things obscure, ambiguous, misunderstood Most sources of variation even the necessary elements of the work The interplay(e.g. dependencies) between elements The inherent delays that further obscure cause & effect Things non-linear and difficult to understand or correlate The Cynefin Framework How do we see, decide, manage complexity? Is reducing complexity in the cards?
Cost of Complexity How does it occur? Complexity is all over software development Architectures, Designs, Human Systems, Processes, Domain Complexity, Requirements, Implementations, Regulatory, Technology, Tools, Communication, and the Ongoing Change in all of these. We tend to accumulate complexity, rarely seeking to do less or sever ties with legacy. What is the Impact? E 1-n = Effort for each aspect of the software delivery effort (normalized according to effort size; aspects are each of those things listed above) W 1-n = Associated Waste of each aspect [anything extra complexity included] (normalized as percentage of effort E) Complexity Waste = Size of effort((1+w 1 )(1+W 2 ) (1+W n ) - 1) Common Cause Waste tends to be a function of Complexity. It accumulates as multiplication not addition because efforts are interrelated.
Sources of Risk/Cost/Complexity Goodwill - Reputation, Commitments, Past History, NPS (Net Promoter Score) Regulatory - SOX, PCI/CISP, SAS70, HIPAA,... SEC, FAA, FDA,... other agencies, audits/assessments, legal fees/penalties Market Share & Financial Delivery (Poor, Missed, Slow, On time); Customer mismatch; Related Lost Sales, Transactions, Licensing Fees; SLA Costs Technical - Labor, Environments(machines, network, storage, power), Licenses, Headroom, Time, Single Points of Failure All of this stuff brings complexity
Do we know what not to do, to get rid of, to simplify? NECESSITIES?
Technical Elements & Better Practices CM is a Big Deal - Everything under Source Control Deployment Scripts Build Scripts Tests & Test Plans Documentation? Environment Descriptions Network Architecture Database Scripts Seed Data Code Images Network Architecture Coordination Steps/Discussion
Technical Elements & Better Practices Design Reviews & Upfront Test Techniques(BDD/ATDD/Spec-Driven Development) TDD, Pair Programming, Code Reviews, Developer Unit & Functional Test Build Automation - Multiple level (e.g. CI/BVT, Full Build & Deployment w/funct/reg Test) Test Automation - Functional, Regression Tests, Scalability - Stress/Load Better Architectures & Adherence Agile / Lean Techniques, Fast Visualization & Feedback Intelligent Code Management (Branching, Merging, regular check-ins, labeling, other working agreements)
Other Things to Pay Attention To Binary Library Mechanism - Artifactory, NuGet, Nexus Repository Virtuals, Cloud Deployment & Rollback Plans (Scripts?) Time (=$$$$$) Technical Debt Stay Ahead of it Simple, Standard Architectures is a Goal Synchronization of environments Infrastructure Updates Gate Application Flow
Get a Handle on Deployment Types Technical types: Rolling Deployments Staged / Production Swaps Hot-fixes Content & Other Low-risk Changes Pre-window Changes Change Types Emergencies, Escalations, Emergencies Ongoing Maintenance: Infrastructure Growth, Tech Refresh, Security Updates, Patches
Its not just about Product, Tools, and Technology PROCESS ELEMENTS
Better, Faster, Cheaper Problems first; Merely adding tools or investing energy for local improvement without having recognized the priority and reality of an issue is a mistake. Limited resources means Doing the Right Things There is a right order when looking at improvement opportunities Quality reduces rework always a goal (especially if the change is free) Delays and activity times are often easy gains but maintain quality Lower costs are a side effect; focus on cost usually gets you in trouble
Lean Concepts Transparency (KB Boards vs Sprint Task Boards, CFD vs BurnUp-BurnDown, Test Results - Multi- Level) Limit Work in Process by Setting Limits and Using Pull (Standups - Lean vs Scrum) One Piece Flow (or at Least Small Batches - e.g. Sprint) Reduce Waste (Rework, Cycle Time as Primary Elements) Prioritize based on Economics (Specifically Cost of Delay, MMF)
Kanban Board
Critical Process Principle You cannot release any faster than your slowest activity Critical Questions about Automation What are the costs of an Activity? (Cost of Delay, Cost of actual Work, Cost of Automating Before Simplifying or Understanding) Is this Activity really one of your top priorities? Is Automation actually your best first step? What are Possible Shortcuts, Risks? Activities are not set in stone is this the best way to deliver on the need? How will you keep it visible for understanding, troubleshooting, verification?
Understand the Work Maintenance Planned Work (App & Infrastructure) Unplanned Work Deal with It! Limit the WIP Keep It Visible Address the Critical Issues, Constraints Clear the Congestion & Make it Flow Staff IT Sufficiently
Use ToC to Address Bottlenecks Theory of Constraints Five Focusing Steps 1) Identify the system constraint(s) 2) Figure out how to exploit (get most out of) constraint 3) Subordinate as much as possible to constraint(s) 4) Elevate the constraint(create more capacity at the constraint) 5) Repeat
Get Human Constraints Handled Kanban in front of Constraint Understand & Manage Workload Brown Bags Camtasia or other Recording/Capture Devices Learner wears keyboard Learn How People Share Storytellers, scribes, Q&A, Create Solid Actionable Knowledge Stores, Documentation, READABLE scripts & logging
Basic Process Concepts still Apply Better - then Faster - then Cheaper (In that Order) In Software, the most important variability to contain is Quality Do Root Cause Analysis Focus on the Real Problems The Best Process is one Easier to Follow then Circumvent Automate, Simplify, Improve/Tune, Review Reduce Single-Points-of-Failure Tools alone don't Solve Problems & are not the first Answer for Problems Complexity issues do not add costs - they multiply them Improvement demands Budget & Coherent Direction Too Many Meetings means You are Doing them Wrong or Doing the Wrong Meetings Focus on the System, Not the Individual
CURRENT THINKING
Scaled Agile Framework
Disciplined Agile Delivery
Concepts from The Phoenix Project The First Way emphasizes the performance of the entire system The Second Way is about creating, shortening, amplifying feedback loops The Third Way is about culture, continual experimentation and problem-solving, growing technical mastery thru practice
Starting Point: Medium-to-Small Outfits All Dev Teams Should Have Local Team Boards Delivery Pace Locally Defined IT/Operations should have Task Board and/or Tool to Manage the Work Architecture Team & Product Tech Leads Periodically Map Upcoming & In-flight Infrastructure & App Projects Product Management - Builds Near-term Backlog (Epics fine for long-view, JIT Grooming as it enters Dev Teams) and Roadmap Production Control (aka Agile Release Train) Meetings and Guideposts Two meetings at first, Onramp & Final ; Reports, Deployment Types, other conventions established
The Goal Normal, Regular, Frequent Release Cadence (weekly probably good enough for production) Earlier to Market with the Right Features Slack - Reduced Pressure to Work Overtime Production Deployments as Non-events Better Technical Discussions Full Cross-functional Teams - More Trust, Less Fear, Working Collaboration, Empowered Craftsman
Supporting Rituals & Mechanisms The Guardrails - How we Manage the Work; This involves Business, Development, and IT/Operations (Key Meetings, Reports, Policies, Deployment Types) Periodic Operations Review & Report (Monthly, Quarterly) - Includes Review of Outages/ Impairments, Severity 1&2 Issues, Capacity Planning and Issues, ITSM/ITIL measures, Unplanned Changes, Risks Taken, Contrast with Goals/Priors (Good/Bad/Ugly/Embarrassing), Incidents/Availability against Environments(Outages/Errors/Failures), Trends Architecture Committee - Infrastructure Plans around economics, technology refresh plans, storage and capacity growth, new applications, reduced costs, consolidations, improvements Release Planning There is no way you avoid an ongoing long-view meeting coordinating across programs
The Challenge No Shortcuts This can take a While to Emerge; The framework is most key Ritual and Guardrails should be stable KILL REWORK Root-cause, quality, and constraints focus Don t bite off too much Manage WIP, Multitasking Make it Visible so You Can See and Address the Real Problems Its not about Speed; Its about Customer Value, Quality, Balance, and Economics Do this as continuous improvement steps not a project plan (Create some slack for this limit WIP, invest now for later) Understanding thru Broader & Deeper Communication Not just Development, but also ITSM/ITIL, Infrastructure & Deployment Mechanics, Program/Portfolio, Customer Needs
Interesting Books & References The Phoenix Project Lean IT Continuous Delivery Continuous Integration Architecture & Patterns for IT Service Management Pro Website Development and Operations The Visible Ops Handbook (A great intro for developers to ITSM/ITIL Concerns) Web Operations Disciplined Agile Delivery (& LinkedIn group) The Scaled Agile Framework (www.scaledagileframework.com) LOTS of technical & tool references - Ant, TFS, Maven, Hudson/Jenkins, Puppet, Chef, VMWare, HyperV, CFEngine, ITIL/ITSM, CI, SAN/NAS, Data Center Ops, Scalability,... Still waiting on solid practice & knowledge convergence for DevOps, Continuous Deployment, true Lean at enterprise level (full value stream) Early stage, competing beliefs, emerging knowledge
Questions?