Agility & Reliability: principles and practices for Lean IT Service Management Marcel van Hal, WBS/Lending & Trade 1-3-2016
The challenge: balancing act? Reliability Agility 1. DEVops DevOps 2. Lean IT Service Delivery 3. Prioritize features: functional increments versus technical improvements 2
Distribution of P1s and P2s* (2015/YtD**) *) only BT and TI+ **) Lending Q1+Q2 3
4 DevOps & Lean Service Delivery
Starting with ASL Agile enterprise Agile enterprise Team support CD 5
... And checking ITIL Cross team 6
Operational IT service management Activity Character Status DevOps objective Service requests Incident management Availability management Capacity management Continuity management Configuration management Info Security management Problem management Change management Service level management 7
Operational IT service management Activity Character Status DevOps objective Service requests Unpredictable Too much effort? Self service & automation Incident management Reactive Manual Prevention & automation Availability management The (lagging!) KPI Manual Monitoring & measurement Capacity management Proactive Reactive, manual Measurement & automation Continuity management Licence to operate Tedious Leaning & automation Configuration management Basic condition Dev Ops Integration & automation Info Security management Licence to operate Continuity management?? Problem management Reactive Ineffective Spikes in backlog CD Change management Alignment Old school? Continuous integration? Service level management Cross team Not actionable Display real-time use feedback 8
Key activities for lean IT service management Activity DevOps objective Examples Service requests Self service & automation Self service for UAM Incident management Prevention and automation Automate restarts Monitoring & measurement Automation EoDs, Queues, CPU, logfiles Capacity management Measurement & automation Memory allocation, table space Continuity management Leaning & automation Soll/ist comparison, DR Configuration management DevOps integration & automation Automated updates of UCMDB Service level management Displaying real-time use(r) feedback Use dashboard; Invite the user 9
DevOps principles 1. DevOps teams are collectively responsible for quality 2. Quality = Functionality (what the systems does) x System Qualities (how well it does it) 3. Develop and test against production-like systems (CI) 4. Deploy small increments with repeatable, reliable processes 5. Monitor and validate operational status continuously 6. Amplify feedback loops: measure user experience and system s behaviour 7. Visual management: display data real-time 10
DevOps practices (Ops view) 1. Define (DoGD) and discuss quality during stand-ups, reviews, and retrospectives (apply rhythm to IT Ops as well) 2. Add problems (spikes) and LCM to the product backlog and have them prioritized 3. Ops engineers to write scripts for SQ tests 4. Ops engineers to write deployment scripts 5. Measure and display status real-time: load, resource utilization, latency, connections, etc. 6. Collect feedback and learn: blameless post mortems, test and measure the system s behaviour and invite users to your reviews 7. And display and demo what you do for anyone to see: e.g. (Do) Green Days, quality statistics, OCD, Ops Calendar, Problems top ten, and 11
Definition of Done or Definition of Green Days Shift the DoD to the right, to capture DevOps, or develop a Definition of Green, to guide the one-minute checks, e.g. system measurements complete? all functions and connections available? performance OK? load <-> system resources OK? system security OK? EOD and data integrity OK? User feedback 12
OpsGuild TFS Safeguards uninterrupted IT service provisioning Understands the user experience and the value of our assets to our customers and users Improves systematically reliability and resilience of our IT assets Improves the profile and Body of Knowledge of the Ops community Understands and performs middleware engineering Automates repeated IT operations for Use support and Continuity management Zero Repeat Helps other Ops Engineers with good practices (and learns from them as well) 13
Inspired by agile software delivery... We re also uncovering better ways of running software Shared team responsibility over individual process roles Availability and performance over fancy features User experience over SLAs and service reports Anticipating system s behaviour over following procedures 14
Journey map for Reliability Stability Operational control IT risk profile & LCM 5. Resilience (advanced analytics) 5. Continuous compliance 5. Assembly platform 4. User/customer drive 4. Auditable (sufficient) 4. Standard stack (DCR/IPC) 3. Continuous improvement 3. Continuous improvement 3. Continuous Improvement 2. In control: monitoring 2. In control: OCD 100% 2. In control: no over-dues 1. Transparency (measurements!) 1. Transparency (stories planned) 1. Transparency (stories planned) Rather a path to perfection, or levels of ambition if you will, than maturity levels or fixed goals. Efforts are aimed towards these emerging goals, and weighted against other priorities; based on available capacity (WIP-limits!), changing objectives, and critical timelines 15
Current Change situation is fragmented, not aligned and brings administrative overhead Change ORM minimum Risk Controls Process standards Changes KPMG Audit SOX Audit OCD / KCT Audit
Zero touch evidence for the change process, change and release with Continuous Compliance Change Process In JIRA / ServiceNow Including Approvals Individual Changes Including Documentation Transparency with business Releases Aut. Version control Fully automated Release pipeline: Aut. Security checks Aut. testing Aut. deployment Aut. everything Continuous compliance with XLRelease: Testing successful? Increase in incidents? All changes approved? Segregation of duties applied? 4 eye principle followed? Release success, or rollback? Related story available?
Continuous Delivery with Continuous Compliance The CD Pipeline: The CD Pipeline guides the changes through a completely automated process towards production. Enable this process to comply inherently with Risk & Compliance, and therefore prevent malicious or insecure software from being released to production. Continuous Compliance Continuous compliance gathers all information regarding released changes. Verify the process and not the standard evidence, focus on deviations. Provide KPMG with evidence with just 1 push of the button. Do not report on the standard process; prevent deviations and report on exceptions.
Continuous Compliance with XLRelease, examples:
Continuous Compliance with XLRelease, examples