Hosting a private cloud Sven Vermeulen IT Architect KBC Group
Let s stage a server Request inventory number Document server in asset management Request IP address (needs inventory number) Request hostname (needs IP address) Feed server configuration system Feed active directory Trigger bare-metal staging (OS) Trigger software deployments Register server in monitoring system Register server in capacity management system Register middleware, database, messaging, scheduling, file transfer, antivirus,
Drivers
1 Faster staging of systems E2E staging of a server took up to 3 weeks
2 Simpler management Strategic choice to evolve towards industrialized software life cycle management (cost reduction)
3 Immediately correct chargeback Decision for charging systems to a business unit was often taken (too) late
Constraints
1 Financial Money had to be spent wisely and with short ROI
2 Short time-to-market No multi-quarter long project before a first value was seen
Decisions
No, we did not pick a commercial software solution Expected time to market would be too high - Over a year to get in production in a reasonable state - Need for customizations was high Financial implications of the project were significant - Software license alone was significant - Customization cost (manday development) was high
Commercial solutions integrations where not within expectations BMC Server Automation for software deployments HP Asset Manager (later ServiceNOW) for asset management Infoblox for DNS and DHCP VMWare Virtual Center Other integrations were already expected to follow closely - SQL Server & Oracle central management infrastructure - Sterling (IBM) Connect:Direct node configuration - IBM TWS scheduling registration -
Yet doing it ourselves would result in a set-back in the long run We are not a software engineering company - And with the financial market spiraling downwards Every new technology would require customizations - Start customization after technology was on-boarded Oblivious about the integration requirements with 3 rd party cloud solutions - How would Hybrid Cloud solutions evolve? - What requirements would be instantiated by NBB/ECB?
Decision: Do It Ourselves, tactical solution Agile development of our own portal & orchestration logic - Sprints of 3 months - Production result after every sprint Close integration with existing systems - Portal & logic had to steer and decide on every step Leverage benefits from standardized operating system platforms (which we call STACS) - Indirectly pushing more internal customers to this STACS concept - Further standardization of these platforms
The Result
Sprint 1 : Staging Linux/Windows In less than 200md and no additional software expenses Portal usable for project leads & architects - Integrated with security requirements (TAMeB, IDM solution, ) - Standard look n feel (.NET / LightSwitch) Virtual Linux/Windows server staged in less than a day - Still using staging, not imaging technology Server immediately assigned (charging) to right business unit - No server gets staged without business service
Charging (back then) Charging was based on multitude of parameters - Type of server - Number of CPUs - Deployed software (middleware, integration, database, ) - Storage assignation - Backup assignation - Network definitions & integration - OS choice - Yet still was not resembling reality
Sprint 2 : Decommission server Update server resources - Change CPU and memory (sounds simple, isn t though ) Introduce deactivation and then decommissioning - And not only through HPC Console staged systems - Enables quick provisioning of test systems Update integrations due to changing environments (sigh) - Move to different backup solution (Symantec NetBackup) - Move to different asset management (ServiceNOW) - Redesign of managed file transfer environment
Sprint 3 : Sandbox system Deploy VMWare images as isolated systems for proof of concepts. No data flows towards internal data center - Security requirement from sandbox perspective No KBC specific deployment requirements - Number one complaint when doing functional tests (pre-sales) Simple user management for sandbox systems - Main concern with stakeholders
Final Sprint : Pause server Reduce CPU and memory consumption on VMWare clusters - Better control of available resources (and cost) No charging of paused systems to end customer - No real confirmed business case though, so focus on fast, low-cost implementation Some restrictions still apply - If paused for longer than backup retention, need for (manual for now) snapshot backup - Reactivation requires apply of all security patches (we do monthly patching, so this can be quite a queue)
Obligatory screenshot
Did we reach our goals?
Results 540md savings per year - 300 servers staged per year (mid-2013) - 1,8 md less work per staging Increased quality and security posture through industrialization Delivery time decimated - Includes middleware and database initiation & integration
This is (not) the end HPC Console itself was tactical choice Chargeback system has been refactored since then - Much simpler (definitely, but now also obviously not realistic) We know the market has matured further since the inception Experience with tool learns us the ropes of - Building custom integration (and what to look out for) - Functional and non-functional requirements (& pitfalls ) Updated cloud strategy at KBC Group gives us plenty to think about for the (short-term already) future.
? Th-th-th-that s all folks