G-Cloud Service Definition Canopy Big Data proof of concept Service SCS
Canopy Big Data proof of concept Service SCS Canopy Big Data Proof of Concept (PoC) Service is a consulting service that helps the Customer to explore opportunities provided by Big Data with minimal upfront investment. The end result is a full report of the finding from the POC and an interactive presentation of insight found. The key benefits of this service are: It demonstrates the potential value of Big Data through a Proof of Concept approach It has a low cost of entry Shows the Customer how to make better, more informed business decisions, using sophisticated analysis of multiple data sources The ability to add external data sources to enrich the insights available from only internal data sources e.g. Social media What is it? This service provides a cloud-based Big Data platform to allow clients to explore the opportunities provided by Big Data with minimal upfront investment. The Big Data Proof of Concept (POC) approach provides clients with the Big Data expertise and platform to test a particular business use case within an agreed scope. It s a modular engagement that lasts 6-8 weeks depending on the scale, complexity and goals of the POC. Canopy s experienced information specialists and In-House Big Data platform allows the Customer to test a particular business use case or hypothesis. On completion of the client specific POC the environment will be made available for the client to view and share with other individuals in their organisation for a month after which the POC environment will be switched off. The POC boundaries and restrictions are as follows: The POC will be developed for a single business use case The POC will include a maximum number of data sources: Structured = 2 Semi-structured/Unstructured = 1 The data volumes of the POC cannot exceed 320 GB The POC environment is developed for batch processing data and will not be suitable for use cases that require real time processing The Customer will be responsible for ensuring the following: Ensure key individuals (client SME) are available for the period of the POC development activity The right individuals and level of seniority are invited to the POC and Insight presentation Providing Canopy Big Data consultant and Technical Specialist access to system and or data sources relevant to the chosen Big Data use case ii
Provide any technical documentation on the relevant data sources. In addition to a more traditional waterfall approach, Canopy supports both Agile and Iterative lifecycle models. Canopy Analytics Agile Methodology is described in terms of a hierarchical process model, consisting of sets of tasks described at two levels of abstraction (from general to specific): 6 main phases and specific steps in each phase. Each step is described by four categories: Key Activities Outputs Resources Recommendations / Errors to Avoid Each step is illustrated further by specific examples of how this step has been implemented by individual use case streams of Canopy Analytics PoCs project. What makes us unique? Everybody is talking about Big Data. The client s journey into the world of big data is driven by the promise of new insight of new and actionable intelligence revealed. For Canopy, Big Data is the same but different. what does this mean? In many ways, it is a straight continuation of what we have been doing for years: helping our clients maximise the operational and business advantage gained from their digital assets. But it is different too. The scale of the data explosion, the growing emphasis on deriving value from unstructured data, and the massive increases in connectivity in the internet of things, all help to form a new data landscape. And in this new landscape, those who are best able to generate differentiating intelligence often in real-time will be the ones who achieve and sustain success. Canopy offers relevant world-class competencies and customised solutions necessary to handle Big Data and get (more) value out of it. Canopy is a key supplier to the Public Sector, delivering wide ranging and secure systems in major customers like DWP, HMRC, MOJ and DoH. As such we are in a good placed to understand the requirements of the Public Sector market and to recommend the most appropriate methods and tools. Canopy recognises that many organizations are wondering how big data could benefit their business. They are finding that data growth and diversity is outpacing traditional management information technologies, but there is still the worry about the implications of big data - what is it and do I really need to embrace it? If the Customer wants to see first-hand the kinds of insights and predictive models that big data can produce then Canopy Big Data PoC can help the Customer explore the possibilities. The Big Data POC Service can include, but may not be limited to, the following activities: Collaboratively examine the use case and define the sources of data required Work with the Customer to extract the data Integrate data into Canopy PoC environment iii
Define and execute Map Reduce algorithms based on use case Provide analysis and visualization of findings. This service provides cloud-based Big Data platform to allow clients to explore the opportunities provided by Big Data with minimal upfront investment. The Big Data Proof of Concept (POC) approach provides clients with the Big Data expertise and platform to test a particular business use case within an agreed scope. The POC service includes the following high level activities: Business assessment: understanding the business needs and objectives to define the big data POC use case or hypothesis Data assessment: auditing the real world information option available, both internally and externally, to define what sets of information are relevant to address the POC challenge Solution development: load the identified data sets and analyze the information with respect to the use case or hypothesis defined for the POC. Insight: creating visualizations which demonstrates the value of Big Data POC on real business data using analytical models. What benefits can a Big Data POC Service deliver? Provide a Big Data POC specific to the Customer s requirements Develop and prove a use case or hypothesis of the Customer s choice Test and verify the validity of the data sources Help the Customer better understand the risk and benefits of implementing a Big Data solution Provide the basis for a decision on whether to implement a Big Data solution. iv
v
Contents 1 Introduction... 1 1.1 Service summary... 1 1.2 How this product can be used... 1 2 Service overview... 3 2.1 Enabling Strategy... 3 2.2 Business & User Needs... 3 2.3 Capabilities... 3 2.4 Agile & Iterative... 4 2.5 Canopy Big Data Platform... 4 2.6 Service Roadmap... 6 3 Information Assurance... 7 4 Backup/Restore and Disaster Recovery... 8 5 On-boarding and off-boarding... 9 6 Pricing... 10 6.1 Other Professional services... 10 6.2 Termination terms... 10 7 Service management... 11 8 Service constraints... 12 9 Service levels... 13 10 Financial recompense... 14 11 Training... 15 12 Ordering and invoicing process... 16 13 Termination Terms... 17 13.1 By consumers (i.e. consumption)... 17 13.2 By the Supplier (removal of the G-Cloud Service)... 17 14 Data restoration / service migration... 18 15 Consumer responsibilities... 19 16 Technical Requirements... 20 17 Trial service... 21 vi
1 Introduction 1.1 Service summary Canopy Big Data Proof of Concept (POC) Service is a consulting service that helps the Customer to explore opportunities provided by Big Data with minimal upfront investment. The end result is a full report of the finding from the POC and an interactive presentation of insight found. The key benefits of this service are: It demonstrates the potential value of big data through a proof of concept It has a low cost of entry Shows the Customer how to make better, more informed business decisions, using sophisticated analysis of multiple data sources The ability to add external data sources to enrich the insights available from only internal data sources e.g. Social media. Canopy recognises that many organisations are wondering how big data could benefit their business. They are finding that data growth and diversity is outpacing traditional management information technologies, but there is still the worry about the implications of big data - what is it and do I really need to embrace it? If the Customer wants to see first-hand the kinds of insights and predictive models that big data can produce then Canopy Big Data PoC can help the Customer explore the possibilities. The Big Data POC Service can include, but may not be limited to, the following activities: Collaboratively examine the use case and define the sources of data required Work with the Customer to extract the data Integrate data into Atos Canopy PoC environment Define and execute Map Reduce algorithms based on use case Provide analysis and visualization of findings. 1.2 How this product can be used This service provides cloud-based Big Data platform to allow clients to explore the opportunities provided by Big Data with minimal upfront investment. The Big Data Proof of Concept (POC) approach provides clients with the Big Data expertise and platform to test a particular business use case within an agreed scope. The POC service includes the following high level activities: Business assessment: understanding the business needs and objectives to define the big data POC use case or hypothesis Data assessment: auditing the real world information option available, both internally and externally, to define what sets of information are relevant to address the POC challenge 1
Solution development: load the identified data sets and analyze the information with respect to the use case or hypothesis defined for the POC. Insight: creating visualizations which demonstrates the value of Big Data POC on real business data using analytical models. What benefits can a Big Data POC Service deliver? Provide a Big Data POC specific to the Customer s requirements Develop and prove a use case or hypothesis of the Customer s choice Test and verify the validity of the data sources Help the Customer better understand the risk and benefits of implementing a Big Data solution Provide the basis for a decision on whether to implement a Big Data solution. 2
2 Service overview 2.1 Enabling Strategy This service provides a cloud-based Big Data platform to allow clients to explore the opportunities provided by Big Data with minimal upfront investment. The Big Data Proof of Concept (POC) approach provides clients with the Big Data expertise and platform to test a particular business use case within an agreed scope. It s a modular engagement that lasts 6-8 weeks depending on the scale, complexity and goals of the POC. Canopy s experienced information specialists and In-House Big Data platform allows the Customer to test a particular business use case or hypothesis. On completion of the client specific POC the environment will be made available for the client to view and share with other individuals in their organisation for a month after which the POC environment will be switched off. 2.2 Business & User Needs The POC boundaries and restrictions are as follows: The POC will be developed for a single business use case The POC will include a maximum number of data sources: Structured = 2 Semi-structured/Unstructured = 1 The data volumes of the POC cannot exceed 320 GB The POC environment is developed for batch processing data and will not be suitable for use cases that require real time processing The Customer will be responsible for ensuring the following: Ensure key individuals (client SME) are available for the period of the POC development activity The right individuals and level of seniority are invited to the POC and Insight presentation Providing Canopy Big Data consultant and Technical Specialist access to system and or data sources relevant to the chosen Big Data use case Provide any technical documentation on the relevant data sources. 2.3 Capabilities The service is delivered by experienced, professionally accredited consultants and technologists in the following disciplines: Big Data Consultant Will be the single point of contact for the client during the POC activity. The consultant will be overseeing the whole activity and actively involved in the requirements gathering and Delivering the POC demonstration and insight presentation Big Data Technical Specialist will lead the development of the POC. The technical specialist will be responsible for loading the client s data into the Canopy POC environment and providing MapReduce scripts to analyse the data as per the business use case 3
Big Data Scientist Will develop algorithms and statistical analysis techniques to extract value and insight from the data Big Data Visualisation specialist Will lead the visualisation element of the POC. The Visualisation specialist will work alongside the technical specialist and interpret the analyses performed on the data and present the insights in a visual form. 2.4 Agile & Iterative In addition to a more traditional waterfall approach, Canopy supports both Agile and Iterative lifecycle models. Canopy Analytics Agile Methodology is described in terms of a hierarchical process model, consisting of sets of tasks described at two levels of abstraction (from general to specific): 6 main phases and specific steps in each phase. Each step is described by four categories: Key Activities Outputs Resources Recommendations / Errors to Avoid. Each step is illustrated further by specific examples of how this step has been implemented by individual use case streams of Canopy Analytics PoCs project. 2.5 Canopy Big Data Platform The Big Data Platform is made up of 3 layers: 1. Data Integration Layer Enables data movements from the source systems and between platforms as part of an analytical process: Moving master data from an MDM system into a data warehouse, an analytical DBMS, or Hadoop Moving derived structured data from Hive to a data warehouse Moving filtered event data into Hadoop or an analytical RDBMS Moving dimension data from a data warehouse to Hadoop Moving social graph data from Hadoop to a graph datastore Moving data from a graph datastore to a data warehouse Capturing real-time data and near-real-time data via a variety of mechanisms, incl. web services and messaging 2. Data Storage and Pre-Processing Layer Looks at: Source data processing to the appropriate output format required by the analytic layer Description of coherence mechanisms compliant with EDW and MDM Best practices for designing and building pre-processing solutions 4
3. Analytics Layer Analytics in a sense, this is the most crucial part of the platform. This is where all the analytic processes reside. We can deconstruct it further in two main elements: The Analytic Services, which are well-defined components that implement concrete data analysis algorithms, which can be quite varied and could be implemented using different base technologies. But they all have a defined, concrete scope related to a specific Data Analysis technique over specific dataset classes, as many of them will be based on the machine-learning techniques, both using supervised and unsupervised approaches. An Analytic Framework, that acts like a glue that brings together different Analytic Services in order to achieve a concrete business outcome. So a specialised programmer could use this framework to implement a complete analytic functionality. Analytic Apps: Combining different Analytic Services and using the capabilities of the Analytic Framework allows creation of complete analytic functionality Analytic Apps. These Analytic Apps are the elements that expose the final, business-oriented capability to the end-user. 5
2.6 Service Roadmap 6
3 Information Assurance This product is currently available at Impact Level 0 (IL0). The service can be run at higher Impact Levels including IL2 and IL3. Canopy has considerable experience of providing services at different levels of assurance. Canopy, together with parent company Atos, currently has a number of products on G-Cloud that have received Pan Government Accreditation (PGA). Details can be found on the Cabinet Office website at: http://gcloud.civilservice.gov.uk/customer-zone/accreditation-status/ 7
4 Backup/Restore and Disaster Recovery Not Applicable 8
5 On-boarding and off-boarding Not applicable 9
6 Pricing The service is delivered according to the specific needs of each engagement. Canopy pricing is based on per day rate cards aligned to the SFIA job definitions and separate rates exist for onshore and offshore rates. Please refer to the SFIA rate card - Canopy (On & Offshore) for the standard rates for this service. The primary roles which are engaged in delivering this service are: Service item SFIA Level(s) Big Data Visualisation Specialist 4-5 Big Data Technical Specialist 4-5 Big Data Scientist 5-6 Standards for Consultancy Day Rate: Consultant s Working Day 8 hours exclusive of travel and lunch. Working Week Monday to Friday excluding national holidays Office Hours 09:00 17:00 Monday to Friday Travel and Subsistence Included in day rate within M25. Payable at the department s standard T&S rate outside the M25. Mileage As above Professional Indemnity Insurance included in day rate. 6.1 Other Professional services Available as per the SFIA rate card - Canopy (On & Offshore). 6.2 Termination terms Please refer directly to Canopy standard Terms and conditions. 10
7 Service management Where this service is purchased free-standing, a project/service management wrapper will be built in to workplans and effort estimates. This will normally include status/progress reporting. 11
8 Service constraints The service boundaries and restrictions are as follows: The POC will be developed for a single business use case The POC will include a maximum number of data sources: Structured = 2 Semi-structured/Unstructured = 1 The data volumes of the POC cannot exceed 320 GB The POC environment is developed for batch processing data and will not be suitable for use cases that require real time processing The POC will be made available to the customer for a month after which it will be removed 12
9 Service levels Canopy provides suitably trained individuals to complete the tasks necessary for this service. They can work at the client site or remotely, depending on the need for access to: Client staff Client applications & network. Canopy s standard working hours / days are 09:00 to 17:00 Monday to Friday, excluding public & regional holidays. 13
10 Financial recompense To minimise the cost to users, Canopy does not provide service credits for use of the service. All Canopy services are provided on a reasonable endeavours basis. Please refer to G Cloud terms and conditions. In accordance with the guidance within the GPS G-Cloud Framework Terms and Conditions, the Customer may terminate the contract at any time, without cause, by giving at least thirty (30) Working Days prior notice in writing. The Call Off Contract terms and conditions and the Canopy terms will define the circumstances where a refund of any pre-paid service charges may be available. 14
11 Training Canopy is pleased to be able to offer a wide range of training services from formal class room based training to on line webinar type training. In addition Canopy will develop training materials that elaborate on the Customer s specific business process rather than a generic training based on the product only. Training material can be provided in a format suitable for usage via a Web Browser. 15
12 Ordering and invoicing process Ordering this product is a straightforward process. Please forward your requirements to the email address GCloud@canopycloud.com Canopy will prepare a quotation and agree that quotation with you, including any volume discounts that may be applicable. Once the quotation is agreed, Canopy will issue the customer with the necessary documentation (as required by the G-Cloud Framework) and ask for the customer to provide Canopy with a purchase order. Once received, the customer services will be configured to the requirements as per the original quotation. For new customers, additional new supplier forms may need to be completed. Invoices will be issued to the customer and Shared Services (quoting the purchase order number) for the services procured. On a monthly basis, Canopy will also complete the mandated management information reports to Government Procurement Services detailing the spend that the customer has placed with us. Cabinet Office publish a summary of this monthly management information at: http://gcloud.civilservice.gov.uk/about/sales-information/. 16
13 Termination Terms 13.1 By consumers (i.e. consumption) Termination shall be in accordance with: The G-Cloud Framework terms and conditions Any terms agreed within the Call Off Contract under section 10.2 of the Order Form (termination without cause) where the Government Procurement Service (GPS) guidance states At least thirty (30) Working Days in accordance with Clause CO-9.2 of the Call-Off Contract Canopy Supplier Terms for this Service as listed on the G-Cloud CloudStore. For this specific service, by default Canopy ask for at least thirty (30) Working Days prior written notice of termination as per the guidance within the GPS G- Cloud Framework Terms and Conditions. 13.2 By the Supplier (removal of the G-Cloud Service) Canopy commits to continue to provide the service for the duration of the Call Off Contract subject to the terms and conditions of the G-Cloud Framework and Canopy Supplier Terms. 17
14 Data restoration / service migration Not applicable. 18
15 Consumer responsibilities Efficient operation of an applications development project requires timely provision of a range of information and services by the users and client organisation; these will be outlined in the workplan and usually include: Context: access to strategies and business plans Expertise: access to users for interviews & workshops Knowledge: access to current documentation and applications Data: access to the data required to undertake the POC Facilities: work space, physical access & network/internet access (if on client site). 19
16 Technical Requirements Efficient project delivery during the requirements management & specification stages - usually requires consideration of access to: The Canopy network (usually via the Internet) The Internet/WWW Any shared project areas (e.g. on client or Canopy network or cloud-based) Development and test environments Client legacy systems. 20
17 Trial service This service is not available on a trial basis. 21
22