Big Data : Next Big Thing or Big Distraction? Dirk Quartel and Richard Tweedie June 14
What is Big Data? Data whose size is beyond the ability of the typical data base and software tools to capture, store and analyse This is a moving definition as tools and amounts of data increase As technology improves, the definition of big data will increase Definition is relative to the current status of the user CCR would be considered Big Data by some Definition varies by industry Generated by a variety of sources and activities: Communicating Browsing Buying Sharing Doing (video)
Types of Big Data Traditional Enterprise data includes customer and account management information, transactional data, access logs, and general ledger data. Machine-generated/sensor data includes images, sound recordings, smart meters, manufacturing sensors, system logs, trading systems data. Social data (unstructured) includes customer feedback streams, micro-blogging sites like Twitter, social media platforms like Facebook
Concept changes Volume 30 million networked sensor nodes and growing 1 second of video is 20,000 times the information of a page of text Growing at 40% per year Nobody seems willing to estimate how much data there actually is Variety Mobile phones Internet use Purchasing information Video files Velocity Data is now collected, processed and used in real time Processes that were once quarterly or monthly are now completed when new data arrives Similar to manual, but data is collected at many locations by many sources
Example Use ATM Fraud Goal: Use the camera on the ATM to identify the customer through facial recognition, reducing fraud. Whenever a customer makes a withdrawal, take and store an image of the customer As the number of images increases, and when a reliable recognition model is viable, use it as a secondary authentication process If the person fails identification, a third authentication method such as a code sent to the registered mobile could be used to authenticate If authentication still fails, a transaction limit would be applied
Big Data Concepts and Definitions - the foundations are still the same We see the same foundations in Big Data that we also see in Little Data Aggregating data Combining data Accessing data Sharing data Storing data Insights from data Big data brings new analytical techniques and data processing frameworks into play these would also be helpful for current data
The more things change, the more they stay the same... Despite the changes, without analytical insight, and management action and there is no value added: Need to communicate meaning of data Need to educate senior management of the benefits of a new process Insight needs to drive action and change via business decisions To deliver this, a high level of trust will need to be developed in these new data sources if they are to be used What is the current level of trust in traditional data sources in your organisation? Can this be achieved for the new data sources?
Are we ready for the added complexity? Analytical eco-systems will need to be developed to handle new levels of data variety, volumes and velocity Will stakeholders understand the new concepts? Senior Management Regulators Non Analysts Consumer Law Centres Privacy issues? Is your small data all under control? There is an estimated skills shortage of 50% in 5 years time Only 15% of companies are ready to take competitive advantage of Big Data
ANALYTIC ECOSYSTEM SUSTAINABLE VALUE
What is the Analytic Ecosystem? A group of systems that, while having their own agendas, collaborate to provide a benefit that no individual system could provide on its own
Typical Evolution in an Organisation Reactionary Business requests dealt with on an individual basis, no long term vision Ad-hoc No long term reporting or analytics roadmap Workload increase without budget Innovative chaos Lack of funding and strategic guidance Just get it done!
Reaching the Choke Point Business as Usual Cost Increase in FTE becomes more and more difficult to justify Time to deliver Growth in environment complexity increases the time it takes to deliver Stability Patchwork systems regularly fail Regular issues in outputs erode trust Project overrun Unexpected complexities in the data and system make estimation difficult
Analysis of Choke Point, Why Doesn t it Scale? Data Management No central data management No data quality standards No data delivery standards Quality of code Little re-usability of processes, and conflicting business definitions Lack of coding standards produces difficult to maintain and test artefacts No defensive programming No peer reviews No automated test beds Operational Issues Low development standards impair operations Increasing amounts of time spent on investigation
Components of a Sustainable Ecosystem Management Governance Internal Promotion of activities and benefits Business Value Reporting Performance Monitoring Decision Automation Insight Discovery Innovation Industry Pollination Structure Training Data Management Code Assets Foundation Tools Storage/Data Environment Coding Standards
Special Considerations for Big Data IT infrastructure is key Volume, velocity and variety Methodology considerations How should the analytic approach change to make use of the data? Application considerations How can the output be integrated back in with the Enterprise data? How can benefit be gained and tracked?
Big Data and Risk Analytics Data would need to come from reliable and repeatable sources Does Big Data equal Big Data Issues? What would the sources of Big Data be in banking? Transaction data, real time market feeds, customer service data, correspondence, response to social media activity What types of analysis/models could use Big Data? NO Decisioning models? Basel/provisioning models? MAYBE Collections Counterparty Risk evaluation YES Customer Insight analysis / Understanding of Customer Portfolio Profiling what s happening? Fraud Marketing
Summary There are many interesting aspects and possibilities surrounding Big Data Big Data is talked about by executives - the message has been delivered This means that analytics has a seat at the executive table The discussion is about value-add, not cost Less than 15% of banks have a Big Data initiative Late starters will join in a me too approach Is CCR the first "big data" battle ground for credit risk? Make sure current little data eco-systems are in good shape - need to show we can walk before we try to run
THANK YOU FOR YOUR TIME AND ATTENTION
ABOUT CONNECTED ANALYTICS OUR STORY, OUR PRINCIPALS
Be Better Connected : Our Story Professional Services Connected Analytics is an analytics professional services organisation. We provide services and resources to help our clients achieve their analytic goals. We are absolutely committed to delivery excellence and to maintaining an employment proposition that gives our clients access to some of the very best talent our industry has to offer. As a privately owned and locally based business, we can maintain our delivery focus and provide a more commercial proposition to our clients by operating beyond revenue and margin targets that dominate many professional services and software companies. Talent With our delivery excellence and talent focussed approach, Connected Analytics are developing a community of analytics professionals who can help with any stage of your analytics journey. From data sourcing and extraction; data transformation and governance; model build and implementation; monitoring and validation; architecture and future state : we work in all components of the analytics eco-system. We aim to achieve a reputation for delivery that allows us to partner with our clients for the longer term, create lasting insight and continually helping clients increase their analytic maturity. Delivery Flexibility Whether you want us to manage a project and deliver outcomes for you, provide the services of a consultant from our community or refer a person we trust to join your team, we know the right people and have access to the analytic talent for the job.
Our Principals at Conference Dirk Quartel Senior Analytics Specialist dirk@connectedanalytics.com.au +61 487 334 064 Richard Tweedie Analytic Architect rich@connectedanalytics.com.au +61 434 12 61 94
info@connectedanalytics.com.au Connected Analytics 2013 Level 2 Riverside Quay, 1 Southbank Boulevard, Melbourne, VIC 3006