Big Data and Analytics at the IRS:
|
|
|
- Marcus Adams
- 9 years ago
- Views:
Transcription
1 Big Data and Analytics at the IRS: Perspectives and Initatives Government Big Data Symposium March 5-6, 2013 Jeff Butler Director, Research Databases IRS, Research, Analysis, and Statistics
2 Background The Internal Revenue Service (IRS) has a large service and enforcement footprint. The table below is from FY Tax Return Processing Account Management Customer Service Enforcement 234 million tax returns filed 1.8 billion third-party information returns $2.4 trillion in gross receipts 122 million refunds totaling $415 billion 319 million vists to IRS website 83 million toll-free telephone calls 223 million letters or notices sent to taxpayers $116 billion in accounts receivable 2
3 Types of Research and Analysis Taxpayer Behavior Failure to file or pay Abusive tax shelters Identity theft Return preparer compliance Misreporting income or deductions Refund fraud Off-shore transactions Financial crimes Analytic Initiatives Identify patterns of filing and payment non-compliance Predict and prevent ID theft and refund dfraud Estimate U.S. tax gap Measure taxpayer py burden Optimize case inventories and treatment strategies Simulate effects of tax changes Analyze criminal networks 3
4 Analytic Data Environment in IRS IRS enterprise IT manages hundreds of transactional systems and applications Research organization integrates legacy and third-party data into the Compliance Data Warehouse (CDW) Compliance Data Warehouse (CDW) Selected Metrics Total data size ~ 1.3PB Number of database tables ~ 3,100 Number of unique columns ~ 52,500 Number of searchable metadata attributes > 1 million Number of users ~ 1,020 Average daily queries ~ 6,500 4
5 IRS Analytic Data Environment Compliance Data Warehouse (CDW) Analytic Sandboxes (Examples) Case Predictive Text Optimization i Modeling Analytics Simulation Data Integration Layer Core Analytic Database nterprise Data a E Integration La ayer Data Statistical & Mathematical Analysis Storage Mgmt Security/Audit Monitoring Ad-Hoc Query and Reporting Infrastructure and Services System Admin Software Config Accounts Metadata Data Profiling Data Extracts, Matching Web Services Training & Support 5
6 IRS Analytic Data Environment Compliance Data Warehouse (CDW) Core Database Servers (Sybase IQ, Oracle, SQL Server) Shared Storage (>2PB) (DB, Backup, Staging, User) Application/Web Servers (SAS, R, Hyperion) IRS Network Users & Projects Systems & Applications Analytic Sandboxes Other Tools 6
7 Scale (Volume) 1600 Data Size (Terabytes) 7000 Average Daily Queries Third-Party Tools Web-Based Not all infrastructure/service costs are constant in scale Massively large environments can have asymmetric challenges Systems & Storage Management ETL & Database Administration Metadata & Web Services Security Audit and Monitring Tools, Training, & Support Analytic Sandboxes 7
8 Challenges with Scale I/O bottlenecks when data are off-loaded for analytics Single biggest problem for users in massively large environments Strategy: Maximize in-database analytics where possible Finding the optimal mix of ETL tools and techniques This is still where data warehousing costs are highest Strategy: Stay nimble and avoid one-size-fits-all solution Choosing the right database technology Is it performance or scale that s really needed? CDW is largest database in the IRS and still uses columar DB Strategy: Maximize performance for users at smallest O&M cost Storage management Different approach needed in user-based analytic environment Strategy: t Partition file systems based on user intensity it 8
9 terly Monthly Weekly Daily Annual Quart Data Arrival Rate Timeliness (Velocity) Ingest-Release Latency Data arrival rates are different from data delivery rates Minimzing this difference is inherently an ETL problem Data Extract/ Feed Validation/ Integration/ Preprocessinprocessing Post- Analysis/ Modeling Interpretation/ Action 9
10 Challenges with Velocity Larger the data size, longer the processing time Let P ij and S ij = processing time and size of data set i with frequency j, ij = 1, 2,, n The problem is argmin θ ij (P S) ij + ε ij Processing time varies with scale (and complexity) Disturbances ε ij are unavoidable (e.g., server maintenance) Data may require validation, standardization, and cleaning No two data sets are the same Structured vs. unstructured data What is impact of frequent schema changes on data delivery times for structured data? Do skills exist for processing unstructured data at any speed? 10
11 Heterogeneity (Variety) Sources of IRS Data Types of IRS Data Source Systems and Data Formats Taxpayers Employers Preparers Banks Brokers Non-Profits Interagency Fed/State Treaty Partners Intermediaries Forms Schedules Worksheets Attachments Images Correspondence Transactions Phone Calls Notices Transcripts Mainframe Unix/Linux Windows Databases VSAM Flat Files Applications DB tables Fixed format Hierarchical Delimited Packed decimal XML Plain text Overwhelming majority of IRS data are still structured Most transaction systems are still file-based Challenge: skills needed to parse and analyze text Information extraction and entity resolution techniques (NLP) 11
12 Metadata and Information Quality Searchable Metadata Framework and Strategy Simple reference model is used to guide consisteny of searchable artifacts Combination of system, contextual, and application attributes Controlled vocabulary for key descriptive elements Columns Columns w ith Metadata Strategy favors basic discoverability rather than systematized collections Data for analytics must be searchable, understandable, and semantically consistent Metadata is the nucleus of any data quality strategy Trust and confidence in data should be invariant to scale 12
13 Metadata and Information Quality Stages of Metadata Collection Database Flat File VSAM Extract Transform Load Validate Staging DW Roll-Ups Query, Analys sis, Reportin g Source Systems Source Metadata ETL/T Metadata Data Model Metadata Report Metadata Central Metadata Repository Metadata are collected at each stage of the data supply chain 13
14 Metadata and Information Quality System Metadata Physical properties, data movement, ETL/T, and workflow artifacts Contextual Metadata Attributes, references, and other searchable content Application Metadata Context dependent logic, conditional rules, and dynamic processing Source System Characteristics System properties File or table names Data element names and definitons Data types Transformation rules Cross-references references Target System Properties Table names Column names Data types Indexes Partitions or table spaces Data Attributes Authoritative system Data element name and definiton Availability Data type Join paths Legacy source reference User reviews Links to context-dependent data Publishing Standards Web-based Standard format Hierarchical and free-form search Web-Based Logic Reports and roll-ups Lookup tables URLs and other links External communication Profiling Frequencies Statistical distributions Trend analysis Geographic maps Reviews User ID Table/column reference Feedback 14
15 Techniques used by IRS analysts Workforce Skills Regression-based methods (GLM, logisitic, quantile, non-linear, proportional hazards) Social network analysis, graph theory Machine learning (neural networks, SVMs, genetic algorithms) Multivariate statistical methods (discriminant analysis, clustering, density estimation, factor analysis) Simulation (Monte Carlo, MCMC, agent-based modeling) Decision trees (CART, CHAID, C5, hybrids) Bayes rules and other classifiers Variance estimation with complex samples 15
16 Workforce Skills Analysts: Use of advanced SQL techniques to avoid off-loading data for analytics (in-database dtb computing) Understanding and leveraging Open Source tools IT Staff: Literacy in non-traditional computing architectures Support for Open Source tools and analytic databases Ability to quickly build and deploy analytic sandboxes This is different from typical BI/report/dashboard environments Emphasis on algorithms, not just information distribution Key is multi-disciplinary skills Nexus of statistics, computer science, economics, IT 16
17 Data Privacy and Security IRS analytics are done behind the firewall but data still moves Data off-loaded to laptops, servers, sandboxes External access (Treasury, Congress, universities) Permissions management in shared disk environment Gets more complex with more users and data Security trade-offs and challenges Impact of system- and application-level policy changes How much continuous monitoring and auditing? FISMA and the documentation dilemma Relationship between encryption and performance 17
CDW DATA QUALITY INITIATIVE
Loading Metadata to the IRS Compliance Data Warehouse (CDW) Website: From Spreadsheet to Database Using SAS Macros and PROC SQL Robin Rappaport, IRS Office of Research, Washington, DC Jeff Butler, IRS
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
EII - ETL - EAI What, Why, and How!
IBM Software Group EII - ETL - EAI What, Why, and How! Tom Wu 巫 介 唐, [email protected] Information Integrator Advocate Software Group IBM Taiwan 2005 IBM Corporation Agenda Data Integration Challenges and
Chapter 5. Learning Objectives. DW Development and ETL
Chapter 5 DW Development and ETL Learning Objectives Explain data integration and the extraction, transformation, and load (ETL) processes Basic DW development methodologies Describe real-time (active)
High-Volume Data Warehousing in Centerprise. Product Datasheet
High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence
SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box)
SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box) SQL Server White Paper Published: January 2012 Applies to: SQL Server 2012 Summary: This paper explains the different ways in which databases
NEWLY EMERGING BEST PRACTICES FOR BIG DATA
2000-2012 Kimball Group. All rights reserved. Page 1 NEWLY EMERGING BEST PRACTICES FOR BIG DATA Ralph Kimball Informatica October 2012 Ralph Kimball Big is Being Monetized Big data is the second era of
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
What's New in SAS Data Management
Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Improve data quality Move data in real time and
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
BUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
<Insert Picture Here> Oracle Retail Data Model Overview
Oracle Retail Data Model Overview The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco
Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand
<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server
Extending Hyperion BI with the Oracle BI Server Mark Ostroff Sr. BI Solutions Consultant Agenda Hyperion BI versus Hyperion BI with OBI Server Benefits of using Hyperion BI with the
Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica
Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica Menachem Brouk, Regional Director - EMEA Agenda» Attunity update» Solutions for : 1. Big Data Analytics 2. Live Reporting
BENEFITS OF AUTOMATING DATA WAREHOUSING
BENEFITS OF AUTOMATING DATA WAREHOUSING Introduction...2 The Process...2 The Problem...2 The Solution...2 Benefits...2 Background...3 Automating the Data Warehouse with UC4 Workload Automation Suite...3
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY
A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY Analytics for Enterprise Data Warehouse Management and Optimization Executive Summary Successful enterprise data management is an important initiative for growing
SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs
Database Systems Journal vol. III, no. 1/2012 41 SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs 1 Silvia BOLOHAN, 2
SAP SE - Legal Requirements and Requirements
Finding the signals in the noise Niklas Packendorff @packendorff Solution Expert Analytics & Data Platform Legal disclaimer The information in this presentation is confidential and proprietary to SAP and
Enterprise Data Integration The Foundation for Business Insight
Enterprise Data Integration The Foundation for Business Insight Data Hubs Data Migration Data Warehousing Data Synchronization Business Activity Monitoring Ingredients for Success Enterprise Visibility
ORACLE TAX ANALYTICS. The Solution. Oracle Tax Data Model KEY FEATURES
ORACLE TAX ANALYTICS KEY FEATURES A set of comprehensive and compatible BI Applications. Advanced insight into tax performance Built on World Class Oracle s Database and BI Technology Design after the
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
Agile Business Intelligence Data Lake Architecture
Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
An Architectural Review Of Integrating MicroStrategy With SAP BW
An Architectural Review Of Integrating MicroStrategy With SAP BW Manish Jindal MicroStrategy Principal HCL Objectives To understand how MicroStrategy integrates with SAP BW Discuss various Design Options
Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III
White Paper Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III Performance of Microsoft SQL Server 2008 BI and D/W Solutions on Dell PowerEdge
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Making Sense of Big Data in Insurance
Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific
Why is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
Lavastorm Resolution Center 2.2 Release Frequently Asked Questions
Lavastorm Resolution Center 2.2 Release Frequently Asked Questions Software Description What is Lavastorm Resolution Center 2.2? Lavastorm Resolution Center (LRC) is a flexible business improvement management
Getting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
Importance or the Role of Data Warehousing and Data Mining in Business Applications
Journal of The International Association of Advanced Technology and Science Importance or the Role of Data Warehousing and Data Mining in Business Applications ATUL ARORA ANKIT MALIK Abstract Information
Practical Considerations for Real-Time Business Intelligence. Donovan Schneider Yahoo! September 11, 2006
Practical Considerations for Real-Time Business Intelligence Donovan Schneider Yahoo! September 11, 2006 Outline Business Intelligence (BI) Background Real-Time Business Intelligence Examples Two Requirements
Putting Apache Kafka to Use!
Putting Apache Kafka to Use! Building a Real-time Data Platform for Event Streams! JAY KREPS, CONFLUENT! A Couple of Themes! Theme 1: Rise of Events! Theme 2: Immutability Everywhere! Level! Example! Immutable
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA
OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,
Informatica ILM Archive and Application Retirement
Informatica ILM Archive and Application Retirement Thierry AUDOT Technical Manager EMEA 26 th September 2012 1 Live Archiving What are key users pain points? My reports take forever to run! I need all
Fusion Applications Overview of Business Intelligence and Reporting components
Fusion Applications Overview of Business Intelligence and Reporting components This document briefly lists the components, their common acronyms and the functionality that they bring to Fusion Applications.
BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT
BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on
SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
Tax Fraud in Increasing
Preventing Fraud with Through Analytics Satya Bhamidipati Data Scientist Business Analytics Product Group Copyright 2014 Oracle and/or its affiliates. All rights reserved. 2 Tax Fraud in Increasing 27%
Ganzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS PRODUCT FACTS & FEATURES KEY FEATURES Comprehensive, best-of-breed capabilities 100 percent thin client interface Intelligence across multiple
Getting Value from Big Data with Analytics
Getting Value from Big Data with Analytics Edward Roske, CEO Oracle ACE Director [email protected] BLOG: LookSmarter.blogspot.com WEBSITE: www.interrel.com TWITTER: Eroske About interrel Reigning Oracle
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS
Oracle Fusion editions of Oracle's Hyperion performance management products are currently available only on Microsoft Windows server platforms. The following is intended to outline our general product
Improving your Data Warehouse s IQ
Improving your Data Warehouse s IQ Derek Strauss Gavroshe USA, Inc. Outline Data quality for second generation data warehouses DQ tool functionality categories and the data quality process Data model types
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
Business-driven governance: Managing policies for data retention
August 2013 Business-driven governance: Managing policies for data retention Establish and support enterprise data retention policies for ENTER» Table of contents 3 4 5 Step 1: Identify the complete business
Overview. Edvantage Security
Overview West Virginia Department of Education (WVDE) is required by law to collect and store student and educator records, and takes seriously its obligations to secure information systems and protect
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Accelerate time to market Move data in real time
Cloud Ready Data: Speeding Your Journey to the Cloud
Cloud Ready Data: Speeding Your Journey to the Cloud Hybrid Cloud first Born to the cloud 3 Am I part of a Cloud First organization? Am I part of a Cloud First agency? The cloud applications questions
Oracle Data Integrator 12c: Integration and Administration
Oracle University Contact Us: +33 15 7602 081 Oracle Data Integrator 12c: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive data integration
III JORNADAS DE DATA MINING
III JORNADAS DE DATA MINING EN EL MARCO DE LA MAESTRÍA EN DATA MINING DE LA UNIVERSIDAD AUSTRAL PRESENTACIÓN TECNOLÓGICA IBM Alan Schcolnik, Cognos Technical Sales Team Leader, IBM Software Group. IAE
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group. Tuesday June 12 1:00-2:15
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group Tuesday June 12 1:00-2:15 Service Oriented Architecture and the DBA What is Service Oriented Architecture (SOA)
Oracle Business Intelligence Foundation Suite 11g Essentials Exam Study Guide
Oracle Business Intelligence Foundation Suite 11g Essentials Exam Study Guide Joshua Jeyasingh Senior Technical Account Manager WW A&C Partner Enablement Objective & Audience Objective Help you prepare
Advanced Analytics for Audit Case Selection
Advanced Analytics for Audit Case Selection New York State Capitol Building Albany, New York Presented by: Tim Gardinier, Manager, Data Processing Services Audit Case Selection Putting a scientific approach
Industry Models and Information Server
1 September 2013 Industry Models and Information Server Data Models, Metadata Management and Data Governance Gary Thompson ([email protected] ) Information Management Disclaimer. All rights reserved.
Oracle Data Integrator 11g: Integration and Administration
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 4108 4709 Oracle Data Integrator 11g: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive
Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.
Oracle Architecture, Concepts & Facilities
COURSE CODE: COURSE TITLE: CURRENCY: AUDIENCE: ORAACF Oracle Architecture, Concepts & Facilities 10g & 11g Database administrators, system administrators and developers PREREQUISITES: At least 1 year of
Leveraging Machine Data to Deliver New Insights for Business Analytics
Copyright 2015 Splunk Inc. Leveraging Machine Data to Deliver New Insights for Business Analytics Rahul Deshmukh Director, Solutions Marketing Jason Fedota Regional Sales Manager Safe Harbor Statement
Corralling Data for Business Insights. The difference data relationship management can make. Part of the Rolta Managed Services Series
Corralling Data for Business Insights The difference data relationship management can make Part of the Rolta Managed Services Series Data Relationship Management Data inconsistencies plague many organizations.
Data Warehousing. Jens Teubner, TU Dortmund [email protected]. Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1
Jens Teubner Data Warehousing Winter 2015/16 1 Data Warehousing Jens Teubner, TU Dortmund [email protected] Winter 2015/16 Jens Teubner Data Warehousing Winter 2015/16 13 Part II Overview
Master of Science in Health Information Technology Degree Curriculum
Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525
Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
SQL Server 2005 Features Comparison
Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
Exploring the Synergistic Relationships Between BPC, BW and HANA
September 9 11, 2013 Anaheim, California Exploring the Synergistic Relationships Between, BW and HANA Sheldon Edelstein SAP Database and Solution Management Learning Points SAP Business Planning and Consolidation
Enterprise Information Management and Business Intelligence Initiatives at the Federal Reserve. XXXIV Meeting on Central Bank Systematization
Enterprise Information Management and Business Intelligence Initiatives at the Federal Reserve Kenneth Buckley Associate Director Division of Reserve Bank Operations and Payment Systems XXXIV Meeting on
Data Integration and ETL with Oracle Warehouse Builder NEW
Oracle University Appelez-nous: +33 (0) 1 57 60 20 81 Data Integration and ETL with Oracle Warehouse Builder NEW Durée: 5 Jours Description In this 5-day hands-on course, students explore the concepts,
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
BBBT Podcast Transcript
BBBT Podcast Transcript About the BBBT Vendor: The Boulder Brain Trust, or BBBT, was founded in 2006 by Claudia Imhoff. Its mission is to leverage business intelligence for industry vendors, for its members,
Data Mart/Warehouse: Progress and Vision
Data Mart/Warehouse: Progress and Vision Institutional Research and Planning University Information Systems What is data warehousing? A data warehouse: is a single place that contains complete, accurate
Information and Decision Sciences (IDS)
University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating
Introduction to Glossary Business
Introduction to Glossary Business B T O Metadata Primer Business Metadata Business rules, Definitions, Terminology, Glossaries, Algorithms and Lineage using business language Audience: Business users Technical
OWB Users, Enter The New ODI World
OWB Users, Enter The New ODI World Kulvinder Hari Oracle Introduction Oracle Data Integrator (ODI) is a best-of-breed data integration platform focused on fast bulk data movement and handling complex data
<Insert Picture Here> Oracle SQL Developer 3.0: Overview and New Features
1 Oracle SQL Developer 3.0: Overview and New Features Sue Harper Senior Principal Product Manager The following is intended to outline our general product direction. It is intended
Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015
Bringing Strategy to Life Using an Intelligent Platform to Become Ready Informatica Government Summit April 23, 2015 Informatica Solutions Overview Power the -Ready Enterprise Government Imperatives Improve
Reverse Engineering in Data Integration Software
Database Systems Journal vol. IV, no. 1/2013 11 Reverse Engineering in Data Integration Software Vlad DIACONITA The Bucharest Academy of Economic Studies [email protected] Integrated applications
The Future of Business Analytics is Now! 2013 IBM Corporation
The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Many corporations and Independent Software Vendors considering cloud computing adoption face a similar challenge: how should
Oracle Business Intelligence 11g Business Dashboard Management
Oracle Business Intelligence 11g Business Dashboard Management Thomas Oestreich Chief EPM STrategist Tool Proliferation is Inefficient and Costly Disconnected Systems; Competing Analytic
Integrating data in the Information System An Open Source approach
WHITE PAPER Integrating data in the Information System An Open Source approach Table of Contents Most IT Deployments Require Integration... 3 Scenario 1: Data Migration... 4 Scenario 2: e-business Application
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
