1 White Paper Big Data Analytics Pilot, Internet of Things Optimizing with the Internet of Things Intel manufacturing advances operational efficiencies and boosts the bottom line with an IoT and big data analytics pilot Introduction Intel proves IoT success by using big data analytics to bring cost savings, predictive maintenance, and higher product yields to its own manufacturing processes. The volume, variety, and velocity of data 1 being produced in manufacturing is growing exponentially, creating opportunities for those diligently analyzing this data to gain a competitive edge, respond to changing market dynamics, and increase manufacturing margins, productivity, and efficiency. Equipment across the factory floor is generating thousands of different data types, such as unit production data at multiple levels, equipment operation data, process data, and human operator data, which can be stored to be made available for analysis. Although large manufacturers have been using statistical process control and statistical data analysis to optimize production for years, the extreme composition of today s data provide opportunities to deploy new approaches, infrastructure, and tools. industries are ready to embrace the use of big data, supported by higher compute performance, open standards, the availability of industry know-how, and the vast availability of highly skilled, data statisticians fueled by academics. With access to new manufacturing intelligence, manufacturers will improve quality, increase manufacturing throughput, have better insights into root cause of manufacturing issues, and reduce machine failure and downtime. With these new business values and technology capabilities, manufacturers will be able to change business models and practices in order to optimize designs for manufacturability, thereby improving supply chain management and introducing the use of customized manufacturing services to shorten time to market for products customized for smart consumers across various geographies. This paper outlines an Internet of Things (IoT) pilot in one of Intel s manufacturing facilities to show how data analytics applied to factory equipment and sensors can bring operational efficiencies and cost savings to manufacturing processes. With industry collaboration from Cloudera, Dell, Mitsubishi Electric, and Revolution Analytics, this IoT big data analytics project is forecasted to save millions of dollars annually along with additional return on investment business value.
2 Challenge: How to extract value from all of the data in manufacturing? Big data is characterized by huge data sets with varied data types, which can be classified as structured, semi-structured, or unstructured, as shown in Table 1. Structured data fits nicely into neatly formatted tables, making it relatively easy to manage and process. Structured data has the advantage of being easily entered, stored, queried, and analyzed. Examples include manufacturing data stored in relational databases, and MANUFACTURING INDUSTRIES EXAMPLES Semiconductor Electronics Solar Machinery Energy Automobiles Aerospace Chemical and Pharma Metalworking Food and Beverage Pulp and Paper Clothings and Textiles Furniture REAL-TIME, SEMI-STRUCTURED Machine builder standards like SECS/ GEM, EDA or custom-based on COM, XML Sensors (vibration, pressure, valve, and acoustics), Relays RFID Direct from PLCs, Motor and drives, Direct from motion controllers, robotic arm historians (time series data structures) Table 1. data examples data from manufacturing execution systems and enterprise systems. On the other hand, unstructured data such as images, text, machine log files, human-operator-generated shift reports, and manufacturing social collaboration platform texts may be in a raw format that requires decoding before data values can be extracted. Semi-structured data is a form of structured data that does not conform to the formal structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elements UNSTRUCTURED Operator shift reports Machine logs Error logs Texts Vision images Audio/Video collaboration social platforms STRUCTURED RDBMS databases NoSQL* Enterprise data warehouse Files stored in manufacturing PCs Spreadsheets and enforce hierarchies of records and fields within the data. In manufacturing, the power of big data technology stems from the ability to merge and correlate these data set types to create business value through newfound insights. The other value proposition of big data technology is it allows manufacturers to aggregate and centralize various types of data in a cost-effective, scalable manner. In manufacturing, process variability stemming from various factors, like material, process recipes and methods, and equipment differences, drives a real business need for manufacturers to turn to a big data solution based on a scalable platform that can grow with their businesses and manufacturing requirements. Machine data is strongly correlated to yield, quality, and output, thereby providing valuable information to proactively detect processes that are getting out of control. However, some types of manufacturing generate massive data files (gigabytes in a few days per tool type, as shown in Table 2), limiting the ability to store, analyze, and extract useful information from them using conventional methods. Without the use of big data technologies, it is extremely hard to even visualize the information in large data sets from various sources. TYPES Machine parameters and error logs Table 2. Data size examples SIZE (per week) EXAMPLES ~5 GB per machine Used to monitor machine performance: dispense height, placement (x, y, z), belt speed, flow rate, oven temperature, laser power, etc. Machine events ~10 GB per machine Used to measure process time: start dispense, end dispense, start setup, and end setup Defect images from vision equipment ~50 MB per unit or 750 GB per lot Used to identify root cause of failure modes, defect commonality, defect mapping 2
3 Addressing the challenge: IoT pilot using big data analytics server and IoT gateway in Intel manufacturing BIG ANALYTICS Data Analytics (Visualization, Monitoring, Statistical) CORPORATE INTRANETS VPN BUSINESS INTELLIGENCE AND VISUALIZATION Data Distill (ETL) INGESTION FIREWALL Automation DBs MANUFACTURING SUBNETS Gateway Gateway Tools ACQUISITION Data Storage, Retention, and Management Private Cloud Infrastructure Testers Figure 1. Building blocks for end-to-end infrastructure enabling manufacturing intelligence from the factory floor to the datacenter Figure 1 shows a high-level IoT manufacturing architecture for small to large data sets that constitute data acquisition and aggregation of various types of data from the manufacturing shop floor and the manufacturing network, which opens up the possibilities of visualizing, monitoring, and data mining for new business intelligence. As an example, the architecture can: clean, extract, transform, and consolidate structured data from existing databases and unstructured data from tool sensors and log files from legacy equipment in a data store platform (i.e., Hadoop*). The data can then be visualized and analyzed by high-level factory applications running in different virtual machines on the same on-premise server as the data store server. Alternatively, the data can be accessed with other analytical or monitoring applications on the network. Other enhanced capabilities could include running analytics in Hadoop or other types of file systems, or running analytics in memory for faster performance. The results of the analysis can be presented to users via intuitive visualization capabilities in the business intelligence layer of the network. 3
4 Big data analytics IoT server setup in Intel manufacturing pilot Business Intelligence Data Analytics The big data analytics server used by Intel manufacturing in its pilot deployment is shown in Figure 2. A compact system from Dell, the PowerEdge* VRTX, 2 was selected to host the big data and analytics software in a private cloud setting as an onpremise server. The system consists of a Dell PowerEdge VRTX chassis with twenty-five 900GB hard disks and two Dell PowerEdge M820 blade servers, each equipped with four Intel Xeon processors from the E product family on each blade. The Intel Xeon processor E product family provides a dense, cost-optimized, four-socket processor solution with up to eight cores, up to 20MB of 4 ETL Data Store and Management OS Virtualization Revolution Analytics DeployR* Web Services R Based Analytics Template 1, 2, and 3 Metadata Batch Processing Figure 2. Big data analytics server Cloudera Interactive SQL Machine Learning Resource Management Storage Integration Red Hat Enterprise Linux* for Virtual Datacenters Dell PowerEdge* VRTX Search Existing Factory Applications Consolidated into Virtual Machines AquaFold* MonetDB* PostgreSQL* last-level (L3) cache, and up to 1.5TB maximum memory capacity, along with communication pathways to move data more quickly. The two M820 servers host analytics and application software, and the Hadoop nodes, which run in multiple virtual machines. Red Hat Enterprise Linux* for Virtual Datacenters operating system provides a complete virtualization software solution for servers designed for a scalable and fully virtualized datacenter. Analytics and Application node Figure 3 shows how the software is allocated to different virtual machines (VMs). The node hosts five VMs that run the various analytics and application workloads. This includes: Revolution R Enterprise* from Revolution Analytics By linking with high-performance Intel MKL multithreaded math libraries, Revolution R Enterprise uses the power of multiple processors to accelerate common matrix calculations. In addition, Revolution R Enterprise parallel, distributed algorithms enable statistical analysis of Big Data to scale linearly with multicore servers, high-performance computing clusters, big data appliances, and Hadoop*. Revolution R Enterprise* from Revolution Analytics is analytics software built upon the powerful open source R statistics language. The software provides a seamless, secure data bridge between analytics solutions and enterprise software, thereby solving a key integration problem faced by businesses adopting R-based analytics alongside existing IT infrastructure. 3 MonetDB*: an open source columnoriented database management system designed to provide high performance on complex queries against large databases, such as combining tables with hundreds of columns and multimillion rows. MonetDB has been applied in highperformance applications for data mining, online analytical processing (OLAP), geographic information systems, and streaming data processing. 4 PostgreSQL*: a powerful, open source object-relational database system used for online transaction processing (OLTP). 5
5 Revolution Analytics DeployR* Web Services Analytics Template 1 Analytics Template 2 Analytics Template 3 Analytics and Application node Dell M820 (four 8-core processors, 256GB RAM) CLOUDERA ENTERPRISE* HUB BATCH PROCESSING ANALYTIC SQL FileSystem SEARCH ENGINE MACHINE LEARNING WORKLOAD MANAGEMENT Column-store database for fast analytics workloads RDBMS/row-store database for OLTP workloads STORAGE FOR ANY TYPE OF UNIFIED, ELASTIC, RESILIENT, SECURE STREAM PROCESSING Online NoSQL Hadoop* Node Dell M820 (four 8-core processors, 512GB RAM) HDD: 900GB 10K RPM SAS 6GBps 2.5 3RD PARTY APPS 10 x 900GB ~ 6TB 15 x 900GB ~ 10TB usable Figure 3. Software allocations to virtual machines on the big data analytics server MANAGEMENT SYSTEM MANAGEMENT AquaFold*: an application server used to build and deploy production quality database and reporting applications quickly. It includes capabilities like multiple sources data synchronization, cross-database data migration, data warehousing extract transform and load, scheduling data exports/ imports, and custom business intelligence dashboards. Supports communication to various production databases in the network like PostgreSQL, Microsoft SQL*, Oracle*, and IBM DB2* with ODBC* and JDBC* connections. 6 Hadoop* node Four virtual machines are provisioned to run a basic, four-node Cloudera Hadoop cluster, consisting of one head node and three worker nodes. Apache Hadoop is an open source distributed software platform for scalable distributed computing. Written in Java, it runs on a cluster of industry-standard servers configured with direct-attached storage and cost-effectively scales performance by adding economical nodes to the cluster. Cloudera Enterprise Data Hub* (CDH) offers a unified platform for big data by providing one place to store, process, and analyze all of their data resulting in an extension of existing investment value and enabling fundamental new ways to derive value from their data. CDH is 100 percent Apache licensed open source and is unique in offering unified batch processing, interactive SQL, and interactive search and rolebased access controls. 7 5
6 A closer look at the Cloudera stack Cloudera Enterprise Data Hub* (CDH) delivers the core elements of Hadoop* scalable storage and distributed computing along with additional components such as a user interface, plus necessary enterprise capabilities such as security, and integration with a broad range of hardware and software solutions. CDH can help businesses transform and accelerate the way big data is used. When combined with datacenter architecture based on Intel Xeon processors, Cloudera and Intel provide a comprehensive approach for industrial and manufacturing businesses seeking to analyze data to make factories run more efficiently. Internet of Things (IoT) Gateway An Intel Atom processor-based gateway by Mitsubishi Electric, called the Mitsubishi Electric C Language Controller of MELSEC-Q series*, 8 is used to aggregate and securely ingest data into the big data analytics server. Data ingestion is the process of validating, filtering, and reformatting the data to make it easier for the big data analytics software to work on it. The Mitsubishi Electric C language controllers of MELSEC-Q series are embedded solutions equipped with numerous features characteristic of intelligent systems, including robust network connectivity and the high computational performance needed to process large amounts of data collected from sensors or via the network when supporting sophisticated system control and operations. At the core of this controller is a hardware platform based on Intel architecture and the Wind River VxWorks* real-time operating system. Mitsubishi Electric developed C language controllers of MELSEC-Q Series to satisfy the diverse requirements of factory automation, including excellent reliability, tolerance of harsh environments, and long-term availability. These features make it a robust and reliable product that requires little maintenance for IoT manufacturing applications. In place of ladder logic used in conventional programmable logic controllers, the C language controllers of MELSEC-Q series use the international standard C languages (C and C++) for greater programming flexibility. This allows users to take full advantage of their existing C language software and development. CIMSNIPER* is a data acquisition and processing software package for Mitsubishi Electric C Language Controller of MELSEC-Q series. It can collect process data (including SECS messages) and manufacturing equipment errors without modifications of existing systems. Big data analytics cases at Intel manufacturing Over the past two years, Intel has developed more than a dozen big data projects that have bolstered both operational efficiency and the bottom line. Here are a couple of examples: Reduced product test times Every Intel chip produced undergoes a thorough quality check involving a complex series of tests. Intel found that by using historical information gathered during manufacturing, the number of tests required could be reduced, resulting in decreased test time. Implemented as a proof of concept, this solution avoided test costs of $3 million in 2012 for one series of Intel Core processors. Extending this solution to more products, Intel expects to realize an additional $30 million in cost avoidance. 9 Improved manufacturing monitoring Data-intensive processes also help Intel detect failures in its manufacturing line, which is a highly automated environment. Intel pulls log files out of manufacturing tools and testers across the entire factory network, which can be up to 5 Terabytes an hour. By capturing and analyzing this information, Intel can determine when a specific step in one of its manufacturing processes starts to deviate from normal tolerances. 10 Specific to the current end-toend platform pilot deployment discussed here, Intel, in collaboration with Mitsubishi Electric, Cloudera, Revolution Analytics, and Dell, successfully pioneered capabilities that have made tremendous headway in the use of data mining sciences to solve practical manufacturing issues, thus saving Intel millions of dollars through cost avoidance and improved decision making. The major goal of this project is to extract the value propositions of data and data analytics to obtain better insights in predictive manufacturing and reduce manufacturing costs without sacrificing throughput or quality. The following details some of the groundbreaking work and discoveries Intel accomplished through the integration of big data analytics and technologies for the IoT in manufacturing. 6
7 Pilot Results: Use Case 1: Reducing non-genuine production yield loss through the monitoring and analysis of machine parametric values and timely replacement of parts before they fail Background Automated Test Equipment (ATE) is a machine that is designed to perform tests on different devices referred to as a devices under test (DUT). An ATE uses control systems and automated information technology to rapidly perform tests that measure and evaluate a DUT. 11 The ATE systems interface with an automated placement tool, called a handler, that physically places the DUT on a Test Interface Unit (TIU) so that it can be measured by the equipment. Problem Statement Defective TIUs will wrongly categorize good units as bad, which negatively impacts Intel manufacturing operation costs. Defective TIUs can cause DUTs to be wrongly categorized, including the rejection of good units. Intel manufacturing s objective was to detect TIU defects before they happen so they could repair or replace them before units are wrongly categorized. When a faulty TIU wrongly categorizes good units as bad, the units are scrapped. Some components were replaced with spare parts during regular preventive maintenance, even if they were still operating properly to avoid such problems. Results and Benefits Yield The analytics capability predicted up to 90 percent of potential TIU failures before being triggered by the existing factory s online process control system. This helped, in this case, to replace defective TIUs before causing over rejection of good units, thus reducing yield losses by up to 25 percent. 12 In addition, Intel was able to save spare cost by reducing the need to replace spare parts before they fail during preventive maintenance resulting in an estimated 20 percent reduction in spare cost. Use Case 2: Reducing yield losses by eliminating and minimizing incorrect ball assembly in ball attach equipment Background The ball attach module is where solder paste is printed onto the lands on the substrates. Solder balls are placed into the ball attach lands, and the paste holds them in place. The entire package goes through a reflow oven, which melts the paste and balls to the substrate lands. Solder balls are vacuumed onto tiny holes of the placement head. The head is inspected for excess or missing balls. Once the head is aligned to the substrates, the balls are placed in the solder paste on the substrates. After releasing the balls, the placement head is inspected for any remaining balls. Finally, the substrates are inspected by a camera vision system for any missing or shifted balls. Problem Statement Units with missing balls are faulty material and a yield loss. There are numerous scenarios in which balls are missing from the units, including inadequate vacuum pressure. Results and Benefits By visualizing and correlating sensor readings with various machine data and execution system data, Intel was able to reduce yield losses, optimize maintenance cost, and avoid sudden equipment downtime. 12 This enabled the technicians to proactively fix the problem, toward the journey of having a predictive maintenance capability. Use Case 3: Using image analytics to identify good or defect units Background A machine vision equipment is a module that screens units and categorizes them into good and marginal units. The good units are sent forward to be processed while the marginal ones are inspected and determined by a manufacturing specialist to be either good or defective. This manual process takes time. Problem Statement The manual process to inspect and categorize marginal units is cumbersome and sometimes takes about 8 hours to successfully segregate a host of true reject units from marginal ones. This is caused by the time it takes for the units to reach the operator, flow to a segregation module, and to ultimately be segregated. Image analytics enables identification of reject units moments after being inspected by the inspection module. Results and Benefits The marginal images recorded at the machine vision equipment module are preprocessed. Each image, which is unstructured data, is resized, cropped, converted into grayscale before converting each pixel to binary. The next stage of the process involves feature selection, where the unstructured image is defined by a set of distinct values. These values are then fed into various machine learning algorithms to determine true rejects and marginal rejects. The image analytics shortens the time to save true rejects from a pool of marginal units. Image analytics identifies defects roughly 10 times faster than the manual method. 12 7
8 SOURCES ON EQUIPMENT Sensors VISUALIZATION Business Intelligence, Visualization, Tools and Apps FILTERING AND TRANSPORT VALUE Predictive Maintenance, Reduced MTTR, Improve Cycle Time, Reduce Cost ANALYTICS Revo DeployR WebServices R Based Analytics Template 1, 2 and 3 FTP/ MQTT/ RESTFul PREPARATION Extract Transform Load, Database Extraction Tools, SQL, Phyton, Perl Scoop STORAGE/WAREHOUSE Cloudera Hadoop* CLOUDERA ENTERPRISE* HUB Operational Data DISTILLATION Phyton, R, Perl, Excel Various Lib (MapReduce and HDFS) BATCH PROCESSING ANALYTIC SQL FileSystem SEARCH ENGINE MACHINE LEARNING WORKLOAD MANAGEMENT STREAM PROCESSING STORAGE FOR ANY TYPE OF UNIFIED, ELASTIC, RESILIENT, SECURE Online NoSQL 3RD PARTY APPS MANAGEMENT SYSTEM MANAGEMENT NoSQL (Hadoop) RDMS/ OLTP COL/ OLAP Figure 4. High-level data flow Data Flow Figure 4 shows how data flows from the previously discussed use cases. Machine data and sensors data are sent by the Intel Atom processorbased gateway to the Big Data Analytics Server (BDAS) as they are acquired in real time. For instance, machine data from the ball attach module acquired over machine interface ports and from analog sensors are streamed at a velocity of a few MBs per a few seconds. 8 All incoming factory data is stored in Hadoop*. The following, nonexhaustive list shows some of the possibilities that can be achieved using readily available capabilities in Hadoop: - HTTP: The Big Data Analytics Server exposes an authenticated REST HTTP endpoint that supports operations into HDFS. - Apache Sqoop* provides the connectivity tool for moving data from non-hadoop data stores such as relational databases and data warehouses into Hadoop. - Flume* can receive a stream of continuous log data. The data specific to these use cases described in the section above are in the form of Comma-Separated Value (CSV) files or raw images. While the Hadoop ecosystem includes plenty of ways to ingest data (as described in the previous point), some of the machines in the factory have limited network transfer capabilities, which required custom engineering to deliver data into HDFS:
9 - FTP: The IOT gateway has an FTP client that periodically connects to the Big Data Analytics Server and transfers over the latest acquired data directly into HDFS. Other streaming protocols like MQTT and REST can be used depending on the real time streaming and analytic requirements. - CIFS share (Windows share): The Big Data Analytics Server provides a Windows/CIFS share directory that the gateway can copy files into. The CSV files are directly imported into HBase* using Pig, while the raw images first go though a preprocess using a map-reduce job using computer vision techniques to produce a text data representation of the image. Depending on the operational requirements, the data is stored in one of three databases: NoSQL (Hadoop), RDMS/SQL, or Coli/OLAP. The databases are accessed and processed using various tools, including AquaFold, ad hoc reporting, workflow scheduling, and ETL and database integration. At the same time, the Cloudera Distribution for Hadoop performs various operations on the data. The processed data is further analyzed using Revolution Analytics tools designed for factory applications. The data is presented to users in easy to understand dashboards. Summary of key learnings and conclusions Intel integrated and validated a big data analytics on-premise server solution with data extracted from Intel s own manufacturing network and from equipment and sensors through the use of Internet of Things gateway to validate the business value of Internet of Things in manufacturing. The Mitsubishi Electric C language controllers of MELSEC-Q series were used. The pilot involved close collaborations between the factory engineers, the IT department, and industry experts from Cloudera, Dell, Mitsubishi Electric, and Revolution Analytics. The team started leveraging existing machine performance and monitoring data, then proceeded to leverage big data analytics and modelling to ingest the data to predict potential excursions and failures. Being able to predict the machine component failures allows engineers to repair and prevent the excursion, hence reaping tremendous savings from wasting production units, time to repair, and machine components. An integrated suitcase of various software building blocks on the Big Data Analytics Server and IOT gateway were used. This framework can be applied and implemented for manufacturers which have not started leveraging on the intelligence contained in manufacturing data. Manufacturers that have already been using data to improve their efficiency can incrementally add to their existing capabilities to evolve their data mining and analytics capability to the next level. Big data analytics and the Internet of Things in manufacturing as an end-toend platform is the critical backbone to enable the vision of smart manufacturing. This platform is scalable and available in various configurations using currently available industry-standard building blocks. For Intel, this pilot is forecasted to save millions of dollars (USD) annually with additional return on investment business values that Intel is still realizing. Benefits include improving equipment component uptime, minimizing wrong classification of good units as bad (hence increasing yield and productivity), enabling predictive maintenance, and reducing component failures. There are many other types of parametric, metrology, product, and equipment data both structured and unstructured within Intel manufacturing environment and machines which could be mined and analyzed to extract new business values. Capitalizing on this opportunity will enable further efficiency and productivity enhancements to the factory and ultimately create a competitive advantage. 9
10 1. Gartner analyst Doug Laney introduced the 3Vs concept in a 2001 MetaGroup research publication, 3D data management: Controlling data volume, variety and velocity en.wikipedia.org/wiki/monetdb Source: Intel IT Annual Report, 10. Source: Intel Cuts Costs With Big Data, InformationWeek, March 18, 2013, The TCO or other cost reduction scenarios described in this document are intended to enable you to get a better understanding of how the purchase of a given Intel product, combined with a number of situation-specific variables, might affect your future cost and savings. Nothing in this document should be interpreted as either a promise of or contract for a given level of costs. 11. Source: 12. Results might vary depending on package size, process, and equipment used in the manufacturing process. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked reserved or undefined. Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling , or by visiting Intel s Web site at Copyright 2014 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Atom and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 0914/KW/CMD/PDF Please Recycle US
Real-Time Big Data Analytics with the Intel Distribution for Apache Hadoop software Executive Summary is already helping businesses extract value out of Big Data by enabling real-time analysis of diverse
Fast, Low-Overhead Encryption for Apache Hadoop* Solution Brief Intel Xeon Processors Intel Advanced Encryption Standard New Instructions (Intel AES-NI) The Intel Distribution for Apache Hadoop* software
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
整合 ICT 技術打造綠色智慧節能航空城 李國輝亞太區 ICT 解決方案架構師 Intel Corp Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE,
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
Solution Blueprint Internet of Things (IoT) Affordable Building Automation System Enabled by the Internet of Things (IoT) HCL Technologies* uses an Intel-based intelligent gateway to deliver a powerful,
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
A Superior Hardware Platform for Server Virtualization Improving Data Center Flexibility, Performance and TCO with Technology Brief Server Virtualization Server virtualization is helping IT organizations
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
Intel and Qihoo 360 Internet Portal Datacenter - Big Data Storage Optimization Case Study The adoption of cloud computing creates many challenges and opportunities in big data management and storage. To
Solution Brief Intel Xeon Processors Lanner Intel Network Builders: Lanner and Intel Building the Best Network Security Platforms Internet usage continues to rapidly expand and evolve, and with it network
Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.
Integrating Cloudera and SAP HANA Version: 103 Table of Contents Introduction/Executive Summary 4 Overview of Cloudera Enterprise 4 Data Access 5 Apache Hive 5 Data Processing 5 Data Integration 5 Partner
White Paper MarkLogic and Intel for Federal, State, and Local Agencies Simplifying Data Governance and Accelerating Real-time Big Data Analysis for Government Institutions with MarkLogic Server and Intel
IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
Intel Cloud Builders Guide Intel Xeon Processor-based Servers RES Virtual Desktop Extender Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms Client Aware Cloud with RES Virtual
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated
An Oracle White Paper October 2013 Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics Introduction: The value of analytics is so widely recognized today that all mid
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Oracle Manufacturing Operations Center Today's leading manufacturers demand insight into real-time shop floor performance. Rapid analysis of equipment performance and the impact on production is critical
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
White Paper MarkLogic and Intel for Healthcare Simplifying Data Governance and Accelerating Real-time Big Data Analysis for Healthcare with MarkLogic Server and Intel Reduce risk and speed time to value
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
The Case for Rack Scale Architecture An introduction to the next generation of Software Defined Infrastructure Intel Data Center Group Pooled System Top of Rack Switch POD Manager Network CPU/Memory Storage
Dubrovnik, Croatia, South East Europe 20-22 May, 2013 Big Data Value, use cases and architectures Petar Torre Lead Architect Service Provider Group 2011 2013 Cisco and/or its affiliates. All rights reserved.
IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,
Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal
Solution Recipe: Improve PC Security and Reliability with Intel Virtualization Technology 30406_VT_Brochure.indd 1 6/20/06 4:01:14 PM Preface Intel has developed a series of unique Solution Recipes designed
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable
Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance
Solution Overview LexisNexis High-Performance Computing Cluster Systems Platform Get More Scalability and Flexibility for What You Will Learn Modern enterprises are challenged with the need to store and
ORACLE OPS CENTER: VIRTUALIZATION MANAGEMENT PACK KEY FEATURES LIFECYCLE MANAGEMENT OF VIRTUALIZATION TECHNOLOGIES MADE SIMPLE Automation of lifecycle management reduces costs and errors while improving
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
Intel Service Assurance Administrator Product Overview Running Enterprise Workloads in the Cloud Enterprise IT wants to Start a private cloud initiative to service internal enterprise customers Find an
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
Introduction to Oracle Business Intelligence Standard Edition One Mike Donohue Senior Manager, Product Management Oracle Business Intelligence The following is intended to outline our general product direction.
White Paper MarkLogic and Intel for Financial Services Simplifying Data Governance and Accelerating Real-time Big Data Analysis in Financial Services with MarkLogic Server and Intel Reduce risk and speed
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
Case Study Cloud based Holdfast Electronic Sports Game Platform Intel and Holdfast work together to upgrade Holdfast Electronic Sports Game Platform with cloud technology Background Shanghai Holdfast Online
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
White Paper Wind River Hypervisor and Operating Systems Intel Processors for Embedded Computing Applying Multi-core and Virtualization to Industrial and Safety-Related Applications Multi-core and virtualization
Windows Embedded Security and Surveillance Solutions Windows Embedded 2010 Page 1 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues
White Paper Intel Xeon processor E5 v3 family Intel Xeon Phi coprocessor family Digital Design and Engineering Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture Executive
Big Data for Big Science Bernard Doering Business Development, EMEA Big Data Software Internet of Things 40 Zettabytes of data will be generated WW in 2020 1 SMART CLIENTS INTELLIGENT CLOUD Richer user
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS PRODUCT FACTS & FEATURES KEY FEATURES Comprehensive, best-of-breed capabilities 100 percent thin client interface Intelligence across multiple
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle
9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence
Intel IT IT Best Practices Cloud Security and Secure ization November 2011 Overcoming Security Challenges to ize Internet-facing Applications Executive Overview To enable virtualization of Internet-facing
Accelerating Enterprise Big Data Success Tim Stevens, VP of Business and Corporate Development Cloudera 1 Big Opportunity: Extract value from data Revenue Growth x = 50 Billion 35 ZB Cost Savings Margin
v2 High Performance Data Management Use of Standards in Commercial Product Development Jay Hollingsworth: Director Oil & Gas Business Unit Standards Leadership Council Forum 28 June 2012 1 The following
Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data
White Paper Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III Performance of Microsoft SQL Server 2008 BI and D/W Solutions on Dell PowerEdge
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
Cost Savings Solutions for Year 5 True Ups US Dept. of Energy EA Affigent/CDWG/Microsoft Realizing Cost Savings Now and Moving to a Dynamic Datacenter via your Current EA Enterprise Desktop Solutions to
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?
IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade
Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide An Oracle White Paper October 2010 Disclaimer The following is intended to outline our general product direction.
Your consent to our cookies if you continue to use this website.