Axibase Time Series Database



Similar documents
EMC Data Protection Advisor 6.0

Real Time Big Data Processing

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA TRENDS AND TECHNOLOGIES

So What s the Big Deal?

How To Scale Out Of A Nosql Database

INTRODUCTION TO CASSANDRA

NetApp Big Content Solutions: Agile Infrastructure for Big Data

StruxureWare TM Center Expert. Data

Using distributed technologies to analyze Big Data

Towards Smart and Intelligent SDN Controller

Oracle Big Data SQL Technical Update

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

Data Sheet: Storage Management Veritas CommandCentral Storage 5.1 Centralized visibility and control across heterogeneous storage environments

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Installation and Configuration Guide for Windows and Linux

Installation and Configuration Guide for Windows and Linux

BigMemory & Hybris : Working together to improve the e-commerce customer experience

Database as a Service (DaaS) Version 1.02

The Internet of Things and Big Data: Intro

Big Data With Hadoop

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Veeam ONE What s New in v9?

Centralized Orchestration and Performance Monitoring

Symantec NetBackup 5000 Appliance Series

XpoLog Center Suite Log Management & Analysis platform

How To Use Ibm Tivoli Monitoring Software

How To Handle Big Data With A Data Scientist

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Oracle Database In-Memory The Next Big Thing

Getting Started with Capacity Planner

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Monitoring can be as simple as waiting

XpoLog Center Suite Data Sheet

Reference Architecture, Requirements, Gaps, Roles

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

How To Use Big Data For Telco (For A Telco)

IBM Tivoli Monitoring for Virtual Environments: Dashboard, Reporting, and Capacity Planning Version 7.2 Fix Pack 2. User s Guide SC

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Heroix Longitude Quick Start Guide V7.1

Hypertable Architecture Overview

Product Guide. Sawmill Analytics, Swindon SN4 9LZ UK tel:

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Implement Hadoop jobs to extract business value from large and varied data sets

Sisense. Product Highlights.

InfiniteGraph: The Distributed Graph Database

Applications for Big Data Analytics

ENTERPRISE-CLASS MONITORING SOLUTION FOR EVERYONE ALL-IN-ONE OPEN-SOURCE DISTRIBUTED MONITORING

Chapter 7. Using Hadoop Cluster and MapReduce

HIGH AVAILABILITY CONFIGURATION FOR HEALTHCARE INTEGRATION PORTFOLIO (HIP) REGISTRY

How To Use Hp Vertica Ondemand

Server & Application Monitor

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

Data Sheet: Archiving Altiris Server Management Suite 7.0 from Symantec Essential server management: Discover, provision, manage, and monitor

System Requirements and Platform Support Guide

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

SQL Server 2012 Performance White Paper

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Integrating Big Data into the Computing Curricula

WHITE PAPER September CA Nimsoft Monitor for Servers

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

TIBCO Live Datamart: Push-Based Real-Time Analytics

Data Sheet: Backup & Recovery Symantec Backup Exec 12.5 for Windows Servers The gold standard in Windows data protection

Benchmarking Cassandra on Violin

Cost-Effective Business Intelligence with Red Hat and Open Source

Hadoop & its Usage at Facebook

Data Sheet: Server Management Altiris Server Management Suite 7.0 Essential server management: Discover, provision, manage, and monitor

VMware vcenter Operations Manager Enterprise Administration Guide

Customized Report- Big Data

Big Data Technologies Compared June 2014

ACEYUS REPORTING. Aceyus Intelligence Executive Summary

Data Sheet: Disaster Recovery Veritas Volume Replicator by Symantec Data replication for disaster recovery

IT Infrastructure Management

Machine Data Analytics with Sumo Logic

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Virtual Server Hosting Service Definition. SD021 v1.8 Issue Date 20 December 10

How AWS Pricing Works

SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON

Security and Billing for Azure Pack. Presented by 5nine Software and Cloud Cruiser

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

NextGen Infrastructure for Big DATA Analytics.

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Oracle Database 12c Plug In. Switch On. Get SMART.

BIG DATA What it is and how to use?

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Big Data Explained. An introduction to Big Data Science.

Cloud Computing Is In Your Future

features at a glance

High Availability Guide for Distributed Systems

XpoLog Competitive Comparison Sheet

recovery at a fraction of the cost of Oracle RAC

Transcription:

Axibase Time Series Database

Axibase Time Series Database Axibase Time-Series Database (ATSD) is a clustered non-relational database for the storage of various information coming out of the IT infrastructure. ATSD is specifically designed to store and analyze large amounts of statistical data collected at high frequency. 2 Prepared by Axibase

Database History 1970 IBM introduced relational algebra for data processing. Cambrian explosion of relational database management systems: 2000 first large-scale applications emerge, such as Google Search. 2004 Google Big Table first non-relational database using distributed file system. Currently we are experiencing Cambrian explosion of non-relational (a.k.a. NoSQL) databases: 3 Prepared by Axibase

Key Differences Between SQL and NoSQL SQL NoSQL High-level Programming Language SQL Transactions Query Optimizer Non-key indexes 4 Prepared by Axibase

Key Differences Between SQL and NoSQL SQL NoSQL Scalability TB PB Maximum Cluster Size 48 (Oracle RAC) 1000+ Distributed Read Time Write Time Table Schema (column names, data types) Depends on table size and indexes Depends on table size and indexes Predetermined Linear Linear Raw bytes. Schema determined by application 5 Prepared by Axibase

How Proven Is NoSQL Technology NoSQL is the leading technology behind big data applications. Google search, gmail, AppEngine Yahoo/Microsoft search Amazon e-commerce, search, cloud computing (AWS DynamoDB) IBM Big Insights, Microsoft Azure HD Insight 6 Prepared by Axibase

Big Data Adoption HBase behind Facebook Messages: 6+ billion messages per day 75+ billion R/W operations per day Peak throughput: 1.5 million R/W operations per second 2+ petabytes of data (6+ PB including replicas) with data growth of over 8 TB per day 7 Prepared by Axibase

Big Data Adoption IBM BigInsights behind Vestas: A wind energy company in Denmark is reducing the time to analyze petabytes of data from several weeks to 15 minutes to improve the accuracy of wind turbine placement. Stores 2.8 PB of company historical data together with over 178 external parameters: temperature, barometric pressure, humidity, precipitation, wind direction, wind velocity etc. Stores precise data on weather over the past 11 years. Collects data from over 35,000 meteorological stations. 8 Prepared by Axibase

Big Data Adoption HBase behind Explorys: Explorys uses HBase to enable search and analysis of patient populations, treatment protocols, and clinical outcomes. Stores over 275 billion clinical, financial and operational data elements. 48 million unique patient files. Collecting data from over 340 hospitals and 300,000 healthcare providers. Pull data from 22 integrated major healthcare systems. 9 Prepared by Axibase

Axibase Time Series Database Scalability & Speed Collects billions of samples per day. Retains detailed data forever. Features Combines database, rule engine, and visualization in one product. Analytical Rule Engine Applies aggregate functions and filters on streaming data. Integration Accepts data from any source based on industry-standard protocols. Visualization Built-in portals with smart widgets. 10 Prepared by Axibase

11 Prepared by Axibase

Big Data for IT Monitoring Retain detailed data forever. Collect statistics at high-frequency, for example every 15 seconds. Consolidate performance statistics from all systems into one database: facilities, network, storage, servers, applications, databases, transactions, service providers, user activity etc. Monitor infrastructure based on abnormal deviations instead of manual thresholds. Apply statistical formulas to predict outages. Take advantage of schema-less database to collect data from any source. 12 Prepared by Axibase

Big Data for Developers Support for annotation-style instrumentation. Alternative to byte-code instrumentation and file logging. Collect detailed performance and usage statistics for reporting and analytics, without writing custom monitors. 13 Prepared by Axibase

Big Data for Operations Gather and analyze statistical data generated by the various systems and sensors. Analytics that can support decision control systems. Allows for better real time operations decision support. Generate accurate forecasts of upcoming issues: Delays Scheduled maintenance based on product usage and sensor data instead of warranty periods Improved customer service times and standards. 14 Prepared by Axibase

ATSD Architecture ATSD architecture combines database, analytics and reporting tools into one complete product. Data locality makes analytics run faster. Application server layer is simplified to provide core shared services 15 Prepared by Axibase

ATSD Components Pluggable driver provides support for different storage engines Compute, persistence and data collection layers scaled independently 16 Prepared by Axibase

Fault Tolerance ATSD is a distributed system, with high fault tolerance. Each data sample is automatically replicated 3 times for recovery. 17 Prepared by Axibase

ATSD Scalability ATSD is a distributed, non-relational database with high throughput, fault tolerance and reading speed. ATSD can collect billions of metrics per day and store petabytes of data. ATSD supports millisecond resolution and sampling intervals of up to several measurements per second. The data is stored without losing accuracy. Additional nodes can be added at runtime to handle increasing volumes. ATSD automatically distributes the table across active nodes. New nodes can be added in remote data centers to minimize network traffic. 18 Prepared by Axibase

Supported Data Types Two types of data ingestion: push and pull. ATSD supports numeric values, log messages and properties (collection of key-values). ATSD uses collectors for retrieving structured and unstructured data from remote sources. Support for standard protocols: Telnet, ICMP, CSV/TSV, FILE, JMX, HTTP, and JSON. 19 Prepared by Axibase

Data Collection Collection is agentless; data is pushed by external systems into ATSD. New metrics are auto-registered. No need to update schema or restart any server components. Existing monitoring tools can be instrumented to stream data into ATSD. Each data sample can be tagged (key = value) at source for subsequent querying, aggregations, and roll-ups. 20 Prepared by Axibase

Data Storage Built-in data compression provides 70%-80% disk space savings over raw data. No data needs to be deleted. Seek time is almost linear regardless of the dataset size. Data storage is sparse and efficient. ATSD stores only what is collected instead of long rows with NULLs or zeros, as is the case in relational model. VMware VMFS-attached disks are sufficient for small to medium clusters. Direct attached disks with JBOD are recommended for larger clusters. JBOD alternatives to minimize node recovery time are available from leading storage vendors, such as NetApp E-Series. 21 Prepared by Axibase

Built-in Instruments Unlike conventional data warehouses, ATSD comes with a set of built-in tools for data analysis: Analytical Rule Engine Forecasting Visualization 22 Prepared by Axibase

Analytical Rule Engine Evaluates incoming data in memory based on statistical rules. Statistical rules are applied to the incoming data stream before data is stored on disk. As data is ingested by ATSD server, a subset of samples that match rule queries are routed to the rule engine for processing. Rule Engine supports both time- and count- based data windows. Rule expressions and filters can reference not just numeric values but also tags such as system type, location, priority to ensure that alerts are raised only for critical issues. Multiple metrics and entities can be correlated within the same rule. 23 Prepared by Axibase

Analytical Rule Engine Rule Examples Type Window Example Description threshold none value > 75 Raise an alert if last metric value exceeds threshold range none value > 50 AND value <= 75 Raise an alert if value is outside of specified range statistical-count count(10) avg(value) > 75 Raise an alert if average value of the last 10 samples exceeds threshold statistical-time time('15 min') avg(value) > 75 Raise an alert if average value for the last 15 minutes exceeds threshold statistical-deviation time('15 min') avg(value) / avg(value(time: '1 hour')) > Raise an alert if 15-minute average exceeds 1-hour average by more than 25% 1.25 statistical-ungrouped time('15 min') avg(value) > 75 Raise an alert if 15-minute average values for all entities in the group exceeds threshold metric correlation time('15 min') avg(value) > 75 AND avg(value(metric: 'loadavg.1m')) > 0.5 Raise an alert if average values for two separate metrics for the last 15 minutes exceed predefined thresholds entity correlation time('15 min') avg(value) > 75 AND avg(value(entity: 'host2')) > 75 Raise an alert if average values for two entities for the last 15 minutes exceed thresholds threshold override time('15 min') avg(value) >= entity.grouptag('cpu Raise an alert if 15-minute average value exceeds minimum threshold specified for groups to which _avg').min() the entity belongs cpu forecast deviation time('5 min') abs(forecast_deviation(wavg())) > 2 Raise an alert if 5-minute average deviates from forecast by more than two standard deviations cpu forecast diff time('10 min') abs(wavg() - forecast()) > 25 Raise alert if absolute forecast deviates from average by more than specified value disk threshold time('15 min') new_maximum() && threshold_linear_time(99) < 120 Raise alert if last value is the highest observed and linear threshold is expected to violate the 99% threshold in less than 120 minutes 24 Prepared by Axibase

Analytical Rule Engine 25 Prepared by Axibase

Analytical Rule Engine 26 Prepared by Axibase

Forecasting Customers have a growing need to predict problems before they occur. The accuracy of predictions and the percentage of false positives/negatives highly depends on the frequency of data collection, the retention interval, and algorithms. The use of built-in autoregressive time-series extrapolation algorithms (Holt-Winters, ARIMA, etc.) in ATSD allows predicting of system failures at early stages. The forecasting process is resource intensive and is most effective in a clustered system with data locality such as ATSD. Dynamic predictions eliminate the need to set manual thresholds. 27 Prepared by Axibase

Forecasting Example 28 Prepared by Axibase

Forecasting Example 29 Prepared by Axibase

Forecast Settings ATSD selects the most accurate forecasting algorithm for each time-series separately based on a ranking system. The winning algorithm is used to compute forecast for the next day, week or month. Pre-computed forecasts can be used in rule engine. 30 Prepared by Axibase

Forecast Settings 31 Prepared by Axibase

Visualization ATSD can be integrated with Axibase Enterprise Reporting using the ATSD adapter ATSD comes with a wide variety of widgets for creating interactive portals directly in ATSD. ATSD widgets are designed from the ground-up to handle large data sets and calculations on the client. ATSD visualization is supported on mobile devices and Smart TVs. 32 Prepared by Axibase

Visualization 33 Prepared by Axibase

Search Implemented in ATSD is log file search system to detect problems in distributed systems for the purposes of security, audit and change control. Notifications Supports standard notification mechanisms: email, console, web service, and notification in the environment. For example, Axibase LED lighting system - the "Data Cube", which changes colors depending on the status of IT services. 34 Prepared by Axibase

ATSD Benefits Enables customers to extract value from data that already exists in their operational and IT infrastructures. Delivers preemptive monitoring through identification of abnormal behaviors in production systems. Eliminates most manually-defined rules from the customer s monitoring catalog. Serves as a centralized repository for historical data. Directly supported by AER for Dashboards, Reports, Capacity Planning 35 Prepared by Axibase

System Requirements Operating Systems: Red Hat Enterprise Linux 5.6+ Ubuntu 12.04+ Suse Linux Enterprise Server 10+ Computing Hardware: Edition Community - FREE Standard Enterprise ATSD Nodes 1 1 + 1 > 5 Processors 2 vcpu, 2+ GHz 4 vcpu, 2+ GHz 4 vcpu, 2+ GHz Memory 4 GB (2GB for JVM) 16 GB (8GB for JVM) 16 GB (8GB for JVM) 36 Prepared by Axibase

Use Cases ITM long-term history extension nmon reporting for AIX, Linux and Solaris Minimize exceptions in monitoring catalog Collect environmental data from SCADA Predictive Maintenance based on sensors 37 Prepared by Axibase

ITM History Extension ITM can be instrumented to write streaming data into CSV files. CSV can be instantly uploaded into ATSD using inotify utility and wget. Example: private history streaming in ITM KHD_CSV_OUTPUT_ACTIVATE = Y 38 Prepared by Axibase

ITM History Extension Warehouse Proxy Agent is setup to save history data to CSV file on the local machine. ATSD ingests the CSV files for analytics and long-term storage. ATSD converts the data using built in parsers. 39 Prepared by Axibase

nmon Reporting Consolidate trusted statistics from UNIX systems in one database ATSD is able to collect, parse and analyze nmon files Analyze nmon data with forecasting algorithms Capitalize on nmon data with two predefined visualization portals or easily create your own portals using built-in HTML5 widgets 40 Prepared by Axibase

nmon Predefined Portals 41 Prepared by Axibase

Predefined AIX Portal 42 Prepared by Axibase

Predefined Linux Portal 43 Prepared by Axibase

Contact Axibase Axibase Contact Details: General - 408.973.7897 Fax - 408.725.8885 Email - sales@axibase.com Our headquarters are located in Cupertino, Silicon Valley: 19925 Stevens Creek Blvd. Cupertino, CA 95014 USA 44 Prepared by Axibase