In-Memory BigData Summer 2012, Technology Overview
Company Vision In-Memory Data Processing Leader: > 5 years in production > 100s of customers > Starts every 10 secs worldwide > Over 10,000,000 starts globally > Unique in-memory compute + data grid technology
In-Memory Processing Facts > 64-bit CPUs can address 16 exabytes > Disk up to 10 7 times slower than RAM > RAM prices drop 30% every 18 months > 1GB costs < $1 > 1TB RAM & 48 cores cluster ~ $40K > Multicore CPUs ideal for in-memory parallelization > Speed matters > Citi: 100ms == $1M > Google: 500ms == 20% traffic drop In-memory will have an industry impact comparable to web and cloud. RAM is a new disk, and disk is a new tape.
GridGain 4: Three Editions > Different markets, customers, messages, needs: > Compute Grid Edition > Data Grid Edition > Big Data Edition
GridGain 4: In A Glance > Scalable In-Memory Data Platform > Compute Grid + In-Memory Data Grid Real Time & Streaming MapReduce, CEP > TBs of data and 1000s of nodes Typical 10s of TBs and 100s of nodes > In-Memory Speed, Database Reliability > Native: Java, Scala and Groovy DSLs > Clients: C++,.NET, ios, Android, PHP, REST > Distributed in-memory object store
GridGain 4: New Features 1. In-Memory Data Grid 2. In-Memory Compute Grid 3. Streaming MapReduce 4. Clustering 5. Messaging 6. Advanced Security 7. DevOps GUI Console 8. SPI Architecture 9. Zero Deployment 10. Native Client APIs 11. Java, Scala, Groovy 12. Advanced Load Balancing 13. Pluggable Fault Tolerance 14. Hadoop Integration
Clustering GridGain 4 Sophisticated clustering capabilities for JVM with ability to connect and manage a heterogenous set of computing devices > Pluggable cluster topology management & various consistency strategies > Pluggable automatic discovery on LAN, WAN, and AWS > Pluggable split-brain cluster segmentation resolution > Unicast, broadcast, and Actor-based cluster-wide message exchange > Pluggable event storage and propagation > Versioning > Support for complex leader election algorithms > On-demand and direct deployment > Support for virtual clusters and grouping > Integration with Hadoop ZooKeeper
Advanced Security GridGain 4 > Cluster Security > Client Security > JAAS-based > Authentication > Secure Session
SPI Architecture GridGain 4 Fourteen SPIs provide plug-and-play capabilities to replace and customize every significant subsystem of GridGain runtime. 1. Checkpoint SPI 2. Collision SPI 3. Authentication SPI 4. Secure Session SPI 5. Indexing SPI 6. Load Balancing SPI 7. Communication SPI 8. Deployment SPI 9. Swap Space SPI 10. Metrics SPI 11. Discovery SPI 12. Failover SPI 13. Topology SPI 14. Event Storage SPI
Native Clients GridGain 4 > Java (EE & Android) > C++ >.NET C# > Objective C > REST > Memcache
Java, Scala, Groovy GridGain 4 > Java 6 > Scala 2.9 > Groovy 1.8 and Groovy++ > Scalar - Scala DSL for GridGain > Grover - Groovy++ DSL for GridGain
Hadoop Integration GridGain 4 > HBase cache store > ZooKeeper discovery integration > Distributed bulk data loader > Hadoop-compatible Distributed File System > In-memory & high performance alternative to HDFS
DevOps Console GridGain 4
Success Stories > Trading Systems Handle large volumes of transactions > Real-time Risk Analysis Analysis of trading positions & risk > Online Gaming Online real-time backbone for gaming > Actuarial Analysis Insurance Rating and Modeling > Geo Mapping Real-time geographical route and traffic information > Bioinformatics Real-time DNA sequencing and matching
GridGain Customers
In-Memory Data Grid Features 1 > Java-based distributed in-memory store > Zero deployment for data > Local, full replicable and partitioned cache types > Pluggable expiration policies (LRU, LFU, FIFO, time based and random) > Read-through and write through > Pluggable cache store (SQL, ERP, Hadoop) > Synchronous & asynchronous cache operations > MVCC-based concurrency > Pluggable data overflow storage > PESSIMISTIC & OPTIMISTIC ACID transactions
In-Memory Data Grid Features 2 > JTA/JTS integration > Master/master data replication > Master/master data invalidation > Replication/invalidation in async/sync modes > Write-behind cache store support > Concurrent/Delayed transactional preloading > Affinity routing with compute grid > Partitioned cache with active backups (replicas) > Structures and unstructured data > Transactional datacenter replication
In-Memory Data Grid Features 3 > Customizable/pluggable data indexing > JDBC driver for in-memory data > Co-located cache mode > BigMemory (off-heap allocation) support > Tiered storage with on-heap, off-heap, swap, SQL and Hadoop > Distributed in-memory query support > SQL-based affinity co-located queries > Lucene-based text affinity co-located queries > H2-based text affinity co-located queries > Predicate-based full scan queries > Support for pagination > Local & remote filtering, transformation and reduction for execution plan
In-Memory Compute Grid Features 1 > Direct API for map/split and reduce/aggregate > Pluggable failover management > Pluggable topology resolution > Pluggable collision resolution > Distributed task session > Distributed continuations and recursive split > Streaming MapReduce > Complex Event Processing (CEP) > Node-local cache > AOP-based, OOP/FP-based, sync/async execution modes
In-Memory Compute Grid Features 2 > Direct closure distribution in Java, Scala and Groovy > Cron-based task scheduling > Direct redundant mapping support > Zero deployment with P2P on-demand distributed class loading > Partial asynchronous reduction > Weighted and dynamic adaptive mapping > State checkpoints for long running tasks > Early and late load balancing > Affinity rouging with data grid
GridGain Systems 1065 East Hillsdale Blvd., Suite 230 Foster City, CA 94404 Web: www.gridgain.com Email: info@gridgain.com Twitter: @gridgain