Big Data. Without a Big Database. Kate Matsudaira

Size: px
Start display at page:

Download "Big Data. Without a Big Database. Kate Matsudaira popforms @katemats"

Transcription

1 Big Data Without a Big Database Kate Matsudaira

2 Two kinds of data nicknames examples: created/modified by: sensitivity to staleness: user, transactional user accounts shopping cart/orders user messages users high reference, nontransactional product/offer catalogs catalogs static geolocation data dictionaries business (you) low plan for growth: hard easy access optimization: read/write mostly read

3 Two kinds of data nicknames examples: created/modified by: sensitivity to staleness: user, transactional user accounts shopping cart/orders user messages users high reference, nontransactional product/offer catalogs catalogs static geolocation data dictionaries business (you) low plan for growth: hard easy access optimization: read/write mostly read

4

5 reference data

6 reference data user data

7

8 reference data

9 reference data user data

10 Performance Reminder main memory read ms (100 ns) network round trip 0.5 ms (500,000 ns) disk seek 10 ms (10,000,000 ns) source:

11

12

13

14 The Beginning load balancer webapp webapp load balancer BIG DATABASE data loader

15 The Beginning availability problems load balancer webapp webapp load balancer BIG DATABASE data loader

16 The Beginning availability problems performance problems load balancer webapp webapp load balancer BIG DATABASE data loader

17 The Beginning availability problems performance problems load balancer webapp webapp load balancer BIG DATABASE data loader scalability problems

18 Replication load balancer webapp webapp load balancer BIG DATABASE data loader

19 Replication REPLICA load balancer webapp webapp load balancer BIG DATABASE data loader

20 Replication REPLICA load balancer webapp webapp load balancer BIG DATABASE data loader scalability problems

21 Replication REPLICA load balancer webapp webapp load balancer BIG DATABASE data loader scalability problems performance problems

22 operational overhead Replication REPLICA load balancer webapp webapp load balancer BIG DATABASE data loader scalability problems performance problems

23 Local Caching REPLICA load balancer webapp load balancer BIG DATABASE webapp data loader

24 Local Caching REPLICA cache load balancer webapp load balancer cache BIG DATABASE webapp cache data loader

25 Local Caching scalability problems REPLICA cache load balancer webapp load balancer cache BIG DATABASE webapp cache data loader

26 Local Caching scalability problems operational overhead REPLICA cache load balancer webapp load balancer cache BIG DATABASE webapp cache data loader

27 Local Caching scalability problems operational overhead REPLICA cache load balancer webapp load balancer cache BIG DATABASE webapp cache data loader performance problems

28 Local Caching scalability problems operational overhead REPLICA cache load balancer webapp load balancer cache BIG DATABASE webapp cache consistency problems data loader performance problems

29 Local Caching scalability problems operational overhead REPLICA cache load balancer webapp load balancer cache BIG DATABASE webapp cache long tail performance problems consistency problems data loader performance problems

30 80% of requests query 10% of entries (head) The Long Tail Problem 20% of requests query remaining 90% of entries (tail)

31 Big Cache replica load balancer webapp webapp load balancer BIG CACHE database data loader preload

32 Big Cache operational overhead replica load balancer webapp webapp load balancer BIG CACHE database data loader preload

33 Big Cache operational overhead performance problems replica load balancer webapp webapp load balancer BIG CACHE database data loader preload

34 Big Cache operational overhead performance problems replica load balancer webapp webapp load balancer BIG CACHE database data loader preload scalability problems

35 Big Cache operational overhead performance problems replica load balancer webapp webapp load balancer BIG CACHE database data loader consistency problems preload scalability problems

36 Big Cache long tail performance problems operational overhead performance problems replica load balancer webapp webapp load balancer BIG CACHE database data loader consistency problems preload scalability problems

37 Big Cache long tail performance problems operational overhead performance problems replica load balancer webapp webapp load balancer BIG CACHE database data loader consistency problems preload scalability problems

38 Big Cache Technologies

39 memcached(b) Big Cache Technologies

40 Big Cache Technologies memcached(b) Do I look like I need a cache? ElastiCache (AWS)

41 Big Cache Technologies memcached(b) Do I look like I need a cache? ElastiCache (AWS) Oracle Coherence

42

43 Targeted generic data/ use cases.

44 Dynamically assign keys to the nodes Targeted generic data/ use cases.

45 Dynamically assign keys to the nodes Targeted generic data/ use cases. Scales horizontally

46 Dynamically assign keys to the nodes Targeted generic data/ use cases. Dynamically rebalances data Scales horizontally

47 Dynamically assign keys to the nodes Targeted generic data/ use cases. Dynamically rebalances data Scales horizontally Poor performance on cold starts

48 Dynamically assign keys to the nodes Targeted generic data/ use cases. Dynamically rebalances data Scales horizontally No assumptions about loading/ updating data Poor performance on cold starts

49 Big Cache Technologies Additional hardware Additional configuration Additional monitoring Extra network hop Slow scanning Additional deserialization

50 Big Cache Technologies Additional hardware Additional configuration Additional monitoring }operational overhead Extra network hop Slow scanning Additional deserialization

51 Big Cache Technologies Additional hardware Additional configuration Additional monitoring } operational overhead Extra network hop Slow scanning Additional deserialization }performance

52 NoSQL to The Rescue? NoSQL Replica load balancer webapp webapp load balancer NoSQL Database data loader

53 NoSQL to The Rescue? NoSQL Replica load balancer webapp webapp load balancer NoSQL Database data loader some performance problems

54 NoSQL to The Rescue? NoSQL Replica load balancer webapp webapp load balancer NoSQL Database data loader some performance problems some scalability problems

55 NoSQL to some operational overhead The Rescue? NoSQL Replica load balancer webapp webapp load balancer NoSQL Database data loader some performance problems some scalability problems

56 Remote Store Retrieval Latency network remote store network client

57 Remote Store Retrieval Latency network remote store network client TCP request: 0.5 ms Lookup/write response: 0.5 ms TCP response: 0.5 ms read/parse response: 0.25 ms

58 Remote Store Retrieval Latency network remote store network client TCP request: 0.5 ms Lookup/write response: 0.5 ms TCP response: 0.5 ms read/parse response: 0.25 ms Total time to retrieve single value: 1.75 ms

59 Total time to retrieve A single value from remote store: 1.75 ms from memory: ms (10 main memory reads)

60 Total time to retrieve A single value from remote store: 1.75 ms from memory: ms (10 main memory reads) Sequential access of 1 million random keys from remote store: 30 minutes from memory: 1 second

61 The Truth About Databases

62 What I'm going to call as the hot data cliff: As the size of your hot data set (data frequently read at sustained rates above disk I/O capacity) approaches available memory, write operation bursts that exceeds disk write I/O capacity can create a trashing death spiral where hot disk pages that MongoDB desperately needs are evicted from disk cache by the OS as it consumes more buffer space to hold the writes in memory. MongoDB Source:

63 Redis is an in-memory but persistent on disk database, so it represents a different trade off where very high write and read speed is achieved with the limitation of data sets that can't be larger than memory. Redis source:

64 They are fast if everything fits into memory.

65

66 Can you keep it in memory yourself? full cache data loader load balancer webapp webapp load balancer full cache full cache data loader data loader BIG DATABASE

67 Can you keep it in memory yourself? operational relief full cache data loader load balancer webapp webapp load balancer full cache full cache data loader data loader BIG DATABASE

68 Can you keep it in memory yourself? operational relief full cache data loader load balancer webapp webapp load balancer full cache full cache data loader data loader BIG DATABASE scales infinitely

69 Can you keep it in memory yourself? operational relief full cache data loader load balancer webapp webapp load balancer full cache full cache data loader data loader BIG DATABASE performance gain scales infinitely

70 Can you keep it in memory yourself? operational relief full cache data loader load balancer webapp webapp load balancer full cache full cache data loader data loader BIG DATABASE consistency problems performance gain scales infinitely

71 Fixing Consistency deployment cell webapp full cache data loader load balancer webapp full cache data loader webapp full cache data loader

72 Fixing Consistency 1. Deployment Cells 2. Sticky user sessions deployment cell webapp full cache data loader load balancer webapp full cache data loader webapp full cache data loader

73 How do you fit all of that data into memory? credit:

74 "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered.! We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.! Donald Knuth

75 How do you fit all that data in memory?

76 The Answer

77 Domain Model Design Domain Layer (or Model Layer):! Responsible for representing concepts of the business, information about the business situation, and business rules. State that reflects the business situation is controlled and used here, even though the technical details of storing it are delegated to the infrastructure. This layer is the heart of business software.! Eric Evans, Domain-Driven Design,

78 Domain Model Design Guidelines

79 Domain Model Design Guidelines #1 Keep it immutable

80 Domain Model Design Guidelines #1 Keep it immutable #2 Use independent hierarchies

81 Domain Model Design Guidelines #1 Keep it immutable #3 Optimize Data #2 Use independent hierarchies Help! I am in the trunk!

82 intern()your immutables K1 V1 K1 V1 A A K2 V2 B C F D E E K2 V2 B C D E B C D E F

83 private final Map<Class<?>, Map<Object, WeakReference<Object>>> cache = new ConcurrentHashMap<Class<?>, Map<Object, WeakReference<Object>>>();! public <T> T intern(t o) { if (o == null) return null; Class<?> c = o.getclass(); Map<Object, WeakReference<Object>> m = cache.get(c); if (m == null) cache.put(c, m = synchronizedmap(new WeakHashMap<Object, WeakReference<Object>>())); WeakReference<Object> r = T v = (r == null)? null : (T) r.get(); if (v == null) { v = o; m.put(v, new WeakReference<Object>(v)); } return v; }

84 Use Independent Hierarchies Product id = title= Product Summary productid = Offers productid = Offers Specifications Description Specifications productid = Reviews productid = Product Info Reviews Rumors Model History Model History productid = Description productid = Rumors productid =

85 Collection Optimization

86 Leverage Primitive Keys/Values collection with 10,000 elements [0.. 9,999] java.util.arraylist<integer> java.util.hashset<integer> gnu.trove.list.array.tintarraylist size in memory 200K 546K 40K gnu.trove.set.hash.tinthashset 102K Trove ( High Performance Collections for Java )

87 Optimize small immutable collections Collections with small number of entries (up to ~20): class ImmutableMap<K, V> implements Map<K,V>, Serializable {... }! class MapN<K, V> extends ImmutableMap<K, V> { final K k1, k2,..., kn; final V v1, v2,..., public boolean containskey(object key) { if (eq(key, k1)) return true; if (eq(key, k2)) return true;... return false; }...

88 java.util.hashmap:! Space Savings 128 bytes + 32 bytes per entry!! compact immutable map:! 24 bytes + 8 bytes per entry

89 Numeric Data Optimization

90 Price History Example

91 Example: Price History Problem: Store daily prices for 1M products, 2 offers per product Average price history length per product ~2 years! Total price points: (1M + 2M) * 730 = ~2 billion

92 Price History First attempt TreeMap<Date, Double>! 88 bytes per entry * 2 billion = ~180 GB

93 Typical Shopping Price History price $ days

94 Typical Shopping Price History price $ days

95 Typical Shopping Price History price $ days

96 Run Length Encoding a a a a a a b b b c c c c c c 6 a 3 b 6 c

97 Price History! positive: price (adjusted to scale) negative: run length (precedes price) zero: unavailable Optimization Drop pennies Store prices in primitive short (use scale factor to represent prices greater than Short.MAX_VALUE) Memory: 15 * (array) + 24 (start date) + 4 (scale factor) = 74 bytes

98 Reduction compared to TreeMap<Date, Double>: Space Savings 155 times Estimated memory for 2 billion price points: 1.2 GB

99 Reduction compared to TreeMap<Date, Double>: Space Savings 155 times Estimated memory for 2 billion price points: 1.2 GB << 180 GB

100 public class PriceHistory {! private final Date startdate; // or use org.joda.time.localdate private final short[] encoded; private final int scalefactor;! public PriceHistory(SortedMap<Date, Double> prices) { } // encode public SortedMap<Date, Double> getpricesbydate() { } // decode! Price History Model public Date getstartdate() { return startdate; } // Below computations implemented directly against encoded data public Date getenddate() { } public Double getminprice() { } public int getnumchanges(double minchangeamt, double minchangepct, boolean abs) { } public PriceHistory trim(date startdate, Date enddate) { } public PriceHistory interpolate() { }

101 Know Your Data

102 Compress text

103 String Compression: byte arrays Use the minimum character set encoding static Charset UTF8 = Charset.forName("UTF- 8");! String s = "The quick brown fox jumps over the lazy dog ; // 42 chars, 136 bytes byte[] b a= "The quick brown fox jumps over the lazy dog.getbytes(utf8); // 64 bytes String s1 = Hello ; // 5 chars, 64 bytes byte[] b1 = Hello.getBytes(UTF8); // 24 bytes! byte[] tobytes(string s) { return s == null? null : s.getbytes(utf8); } String tostring(byte[] b) { return b == null? null : new String(b, UTF8); }

104 String Compression: shared prefix Great for URLs public class PrefixedString { private PrefixedString prefix; private byte[] public int hashcode() { public boolean equals(object o) { } }

105 String Compression: short alphanumeric case-insensitive strings public abstract class AlphaNumericString { public static AlphaNumericString make(string s) { try { return new Numeric(Long.parseLong(s, Character.MAX_RADIX)); } catch (NumberFormatException e) { return new Alpha(s.getBytes(UTF8)); } } protected abstract String public String tostring() {return value(); } private static class Numeric extends AlphaNumericString { long value; Numeric(long value) { this.value = value; protected String value() { return Long.toString(value, Character.MAX_RADIX); public int hashcode() { public boolean equals(object o) { } } private static class Alpha extends AlphaNumericString { byte[] value; Alpha(byte[] value) {this.value = value; protected String value() { return new String(value, UTF8); public int hashcode() { public boolean equals(object o) { } } }

106 String Compression: Large Strings Image source:

107 Gzip Become the master of your strings! String Compression: Large Strings Image source:

108 bzip2 Gzip Become the master of your strings! String Compression: Large Strings Image source:

109 bzip2 Just convert to byte[] first, then compress Gzip Become the master of your strings! String Compression: Large Strings Image source:

110 JVM Tuning

111 JVM Tuning Image srouce:

112 make sure to use compressed pointers (-XX:+UseCompressedOops) JVM Tuning Image srouce:

113 make sure to use compressed pointers (-XX:+UseCompressedOops) use low pause GC (Concurrent Mark Sweep, G1) JVM Tuning This s#!% is heavy! Image srouce:

114 make sure to use compressed pointers (-XX:+UseCompressedOops) use low pause GC (Concurrent Mark Sweep, G1) JVM Tuning Adjust generation sizes/ratios Overprovision heap by ~30% This s#!% is heavy! Image srouce:

115 JVM Tuning

116 JVM Tuning

117 JVM Tuning Print garbage collection

118 JVM Tuning Print garbage collection If GC pauses still prohibitive then consider partitioning

119 In summary

120 Know your Business

121 Know your data

122 Know when to optimize

123 The End! My company: My website: And much of the credit for this talk goes to Leon Stein for developing the technology. Thank you, Leon.

124 How do you load the data?

125 Cache Loading webapp full cache data loader a d b al a n c er webapp full cache data loader reliable file store (S3) webapp full cache data loader cooked datasets

126 Cache loading tips & tricks

127 Final datasets should be compressed and stored (i.e. S3) Cache loading tips & tricks

128 Final datasets should be compressed and stored (i.e. S3) Keep the format simple (CSV, JSON) Help! I am in the trunk! Cache loading tips & tricks

129 Final datasets should be compressed and stored (i.e. S3) Keep the format simple (CSV, JSON) Poll for updates Poll frequency == data inconsistency threshold Help! I am in the trunk! Cache loading tips & tricks

130 Cache Loading Time Sensitivity

131 Cache Loading: low time Sensitivity Data /tax-rates /date= tax-rates csv.gz /date= tax-rates csv.gz /date= tax-rates csv.gz

132 /prices /full Cache Loading: medium/high time Sensitivity /date= price-obs csv.gz /date= /inc /date= T csv.gz T csv.gz

133 Cache Loading Strategy Swap Image src:

134 Cache is immutable, so no locking is required Cache Loading Strategy Swap Image src:

135 Cache is immutable, so no locking is required meow Cache Loading Strategy Swap Works well for infrequently updated data sets And for datasets that need to be refreshed each update Image src:

136 Cache Loading Strategy: CRUD

137 Deletions can be tricky YARRRRRR! Cache Loading Strategy: CRUD

138 Avoid full synchronization Deletions can be tricky YARRRRRR! Cache Loading Strategy: CRUD

139 Avoid full synchronization YARRRRRR! Deletions can be tricky Consider loading cache in small batches. Use one container per partition Cache Loading Strategy: CRUD

140 Concurrent Locking with Trove Map public class LongCache<V> { private TLongObjectMap<V> map = new TLongObjectHashMap<V>(); private ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); private Lock r = lock.readlock(), w = lock.writelock(); public V get(long k) { r.lock(); try { return map.get(k); } finally { r.unlock(); } } public V update(long k, V v) { w.lock(); try { return map.put(k, v); } finally { w.unlock(); } } public V remove(long k) { w.lock(); try { return map.remove(k); } finally { w.unlock(); } } }

141 Cache loading optimizations

142 I am cooking the data sets. Ha! Periodically generate serialized data/state Keep local copies Cache loading optimizations

143 I am cooking the data sets. Ha! Periodically generate serialized data/state Validate with CRC or hash Keep local copies Cache loading optimizations

144 Dependent Caches instance product summary load balancer health check status aggregator (servlet) offers dependencies matching predictions

145 Dependent Caches instance product summary load balancer health check status aggregator (servlet) offers dependencies matching predictions

146 Dependent Caches instance product summary load balancer health check status aggregator (servlet) offers dependencies matching predictions

147 Dependent Caches instance product summary load balancer health check status aggregator (servlet) offers dependencies matching predictions

148 Dependent Caches instance product summary load balancer health check status aggregator (servlet) offers dependencies matching predictions

149 Deployment Cell Status deployment cell webapp status aggregator load balancer health check cell status aggregator HTTP or JMX 1 status aggregator 2 status aggregator

150 Hierarchical Status Aggregation

TWO KINDS OF DATA THIS TALK IS ABOUT REFERENCE DATA. nicknames user, transactional reference, non-transactional

TWO KINDS OF DATA THIS TALK IS ABOUT REFERENCE DATA. nicknames user, transactional reference, non-transactional LEON STEIN current: chief architect, @decide.com before that: farecast.com (now http://www.bing.com/travel) before that: amazon before that: at&t wireless TWO KINDS OF DATA nicknames user, transactional

More information

MID-TIER DEPLOYMENT KB

MID-TIER DEPLOYMENT KB MID-TIER DEPLOYMENT KB Author: BMC Software, Inc. Date: 23 Dec 2011 PAGE 1 OF 16 23/12/2011 Table of Contents 1. Overview 3 2. Sizing guidelines 3 3. Virtual Environment Notes 4 4. Physical Environment

More information

High Frequency Trading and NoSQL. Peter Lawrey CEO, Principal Consultant Higher Frequency Trading

High Frequency Trading and NoSQL. Peter Lawrey CEO, Principal Consultant Higher Frequency Trading High Frequency Trading and NoSQL Peter Lawrey CEO, Principal Consultant Higher Frequency Trading Agenda Who are we? Brief introduction to OpenHFT. What does a typical trading system look like What requirements

More information

In Memory Accelerator for MongoDB

In Memory Accelerator for MongoDB In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000

More information

Key Components of WAN Optimization Controller Functionality

Key Components of WAN Optimization Controller Functionality Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications

More information

Tushar Joshi Turtle Networks Ltd

Tushar Joshi Turtle Networks Ltd MySQL Database for High Availability Web Applications Tushar Joshi Turtle Networks Ltd www.turtle.net Overview What is High Availability? Web/Network Architecture Applications MySQL Replication MySQL Clustering

More information

Liferay Performance Tuning

Liferay Performance Tuning Liferay Performance Tuning Tips, tricks, and best practices Michael C. Han Liferay, INC A Survey Why? Considering using Liferay, curious about performance. Currently implementing and thinking ahead. Running

More information

Cognos8 Deployment Best Practices for Performance/Scalability. Barnaby Cole Practice Lead, Technical Services

Cognos8 Deployment Best Practices for Performance/Scalability. Barnaby Cole Practice Lead, Technical Services Cognos8 Deployment Best Practices for Performance/Scalability Barnaby Cole Practice Lead, Technical Services Agenda > Cognos 8 Architecture Overview > Cognos 8 Components > Load Balancing > Deployment

More information

Tomcat Tuning. Mark Thomas April 2009

Tomcat Tuning. Mark Thomas April 2009 Tomcat Tuning Mark Thomas April 2009 Who am I? Apache Tomcat committer Resolved 1,500+ Tomcat bugs Apache Tomcat PMC member Member of the Apache Software Foundation Member of the ASF security committee

More information

Evidence based performance tuning of

Evidence based performance tuning of Evidence based performance tuning of enterprise Java applications By Jeroen Borgers jborgers@xebia.com Evidence based performance tuning of enterprise Java applications By Jeroen Borgers jborgers@xebia.com

More information

Raima Database Manager Version 14.0 In-memory Database Engine

Raima Database Manager Version 14.0 In-memory Database Engine + Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized

More information

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit. Is your database application experiencing poor response time, scalability problems, and too many deadlocks or poor application performance? One or a combination of zparms, database design and application

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

Ground up Introduction to In-Memory Data (Grids)

Ground up Introduction to In-Memory Data (Grids) Ground up Introduction to In-Memory Data (Grids) QCON 2015 NEW YORK, NY 2014 Hazelcast Inc. Why you here? 2014 Hazelcast Inc. Java Developer on a quest for scalability frameworks Architect on low-latency

More information

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords

More information

Wisdom from Crowds of Machines

Wisdom from Crowds of Machines Wisdom from Crowds of Machines Analytics and Big Data Summit September 19, 2013 Chetan Conikee Irfan Ahmad About Us CloudPhysics' mission is to discover the underlying principles that govern systems behavior

More information

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011 SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,

More information

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory) WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...

More information

Rakam: Distributed Analytics API

Rakam: Distributed Analytics API Rakam: Distributed Analytics API Burak Emre Kabakcı May 30, 2014 Abstract Today, most of the big data applications needs to compute data in real-time since the Internet develops quite fast and the users

More information

Low level Java programming With examples from OpenHFT

Low level Java programming With examples from OpenHFT Low level Java programming With examples from OpenHFT Peter Lawrey CEO and Principal Consultant Higher Frequency Trading. Presentation to Joker 2014 St Petersburg, October 2014. About Us Higher Frequency

More information

FAQs. This material is built based on. Lambda Architecture. Scaling with a queue. 8/27/2015 Sangmi Pallickara

FAQs. This material is built based on. Lambda Architecture. Scaling with a queue. 8/27/2015 Sangmi Pallickara CS535 Big Data - Fall 2015 W1.B.1 CS535 Big Data - Fall 2015 W1.B.2 CS535 BIG DATA FAQs Wait list Term project topics PART 0. INTRODUCTION 2. A PARADIGM FOR BIG DATA Sangmi Lee Pallickara Computer Science,

More information

Integrating VoltDB with Hadoop

Integrating VoltDB with Hadoop The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.

More information

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00 Practical Cassandra NoSQL key-value vs RDBMS why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints Vitalii Tymchyshyn

More information

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

HADOOP PERFORMANCE TUNING

HADOOP PERFORMANCE TUNING PERFORMANCE TUNING Abstract This paper explains tuning of Hadoop configuration parameters which directly affects Map-Reduce job performance under various conditions, to achieve maximum performance. The

More information

About me. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

About me. Copyright 2015, Oracle and/or its affiliates. All rights reserved. GOTO Chicago RETURN Cameron Purdy Senior Vice President Oracle About me The last time I spoke at a non-oracle event was QConin June 2012 I used to have fun making up Top 10 lists that would poke fun at

More information

Comparing SQL and NOSQL databases

Comparing SQL and NOSQL databases COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations

More information

Configuring Apache Derby for Performance and Durability Olav Sandstå

Configuring Apache Derby for Performance and Durability Olav Sandstå Configuring Apache Derby for Performance and Durability Olav Sandstå Database Technology Group Sun Microsystems Trondheim, Norway Overview Background > Transactions, Failure Classes, Derby Architecture

More information

An Oracle White Paper September 2013. Advanced Java Diagnostics and Monitoring Without Performance Overhead

An Oracle White Paper September 2013. Advanced Java Diagnostics and Monitoring Without Performance Overhead An Oracle White Paper September 2013 Advanced Java Diagnostics and Monitoring Without Performance Overhead Introduction... 1 Non-Intrusive Profiling and Diagnostics... 2 JMX Console... 2 Java Flight Recorder...

More information

Quantifind s story: Building custom interactive data analytics infrastructure

Quantifind s story: Building custom interactive data analytics infrastructure Quantifind s story: Building custom interactive data analytics infrastructure Ryan LeCompte @ryanlecompte Scala Days 2015 Background Software Engineer at Quantifind @ryanlecompte ryan@quantifind.com http://github.com/ryanlecompte

More information

The Classical Architecture. Storage 1 / 36

The Classical Architecture. Storage 1 / 36 1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage

More information

Cloud Based Application Architectures using Smart Computing

Cloud Based Application Architectures using Smart Computing Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

Scaling Progress OpenEdge Appservers. Syed Irfan Pasha Principal QA Engineer Progress Software

Scaling Progress OpenEdge Appservers. Syed Irfan Pasha Principal QA Engineer Progress Software Scaling Progress OpenEdge Appservers Syed Irfan Pasha Principal QA Engineer Progress Software Michael Jackson Dies and Twitter Fries Twitter s Fail Whale 3 Twitter s Scalability Problem Takeaways from

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

SSD Performance Tips: Avoid The Write Cliff

SSD Performance Tips: Avoid The Write Cliff ebook 100% KBs/sec 12% GBs Written SSD Performance Tips: Avoid The Write Cliff An Inexpensive and Highly Effective Method to Keep SSD Performance at 100% Through Content Locality Caching Share this ebook

More information

Effective Java Programming. measurement as the basis

Effective Java Programming. measurement as the basis Effective Java Programming measurement as the basis Structure measurement as the basis benchmarking micro macro profiling why you should do this? profiling tools Motto "We should forget about small efficiencies,

More information

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How

More information

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers A Comparative Study on Vega-HTTP & Popular Open-source Web-servers Happiest People. Happiest Customers Contents Abstract... 3 Introduction... 3 Performance Comparison... 4 Architecture... 5 Diagram...

More information

ZooKeeper. Table of contents

ZooKeeper. Table of contents by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...

More information

Social Networks and the Richness of Data

Social Networks and the Richness of Data Social Networks and the Richness of Data Getting distributed Webservices Done with NoSQL Fabrizio Schmidt, Lars George VZnet Netzwerke Ltd. Content Unique Challenges System Evolution Architecture Activity

More information

Java Interview Questions and Answers

Java Interview Questions and Answers 1. What is the most important feature of Java? Java is a platform independent language. 2. What do you mean by platform independence? Platform independence means that we can write and compile the java

More information

Benchmarking Cassandra on Violin

Benchmarking Cassandra on Violin Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract

More information

Java 7 Recipes. Freddy Guime. vk» (,\['«** g!p#« Carl Dea. Josh Juneau. John O'Conner

Java 7 Recipes. Freddy Guime. vk» (,\['«** g!p#« Carl Dea. Josh Juneau. John O'Conner 1 vk» Java 7 Recipes (,\['«** - < g!p#«josh Juneau Carl Dea Freddy Guime John O'Conner Contents J Contents at a Glance About the Authors About the Technical Reviewers Acknowledgments Introduction iv xvi

More information

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)

More information

Google File System. Web and scalability

Google File System. Web and scalability Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

Distributed storage for structured data

Distributed storage for structured data Distributed storage for structured data Dennis Kafura CS5204 Operating Systems 1 Overview Goals scalability petabytes of data thousands of machines applicability to Google applications Google Analytics

More information

JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing

JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing January 2014 Legal Notices JBoss, Red Hat and their respective logos are trademarks or registered trademarks of Red Hat, Inc. Azul

More information

Scaling Database Performance in Azure

Scaling Database Performance in Azure Scaling Database Performance in Azure Results of Microsoft-funded Testing Q1 2015 2015 2014 ScaleArc. All Rights Reserved. 1 Test Goals and Background Info Test Goals and Setup Test goals Microsoft commissioned

More information

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Scalability of web applications. CSCI 470: Web Science Keith Vertanen Scalability of web applications CSCI 470: Web Science Keith Vertanen Scalability questions Overview What's important in order to build scalable web sites? High availability vs. load balancing Approaches

More information

HDFS Architecture Guide

HDFS Architecture Guide by Dhruba Borthakur Table of contents 1 Introduction... 3 2 Assumptions and Goals... 3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets... 3 2.4 Simple Coherency Model...3 2.5

More information

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Comments on reviews a. Need to avoid just summarizing web page asks you for: 1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of

More information

Hypertable Architecture Overview

Hypertable Architecture Overview WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Transactions and ACID in MongoDB

Transactions and ACID in MongoDB Transactions and ACID in MongoDB Kevin Swingler Contents Recap of ACID transactions in RDBMSs Transactions and ACID in MongoDB 1 Concurrency Databases are almost always accessed by multiple users concurrently

More information

A Deduplication File System & Course Review

A Deduplication File System & Course Review A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

White Paper. Optimizing the Performance Of MySQL Cluster

White Paper. Optimizing the Performance Of MySQL Cluster White Paper Optimizing the Performance Of MySQL Cluster Table of Contents Introduction and Background Information... 2 Optimal Applications for MySQL Cluster... 3 Identifying the Performance Issues.....

More information

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb, Consulting MTS The following is intended to outline our general product direction. It is intended for information

More information

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG ZingMe Practice For Building Scalable PHP Website By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG Agenda About ZingMe Scaling PHP application Scalability definition Scaling up vs

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

Enterprise Manager Performance Tips

Enterprise Manager Performance Tips Enterprise Manager Performance Tips + The tips below are related to common situations customers experience when their Enterprise Manager(s) are not performing consistent with performance goals. If you

More information

WebSphere Architect (Performance and Monitoring) 2011 IBM Corporation

WebSphere Architect (Performance and Monitoring) 2011 IBM Corporation Track Name: Application Infrastructure Topic : WebSphere Application Server Top 10 Performance Tuning Recommendations. Presenter Name : Vishal A Charegaonkar WebSphere Architect (Performance and Monitoring)

More information

Running a Workflow on a PowerCenter Grid

Running a Workflow on a PowerCenter Grid Running a Workflow on a PowerCenter Grid 2010-2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General

More information

Resource Utilization of Middleware Components in Embedded Systems

Resource Utilization of Middleware Components in Embedded Systems Resource Utilization of Middleware Components in Embedded Systems 3 Introduction System memory, CPU, and network resources are critical to the operation and performance of any software system. These system

More information

Cache Configuration Reference

Cache Configuration Reference Sitecore CMS 6.2 Cache Configuration Reference Rev: 2009-11-20 Sitecore CMS 6.2 Cache Configuration Reference Tips and Techniques for Administrators and Developers Table of Contents Chapter 1 Introduction...

More information

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier Simon Law TimesTen Product Manager, Oracle Meet The Experts: Andy Yao TimesTen Product Manager, Oracle Gagan Singh Senior

More information

In-Memory Performance for Big Data

In-Memory Performance for Big Data In-Memory Performance for Big Data Goetz Graefe, Haris Volos, Hideaki Kimura, Harumi Kuno, Joseph Tucek, Mark Lillibridge, Alistair Veitch VLDB 2014, presented by Nick R. Katsipoulakis A Preliminary Experiment

More information

Application Performance in the Cloud

Application Performance in the Cloud Application Performance in the Cloud Understanding and ensuring application performance in highly elastic environments Albert Mavashev, CTO Nastel Technologies, Inc. amavashev@nastel.com What is Cloud?

More information

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Design and Implementation of a Storage Repository Using Commonality Factoring IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Axion Overview Potentially infinite historic versioning for rollback and

More information

BigMemory & Hybris : Working together to improve the e-commerce customer experience

BigMemory & Hybris : Working together to improve the e-commerce customer experience & Hybris : Working together to improve the e-commerce customer experience TABLE OF CONTENTS 1 Introduction 1 Why in-memory? 2 Why is in-memory Important for an e-commerce environment? 2 Why? 3 How does

More information

Storage in Database Systems. CMPSCI 445 Fall 2010

Storage in Database Systems. CMPSCI 445 Fall 2010 Storage in Database Systems CMPSCI 445 Fall 2010 1 Storage Topics Architecture and Overview Disks Buffer management Files of records 2 DBMS Architecture Query Parser Query Rewriter Query Optimizer Query

More information

Simple Solution for a Location Service. Naming vs. Locating Entities. Forwarding Pointers (2) Forwarding Pointers (1)

Simple Solution for a Location Service. Naming vs. Locating Entities. Forwarding Pointers (2) Forwarding Pointers (1) Naming vs. Locating Entities Till now: resources with fixed locations (hierarchical, caching,...) Problem: some entity may change its location frequently Simple solution: record aliases for the new address

More information

MySQL Storage Engines

MySQL Storage Engines MySQL Storage Engines Data in MySQL is stored in files (or memory) using a variety of different techniques. Each of these techniques employs different storage mechanisms, indexing facilities, locking levels

More information

CS514: Intermediate Course in Computer Systems

CS514: Intermediate Course in Computer Systems : Intermediate Course in Computer Systems Lecture 7: Sept. 19, 2003 Load Balancing Options Sources Lots of graphics and product description courtesy F5 website (www.f5.com) I believe F5 is market leader

More information

Bigdata High Availability (HA) Architecture

Bigdata High Availability (HA) Architecture Bigdata High Availability (HA) Architecture Introduction This whitepaper describes an HA architecture based on a shared nothing design. Each node uses commodity hardware and has its own local resources

More information

Using Object Database db4o as Storage Provider in Voldemort

Using Object Database db4o as Storage Provider in Voldemort Using Object Database db4o as Storage Provider in Voldemort by German Viscuso db4objects (a division of Versant Corporation) September 2010 Abstract: In this article I will show you how

More information

Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3

Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3 Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM

More information

Virtuoso and Database Scalability

Virtuoso and Database Scalability Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of

More information

Oracle WebLogic Server 11g Administration

Oracle WebLogic Server 11g Administration Oracle WebLogic Server 11g Administration This course is designed to provide instruction and hands-on practice in installing and configuring Oracle WebLogic Server 11g. These tasks include starting and

More information

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database

More information

PTC System Monitor Solution Training

PTC System Monitor Solution Training PTC System Monitor Solution Training Patrick Kulenkamp June 2012 Agenda What is PTC System Monitor (PSM)? How does it work? Terminology PSM Configuration The PTC Integrity Implementation Drilling Down

More information

SMALL INDEX LARGE INDEX (SILT)

SMALL INDEX LARGE INDEX (SILT) Wayne State University ECE 7650: Scalable and Secure Internet Services and Architecture SMALL INDEX LARGE INDEX (SILT) A Memory Efficient High Performance Key Value Store QA REPORT Instructor: Dr. Song

More information

CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS

CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS Java EE Components Java EE Vendor Specifications Containers Java EE Blueprint Services JDBC Data Sources Java Naming and Directory Interface Java Message

More information

MS SQL Performance (Tuning) Best Practices:

MS SQL Performance (Tuning) Best Practices: MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware

More information

Cloud Operating Systems for Servers

Cloud Operating Systems for Servers Cloud Operating Systems for Servers Mike Day Distinguished Engineer, Virtualization and Linux August 20, 2014 mdday@us.ibm.com 1 What Makes a Good Cloud Operating System?! Consumes Few Resources! Fast

More information

Course Description. Course Audience. Course Outline. Course Page - Page 1 of 5

Course Description. Course Audience. Course Outline. Course Page - Page 1 of 5 Course Page - Page 1 of 5 WebSphere Application Server 7.0 Administration on Windows BSP-1700 Length: 5 days Price: $ 2,895.00 Course Description This course teaches the basics of the administration and

More information

MongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15

MongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15 MongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15 1 MongoDB in the NoSQL and SQL world. NoSQL What? Why? - How? Say goodbye to ACID, hello BASE You

More information

Original-page small file oriented EXT3 file storage system

Original-page small file oriented EXT3 file storage system Original-page small file oriented EXT3 file storage system Zhang Weizhe, Hui He, Zhang Qizhen School of Computer Science and Technology, Harbin Institute of Technology, Harbin E-mail: wzzhang@hit.edu.cn

More information

SDFS Overview. By Sam Silverberg

SDFS Overview. By Sam Silverberg SDFS Overview By Sam Silverberg Why did I do this? I had an Idea that I needed to see if it worked. Design Goals Create a dedup file system capable of effective inline deduplication for Virtual Machines

More information

Java DB Performance. Olav Sandstå Sun Microsystems, Trondheim, Norway Submission ID: 860

Java DB Performance. Olav Sandstå Sun Microsystems, Trondheim, Norway Submission ID: 860 Java DB Performance Olav Sandstå Sun Microsystems, Trondheim, Norway Submission ID: 860 AGENDA > Java DB introduction > Configuring Java DB for performance > Programming tips > Understanding Java DB performance

More information

Oracle Database In-Memory The Next Big Thing

Oracle Database In-Memory The Next Big Thing Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

In-Memory Databases MemSQL

In-Memory Databases MemSQL IT4BI - Université Libre de Bruxelles In-Memory Databases MemSQL Gabby Nikolova Thao Ha Contents I. In-memory Databases...4 1. Concept:...4 2. Indexing:...4 a. b. c. d. AVL Tree:...4 B-Tree and B+ Tree:...5

More information

GraySort on Apache Spark by Databricks

GraySort on Apache Spark by Databricks GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner

More information

Cloud Computing at Google. Architecture

Cloud Computing at Google. Architecture Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale

More information