3 Techniques for Database Scalability with Hibernate Geert Bevin - @gbevin - SpringOne 2009
Goals Learn when to use second level cache Learn when to detach your conversations Learn about alternatives to RDBMs 2
Hibernate can be really good For enterprise apps, all data is usually in the database ORM helps us mask the impedance mismatch Even when doing lower-level database operations, we can use libraries like Spring 3
... Hibernate can be really bad too The DB still takes the majority of the application workload persistent storage hub for our application state ACID hub for coordination Query hub for searching for data So, we tend to over-provision the DB Under-utilize the application servers Costs us lots of extra money in time and infrastructure 4
Apps typically have lots of operations on large amounts of small data Operations Per Second Amount of Data 5
DBs are sized to peak load Operations Per Second Amount of Data
Strive to downsize DBs Frequently accessed app data: Shared Memory (Transactional) Operations Per Second Business Record Data : Database Amount of Data
Recognize your data lifetimes Memory Terracotta Database Appropriateness Data Lifetime
There's no such thing as stateless YESTERDAY: Stateless == stored in the DB (scalable) Stateful == sticky load balancer (app clustering) TODAY: We realize state is state, in the DB or otherwise Clustering has evolved ( linear >4 nodes) Hybrid approach now viable 9
Techniques to offload database Cluster Hibernate second level caches Detach conversations and flush manually Recognize non relational data 10
Introducing Terracotta JVM-level clustering Commodity hardware Transactional coordination of JVMs Rely on regular JMM semantics Leverage existing transaction APIs Built-in clustering, partitioning, HA Tools and facilities Developer and operations visual console Cluster-wide statistics Optional cluster awareness APIs 11
Reference Application Best practices with Spring & Terracotta Same Spring framework, tools and runtime Same MySQL Using database less and Terracotta more simplifies implementation makes scalability trivial 12
Reference Application Real-world techniques Web flows Cache frequently-read data Key-value stores Spring internals remain durable, scalable 13
Examinator: The App Best of breed OSS stack Tomcat, Spring, Hibernate, Terracotta, MySQL Based on a real-world customer Online test-taking application Multiple choice tests Admin creates and manages Students take timed tests One active long-running test Forced / max durations 14
Examinator stack Spring-based stack MVC Web Flow Security Transactions Spring Security View Model Controller Spring Web Flow Spring MVC Sitemesh Open source Service Freemarker Tomcat / Jetty Hibernate / JPA Site Mesh JPA Hibernate DAO Domain Freemarker 15
Examinator Use Cases Business Function Password reset requires confirmation Single Sign-on Admin editable exam definitions Taking durable, transactional exams Usage Hold confirmation code in memory wait for email follow-up Authenticate with user in DB Cache for all other requests Load exams in cache Sync changes back to DB Conversational tests in clustered heap Async write-behind to DB Pattern Key-value store Read-only caching Hibernate 2 nd level cache Medium term data lifetime Copyright Terracotta 2007 16
Examinator Use Cases Business Function Password reset requires confirmation Single Sign-on Admin editable exam definitions Taking durable, transactional exams Usage Hold confirmation code in memory wait for email follow-up Authenticate with user in DB Cache for all other requests Load exams in cache Sync changes back to DB Conversational tests in clustered heap Async write-behind to DB Pattern Key-value store Read-only caching Hibernate 2 nd level cache Medium term data lifetime Copyright Terracotta 2007 17
Password reset requires confirmation Key-value store + write-through Hold incomplete password reset request Unique code held in map through Terracotta Lookup code at validation Remove code and reset password Use DistributedMap structure which also provides eviction 18
@Service public class DefaultPasswordResetService implements PasswordResetService { @Root private DistributedMap<String, Long> resetcodes; private final UserService userservice; public boolean requestconfirmation(final String email, final String url) { final User user = userservice.findbyemail(email); final String uuidstring = SecurityHelper.generateUniqueCode(); resetcodes.put(uuidstring, user.getid()); } // [snip]... send email... [/snip] return true; public boolean generatenewpassword(final String code) { final Long userid = resetcodes.remove(code); if (null == userid) return false; final User user = userservice.findbyid(userid); final String uuidstring = SecurityHelper.generateUniqueCode(); // [snip]... send email... [/snip] user.setandencodepassword(uuidstring); user.setconfirmed(true); userservice.store(user); } return true; } // [snip]... constructor, setters, getters... [/snip] 19
DEMO 20
Examinator Use Cases Business Function Password reset requires confirmation Single Sign-on Admin editable exam definitions Taking durable, transactional exams Usage Hold confirmation code in memory wait for email follow-up Authenticate with user in DB Cache for all other requests Load exams in cache Sync changes back to DB Conversational tests in clustered heap Async write-behind to DB Pattern Key-value store Read-only caching Hibernate 2 nd level cache Medium term data lifetime Copyright Terracotta 2007 21
Single Sign-on Read-only caching Clustered login through HTTP session Leverages Spring Security Just flip the switch through configuration 22
<tc:tc-config> <!-- server array config --> <clients> <modules> <module name="tim-tomcat-6.0" version="1.0.0"/> <!-- other modules --> </modules> </clients> <application> <!-- other application config --> <spring> <jee-application name="examinator"> <session-support>true</session-support> </jee-application> <!-- other spring config --> </spring> </application> </tc:tc-config> 23
DEMO 24
Examinator Use Cases Business Function Password reset requires confirmation Single Sign-on Admin editable exam definitions Taking durable, transactional exams Usage Hold confirmation code in memory wait for email follow-up Authenticate with user in DB Cache for all other requests Load exams in cache Sync changes back to DB Conversational tests in clustered heap Async write-behind to DB Pattern Key-value store Read-only caching Hibernate 2 nd level cache Medium term data lifetime Copyright Terracotta 2007 25
Admin editable exam definitions Hibernate 2nd level cache Don't hit the database for infrequently changed data Hibernate second level cache shares Exam data amongst users Dehydrated cached entity data is clustered through Terracotta across nodes Use tim-hibernate-cache module No changes to how you use the 2 nd level cache 26
<tc:tc-config> <!-- server array config --> <clients> <modules> <module name="tim-hibernate-3.2.5" version="1.4.0"/> <module name="tim-hibernate-cache-3.2" version="1.0.0"/> <!-- other modules --> </modules> </clients> <!-- application config --> </tc:tc-config> 27
<bean id="entitymanagerfactory" class="org.springframework.orm.jpa.localcontainerentitymanagerfactorybean"> <property name="datasource" ref="datasource" /> <property name="jpavendoradapter" ref="hibernatejpavendoradapter" /> <property name="persistenceunitname" value="exam" /> <property name="jpaproperties"> <value> # Auto-detect annotated JPA entities hibernate.archive.autodetection = class # Caching hibernate.cache.provider_class = org.terracotta.modules.hibernatecache.terracottahibernatecacheprovider hibernate.cache.use_second_level_cache = true </value> </property> </bean> @Entity @Table(name = "EXAM") @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE) @InstrumentedClass @HonorTransient public class Exam { // [snip]... domain model fiels and methods... [/snip] } 28
Examinator Use Cases Business Function Password reset requires confirmation Single Sign-on Admin editable exam definitions Taking durable, transactional exams Usage Hold confirmation code in memory wait for email follow-up Authenticate with user in DB Cache for all other requests Load exams in cache Sync changes back to DB Conversational tests in clustered heap Async write-behind to DB Pattern Key-value store Read-only caching Hibernate 2 nd level cache Medium term data lifetime Copyright Terracotta 2007 29
Taking durable, transactional exams Medium term + key-value + write-behind Leverage Spring Work Flow's conversations Cluster conversations through HTTP session Simply flip the switch through config again Store active exams in clustered map Use async write-behind to database for exam results Terracotta provides a TIM : tim-async Fault-tolerant, HA, locality aware, work stealing,... 30
public class ExamSessionServiceImpl implements ExamSessionService { @Root private final ConcurrentMap<String, ExamSession> ongoingexams = new ConcurrentStringMap<ExamSession>(); @Root private final AsyncCoordinator<ExamResult> asynccommitter = new AsyncCoordinator<ExamResult>(); // [snip]... other fields... [/snip] @Autowired public ExamSessionServiceImpl(final ExamService examservice, final UserService userservice, final ExamResultCommitHandler handler) { this.examservice = examservice; this.userservice = userservice; } asynccommitter.start(handler); public ExamResult endexam(final String username) throws ExamException { final ExamSession examsession = getexamsession(username); final ExamResult result = getexamresult(examsession); asynccommitter.add(result); final boolean removed = ongoingexams.remove(username, examsession); if (removed) { removetimeouttask(username); } } return result; } // [snip]... other service methods using ongoingexams, etc... [/snip] 31
@Service public class ExamResultCommitHandler implements ItemProcessor<ExamResult> { private final ExamService examservice; @Autowired public ExamResultCommitHandler(final ExamService examservice) { this.examservice = examservice; } public void process(final ExamResult result) { // We use UUID for the exam result ID, this is set as the value of the // identifier property before actually storing the entity in the // database. If the node would go down before actually removing the // result from the queue, it will be processed again later. if (result.getid()!= null) { if (examservice.examresultexistsbyid(result.getid())) { // this entity was already persisted return; } else { // the entity was not persisted, but the ID was set // clear the ID so that it will be persisted below result.setid(null); } } } } // save the new exam result examservice.saveexamresult(result); 32
DEMO 33
Performance Results 1/2 Test taking application 20,000 online concurrent test takers MySQL reads only to load test definitions once MySQL writes only to flush test when done 16 JVMs and 2 mirrored Terracotta servers Dual core, SLES, RHEL, 2GB heap (Application) 8 core, RHEL, 2GB heap (Terracotta) 34
Performance Results 2/2 Latency 5ms average Tight distribution / variation No outliers >10ms When run on Vmware 6ms latency Utilization ~30% for app JVMs ~50% for TC servers Near 0% for MySQL!!! 35
Getting started Have a target application? Start by identifying data that can benefit from being managed by Terracotta Web Flows that can detach from DB Medium-term middle tier state that needs to persist Profile your DB queries for caching opportunities Key-value stores that benefit from high locality No target? Download the Examinator Learn how it works and augment it to represent your opportunities 36
www.springsource.com www.terracotta.org Examinator on Terracotta s home page SpringSource team blog: blog.springsource.com Ari s Blog: blog.terracottatech.com Alex's Blog: tech.puredanger.com Spring in Action, Spring Recipes recent top sellers Definitive Guide to Terracotta for sale on Amazon or Apress.com 37