Virtual file system on NoSQL for processing high volumes of HL7 messages

Size: px
Start display at page:

Download "Virtual file system on NoSQL for processing high volumes of HL7 messages"

Transcription

1 Digital Healthcare Empowering Europeans R. Cornet et al. (Eds.) 2015 European Federation for Medical Informatics (EFMI). This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License. doi: / Virtual file system on NoSQL for processing high volumes of HL7 messages Eizen KIMURA 1 and Ken ISHIHARA Dept. Medical Informatics of Medical School of Ehime University 687 Abstract. The Standardized Structured Medical Information Exchange (SS-MIX) is intended to be the standard repository for HL7 messages that depend on a local file system. However, its scalability is limited. We implemented a virtual file system using NoSQL to incorporate modern computing technology into SS-MIX and allow the system to integrate local patient IDs from different healthcare systems into a universal system. We discuss its implementation using the database MongoDB and describe its performance in a case study. Keywords. HL7, Data Science, Distributed Computing, NoSQL, SS-MIX Introduction The leveraging of big data analysis in the medical domain could break new ground in the management of lifestyle-related diseases and also increase the speed of drug development. The US government recently advocated the Big Data Research and Development Initiative, with the NIH announcing that more than 200 terabytes of genomic data from a thousand genomic research projects will be available on the Amazon Web Service 1. Conversely, in Japan, the National Database (NDB) already contains more than 6.9 billion health insurance claims, yet, there is no still concrete plan to develop a big data analysis framework. In 2006, the Ministry of Health Labour and Welfare introduced the Standardized Structured Medical Information Exchange project to promote the exchange of health information among institutions 2. The design concept of SS-MIX aims at simplicity by making use of standard file systems and storing HL7 messages in a standard directory structure. However, its development started before the Internet era, and it is intended for use with local file systems. The idea was for a clinic or hospital to be able to provide patient data on a portable storage device (e.g., CD-ROM, USB memory stick), enabling that a patient to take the data to another institution. SS-MIX also lacks a distributed data processing scheme and has limited scalability. Moreover, in Japan, there is no nationwide patient ID system, but rather institution-specific IDs, regional patient IDs, clinical research registration IDs, and so on. A scheme needs to be developed to aggregate these IDs and medical records, and to enable analysis in a cross-sectional manner. One way forward would be to preserve the simplicity of SS-MIX, but add various capabilities including large-scale storage, high-speed search, distributed processing, and the ability to aggregate multiple patient IDs into unique nationwide IDs. Google has already built a distributed data storage system called BigTable 3. By separating user data based on metadata, it offers 1 Corresponding Author. Eizen Kimura Medical School of Ehime Univ. [email protected]

2 688 E. Kimura and K. Ishihara / Virtual File System on NoSQL for Processing High Volumes an individualised user experience, although all of the resources of each user are accumulated on a single cloud storage system 4. In the present study, we leverage cloud technology to aggregate all patient medical records in SS-MIX, improve its search and distribution processing performance, and share medical information with stakeholders securely. To this end, we applied the same metadata scheme used in BigTable. Convert(HL7(messages(to(BSON(for(MongoDB(store( MSH ^~\& XXXX C PRIORITYHEALTH PRIORITYHEALTH ORU^R01 Q T P 2.3 PID ^^^Priority Health LASTNAME^FIRSTNAME^INIT M PD ^PCPLAST^PCPFIRST^M^^^^^NPI OBR 1 185L29839X64489JLPF~X64489^ACC_NUM JLPF^Lipid Panel - C 1694^DOCLAST^DOCFIRST^^MD OBX 1 NM JHDL^HDL Cholesterol (CAD) 1 62 CD:289^mg/dL >40^>40 "" "" F ^^^"" OBX 2 NM JTRIG^Triglyceride (CAD) 1 72 CD:289^mg/dL ^35^150 "" "" F ^^^"" OBX 3 NM JVLDL^VLDL-C (calc - CAD) 1 14 CD:289^mg/dL "" "" F ^^^"" OBX 4 NM JLDL^LDL-C (calc - CAD) CD:289^mg/dL 0-100^0^100 H "" F ^^^"" OBX 5 NM JCHO^Cholesterol (CAD) CD:289^mg/dL ^90^200 H "" F ^^^"" Original'Raw'HL7'Message Question: The average value of cholesterol of male adults (age: ) Select HL7 Records satisfying following conditions: Sex is Man (PID 8) AND Birth date is in between 1936/04/01 and 1972/03/31 AND OBX has JHDL entry. Retrieve Cholesterol Laboratory Result from OBX 5th eld Count number of results and sum the cholesterol values Calcurate average value of collected cholesterol values Convert var res = db.somecoll.mapreduce( map,reduce, { nalize: nalize, out:{ replace: "map_reduce_example" }, query: { "HL7Message.PID.PID_8" : "M", "HL7Message.PID.PID_7": {"$gte": , "$lte": }, "HL7Message.OBX.OBX_3.OBX_3_0": "JHDL", }}); var map = function() { for (idx in this['hl7message']['obx']) { if (this['hl7message']['obx'][idx]['obx_3']['obx_3_0'] == "JHDL") { var key = "JHDL"; var value = { sum : parseint(this['hl7message']['obx'][idx]['obx_5']), count : 1} emit(key,value);}}} var reduce = function(key, values) { reducedval = { sum: 0,count: 0}; values.foreach(function(value) { if (!isnan(value.sum)) { reducedval.sum+=value.sum; reducedval.count+=value.count; } }); return (reducedval); } var nalize = function(key, reducedval) { return { sum: reducedval.sum, count: reducedval.count, average: reducedval.sum / reducedval.count, }; Serialized'XML'Message <?xml version="1.0" encoding="utf-8"?> <HL7Message> <MSH> <MSH_0>MSH</MSH_0> <MSH_1>^~\&</MSH_1> <MSH_2>XXXX</MSH_2> <MSH_3>C</MSH_3> <OBX> <OBX_0>OBX</OBX_0> <OBX_1>1</OBX_1> <OBX_2>NM</OBX_2> <OBX_3> <OBX_3_0>JHDL</OBX_3_0> <OBX_3_1>HDL Cholesterol (CAD)</OBX_3_1> </OBX_3> <OBX_4>1</OBX_4> <OBX_5>62</OBX_5> <OBX_6> <OBX_6_0>CD:289</OBX_6_0> <OBX_6_1>mg/dL</OBX_6_1> </OBX_6> <OBX_7> <OBX_7_0>>40</OBX_7_0> <OBX_7_1>>40</OBX_7_1> </OBX_7> <OBX_8>""</OBX_8> <OBX_9/> <OBX_10>""</OBX_10> <OBX_11>F</OBX_11> <OBX_12/> <OBX_13/> <OBX_14> </OBX_14> <OBX_15/> <OBX_16/> <OBX_17> <OBX_17_0/> <OBX_17_1/> <OBX_17_2/> <OBX_17_3>""</OBX_17_3> </OBX_17> <OBX_18/> </OBX> <OBX> <OBX_0>OBX</OBX_0> <OBX_1>2</OBX_1> <OBX_2>NM</OBX_2> <OBX_3> <OBX_3_0>JTRIG</OBX_3_0> <OBX_3_1>Triglyceride (CAD)</OBX_3_1> </OBX_3> <OBX_4>1</OBX_4> <OBX_5>72</OBX_5> <OBX_6> <OBX_6_0>CD:289</OBX_6_0> <OBX_6_1>mg/dL</OBX_6_1> </OBX_6> <OBX_7> <OBX_7_0>35-150</OBX_7_0> <OBX_7_1>35</OBX_7_1> <OBX_7_2>150</OBX_7_2> </OBX_7> ConfigServer (mongod) Convert {"HL7Message"=> {"MSH"=> {"MSH_0"=>"MSH", "MSH_1"=>"^~\\&", "MSH_2"=>"XXXX", "MSH_3"=>"C", "MSH_4"=>"PRIORITYHEALTH", Fig. 1 Storing and MapReduce HL7 messages on virtual file system BSON'Message "OBX"=> [{"OBX_0"=>"OBX", "OBX_1"=>"1", "OBX_2"=>"NM", "OBX_3"=>{"OBX_3_0"=>"JHDL", "OBX_3_1"=>"HDL Cholesterol (CAD)"}, "OBX_4"=>"1", "OBX_5"=>"62", "OBX_6"=>{"OBX_6_0"=>"CD:289", "OBX_6_1"=>"mg/dL"}, "OBX_7"=>{"OBX_7_0"=>">40", "OBX_7_1"=>">40"}, "OBX_8"=>"\"\"", "OBX_9"=>nil, "OBX_10"=>"\"\"", "OBX_11"=>"F", "OBX_12"=>nil, "OBX_13"=>nil, "OBX_14"=>" ", "OBX_15"=>nil, "OBX_16"=>nil, "OBX_17"=> {"OBX_17_0"=>nil, "OBX_17_1"=>nil, "OBX_17_2"=>nil, "OBX_17_3"=>"\"\""}, "OBX_18"=>nil}, {"OBX_0"=>"OBX", "OBX_1"=>"2", "OBX_2"=>"NM", "OBX_3"=>{"OBX_3_0"=>"JTRIG", "OBX_3_1"=>"Triglyceride (CAD)"}, "OBX_4"=>"1", "OBX_5"=>"72", "OBX_6"=>{"OBX_6_0"=>"CD:289", "OBX_6_1"=>"mg/dL"}, "OBX_7"=>{"OBX_7_0"=>"35-150", "OBX_7_1"=>"35", "OBX_7_2"=>"150"}, Import(BSON(Messages(into MongoDB(Sharding(Clusters Sharding'Nodes i (mongod)' Query Mapping Shuffilng Reducing Final'Result Rou$ng'Server (mongos) Mongo'Map'Reduce'Framework 1. Methods The virtual file system uses MongoDB (version 2.4.9) as the NoSQL backend. MongoDB offers distributed processing on multiple nodes via sharding, which consists of 10 nodes 3. One (called mongos) is for the routing server process, one (mongod) mediates the interaction between sharding nodes and clients, and the rest of the nodes process distributed data. Each node is deployed on Science Cloud 5 and is run on a CentOS 5.7 (64 bit) Intel Xeon X5675 chip at 3.97 GHz with 12 cores and 96 Gb RAM, 10 Gbps x 2. The General Parallel File System (GPFS) 6 was built on the RAID6 system and consists of 600 sets of 3-Tb/7200-rpm hard disks; we linked this system to the 10 nodes using 10 Gbps connections. Because MongoDB uses Binary Java Script Object

3 E. Kimura and K. Ishihara / Virtual File System on NoSQL for Processing High Volumes 689 Notation (BSON) as an internal representation 7, we developed a tool that converts raw HL7 ver 2.x messages into BSON format and then stores the converted messages under an HL7Message node of the document (Figs. 1, 2). It decomposes every separator of the HL7 message, arranges its contents in accordance with the hierarchical structure of the BSON document, and assigns consecutive numbers to each one. It also extracts the patient ID, institutional ID, and the type of message from the original HL7 message and arranges the metadata simulating SS-MIX standard storage under the SS-MIX node of the BSON document (Fig. 1). The virtual file system that simulates SS-MIX storage is developed using Filesystem in Userspace (FUSE) 8. It mounts a virtual SS-MIX storage system on the host and then converts the requests and responses of file system access to the query to, and response from, the MongoDB. The FUSE module was developed using Ruby and the FUSEfs module. It simulates the file system hierarchy using the metadata under the SS-MIX node in the BSON document (Fig. 1). The tool for aggregating patient IDs adds an entry containing a universal patient ID and a new institution ID to the existing metadata under the SS-MIX node. This makes it possible to search all medical records for any given patient across healthcare facilities. The system must be able to efficiently register data from nationwide healthcare systems in real time. To test this, we performed various evaluations. First, to assess the relationship between the number of sharding nodes and the performance of data registration, the average processing times were calculated from five processing times. We simulated a case in which 40 clients sent a message that included 100 HL7 messages and repeated this 500 times. Thus, we determined the registration time to process 2 million HL7 messages. We repeated this process using a different number of nodes, from one to eight. Next, we investigated how the numbers of concurrent connections and of bulk-transferred HL7 messages affected registration performance. On the sharding setting of eight nodes, we measured the average number of registrations while changing the following conditions. We assumed various concurrent connections (from 1 to 40 clients), different numbers of HL7 messages per inquiry (1000, 2000, 4000, and 8000), and repeated this 500 times. To evaluate its performance processing distributed data, we prepared a MapReduce scenario that collected laboratory data on high-density lipoprotein (HDL) cholesterol levels of men aged years in April First, the system performed a query to narrow down the HL7 records to only those that matched our conditions (gender, or PID-8; birth date, or PID- 7; lab test result, whose OBX-3 is JHDL). In Map process, the system extracts the JHDL laboratory test results from the value of the OBX-5 field from the OBX resides previously matched HL7 messages. In Reduce process, it counts the number of laboratory test and the sum of the laboratory test result values from every node. In finalizing process, it calculates the average value for HDL cholesterol from previously corrected values. We conducted this process 10 times, changing the number of nodes and number of HL7 messages involved, and determined the average processing time. 2. Results The average size of HL7 messages was 824 bytes, and that of BSON-converted ones was 3568 bytes. When 100 million HL7 messages were stored in MongoDB, its physical volume was Gb. Figure 3 shows the relationship between the number of nodes and data registration performance. Registration performance increased up to four nodes, after which it remained constant. Figure 4 shows the relationship between

4 690 E. Kimura and K. Ishihara / Virtual File System on NoSQL for Processing High Volumes the number of concurrent connections and the number of bulk-transferred HL7 messages. As the number of concurrent connections increased, the registration performance improved, up to 34 simultaneous connections. At that point, registration peaked at 7664 messages per second and thereafter reached a plateau. The number of bulk-transferred messages had no impact on the overall performance. As long as every node has less than 30 million messages, the sharding shows an inverse proportion to the number of nodes. The processing time was measured as t = /x (s) (R2 = 0.987) (x: number of nodes), and it shows the O(n) order performance scale. Fig 2. SS-MIX schema on MongoDB Fig 3. Performance of bulk message transfer Fig. 4 Performance of MongoDB sharding Fig. 5 MapReduce processing time 3. Discussion MongoDB uses the sharding keys to keep O(n) order search performance as a whole by adding sharding nodes. We had to take advantage of assigned equally distributed ID, not patient ID for the sharding key because patient ID was known to be considerable variation in distribution. This method shows high scalability in processing cross tabulations by reducing the need of cross-referring data over another nodes. As MongoDB depends on memory-mapped files 9, its performance is reduced greatly when its contents exceed the capacity of the server memory. According to our tests, its performance was degraded once 30 million messages were stored in a single node. MongoDB is a document-oriented No-SQL that uses the BSON format as an internal storing representation and allows indexing of all document contents. Hence, we believe that it is suitable to use MongoDB as the NoSQL infrastructure for structured documents such as HL7 CDA R2. Our system converts raw HL7 messages into BSON

5 E. Kimura and K. Ishihara / Virtual File System on NoSQL for Processing High Volumes 691 format, and it proved to be scalable. Assuming a server akin to what we used in the present study, 55 nodes will be sufficient to process one billion HL7 records from all healthcare institutions in Japan. Our system may conduct a MapReduce process in minutes and handle real-time streaming of laboratory results to detect anomalies, such as signs of infectious disease spread. However, in our tests, the routing server eventually reached a plateau in registration performance. Hence, we have to increase the nodes on the routing server to avoid this problem in the future. The previous study 10 has the similar system settings of ours one. However, the main difference is that the study stores data files on legacy file system. It builds the metadata indexes for the files and stores them into MongoDB. It provides the virtual file system that shows the files limited by some queries against metadata. Meanwhile, our system stores data directly and adds the metadata for simulating virtual file system on MongoDB to overcome the performance limitation of legacy file system. A healthcare setting can mount its data through a virtual file system, separated from other healthcare settings data. Despite the fact that our approach does not need the preexistence of a file system, providing the virtual file system was required to ensure compatibility with legacy applications on the SS-MIX storage requires a file system. HL7 had been developing innovative standards framework Fast Healthcare Interoperability Resources (FHIR) for sharing medical information 11. In its specification, FHIR adopts JSON as a standard representation format, which has sideby-side compatibility with BSON. Therefore, FHIR documents will be the primary targets of parallel distributed processing immediately by storing in MongoDB. We will verify whether FHIR is a suitable format for distributed processing in cloud computing. Acknowledgements: Data processing and other research was performed using the NICT Science Cloud at the National Institute of Information and Communications Technology (NICT) as a collaborative research project. This work was supported by MEXT KAKENHI Grant Number References [1] Policy OoSaT. OBAMA ADMINISTRATION UNVEILS BIG DATA INITIATIVE: ANNOUNCES $200 MILLION IN NEW R&D INVESTMENTS 2012; Available from: [2] Kimura M, Nakayasu K, Ohshima Y, Fujita N, Nakashima N, Jozaki H, et al. SS-MIX: A Ministry Project to Promote Standardized Healthcare Information Exchange. Methods of Information in Medicine. 2011;50(2):131. [3] Chodorow K. Scaling MongoDB: O'Reilly Media, Inc.; [4] Cooper J. How Entities and Indexes are Stored. 2009; Available from: [5] Murata KT, Watari S, Nagatsuma T, Kunitake M, Watanabe H, Yamamoto K, et al. A Science Cloud for Data Intensive Sciences. Data Science Journal. 2013;12:WDS139-WDS46. [6] Schmuck FB, Haskin RL. GPFS: A Shared-Disk File System for Large Computing Clusters. FAST. 2002;2:19. [7] Cattell R. Scalable SQL and NoSQL data stores. ACM SIGMOD Record. 2011;39(4): [8] Szeredi M. Filesystem in Userspace. 2013; Available from: [9] Parker Z, Poe S, Vrbsky SV. Comparing NoSQL MongoDB to an SQL DB. Proceedings of the 51st ACM Southeast Conference; Savannah, Georgia : ACM; p [10] Jacobi MR, editor. Applied Parallel Metadata Indexing. Conference: 4th Annual Computing and Information Technology Student Mini-Showcase; 2012: Los Alamos National Laboratory (LANL). [11] HL7. FHIR: Fast healthcare interoperability resources [cited /17]; Available from:

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage

More information

A method for handling multi-institutional HL7 data on Hadoop in the cloud

A method for handling multi-institutional HL7 data on Hadoop in the cloud A method for handling multi-institutional HL7 data on Hadoop in the cloud { Masamichi Ishii *1, Yoshimasa Kawazoe *1, Akimichi Tatsukawa 2*, Kazuhiko Ohe *2 *1 Department of Planning, Information and Management,

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB bankmark UG (haftungsbeschränkt) Bahnhofstraße 1 9432 Passau Germany www.bankmark.de [email protected] T +49 851 25 49 49 F +49 851 25 49 499 NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB,

More information

Can the Elephants Handle the NoSQL Onslaught?

Can the Elephants Handle the NoSQL Onslaught? Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented

More information

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010 System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached

More information

MongoDB: document-oriented database

MongoDB: document-oriented database MongoDB: document-oriented database Software Languages Team University of Koblenz-Landau Ralf Lämmel, Sebastian Jackel and Andrei Varanovich Motivation Need for a flexible schema High availability Scalability

More information

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Dave Dykstra [email protected] Fermilab is operated by the Fermi Research Alliance, LLC under contract No. DE-AC02-07CH11359

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

MongoDB Developer and Administrator Certification Course Agenda

MongoDB Developer and Administrator Certification Course Agenda MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL

More information

MongoDB and Couchbase

MongoDB and Couchbase Benchmarking MongoDB and Couchbase No-SQL Databases Alex Voss Chris Choi University of St Andrews TOP 2 Questions Should a social scientist buy MORE or UPGRADE computers? Which DATABASE(s)? Document Oriented

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

L7_L10. MongoDB. Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD.

L7_L10. MongoDB. Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. L7_L10 MongoDB Agenda What is MongoDB? Why MongoDB? Using JSON Creating or Generating a Unique Key Support for Dynamic Queries Storing Binary Data Replication Sharding Terms used in RDBMS and MongoDB Data

More information

In Memory Accelerator for MongoDB

In Memory Accelerator for MongoDB In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000

More information

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk Benchmarking Couchbase Server for Interactive Applications By Alexey Diomin and Kirill Grigorchuk Contents 1. Introduction... 3 2. A brief overview of Cassandra, MongoDB, and Couchbase... 3 3. Key criteria

More information

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment R&D supporting future cloud computing infrastructure technologies Research and Development on Autonomic Operation Control Infrastructure Technologies in the Cloud Computing Environment DEMPO Hiroshi, KAMI

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Getting Started with MongoDB

Getting Started with MongoDB Getting Started with MongoDB TCF IT Professional Conference March 14, 2014 Michael P. Redlich @mpredli about.me/mpredli/ 1 1 Who s Mike? BS in CS from Petrochemical Research Organization Ai-Logix, Inc.

More information

Comparisons Between MongoDB and MS-SQL Databases on the TWC Website

Comparisons Between MongoDB and MS-SQL Databases on the TWC Website American Journal of Software Engineering and Applications 2015; 4(2): 35-41 Published online April 28, 2015 (http://www.sciencepublishinggroup.com/j/ajsea) doi: 10.11648/j.ajsea.20150402.12 ISSN: 2327-2473

More information

Benchmarking and Analysis of NoSQL Technologies

Benchmarking and Analysis of NoSQL Technologies Benchmarking and Analysis of NoSQL Technologies Suman Kashyap 1, Shruti Zamwar 2, Tanvi Bhavsar 3, Snigdha Singh 4 1,2,3,4 Cummins College of Engineering for Women, Karvenagar, Pune 411052 Abstract The

More information

MongoDB in the NoSQL and SQL world. Horst Rechner [email protected] Berlin, 2012-05-15

MongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15 MongoDB in the NoSQL and SQL world. Horst Rechner [email protected] Berlin, 2012-05-15 1 MongoDB in the NoSQL and SQL world. NoSQL What? Why? - How? Say goodbye to ACID, hello BASE You

More information

NoSQL Data Base Basics

NoSQL Data Base Basics NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS

More information

A Big Data-driven Model for the Optimization of Healthcare Processes

A Big Data-driven Model for the Optimization of Healthcare Processes Digital Healthcare Empowering Europeans R. Cornet et al. (Eds.) 2015 European Federation for Medical Informatics (EFMI). This article is published online with Open Access by IOS Press and distributed under

More information

Analisi di un servizio SRM: StoRM

Analisi di un servizio SRM: StoRM 27 November 2007 General Parallel File System (GPFS) The StoRM service Deployment configuration Authorization and ACLs Conclusions. Definition of terms Definition of terms 1/2 Distributed File System The

More information

NoSQL Databases. Polyglot Persistence

NoSQL Databases. Polyglot Persistence The future is: NoSQL Databases Polyglot Persistence a note on the future of data storage in the enterprise, written primarily for those involved in the management of application development. Martin Fowler

More information

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing

More information

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Rekha Singhal and Gabriele Pacciucci * Other names and brands may be claimed as the property of others. Lustre File

More information

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013 Big Data Use Case How Rackspace is using Private Cloud for Big Data Bryan Thompson May 8th, 2013 Our Big Data Problem Consolidate all monitoring data for reporting and analytical purposes. Every device

More information

Contextual cloud-based service oriented architecture for clinical workflow

Contextual cloud-based service oriented architecture for clinical workflow 592 Digital Healthcare Empowering Europeans R. Cornet et al. (Eds.) 2015 European Federation for Medical Informatics (EFMI). This article is published online with Open Access by IOS Press and distributed

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - [email protected]. CMSC 601 - Presentation

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM Aniket Bochare - [email protected] CMSC 601 - Presentation Date-04/25/2011 AGENDA Introduction and Background Framework Heterogeneous

More information

Big Data Visualization with JReport

Big Data Visualization with JReport Big Data Visualization with JReport Dean Yao Director of Marketing Greg Harris Systems Engineer Next Generation BI Visualization JReport is an advanced BI visualization platform: Faster, scalable reports,

More information

Benchmarking Cassandra on Violin

Benchmarking Cassandra on Violin Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract

More information

Structured Data Storage

Structured Data Storage Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct

More information

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF LANE <[email protected]> @GEOFFLANE

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF LANE <GEOFF@ZORCHED.NET> @GEOFFLANE NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF LANE @GEOFFLANE WHAT IS NOSQL? NON-RELATIONAL DATA STORAGE USUALLY SCHEMA-FREE ACCESS DATA WITHOUT SQL (THUS... NOSQL) WIDE-COLUMN / TABULAR

More information

Accelerating and Simplifying Apache

Accelerating and Simplifying Apache Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly

More information

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA Ompal Singh Assistant Professor, Computer Science & Engineering, Sharda University, (India) ABSTRACT In the new era of distributed system where

More information

NoSQL document datastore as a backend of the visualization platform for ECM system

NoSQL document datastore as a backend of the visualization platform for ECM system NoSQL document datastore as a backend of the visualization platform for ECM system JURIS RATS RIX Technologies Riga, Latvia Abstract: - The aim of the research is to assess performance of the NoSQL Document-oriented

More information

How To Scale Out Of A Nosql Database

How To Scale Out Of A Nosql Database Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI

More information

A Performance Analysis of Distributed Indexing using Terrier

A Performance Analysis of Distributed Indexing using Terrier A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information

Lustre * Filesystem for Cloud and Hadoop *

Lustre * Filesystem for Cloud and Hadoop * OpenFabrics Software User Group Workshop Lustre * Filesystem for Cloud and Hadoop * Robert Read, Intel Lustre * for Cloud and Hadoop * Brief Lustre History and Overview Using Lustre with Hadoop Intel Cloud

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

MongoDB. The Definitive Guide to. The NoSQL Database for Cloud and Desktop Computing. Apress8. Eelco Plugge, Peter Membrey and Tim Hawkins

MongoDB. The Definitive Guide to. The NoSQL Database for Cloud and Desktop Computing. Apress8. Eelco Plugge, Peter Membrey and Tim Hawkins The Definitive Guide to MongoDB The NoSQL Database for Cloud and Desktop Computing 11 111 TECHNISCHE INFORMATIONSBIBLIO 1 HEK UNIVERSITATSBIBLIOTHEK HANNOVER Eelco Plugge, Peter Membrey and Tim Hawkins

More information

Open Source Technologies on Microsoft Azure

Open Source Technologies on Microsoft Azure Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions

More information

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory) WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...

More information

Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

More information

Big data and urban mobility

Big data and urban mobility Big data and urban mobility Antònia Tugores,PereColet Instituto de Física Interdisciplinar y Sistemas Complejos, IFISC(UIB-CSIC) Abstract. Data sources have been evolving the last decades and nowadays

More information

Bi-Directional Interface between EMR and Quest Diagnostics Microsoft.NET with SQL Server Reporting Services solution for Healthcare Company

Bi-Directional Interface between EMR and Quest Diagnostics Microsoft.NET with SQL Server Reporting Services solution for Healthcare Company Bi-Directional Interface between EMR and Quest Diagnostics Microsoft.NET with SQL Server Reporting Services solution for Healthcare Company Executive Summary One of our EMR clients approached us to setup

More information

Building Heavy Load Messaging System

Building Heavy Load Messaging System CASE STUDY Building Heavy Load Messaging System About IntelliSMS Intelli Messaging simplifies mobile communication methods so you can cost effectively build mobile communication into your business processes;

More information

Cloud Scale Distributed Data Storage. Jürmo Mehine

Cloud Scale Distributed Data Storage. Jürmo Mehine Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented

More information

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015 Leveraging the Power of SOLR with SPARK Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015 Welcome Johannes Weigend - CTO QAware GmbH - Software architect / developer - 25 years

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction

More information

Big Data and Hadoop with Components like Flume, Pig, Hive and Jaql

Big Data and Hadoop with Components like Flume, Pig, Hive and Jaql Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 7, July 2014, pg.759

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2

More information

Big Systems, Big Data

Big Systems, Big Data Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,

More information

IMPLEMENTING GREEN IT

IMPLEMENTING GREEN IT Saint Petersburg State University of Information Technologies, Mechanics and Optics Department of Telecommunication Systems IMPLEMENTING GREEN IT APPROACH FOR TRANSFERRING BIG DATA OVER PARALLEL DATA LINK

More information

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis , 22-24 October, 2014, San Francisco, USA Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis Teng Zhao, Kai Qian, Dan Lo, Minzhe Guo, Prabir Bhattacharya, Wei Chen, and Ying

More information

CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY

CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY White Paper CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY DVTel Latitude NVMS performance using EMC Isilon storage arrays Correct sizing for storage in a DVTel Latitude physical security

More information

DYNAMIC QUERY FORMS WITH NoSQL

DYNAMIC QUERY FORMS WITH NoSQL IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 7, Jul 2014, 157-162 Impact Journals DYNAMIC QUERY FORMS WITH

More information

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO [email protected] DAMA SF December 15, 2011

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011 NoSQL - What we ve learned with mongodb Paul Pedersen, Deputy CTO [email protected] DAMA SF December 15, 2011 DW2.0 and NoSQL management decision support intgrated access - local v. global - structured v.

More information

Supporting in- and off-hospital Patient Management Using a Web-based Integrated Software Platform

Supporting in- and off-hospital Patient Management Using a Web-based Integrated Software Platform Digital Healthcare Empowering Europeans R. Cornet et al. (Eds.) 2015 European Federation for Medical Informatics (EFMI). This article is published online with Open Access by IOS Press and distributed under

More information

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon [email protected] [email protected] XLDB

More information

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server) Scalability Results Select the right hardware configuration for your organization to optimize performance Table of Contents Introduction... 1 Scalability... 2 Definition... 2 CPU and Memory Usage... 2

More information

Performance Analysis of Book Recommendation System on Hadoop Platform

Performance Analysis of Book Recommendation System on Hadoop Platform Performance Analysis of Book Recommendation System on Hadoop Platform Sugandha Bhatia #1, Surbhi Sehgal #2, Seema Sharma #3 Department of Computer Science & Engineering, Amity School of Engineering & Technology,

More information

IBM Rational Asset Manager

IBM Rational Asset Manager Providing business intelligence for your software assets IBM Rational Asset Manager Highlights A collaborative software development asset management solution, IBM Enabling effective asset management Rational

More information

.NET User Group Bern

.NET User Group Bern .NET User Group Bern Roger Rudin bbv Software Services AG [email protected] Agenda What is NoSQL Understanding the Motivation behind NoSQL MongoDB: A Document Oriented Database NoSQL Use Cases What is

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

NoSQL web apps. w/ MongoDB, Node.js, AngularJS. Dr. Gerd Jungbluth, NoSQL UG Cologne, 4.9.2013

NoSQL web apps. w/ MongoDB, Node.js, AngularJS. Dr. Gerd Jungbluth, NoSQL UG Cologne, 4.9.2013 NoSQL web apps w/ MongoDB, Node.js, AngularJS Dr. Gerd Jungbluth, NoSQL UG Cologne, 4.9.2013 About us Passionate (web) dev. since fallen in love with Sinclair ZX Spectrum Academic background in natural

More information

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc. How to Ingest Data into Google BigQuery using Talend for Big Data A Technical Solution Paper from Saama Technologies, Inc. July 30, 2013 Table of Contents Intended Audience What you will Learn Background

More information

EOFS Workshop Paris Sept, 2011. Lustre at exascale. Eric Barton. CTO Whamcloud, Inc. [email protected]. 2011 Whamcloud, Inc.

EOFS Workshop Paris Sept, 2011. Lustre at exascale. Eric Barton. CTO Whamcloud, Inc. eeb@whamcloud.com. 2011 Whamcloud, Inc. EOFS Workshop Paris Sept, 2011 Lustre at exascale Eric Barton CTO Whamcloud, Inc. [email protected] Agenda Forces at work in exascale I/O Technology drivers I/O requirements Software engineering issues

More information

these three NoSQL databases because I wanted to see a the two different sides of the CAP

these three NoSQL databases because I wanted to see a the two different sides of the CAP Michael Sharp Big Data CS401r Lab 3 For this paper I decided to do research on MongoDB, Cassandra, and Dynamo. I chose these three NoSQL databases because I wanted to see a the two different sides of the

More information

Big Data Visualization and Dashboards

Big Data Visualization and Dashboards Big Data Visualization and Dashboards Boney Pandya Marketing Manager Greg Harris Systems Engineer Follow us @Jinfonet #BigDataWebinar JReport Highlights Advanced, Embedded Data Visualization Platform:

More information

MakeMyTrip CUSTOMER SUCCESS STORY

MakeMyTrip CUSTOMER SUCCESS STORY MakeMyTrip CUSTOMER SUCCESS STORY MakeMyTrip is the leading travel site in India that is running two ClustrixDB clusters as multi-master in two regions. It removed single point of failure. MakeMyTrip frequently

More information

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a

More information

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Alexandra Carpen-Amarie Diana Moise Bogdan Nicolae KerData Team, INRIA Outline

More information

NoSQL. Thomas Neumann 1 / 22

NoSQL. Thomas Neumann 1 / 22 NoSQL Thomas Neumann 1 / 22 What are NoSQL databases? hard to say more a theme than a well defined thing Usually some or all of the following: no SQL interface no relational model / no schema no joins,

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

Tunebot in the Cloud. Arefin Huq 18 Mar 2010

Tunebot in the Cloud. Arefin Huq 18 Mar 2010 Tunebot in the Cloud Arefin Huq 18 Mar 2010 What is Tunebot? What is Tunebot? http://tunebot.cs.northwestern.edu Automated online music search engine for query-by-humming (QBH). What is Tunebot? http://tunebot.cs.northwestern.edu

More information

Business Process Management with @enterprise

Business Process Management with @enterprise Business Process Management with @enterprise March 2014 Groiss Informatics GmbH 1 Introduction Process orientation enables modern organizations to focus on the valueadding core processes and increase

More information

Ad Hoc Analysis of Big Data Visualization

Ad Hoc Analysis of Big Data Visualization Ad Hoc Analysis of Big Data Visualization Dean Yao Director of Marketing Greg Harris Systems Engineer Follow us @Jinfonet #BigDataWebinar JReport Highlights Advanced, Embedded Data Visualization Platform:

More information

An Open Source NoSQL solution for Internet Access Logs Analysis

An Open Source NoSQL solution for Internet Access Logs Analysis An Open Source NoSQL solution for Internet Access Logs Analysis A practical case of why, what and how to use a NoSQL Database Management System instead of a relational one José Manuel Ciges Regueiro

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

White Paper February 2010. IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario

White Paper February 2010. IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario White Paper February 2010 IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario 2 Contents 5 Overview of InfoSphere DataStage 7 Benchmark Scenario Main Workload

More information