Big Data Processing and Analytics for Mouse Embryo Images
|
|
- Ruth Fleming
- 8 years ago
- Views:
Transcription
1 Big Data Processing and Analytics for Mouse Embryo Images liangxiu han Zheng xie, Richard Baldock The AGILE Project team FUNDS Research Group - Future Networks and Distributed Systems School of Computing, Mathematics and Digital Technology, Manchester Metropolitan University Group Webpage:
2 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
3 Background- Big data Era Increasing Capability of Generating and Capturing Data: User-generated (e.g. social media/networks), Machines and sensor data (e.g. experimental simulations, environmental sensors, RFID, etc.), Open government and public data, etc. Big Data Era: Data intensive/data centric/data-driven Data size trends (IDC report) 40 Data Volumes 35 Zettabytes (10^21 bytes)
4 Background- Big data Era Data from different domains: Astrophysics, Biomedical Science, Geoscience, Social Science, etc. Facebook: 625,000TB per day Sloan Sky Survey: over 40TB cabig: 4.7+ millions biomedical images for cancers Gene expression data in GEO and ArrayExpress: over 1 millions GeneBank: over
5 Background- Big data Era base pairs Nature 487, (19 July 2012) doi: /487282a /82 01/86 01/90 01/94 01/98 01/02 01/06 01/10 01/14 date,mm/yy
6 Background- Big data Era What is Big Data? A relative term ( don t define it in terms of size being larger than a certain number of terabytes or petabytes) Larger, more complex and hard to access, organise and analysis beyond the capability of the existing tools (varying on sectors) Volume (amount of data), variety ( types), velocity (the growth rate of data coupled with the need to deliver insights and make decisions faster) and complexity (difficulties in transforming and integrating/linking various data) of data beyond the capability of an organisation to capture, store, manage and process
7 Background- Challenges How to filter and reduce the amount of data to enable timely insight and decisions? - Data analytics (computational modelling) How to process large-scale data efficiently( data movement) - Data processing (parallel and distributed /high performance computing) 2+-3 "#"$6'078)9$ 6"&+)#19$ ('8%0):+#19$ "#"$%&'()**+,- $",.$/,"01*+* 4'5 4'5 "#"$6"07) 2+-3 Source:
8 Background- AGILE PROJECT BBSRC funded AGILE project: A Cloud Approach to Automatic Gene Expression Pattern Recognition and Annotation over Large-Scale Images Foundational work - Automatic Gene Expression Pattern Recognition and Annotation : Data Analytics Goal I - Development of parallel approaches to allow efficient exploitation of Cloud computing: Data Processing Goal II -- Development of generic data reuse mechanism and standard services for performance enhancement and cost reduction in the Cloud: Data Processing
9 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
10 Motivation &Challenges-I Annotation of Gene Expression Patterns: tagging an anatomical term from ontology with gene expression patterns of the anatomical component in images
11 Motivation &Challenges-II euxaxssay_007708_02.jpg euxaxssay_007708_06.jpg euxaxssay_007708_16.jpg
12 Motivation &Challenges-Iii Gene expression patterns --- a way to understand the interaction between genes The availability of both ontological annotation and spatial gene pattern --- a resource to identify the mechanism of embryo organisation The current manual annotation --- costly and time consuming Massive amounts of data and complicated organism --- necessity to automate the process of annotation
13 Motivation &Challenges-Iv Big data, now over 20 TB Multi-components coexisting in an image Variable shape, location and orientation of images The number of images associated with a certain gene is uneven The dimensionality of each image is high (3kx4k pixels)
14 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
15 Methodology-i The Framework
16 Methodology-iI Core methodologies Image Processing Wavelet Transform Fisher Ratio LDA(SVM, ANN, LSVM)
17 Methodology-iII Image Processing - Filtering
18 Methodology-IV Wavelet transform Wavelet decomposition
19 Methodology-V Fishers Ratio LDA (Linear Discrimination Analysis) Linear discriminant function: Target function: Between-class scatter matrix Within-class scatter matrix
20 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
21 Experimental Evaluation This work has been published in Bioinformatics (Journal): *Han, L., van Hemert, J., Baldock, R. "Automatically Identifying and Annotating Mouse Embryo Gene Expression Patterns", Bioinformatics 27(8),pp , Oxford Journals, Oxford University Press. DOI: / BIOINFORMATICS/BTR105, 2011
22 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
23 parallelisation EXPLORATION- I Parallelisation (e.g. multicore, cloud computing), along with parallel programming models (e.g. MPI, MapReduce), is a sought after solution to address big data problems Three considerations for parallelising an application How to distribute workloads or decompose an algorithm into parts How to map the tasks onto various computing nodes and execute subtasks in parallel How to coordinate and communicate subtasks on those computing nodes.
24 parallelisation EXPLORATION- II Data Parallelism: workload are distributed into different computing nodes and the same task can be executed on different subsets of the data simultaneously Task Parallelism: tasks are independent and can be executed purely in parallel Pipelining: an iteration of a task consisting of many stages, where each stage in the task is chained and executed in order and the output of one stage is the input of the next one.
25 parallelisation EXPLORATION- IIi Two parallel implementations: MapReduce MPI -- Message Passing Interface "#$%$%&'(#)#*+)'?+#('>8#&+',$/+' 234E6 >8#&+'?+*0#/+ >8#&+ A+%1$*+234B6,+#)-"+' C+%+"#)$1%234D6,.I,.E,.7.FG'F-+"H'' 234I6.#89/+.9/$)' 23476,+#)-"+'.+/+0)$1% *)$%&'(#)#*+)?+#('>8#&+',$/+' 234E6 >8#&+'?+*0#/+ >8#&+ A+%1$*+234B6,+#)-"+' C+%+"#)$1%234D6 "#$%&"'"(()(*#+,"-"%&"'"(()(*#+ Publications *Xie, Z., Han, L., Baldock, R., "Enhancing Parallelism of Data-Intensive Bioinformatics Applications", 8th EUROSIM Congress on Modelling and Simulation (EUROSIM 2013), 2013, Cardiff. *Han, L., Ong, H.-Y.,"Accelerating biomedical data intensive application using MapReduce", the 13th ACM ACM/IEEE International conference on Grid Computing (Grid 2012), Sep , Beijing, China, 2012.,+#)-"+' 4:)"#0)$1%234;6 </#**$+" 234=6,+#)-"+' 4:)"#0)$1%234;6 3"+($0)$1%' 4J#/-#)$1%234IK6
26 parallelisation EXPLORATION- Iv MapReduce implementation A7D)-1()==%( "#$%&'()*%++,-$.'/& 5%#16(%& E%-%(#1,)-& 3(#,-,-$4%1&5%#16(%.37/ 3%+1,-$4%1&5%#16(%.33/ 0#1#&'#*2%( 8))9&:)(& ;<&:)=>& *()++&?#=,>#1,)- 5%#16(%&4%=%*1,)-.54/ D=#++,%(.D=#/ "#$%&'()"#*+,-*.&/)'01"#$23.+ 6##"('()"#5 *4.'$,5 '(%>,*1,)-&
27 parallelisation EXPLORATION- v MPI implementation 56%7."80+).&&',7" 2"-.%*/0." 9.,.0%*'+," -.%*/0." 1.$.)*'+,"2" 34*0%)*'+," "#$%&&'(')%*'+," #$%&'(%)*(+(,)+ -.&,/(++)0,123*(+(,)+405'6(789.05/)::50 6(789&'(%) 6(78=&'(%) 6(78>&'(%).05/)::$7%.05/)::$7%.05/)::$7% ;)(+10)*(+(,)+ ;)(+10)*(+(,)+ ;)(+10)*(+(,)+ -.&?(+@)0+5;50'A5'BC)+);)(+10)*(+(,)+(+6(789.05/)::50 -.&,/(++)0,12;)(+10)*(+(,)+(7D&7D)E(+6(789B05/)::50 6(789;$7)?0($7 ;)(+10),)C)/+$57 6(78=;$7)?0($7 ;)(+10),)C)/+$57 6(78>;$7)?0($7 ;)(+10),)C)/+$57 -.&?(+@)0+5D5;)(+10)FE+0(/+$57(+6(789.05/)::50 -.&,/(++)0,12;)(+10)*(+(,)+(7D&7D)E(+6(789.05/)::50 ;$7)3?0($7?051B ;$7)3?0($7?051B ;$7)?0($7?C52(C >)%G-)(7.5:G-)(7 -)(7 -.&,)7DI6)/H -.&,)7DI6)/H -.&,)7DI6)/H ;$7)?0($7 A5H(0$(7/) <):+*(+(,)+ ;$7)?0($7*$:/0$'$7(+$57 ;17/+$57 -.&?(+@)0+5D5AC(::$4$/(+$57.0)D$/(+$57(+6(789.05/)::50 ;$%10)JG.(0(CC)C(0/@$+)/+10)450+@)K5084C5KG :" "
28 parallelisation EXPLORATION-vi Experiment Results ( Speedup) speedup X 2X 4X 8X 16X 32X ideal 18 x=128 images 16 1X 2X x=128 images 14 4X 8X 12 16X 32X 10 64X ideal speedup nodes nodes MapReduce MPI
29 Outline What is Annotation? Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
30 Ongoing work- I Data analytics Feature extraction ( locate the region, process the region of the image, and reduce computing time) and classification algorithms
31 Ongoing work- II Data analytics Combination of high-level concepts (semantics) and low-level features ( image processing and data mining)
32 Ongoing work- III Data processing side (Parallelisation and cloud computing) Development of data-reuse mechanism for costeffective and optimisation of the data intensive applications running in the Cloud Large-scale evaluation in the Cloud
33 FUNDS Research Group Group members: 4 academic staff + 1 PDRA+ 8 PhDs. Also two associate members. There will have more new members to come. Areas of interest Novel architecture of networked distributed system (parallel and distributed computing, cloud computing, wired and wireless sensor networks, IoT, etc.) Large-scale data mining ( application domains: biomedical images, environmental sensors, computer network traffic, web pages including content and linkage of graph, social network analysis, etc.)
34 Thank you
Parallel Data intensive applications in the cloud ---A data mining use case study in the life science
Parallel Data intensive applications in the cloud ---A data mining use case study in the life science Liangxiu Han Co-authors: Tantana Saengngam and Jano van Hemert UK-eScience-2010, Cardiff outline What
More informationSurfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
More informationConquering the Astronomical Data Flood through Machine
Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:
More informationBig Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014
Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions
More informationUsing Proxies to Accelerate Cloud Applications
Using Proxies to Accelerate Cloud Applications Jon Weissman and Siddharth Ramakrishnan Department of Computer Science and Engineering University of Minnesota, Twin Cities Abstract A rich cloud ecosystem
More informationDeveloping Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationOutline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
More informationMining Large Datasets: Case of Mining Graph Data in the Cloud
Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationVolume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationLarge-Scale Data Processing
Large-Scale Data Processing Eiko Yoneki eiko.yoneki@cl.cam.ac.uk http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase
More informationBig Data Systems CS 5965/6965 FALL 2015
Big Data Systems CS 5965/6965 FALL 2015 Today General course overview Expectations from this course Q&A Introduction to Big Data Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2015.html
More informationConcept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationBig Data. Lyle Ungar, University of Pennsylvania
Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -
More informationBig Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
More informationText Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationWhat happens when Big Data and Master Data come together?
What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information
More informationUsing the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
More informationExploiting Data at Rest and Data in Motion with a Big Data Platform
Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags
More informationData Centric Computing Revisited
Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data
More informationImage Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm
Image Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm Lokesh Babu Rao 1 C. Elayaraja 2 1PG Student, Dept. of ECE, Dhaanish Ahmed College of Engineering,
More informationFramework and key technologies for big data based on manufacturing Shan Ren 1, a, Xin Zhao 2, b
International Conference on Materials Engineering and Information Technology Applications (MEITA 2015) Framework and key technologies for big data based on manufacturing Shan Ren 1, a, Xin Zhao 2, b 1
More informationData-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
More informationBeyond Watson: The Business Implications of Big Data
Beyond Watson: The Business Implications of Big Data Shankar Venkataraman IBM Program Director, STSM, Big Data August 10, 2011 The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT
More informationData Mining and Machine Learning in Bioinformatics
Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg
More informationResearch Statement Immanuel Trummer www.itrummer.org
Research Statement Immanuel Trummer www.itrummer.org We are collecting data at unprecedented rates. This data contains valuable insights, but we need complex analytics to extract them. My research focuses
More informationIntroduction to Engineering Using Robotics Experiments Lecture 17 Big Data
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems
More informationHP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica
HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety
More informationMachine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI-261323
Machine Learning and Cloud Computing trends, issues, solutions Daniel Pop HOST Workshop 2012 Future plans // Tools and methods Develop software package(s)/libraries for scalable, intelligent algorithms
More informationIntroduction of Information Visualization and Visual Analytics. Chapter 2. Introduction and Motivation
Introduction of Information Visualization and Visual Analytics Chapter 2 Introduction and Motivation Overview! 2 Overview and Motivation! Information Visualization (InfoVis)! InfoVis Application Areas!
More informationANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
More informationOpen source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationBig data and its transformational effects
Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More informationIndex Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated
More informationBIG Big Data Public Private Forum
DATA STORAGE Martin Strohbach, AGT International (R&D) THE DATA VALUE CHAIN Value Chain Data Acquisition Data Analysis Data Curation Data Storage Data Usage Structured data Unstructured data Event processing
More informationBig Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationBig Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
More informationAdvances in Natural and Applied Sciences
AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Clustering Algorithm Based On Hadoop for Big Data 1 Jayalatchumy D. and
More informationBringing Compute to the Data Alternatives to Moving Data. Part of EUDAT s Training in the Fundamentals of Data Infrastructures
Bringing Compute to the Data Alternatives to Moving Data Part of EUDAT s Training in the Fundamentals of Data Infrastructures Introduction Why consider alternatives? The traditional approach Alternative
More informationData Services @neurist and beyond
s @neurist and beyond Siegfried Benkner Department of Scientific Computing Faculty of Computer Science University of Vienna http://www.par.univie.ac.at Department of Scientific Computing Parallel Computing
More informationParallel Compression and Decompression of DNA Sequence Reads in FASTQ Format
, pp.91-100 http://dx.doi.org/10.14257/ijhit.2014.7.4.09 Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format Jingjing Zheng 1,* and Ting Wang 1, 2 1,* Parallel Software and Computational
More informationBig Data Processing with Google s MapReduce. Alexandru Costan
1 Big Data Processing with Google s MapReduce Alexandru Costan Outline Motivation MapReduce programming model Examples MapReduce system architecture Limitations Extensions 2 Motivation Big Data @Google:
More informationBig Data and Analytics 21 A Technical Perspective Abhishek Bhattacharya, Aditya Gandhi and Pankaj Jain November 2012
Big Data and Analytics 21 A Technical Perspective Abhishek Bhattacharya, Aditya Gandhi and Pankaj Jain November 2012 Between the dawn of civilization and 2003, the human race created 5 exabytes of data
More informationOntology construction on a cloud computing platform
Ontology construction on a cloud computing platform Exposé for a Bachelor's thesis in Computer science - Knowledge management in bioinformatics Tobias Heintz 1 Motivation 1.1 Introduction PhenomicDB is
More informationSmart Data THE driving force for industrial applications
Smart Data THE driving force for industrial applications European Data Forum Luxembourg, siemens.com The world is becoming digital User behavior is radically changing based on new business models Newspaper,
More informationTowards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Volker Markl volker.markl@tu-berlin.de dima.tu-berlin.de dfki.de/web/research/iam/ bbdc.berlin Based on my 2014 Vision Paper On
More informationISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
More informationWhat is Analytic Infrastructure and Why Should You Care?
What is Analytic Infrastructure and Why Should You Care? Robert L Grossman University of Illinois at Chicago and Open Data Group grossman@uic.edu ABSTRACT We define analytic infrastructure to be the services,
More informationProblems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1
Problems to store, transfer and process the Big Data COURSE: COMPUTING CLUSTERS, GRIDS, AND CLOUDS LECTURER: ANDREY SHEVEL ITMO UNIVERSITY SAINT PETERSBURG 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM
More informationAugmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
More informationIndustrial Internet @GE. Dr. Stefan Bungart
Industrial Internet @GE Dr. Stefan Bungart The vision is clear The real opportunity for change surpassing the magnitude of the consumer Internet is the Industrial Internet, an open, global network that
More informationRole of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
More informationExtracting Business. Value From CAD. Model Data. Transformation. Sreeram Bhaskara The Boeing Company. Sridhar Natarajan Tata Consultancy Services Ltd.
Extracting Business Value From CAD Model Data Transformation Sreeram Bhaskara The Boeing Company Sridhar Natarajan Tata Consultancy Services Ltd. GPDIS_2014.ppt 1 Contents Data in CAD Models Data Structures
More informationAugmented Search for IT Data Analytics. New frontier in big log data analysis and application intelligence
Augmented Search for IT Data Analytics New frontier in big log data analysis and application intelligence Business white paper May 2015 IT data is a general name to log data, IT metrics, application data,
More informationCURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING
Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING R.Kohila
More informationLearning from Big Data in
Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data
More informationICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationManifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
More informationM2M Communications and Internet of Things for Smart Cities. Soumya Kanti Datta Mobile Communications Dept. Email: Soumya-Kanti.Datta@eurecom.
M2M Communications and Internet of Things for Smart Cities Soumya Kanti Datta Mobile Communications Dept. Email: Soumya-Kanti.Datta@eurecom.fr WHAT IS EURECOM A graduate school & research centre in communication
More informationEfficient Analysis of Big Data Using Map Reduce Framework
Efficient Analysis of Big Data Using Map Reduce Framework Dr. Siddaraju 1, Sowmya C L 2, Rashmi K 3, Rahul M 4 1 Professor & Head of Department of Computer Science & Engineering, 2,3,4 Assistant Professor,
More informationBIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
More informationInnovations in Big Data Analytics (Technical Insights)
Brochure More information from http://www.researchandmarkets.com/reports/2725522/ Innovations in Big Data Analytics (Technical Insights) Description: The exponential growth of digital data has been well
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
More informationManaging Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and
More informationMapReduce and Hadoop Distributed File System V I J A Y R A O
MapReduce and Hadoop Distributed File System 1 V I J A Y R A O The Context: Big-data Man on the moon with 32KB (1969); my laptop had 2GB RAM (2009) Google collects 270PB data in a month (2007), 20000PB
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON BIG DATA MANAGEMENT AND ITS SECURITY PRUTHVIKA S. KADU 1, DR. H. R.
More informationTECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING Sunghae Jun 1 1 Professor, Department of Statistics, Cheongju University, Chungbuk, Korea Abstract The internet of things (IoT) is an
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationStart New Conversations, Open New Doors
@ulander Start New Conversations, Open New Doors Grow Your Business with Cisco Peder Ulander Vice President, Cloud and Managed Services Partner Organization, Cisco August 9, 2015 The World Is Changing
More informationSCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS
Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges
More informationBig Data Hope or Hype?
Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big
More informationDistributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
More information3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India
3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing
More informationFigure 1: Architecture of a cloud services model for a digital education resource management system.
World Transactions on Engineering and Technology Education Vol.13, No.3, 2015 2015 WIETE Cloud service model for the management and sharing of massive amounts of digital education resources Binwen Huang
More informationFOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA
FOUNDATIONS OF A CROSSDISCIPLINARY PEDAGOGY FOR BIG DATA Joshua Eckroth Stetson University DeLand, Florida 3867402519 jeckroth@stetson.edu ABSTRACT The increasing awareness of big data is transforming
More informationExploiting the power of Big Data
Exploiting the power of Big Data Timos Sellis School of Computer Science and Information Technology timos.sellis@rmit.edu.au ITECHLAW Asia-Pacific Conference, February 26-28, 2014 Melbourne Australia Timeline
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationBig Data for smart infrastructure: London Bridge Station Redevelopment. Sinan Ackigoz & Krishna Kumar 10.09.15 Cambridge, UK
Big Data for smart infrastructure: London Bridge Station Redevelopment Sinan Ackigoz & Krishna Kumar 10.09.15 Cambridge, UK Redeveloping the redeveloped station 1972 vision 2012 vision 1972 Vision: Two
More informationParallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel
Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationBig Data in Subsea Solutions
Big Data in Subsea Solutions Subsea Valley Conference 2014 Telenor Arena, Fornebu, April 2-3 Roar Fjellheim, Computas AS Computas AS - Brief company profile Norwegian IT consulting company providing services
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationMapReduce and Hadoop Distributed File System
MapReduce and Hadoop Distributed File System 1 B. RAMAMURTHY Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY) bina@buffalo.edu http://www.cse.buffalo.edu/faculty/bina Partially
More informationG-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions
G-Cloud Big Data Suite Powered by Pivotal December 2014 G-Cloud service definitions TABLE OF CONTENTS Service Overview... 3 Business Need... 6 Our Approach... 7 Service Management... 7 Vendor Accreditations/Awards...
More informationBig Data Analytics. The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory
Big Data Analytics The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory * Source: http://www.economistinsights.com/technology-innovation/analysis/hype-and-hope/methodology
More informationAN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.
More informationData Wrangling: From the Wild to the Lake
Data Wrangling: From the Wild to the Lake Ignacio Terrizzano Peter Schwarz Mary Roth John Colino IBM Research - Almaden 48 hours of video is uploaded to YouTube every minute Walmart processes million transactions
More informationIndustry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
More informationBig Data Mining Services and Knowledge Discovery Applications on Clouds
Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades
More informationSimultaneous Gamma Correction and Registration in the Frequency Domain
Simultaneous Gamma Correction and Registration in the Frequency Domain Alexander Wong a28wong@uwaterloo.ca William Bishop wdbishop@uwaterloo.ca Department of Electrical and Computer Engineering University
More informationMachine Learning over Big Data
Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed
More information