Big Data Processing and Analytics for Mouse Embryo Images
|
|
|
- Ruth Fleming
- 10 years ago
- Views:
Transcription
1 Big Data Processing and Analytics for Mouse Embryo Images liangxiu han Zheng xie, Richard Baldock The AGILE Project team FUNDS Research Group - Future Networks and Distributed Systems School of Computing, Mathematics and Digital Technology, Manchester Metropolitan University Group Webpage:
2 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
3 Background- Big data Era Increasing Capability of Generating and Capturing Data: User-generated (e.g. social media/networks), Machines and sensor data (e.g. experimental simulations, environmental sensors, RFID, etc.), Open government and public data, etc. Big Data Era: Data intensive/data centric/data-driven Data size trends (IDC report) 40 Data Volumes 35 Zettabytes (10^21 bytes)
4 Background- Big data Era Data from different domains: Astrophysics, Biomedical Science, Geoscience, Social Science, etc. Facebook: 625,000TB per day Sloan Sky Survey: over 40TB cabig: 4.7+ millions biomedical images for cancers Gene expression data in GEO and ArrayExpress: over 1 millions GeneBank: over
5 Background- Big data Era base pairs Nature 487, (19 July 2012) doi: /487282a /82 01/86 01/90 01/94 01/98 01/02 01/06 01/10 01/14 date,mm/yy
6 Background- Big data Era What is Big Data? A relative term ( don t define it in terms of size being larger than a certain number of terabytes or petabytes) Larger, more complex and hard to access, organise and analysis beyond the capability of the existing tools (varying on sectors) Volume (amount of data), variety ( types), velocity (the growth rate of data coupled with the need to deliver insights and make decisions faster) and complexity (difficulties in transforming and integrating/linking various data) of data beyond the capability of an organisation to capture, store, manage and process
7 Background- Challenges How to filter and reduce the amount of data to enable timely insight and decisions? - Data analytics (computational modelling) How to process large-scale data efficiently( data movement) - Data processing (parallel and distributed /high performance computing) 2+-3 "#"$6'078)9$ 6"&+)#19$ ('8%0):+#19$ "#"$%&'()**+,- $",.$/,"01*+* 4'5 4'5 "#"$6"07) 2+-3 Source:
8 Background- AGILE PROJECT BBSRC funded AGILE project: A Cloud Approach to Automatic Gene Expression Pattern Recognition and Annotation over Large-Scale Images Foundational work - Automatic Gene Expression Pattern Recognition and Annotation : Data Analytics Goal I - Development of parallel approaches to allow efficient exploitation of Cloud computing: Data Processing Goal II -- Development of generic data reuse mechanism and standard services for performance enhancement and cost reduction in the Cloud: Data Processing
9 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
10 Motivation &Challenges-I Annotation of Gene Expression Patterns: tagging an anatomical term from ontology with gene expression patterns of the anatomical component in images
11 Motivation &Challenges-II euxaxssay_007708_02.jpg euxaxssay_007708_06.jpg euxaxssay_007708_16.jpg
12 Motivation &Challenges-Iii Gene expression patterns --- a way to understand the interaction between genes The availability of both ontological annotation and spatial gene pattern --- a resource to identify the mechanism of embryo organisation The current manual annotation --- costly and time consuming Massive amounts of data and complicated organism --- necessity to automate the process of annotation
13 Motivation &Challenges-Iv Big data, now over 20 TB Multi-components coexisting in an image Variable shape, location and orientation of images The number of images associated with a certain gene is uneven The dimensionality of each image is high (3kx4k pixels)
14 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
15 Methodology-i The Framework
16 Methodology-iI Core methodologies Image Processing Wavelet Transform Fisher Ratio LDA(SVM, ANN, LSVM)
17 Methodology-iII Image Processing - Filtering
18 Methodology-IV Wavelet transform Wavelet decomposition
19 Methodology-V Fishers Ratio LDA (Linear Discrimination Analysis) Linear discriminant function: Target function: Between-class scatter matrix Within-class scatter matrix
20 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
21 Experimental Evaluation This work has been published in Bioinformatics (Journal): *Han, L., van Hemert, J., Baldock, R. "Automatically Identifying and Annotating Mouse Embryo Gene Expression Patterns", Bioinformatics 27(8),pp , Oxford Journals, Oxford University Press. DOI: / BIOINFORMATICS/BTR105, 2011
22 Outline Background Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
23 parallelisation EXPLORATION- I Parallelisation (e.g. multicore, cloud computing), along with parallel programming models (e.g. MPI, MapReduce), is a sought after solution to address big data problems Three considerations for parallelising an application How to distribute workloads or decompose an algorithm into parts How to map the tasks onto various computing nodes and execute subtasks in parallel How to coordinate and communicate subtasks on those computing nodes.
24 parallelisation EXPLORATION- II Data Parallelism: workload are distributed into different computing nodes and the same task can be executed on different subsets of the data simultaneously Task Parallelism: tasks are independent and can be executed purely in parallel Pipelining: an iteration of a task consisting of many stages, where each stage in the task is chained and executed in order and the output of one stage is the input of the next one.
25 parallelisation EXPLORATION- IIi Two parallel implementations: MapReduce MPI -- Message Passing Interface "#$%$%&'(#)#*+)'?+#('>8#&+',$/+' 234E6 >8#&+'?+*0#/+ >8#&+ A+%1$*+234B6,+#)-"+' C+%+"#)$1%234D6,.I,.E,.7.FG'F-+"H'' 234I6.#89/+.9/$)' 23476,+#)-"+'.+/+0)$1% *)$%&'(#)#*+)?+#('>8#&+',$/+' 234E6 >8#&+'?+*0#/+ >8#&+ A+%1$*+234B6,+#)-"+' C+%+"#)$1%234D6 "#$%&"'"(()(*#+,"-"%&"'"(()(*#+ Publications *Xie, Z., Han, L., Baldock, R., "Enhancing Parallelism of Data-Intensive Bioinformatics Applications", 8th EUROSIM Congress on Modelling and Simulation (EUROSIM 2013), 2013, Cardiff. *Han, L., Ong, H.-Y.,"Accelerating biomedical data intensive application using MapReduce", the 13th ACM ACM/IEEE International conference on Grid Computing (Grid 2012), Sep , Beijing, China, 2012.,+#)-"+' 4:)"#0)$1%234;6 </#**$+" 234=6,+#)-"+' 4:)"#0)$1%234;6 3"+($0)$1%' 4J#/-#)$1%234IK6
26 parallelisation EXPLORATION- Iv MapReduce implementation A7D)-1()==%( "#$%&'()*%++,-$.'/& 5%#16(%& E%-%(#1,)-& 3(#,-,-$4%1&5%#16(%.37/ 3%+1,-$4%1&5%#16(%.33/ 0#1#&'#*2%( 8))9&:)(& ;<&:)=>& *()++&?#=,>#1,)- 5%#16(%&4%=%*1,)-.54/ D=#++,%(.D=#/ "#$%&'()"#*+,-*.&/)'01"#$23.+ 6##"('()"#5 *4.'$,5 '(%>,*1,)-&
27 parallelisation EXPLORATION- v MPI implementation 56%7."80+).&&',7" 2"-.%*/0." 9.,.0%*'+," -.%*/0." 1.$.)*'+,"2" 34*0%)*'+," "#$%&&'(')%*'+," #$%&'(%)*(+(,)+ -.&,/(++)0,123*(+(,)+405'6(789.05/)::50 6(789&'(%) 6(78=&'(%) 6(78>&'(%).05/)::$7%.05/)::$7%.05/)::$7% ;)(+10)*(+(,)+ ;)(+10)*(+(,)+ ;)(+10)*(+(,)+ -.&?(+@)0+5;50'A5'BC)+);)(+10)*(+(,)+(+6(789.05/)::50 -.&,/(++)0,12;)(+10)*(+(,)+(7D&7D)E(+6(789B05/)::50 6(789;$7)?0($7 ;)(+10),)C)/+$57 6(78=;$7)?0($7 ;)(+10),)C)/+$57 6(78>;$7)?0($7 ;)(+10),)C)/+$57 -.&?(+@)0+5D5;)(+10)FE+0(/+$57(+6(789.05/)::50 -.&,/(++)0,12;)(+10)*(+(,)+(7D&7D)E(+6(789.05/)::50 ;$7)3?0($7?051B ;$7)3?0($7?051B ;$7)?0($7?C52(C >)%G-)(7.5:G-)(7 -)(7 -.&,)7DI6)/H -.&,)7DI6)/H -.&,)7DI6)/H ;$7)?0($7 A5H(0$(7/) <):+*(+(,)+ ;$7)?0($7*$:/0$'$7(+$57 ;17/+$57 -.&?(+@)0+5D5AC(::$4$/(+$57.0)D$/(+$57(+6(789.05/)::50 ;$%10)JG.(0(CC)C(0/@$+)/+10)450+@)K5084C5KG :" "
28 parallelisation EXPLORATION-vi Experiment Results ( Speedup) speedup X 2X 4X 8X 16X 32X ideal 18 x=128 images 16 1X 2X x=128 images 14 4X 8X 12 16X 32X 10 64X ideal speedup nodes nodes MapReduce MPI
29 Outline What is Annotation? Motivation and Challenges Methodology Experimental Evaluation Exploration of Parallelisation Ongoing Work
30 Ongoing work- I Data analytics Feature extraction ( locate the region, process the region of the image, and reduce computing time) and classification algorithms
31 Ongoing work- II Data analytics Combination of high-level concepts (semantics) and low-level features ( image processing and data mining)
32 Ongoing work- III Data processing side (Parallelisation and cloud computing) Development of data-reuse mechanism for costeffective and optimisation of the data intensive applications running in the Cloud Large-scale evaluation in the Cloud
33 FUNDS Research Group Group members: 4 academic staff + 1 PDRA+ 8 PhDs. Also two associate members. There will have more new members to come. Areas of interest Novel architecture of networked distributed system (parallel and distributed computing, cloud computing, wired and wireless sensor networks, IoT, etc.) Large-scale data mining ( application domains: biomedical images, environmental sensors, computer network traffic, web pages including content and linkage of graph, social network analysis, etc.)
34 Thank you
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
Conquering the Astronomical Data Flood through Machine
Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:
Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014
Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions
Using Proxies to Accelerate Cloud Applications
Using Proxies to Accelerate Cloud Applications Jon Weissman and Siddharth Ramakrishnan Department of Computer Science and Engineering University of Minnesota, Twin Cities Abstract A rich cloud ecosystem
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
Mining Large Datasets: Case of Mining Graph Data in the Cloud
Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
Large-Scale Data Processing
Large-Scale Data Processing Eiko Yoneki [email protected] http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase
Big Data Systems CS 5965/6965 FALL 2015
Big Data Systems CS 5965/6965 FALL 2015 Today General course overview Expectations from this course Q&A Introduction to Big Data Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2015.html
Concept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
Big Data. Lyle Ungar, University of Pennsylvania
Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -
Big Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
What happens when Big Data and Master Data come together?
What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information
Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
Exploiting Data at Rest and Data in Motion with a Big Data Platform
Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, [email protected] What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags
Data Centric Computing Revisited
Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data
Image Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm
Image Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm Lokesh Babu Rao 1 C. Elayaraja 2 1PG Student, Dept. of ECE, Dhaanish Ahmed College of Engineering,
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
Beyond Watson: The Business Implications of Big Data
Beyond Watson: The Business Implications of Big Data Shankar Venkataraman IBM Program Director, STSM, Big Data August 10, 2011 The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT
Data Mining and Machine Learning in Bioinformatics
Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg
Research Statement Immanuel Trummer www.itrummer.org
Research Statement Immanuel Trummer www.itrummer.org We are collecting data at unprecedented rates. This data contains valuable insights, but we need complex analytics to extract them. My research focuses
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems
HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica
HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety
Introduction of Information Visualization and Visual Analytics. Chapter 2. Introduction and Motivation
Introduction of Information Visualization and Visual Analytics Chapter 2 Introduction and Motivation Overview! 2 Overview and Motivation! Information Visualization (InfoVis)! InfoVis Application Areas!
ANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
Big data and its transformational effects
Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated
Big Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres [email protected] Talk outline! We talk about Petabyte?
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
Big Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
Advances in Natural and Applied Sciences
AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Clustering Algorithm Based On Hadoop for Big Data 1 Jayalatchumy D. and
Bringing Compute to the Data Alternatives to Moving Data. Part of EUDAT s Training in the Fundamentals of Data Infrastructures
Bringing Compute to the Data Alternatives to Moving Data Part of EUDAT s Training in the Fundamentals of Data Infrastructures Introduction Why consider alternatives? The traditional approach Alternative
Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format
, pp.91-100 http://dx.doi.org/10.14257/ijhit.2014.7.4.09 Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format Jingjing Zheng 1,* and Ting Wang 1, 2 1,* Parallel Software and Computational
Big Data Processing with Google s MapReduce. Alexandru Costan
1 Big Data Processing with Google s MapReduce Alexandru Costan Outline Motivation MapReduce programming model Examples MapReduce system architecture Limitations Extensions 2 Motivation Big Data @Google:
Big Data and Analytics 21 A Technical Perspective Abhishek Bhattacharya, Aditya Gandhi and Pankaj Jain November 2012
Big Data and Analytics 21 A Technical Perspective Abhishek Bhattacharya, Aditya Gandhi and Pankaj Jain November 2012 Between the dawn of civilization and 2003, the human race created 5 exabytes of data
Ontology construction on a cloud computing platform
Ontology construction on a cloud computing platform Exposé for a Bachelor's thesis in Computer science - Knowledge management in bioinformatics Tobias Heintz 1 Motivation 1.1 Introduction PhenomicDB is
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Volker Markl [email protected] dima.tu-berlin.de dfki.de/web/research/iam/ bbdc.berlin Based on my 2014 Vision Paper On
ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
What is Analytic Infrastructure and Why Should You Care?
What is Analytic Infrastructure and Why Should You Care? Robert L Grossman University of Illinois at Chicago and Open Data Group [email protected] ABSTRACT We define analytic infrastructure to be the services,
Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - [email protected] 1
Problems to store, transfer and process the Big Data COURSE: COMPUTING CLUSTERS, GRIDS, AND CLOUDS LECTURER: ANDREY SHEVEL ITMO UNIVERSITY SAINT PETERSBURG 6/2/2016 GIANG TRAN - [email protected]
Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
Industrial Internet @GE. Dr. Stefan Bungart
Industrial Internet @GE Dr. Stefan Bungart The vision is clear The real opportunity for change surpassing the magnitude of the consumer Internet is the Industrial Internet, an open, global network that
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
Extracting Business. Value From CAD. Model Data. Transformation. Sreeram Bhaskara The Boeing Company. Sridhar Natarajan Tata Consultancy Services Ltd.
Extracting Business Value From CAD Model Data Transformation Sreeram Bhaskara The Boeing Company Sridhar Natarajan Tata Consultancy Services Ltd. GPDIS_2014.ppt 1 Contents Data in CAD Models Data Structures
CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING
Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING R.Kohila
Learning from Big Data in
Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data
ICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
M2M Communications and Internet of Things for Smart Cities. Soumya Kanti Datta Mobile Communications Dept. Email: Soumya-Kanti.Datta@eurecom.
M2M Communications and Internet of Things for Smart Cities Soumya Kanti Datta Mobile Communications Dept. Email: [email protected] WHAT IS EURECOM A graduate school & research centre in communication
Efficient Analysis of Big Data Using Map Reduce Framework
Efficient Analysis of Big Data Using Map Reduce Framework Dr. Siddaraju 1, Sowmya C L 2, Rashmi K 3, Rahul M 4 1 Professor & Head of Department of Computer Science & Engineering, 2,3,4 Assistant Professor,
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
Challenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and
MapReduce and Hadoop Distributed File System V I J A Y R A O
MapReduce and Hadoop Distributed File System 1 V I J A Y R A O The Context: Big-data Man on the moon with 32KB (1969); my laptop had 2GB RAM (2009) Google collects 270PB data in a month (2007), 20000PB
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON BIG DATA MANAGEMENT AND ITS SECURITY PRUTHVIKA S. KADU 1, DR. H. R.
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING Sunghae Jun 1 1 Professor, Department of Statistics, Cheongju University, Chungbuk, Korea Abstract The internet of things (IoT) is an
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS
Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges
Big Data Hope or Hype?
Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big
Distributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India
3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing
Figure 1: Architecture of a cloud services model for a digital education resource management system.
World Transactions on Engineering and Technology Education Vol.13, No.3, 2015 2015 WIETE Cloud service model for the management and sharing of massive amounts of digital education resources Binwen Huang
FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA
FOUNDATIONS OF A CROSSDISCIPLINARY PEDAGOGY FOR BIG DATA Joshua Eckroth Stetson University DeLand, Florida 3867402519 [email protected] ABSTRACT The increasing awareness of big data is transforming
Exploiting the power of Big Data
Exploiting the power of Big Data Timos Sellis School of Computer Science and Information Technology [email protected] ITECHLAW Asia-Pacific Conference, February 26-28, 2014 Melbourne Australia Timeline
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Big Data for smart infrastructure: London Bridge Station Redevelopment. Sinan Ackigoz & Krishna Kumar 10.09.15 Cambridge, UK
Big Data for smart infrastructure: London Bridge Station Redevelopment Sinan Ackigoz & Krishna Kumar 10.09.15 Cambridge, UK Redeveloping the redeveloped station 1972 vision 2012 vision 1972 Vision: Two
Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel
Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Big Data in Subsea Solutions
Big Data in Subsea Solutions Subsea Valley Conference 2014 Telenor Arena, Fornebu, April 2-3 Roar Fjellheim, Computas AS Computas AS - Brief company profile Norwegian IT consulting company providing services
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
MapReduce and Hadoop Distributed File System
MapReduce and Hadoop Distributed File System 1 B. RAMAMURTHY Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY) [email protected] http://www.cse.buffalo.edu/faculty/bina Partially
Big Data Analytics. The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory
Big Data Analytics The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory * Source: http://www.economistinsights.com/technology-innovation/analysis/hype-and-hope/methodology
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.
Industry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, [email protected] Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
Big Data Mining Services and Knowledge Discovery Applications on Clouds
Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy [email protected] Data Availability or Data Deluge? Some decades
Simultaneous Gamma Correction and Registration in the Frequency Domain
Simultaneous Gamma Correction and Registration in the Frequency Domain Alexander Wong [email protected] William Bishop [email protected] Department of Electrical and Computer Engineering University
Machine Learning over Big Data
Machine Learning over Big Presented by Fuhao Zou [email protected] Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed
