How To Use A Webmail On A Pc Or Macodeo.Com
|
|
|
- Bernard Mathews
- 5 years ago
- Views:
Transcription
1 Big data workloads and real-world data sets Gang Lu Institute of Computing Technology, Chinese Academy of Sciences BigDataBench Tutorial MICRO 2014 Cambridge, UK INSTITUTE OF COMPUTING TECHNOLOGY 1
2 Five domains n Search engine n Social network n E- commence n Mul9- media n Bioinforma9cs
3 Search Engine General search and ver9cal search Online server and Offline analy9cs
4 n Parsing: n Search Engine: Parsing Extract the text content and out links from the raw web pages
5 n Indexing n Search Engine: Indexing The process of create the mapping of term to document id lists
6 Search Engine: PageRank n PageRank n Compute the importance of the page according to the web link graph using PageRank
7 n Querying n Search Engine: Search query The online web search server serving users requests
8 n Sor9ng n Search Engine: Sor9ng Sort the results according the page ranks and the relevance of between the query and the document
9 Search Engine: Recommenda9on n Recommenda9on n Recommend related queries to users by mining the search log
10 Search Engine: Sta9s9c cou9ng n Sta9s9c coun9ng n Coun9ng the word frequency to extract the key word which represent the features of the page
11 Search Engine: Classifica9on n Classifica9on n Classify text content into different categories, users can filter the results to a special category they are interested in
12 Search Engine: Filter & Seman9c n Filter n Iden9fy pages with specific topic which can be used for ver9cal search n Seman9c extrac9on n extract seman9c informa9on extrac9on
13 Search Engine: Data access n Data access opear9ons n Read, write, and scan the seman9c informa9on.
14 Social network n Data sets n User table n Rela9on table n Ar9cle table n Workloads n Offline analy9cs
15 Social network: Data schema User table Rela9on table Tweet table
16 Social network: Workloads n Hot review topic n Select the top N tweets by the number of review n Hot transmit topic n Select the tweets which are transmiwed more than N 9mes. n Ac9ve user n Select the top N person who post the largest number of tweets. n Leader of opinion n Select top ones whose number of review and transmit are both large than N.
17 Social network: Workloads n Topic classify n Classify the tweets to certain category according to the topic. n Sen9ment classify n Classify the tweets to nega9ve or posi9ve according to the sen9ment. n Friend recommenda9on n Recommend friend to person according the rela9onal graph. n Community detec9on n Detec9ng clusters or communi9es in large social networks. n Breadth first search n Sort persons according to the distance between two people.
18 Specifica9on: E- commerce Order table Item table
19 E- commerce n Data sets n Order table n Item table n Workloads n Offline analy9cs
20 E- commerce: Workloads n Select query n Find the items of which the sales amount is over 100 in a single order. n Aggrega9on query n Count the sales number of each goods. n Join query n Count the number of each goods that each buyer purchased between certain period of 9me. n Recommenda9on n Predict the preferences of the buyer and recommend goods to them. n Sensi9ve classifica9on n Iden9fy posi9ve or nega9ve review. n Basic data opera9on n Unit of opera9on of the data The workloads of select, aggrega2on, and join are similar as queries used in A. Pavlo s sigmod09 paper,but are BigDataBench specified in the e- commence environment MICRO 2014
21 Mul9media Voice Data Extrac1on Speech Recogni1on Video Data MPEG Decoder Frame Data Extrac1on Feature Extrac1on Image Segmenta1on Face Detec1on Three- Dimensional Reconstruc1on Tracing
22 Mul9media: Workloads n MPEG Decoder. n Decode video streams using MPEG- 2 standard. n Feature extrac9on n For a given video frame, extract features which are invariant to scale, noise, and illumina9on. n Speech Recogni9on. n For a given audio file, recognize the content of the file and find whether exists sensi9ve words.
23 n Ray Tracing. Mul9media: Workloads n Render a 2- Dimensional video frame to a 3- Dimensional scene. n Image Segmenta9on. n Segment the input video frame according to color, intensity, and texture, and extract concerned regions. n Face Detec9on. n Detect whether face exists in the input data, if exists, then extract the face. n Deep Learning. n The input images are classified into different categories, and then detect human face.
24 Bioinforma9cs n Sequence assembly. n Assemble scawered and repe99ve DNA fragments to original long sequence. n Sequence alignment. n Align assembled DNA sequence to known sequences in the database, and detect disease. Gene Sequencing Genome Sequence Data Sequence Assembly Sequence Mapping Sequence Alignment Detec1on Result
25 Summary:Real data sets
26 Summary:Search Engine Various implementa1on
27 Summary:Social network
28 Summary:E- commerce
29 Summary:Mul9media
30 Summary:Bioinforma9cs
31 Any Questions
Big Data in medical image processing
Big Data in medical image processing Konstan3n Bychenkov, CEO Aligned Research Group LLC [email protected] Big data in medicine Genomic Research Popula3on Health Images M- Health hips://cloud.google.com/genomics/v1beta2/reference/
Social Media Marke-ng for Academic Research
Social Media Marke-ng for Academic Research 1 David Altman Mar.n Son Susu Wong @MassTTC #Social @TOMO3603 Using Social Media in Technology Licensing Offices 2 David Altman Manager Marke9ng and Communica9ons
TOLOMEO. ORFEO Toolbox. Jordi Inglada - CNES. TOoLs for Open Mul/- risk assessment using Earth Observa/on data TOLOMEO
ORFEO Toolbox Jordi Inglada - CNES TOoLs for Open Mul/- risk assessment using Earth Observa/on data Outline ORFEO Toolbox : general characteris>cs Example of OTB features OTB Applica>ons & Processing Chains
Search and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
Online Gambling - Advantages And Disadvantages
MOVING YOUR BUSINESS ONLINE TO MAXIMIZE ROI By Shelby Landeck Manager of Client Relations, Income Access PRESENTATION OVERVIEW Why going online is important And what your business can achieve online Defining
How To Analyze Medical Image Data With A Feature Based Approach To Big Data Medical Image Analysis
A Feature- based Approach to Big Data Medical Image Analysis Ma$hew Toews $, Chris/an Wachinger, Raul San Jose Estepar, William Wells III $ École de Technologie Supérieur, Montreal Canada BWH, Harvard
DNS Big Data Analy@cs
Klik om de s+jl te bewerken Klik om de models+jlen te bewerken! Tweede niveau! Derde niveau! Vierde niveau DNS Big Data Analy@cs Vijfde niveau DNS- OARC Fall 2015 Workshop October 4th 2015 Maarten Wullink,
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering
Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Agenda Industry Trends Cloud Storage Evolu4on of Storage Architectures Storage Connec4vity redefined S3 Cloud Storage Use
Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas
Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that
Understanding and Detec.ng Real- World Performance Bugs
Understanding and Detec.ng Real- World Performance Bugs Gouliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu Presented by Cindy Rubio- González Feb 10 th, 2015 Mo.va.on Performance bugs
Ensemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/
Ensemble Methods Adapted from slides by Todd Holloway h8p://abeau
Computer Networks. Examples of network applica3ons. Applica3on Layer
Computer Networks Applica3on Layer 1 Examples of network applica3ons e- mail web instant messaging remote login P2P file sharing mul3- user network games streaming stored video clips social networks voice
Run$me Query Op$miza$on
Run$me Query Op$miza$on Robust Op$miza$on for Graphs 2006-2014 All Rights Reserved 1 RDF Join Order Op$miza$on Typical approach Assign es$mated cardinality to each triple pabern. Bigdata uses the fast
The Data Reservoir. 10 th September 2014. Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Informa4on Solu4ons
Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Solu4ons The Reservoir 10 th September 2014 A growing demand Business Teams want Open access to more informa4on More
.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken
Klik om de s+jl te bewerken Klik om de models+jlen te bewerken Tweede niveau Derde niveau Vierde niveau.nl ENTRADA Vijfde niveau CENTR-tech 33 November 2015 Marco Davids, SIDN Labs Wie zijn wij? Mijlpalen
OVERVIEW OF DATA EXPLORATION TECHNIQUES. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri SIGMOD 2015, Melbourne
OVERVIEW OF DATA EXPLORATION TECHNIQUES Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri SIGMOD 2015, Melbourne USER INTERACTION express interests query/results recommendasons annotate collaborate
Università Telema/ca Internazionale UNINETTUNO Corso Vi9orio Emanuele II, n.39 00186 Roma Italia [email protected]
Università Telema/ca Internazionale UNINETTUNO Corso Vi9orio Emanuele II, n.39 00186 Roma Italia [email protected] Access to the online learning environment Management of the Student s page
Data Management in the Cloud: Limitations and Opportunities. Annies Ductan
Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management
Data Stream Algorithms in Storm and R. Radek Maciaszek
Data Stream Algorithms in Storm and R Radek Maciaszek Who Am I? l Radek Maciaszek l l l l l l Consul9ng at DataMine Lab (www.dataminelab.com) - Data mining, business intelligence and data warehouse consultancy.
SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications
Product Bulletin Sequencing Software SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications Comprehensive reference sequence handling Helps interpret the role of each
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15
Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
NCDS Leadership Summit " The Friday Center" Chapel Hill, North Carolina" April 23 & 24, 2013!
NCDS Leadership Summit " The Friday Center" Chapel Hill, North Carolina" April 23 & 24, 2013! Data Collection Scale of Problem Challenges v Research versus clinical contexts v Science versus medicine v
Analysis of Web Archives. Vinay Goel Senior Data Engineer
Analysis of Web Archives Vinay Goel Senior Data Engineer Internet Archive Established in 1996 501(c)(3) non profit organization 20+ PB (compressed) of publicly accessible archival material Technology partner
Big Data Benchmark Suite
BigDataBench: An Open source Big Data Benchmark Suite Jianfeng Zhan http://prof.ict.ac.cn/bigdatabench Professor, ICT, Chinese Academy of Sciences and University of Chinese Academy of Sciences WBDB 2015
WLAN Spectrum Analyzer Technology White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date 2013-05-10
WLAN Spectrum Analyzer Technology White Paper Issue 01 Date 2013-05-10 HUAWEI TECHNOLOGIES CO., LTD. 2013. All rights reserved. No part of this document may be reproduced or transmitted in any form or
M3039 MPEG 97/ January 1998
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039
Search Result Optimization using Annotators
Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,
Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social
DTCC Data Quality Survey Industry Report
DTCC Data Quality Survey Industry Report November 2013 element 22 unlocking the power of your data Contents 1. Introduction 3 2. Approach and participants 4 3. Summary findings 5 4. Findings by topic 6
Analysis of Data Mining Concepts in Higher Education with Needs to Najran University
590 Analysis of Data Mining Concepts in Higher Education with Needs to Najran University Mohamed Hussain Tawarish 1, Farooqui Waseemuddin 2 Department of Computer Science, Najran Community College. Najran
TLD Data Analysis. ICANN Tech Day, Dublin. October 19th 2015 Maarten Wullink, SIDN. Klik om de s+jl te bewerken
Klik om de s+jl te bewerken Klik om de models+jlen te bewerken Tweede niveau Derde niveau Vierde niveau TLD Data Analysis Vijfde niveau ICANN Tech Day, Dublin October 19th 2015 Maarten Wullink, SIDN Wie
Introduc8on to Apache Spark
Introduc8on to Apache Spark Jordan Volz, Systems Engineer @ Cloudera 1 Analyzing Data on Large Data Sets Python, R, etc. are popular tools among data scien8sts/analysts, sta8s8cians, etc. Why are these
ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION
ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION CSE 537 Ar@ficial Intelligence Professor Anita Wasilewska GROUP 2 TEAM MEMBERS: SAEED BOOR BOOR - 110564337 SHIH- YU TSAI - 110385129 HAN LI 110168054 SOURCES
Research at the Department of Computer Science and Software Engineering. Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014
Research at the Department of Computer Science and Software Engineering Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014 Research Areas Ar%ficial intelligence Robo%cs Data mining Image
Applying Machine Learning to Network Security Monitoring. Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject!
Applying Machine Learning to Network Security Monitoring Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject! whoami Almost 15 years in Informa2on Security, done a licle bit of everything.
Ganzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
BPOE Research Highlights
BPOE Research Highlights Jianfeng Zhan ICT, Chinese Academy of Sciences 2013-10- 9 http://prof.ict.ac.cn/jfzhan INSTITUTE OF COMPUTING TECHNOLOGY What is BPOE workshop? B: Big Data Benchmarks PO: Performance
ViCSiM. CAN/LIN Simulator and Monitor
CAN/LIN Simulator and Monitor ViCSiM CAN/LIN Communication Simulator and Monitor Even though it is low in price, it has advanced functions, such as log playback simulation and graph monitor. A product
Price pa)erns, charts and technical analysis: Technical Analysis. Aswath Damodaran
Price pa)erns, charts and technical analysis: Technical Analysis Aswath Damodaran Founda;ons of Technical Analysis: What are the assump;ons? 1. Price is determined solely by the interac;on of supply &
HiBench Introduction. Carson Wang ([email protected]) Software & Services Group
HiBench Introduction Carson Wang ([email protected]) Agenda Background Workloads Configurations Benchmark Report Tuning Guide Background WHY Why we need big data benchmarking systems? WHAT What is
BigDataBench. Khushbu Agarwal
BigDataBench Khushbu Agarwal Last Updated: May 23, 2014 CONTENTS Contents 1 What is BigDataBench? [1] 1 1.1 SUMMARY.................................. 1 1.2 METHODOLOGY.............................. 1 2
IMAGING SOFTWARE. Image-Pro Insight Image Analysis Made Easy. Capture, Process, Measure, and Share
IMAGING SOFTWARE Image-Pro Insight Image Analysis Made Easy Capture, Process, Measure, and Share Image-Pro Insight Image Analysis Made Easy Capture, Process, Measure, and Share Image-Pro Insight, the latest
2003-2015 Take 5 Solutions - All Rights Reserved.
2003 - Take 5 Solutions - All Rights Reserved. Overview Why Take 5 Solu/ons? Take 5's Unique Advantages Leadership Team Product Offerings Direct Mail List Rental Email List Rental and Retarge/ng Social
Privacy- Preserving P2P Data Sharing with OneSwarm. Presented by. Adnan Malik
Privacy- Preserving P2P Data Sharing with OneSwarm Presented by Adnan Malik Privacy The protec?on of informa?on from unauthorized disclosure Centraliza?on and privacy threat Websites Facebook TwiFer Peer
Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp
Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)
Three Step Redirect API
Inspire Commerce &.pay Three Step Redirect API Inspire Commerce 800-261-3173 [email protected] Contents Overview... 3 Methodology... 3 XML Communica:on... 5 Transac:on Opera:ons... 6 Customer
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
Alarms of Stream MultiScreen monitoring system
STREAM LABS Alarms of Stream MultiScreen monitoring system Version 1.0, June 2013. Version history Version Author Comments 1.0 Krupkin V. Initial version of document. Alarms for MPEG2 TS, RTMP, HLS, MMS,
A SURVEY ON WEB MINING TOOLS
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 3, Issue 10, Oct 2015, 27-34 Impact Journals A SURVEY ON WEB MINING TOOLS
ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on
ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on Jaume Bacardit [email protected] The Interdisciplinary Compu/ng and Complex BioSystems
Gyrus: A Framework for User- Intent Monitoring of Text- Based Networked ApplicaAons
Gyrus: A Framework for User- Intent Monitoring of Text- Based Networked ApplicaAons Yeongjin Jang*, Simon P. Chung*, Bryan D. Payne, and Wenke Lee* *Georgia Ins=tute of Technology Nebula, Inc 1 Tradi=onal
LSST Data Management plans: Pipeline outputs and Level 2 vs. Level 3
LSST Data Management plans: Pipeline outputs and Level 2 vs. Level 3 Mario Juric Robert Lupton LSST DM Project Scien@st Algorithms Lead LSST SAC Name of Mee)ng Loca)on Date - Change in Slide Master 1 Data
Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah
Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated
Pu?ng B2B Research to the Legal Test
With the global leader in sampling and data services Pu?ng B2B Research to the Legal Test Ashlin Quirk, SSI General Counsel 2014 Survey Sampling Interna6onal 1 2014 Survey Sampling Interna6onal Se?ng the
Develop Computer Animation
Name: Block: A. Introduction 1. Animation simulation of movement created by rapidly displaying images or frames. Relies on persistence of vision the way our eyes retain images for a split second longer
Video compression: Performance of available codec software
Video compression: Performance of available codec software Introduction. Digital Video A digital video is a collection of images presented sequentially to produce the effect of continuous motion. It takes
Big Data /Data Science Data Intensive (Science) Technologies
Big Data /Data Science Data Intensive (Science) Technologies Adam Belloum Ins:tute of Informa:cs University of Amsterdam [email protected] High Performance compu:ng Curriculum, Jan 2015 hmp://www.hpc.uva.nl/
How To Get More Data From Your Computer
Industry Perspective: Big Data and Big Data Analytics David Barnes Program Director Emerging Internet Technologies IBM Software Group What is Big Data? The Adjacent Possible Inexpensive disk + Increased
