Methods for Analysing Large-Scale Resources and Big Music Data
|
|
- Avis Lee
- 8 years ago
- Views:
Transcription
1 Methods for Analysing Large-Scale Resources and Big Music Data Tillman Weyde Department of Computer Science Music Informatics Research Group City University London
2 Overview! Large-Scale Music Analysis! Big Music Data! Available technologies and methods! The Digital Music Lab! Architecture! Data! Example! Chord recognition! Chord sequence analysis! Visualisations! Hands-on tasks! Exploring British Library content 2 The DML Project: Objectives and Methodology
3 Large Scale Music Analysis! Music has gone digital on a large scale.! What about musicology?! What is different about large scale data! How can we analyse large scale music data 3 The Digital Music Lab Project
4 Large Scale Music Analysis! Big Music Data Collections become increasingly available digitally, e.g.! itunes and Spotify (and others) offer over 30 m tracks each! British Library holds ~ 3 millions of audio recordings (~10% are digitized), 1.5 million music prints, 100k manuscripts! Internet archive: 2.5 m audio tracks 4 The DML Project: Objectives and Methodology
5 Large Scale Music Analysis! Big Music Data in Research! Systematic Musicology has developed as "data oriented empirical research" Parncutt, R Systematic musicology and the history and future of Western musical scholarship. Journal of Interdisciplinary Music Studies, 1, 1-32.!... working with larger data sets will open up new areas of musicology. N. Cook (2005). The Compleat Musicologist. Keynote speech at ISMR. 5 The DML Project: Objectives and Methodology
6 Big Data Analysis Workflow! Acquisition, Storage and Management! Large Hardware & Software Systems! Exploration, Hypothesis development! Query & Search, Visualization! Modelling & Testing! Statistical tools 6
7 Big Data Analysis Technology! Parallel processing on large arrays! Cheap unreliable hardware! Software Architecture! Map/Reduce, Computation Graphs (Hadoop, Spark)! Algorithms! Failure-tolerant, efficient, simple! Visualisations 7
8 Big Music Data Applications (1)! Popular:! Animated! Recommendation (network based) Narrative 2.0 Liveplasma 2.0 8
9 Big Music Data Applications (2)! Not many: e.g. music history graph by Google (since 1950, not classical music )! 9
10 Digital Transformations in Musicology! Challenges! Gap between musicology and music technology (music information retrieval)! Large heterogeneous data collections! Need for software infrastructure! Audio and symbolic music processing! Connecting resources (semantic web, linked data)! Tools and visual interfaces! Methods for gaining musical insight from data 10 The Digital Music Lab Project
11 Open Questions (partly)! How can music research use audio transcription and analysis on large data collections?! How can we provide an infrastructure that enables researchers to make use of large data collections and create reusable open datasets?! How can computational tools be made usable for music researchers, musicians and other users (who are not necessarily computer scientists)? 11 The Digital Music Lab Project
12 Musicological Questions From 2014 Workshop! Analysing styles, trends over time! Work across different heterogenuous collections! Utilise external metadata and annotations 12
13 Infrastructure needed! Feature Extraction! Vamp and other plug-ins! Parallelisation! Middleware! Semantic Web! Music Ontology! Aggregation and collection level analysis 13 13/03/2015
14 Large Scale Music Analysis! Plan for this session! Show Digital Music Lab and our approach to large scale music analysis! Workflow and technologies! Present features and interfaces! Hands-on data exploration! Methods for further analysis! Discussion 14
15 The Digital Music Lab 15
16 The Digital Music Lab project! January March 2015 small follow-up project running now! City University (Dpt of Computer Science, Dpt of Music)! Tillman Weyde, Stephen Cottrell, Jason Dykes, Emmanouil Benetos, Daniel Wolff, Dan Tidhar, Alex Kachkaev! Queen Mary UoL (Centre for Digital Music)! Mark Plubmley, Simon Dixon, Mathieu Barthet, Steven Hargreaves! University College London (Dpt of Computer Science, Centre for Digital Humanities)! Nicolas Gold, Samer Abdallah! British Library (BL Labs)! Aquiles Alencar-Brayner, Mahendra Mahey, Adam Tovell 16
17 Goals! Develop a networked infrastructure to bring computation to the data! Avoid copyright problems by design! Integrate audio feature extraction and transcription! Development of analysis tools! Interactive visual interfaces! Musicological applications 17 13/03/2015
18 Outputs! Curated datasets and derived data (>4 Terabytes)! Web service with visual interfaces for data exploration! Publications (more to come)! Redistributable virtual machine images (in preparation) 18 13/03/2015
19 The Digital Music Lab! Overview 19
20 The DML System Provides...! Access: Systematic exploration of heterogenuous and large music libraries! Control: Interfacing with complex automatic music analysis tools! Analysis: Gain summarised knowledge on large numbers of recordings! Sharing: Experiments reproducible with same data, clear provenance of analysis results. 20
21 The Technical Perspective! Access to data! Audio access restricted by physical location! Metadata unification of different formats! Control via web interface to large-scale analysis! Interactive UI for overview and exploration! Scalable analysis is available on collection-level and recording-level! Share the well-defined and derived data! Re-use of existing software and published code for analysis 21
22 Software Ecosystem! Distributed system! Virtual machines (VirtualBox)! Open Source OS (Ubuntu)! Parallelised existing analysis tools! Python (NumPy)! Vamp Plugins! Big-Data map-reduce (Spark)! Computation management! Built on semantic architecture! Interactive user interface for exploration and analysis! Built using state-of-the-art web technologies 22
23 Data-Flow for Computational Analysis User Interface Provide Web Server Database: Results & Metadata Analysis Management: Cliopatria Audio, Transcriptions and Feature Storage Access Audio and Features Computing Server 23
24 Physical Locations Matter: Content Access! Two computing servers, located at BL and ILM! Allow for in-place access to restricted data! Dedicated server at City for web access 24
25 Sustainability! Preference on Open Source! Basic infrastructure (Ubuntu, Spark, Vamp...)! Soundsoftware repository for! Publishing versioned code of newly developed software! Backup and sharing: Open data / features / results! Open and reproducible method! Enables similar set-up in further institutions 25
26 Results Implemented in the DML System! Conceptual framework (including implementation) for collection-level analysis! Collection in focus as object of analysis! Data-flow for interactive retrieval of results! Secure, responsive and redundant network structure! Distributed computation ressources! Open-source software ecosystem for large-scale music analysis! Parallelised feature extraction and results management! Collection-level analysis, interface and visualisation 26
27 Feature Extraction 27
28 Audio Descriptors List 1. Spectrogram 2. MFCCs 3. Chroma 4. Onsets 5. Speech/Music Segmentation 6. Chords 7. Beats/Tempo 8. Key 9. Melody 10. Note Transcription 28
29 Raw Audio 1. B Sample from CHARM: JS Bach, Chorale Prelude - Beloved Jesus, Cohen, Harriet (piano), Columbia,
30 Spectrogram 2 versions:! STFT magnitude spectrogram! Constant-Q Transform magnitude spectrogram 30
31 MFCCs! Stand for: Mel-Frequency Cepstral Coefficients! Extracted using QM Vamp Plugin Set 31
32 Chroma! Spectrum projected onto 12 bins (representing semitones of an octave)! Extracted using: QM Chromagram and NNLS Chroma Vamp plugins 32
33 Onsets! Onset: the beginning of a musical note or another sound! Extracted using QM Onset Vamp plugin 33
34 Speech/Music Segmentation! Useful for ethnographic recordings/radio broadcasts! Extracted using BBC Speech/Music Segmentation Vamp Plugin 34
35 Chords! Extracted using Chordino Vamp Plugin 35
36 Beats! Beat locations labelled with metrical position! Extracted using Beatroot, Marsyas, Tempotracker Vamp Plugins 36
37 Tempo! Estimated based on onset/beat information! Extracted using Tempotracker and Tempogram Vamp plugins 37
38 Key! Extracted using QM Key Vamp plugin (supports major/minor keys) 38
39 Melody! Or more precisely: Sequence of fundamental frequency (F0) values corresponding to the perceived pitch of the main melody.! Extracted using MELODIA Vamp plugin 39
40 Note Transcription Semitone Resolution! Multiple-pitch detection (onset/offset/pitch/velocity)! Extracted using Silvet Vamp plugin! Synthesized transcription example: 40
41 Note Transcription High Pitch Resolution! Multiple-pitch detection on a 20-cent resolution useful for tuning/ temperament analysis and analysis of non-western music! Extracted using Silvet Vamp plugin 41
42 Data! British Library! CHARM! I Like Music 42
43 Data - British Library Music Dataset! Currently identifying, organising, and curating available music data collections from the BL Sound Archive! Over 3M digital audio recordings, in a variety of formats! Copyright-cleared material will be made available to the public! Copyright-restricted material will be accessible to BL users 43
44 Data I Like Music Dataset! I Like Music: digital music service provider to companies who hold public performance licences! Sole provider of online music to the BBC! Holds a commercial music library of 1.2M tracks and a production music library of 400k tracks 44
45 Data CHARM and Mazurka Dataset! CHARM: AHRC Research Centre for the History and Analysis of Recorded Music ( )! CHARM Dataset: 5k copyright-free historical recordings ( ) + metadata! Mazurka Dataset 3k recordings + metadata! Ideal for musicological analysis using computational methods 45
46 Extracted features available today! ILM, BL, CHARM datasets ~ tracks! Transcriptions MIDI and high resolution! Beats and tempo curves! Chroma! Chord and key 46
47 Infrastructure Feature Extraction Vamp plug-ins Spark and other techniques for parallelisation Middleware Semantic Web server (RDF with Prolog using ClioPatria) Music Ontology Manages aggregation and collection level analysis Provides SPARQL endpoint 47
48 Infrastructure Derived data from 2 collections! Accessible via the web 48
49 Interfaces and Visualisations Audio collections Chord sequence patterns Tag crowd-sourcing 49
50 Studies! Temperament! Chord progressions 0.5 ET Just Vallotti fcmtge scmtfd Kellner Lehman
51 Examples! Chord Sequence Analysis 51
52 Large-Scale Analysis of Chord Sequences! Extracted chord sequences (e.g. Am7/E7/Gmaj7, etc...)! On ILM's commercial music collection (1m tracks )! Parallel processing of multiple music clips! Chordino Vamp plugin (Queen Mary University of London)! 6 weeks on 8 core virtual machine! Retrieve most frequent chord patterns using Sequential Pattern Mining (SPM)! In specific genre subsets (classical, folk, jazz, blues, rock, reggae)! Chord pattern graphs visualised with open source graphviz 52 Barthet, M. et al. Big Chord Data Extraction and Mining. In: CIM 2014
53 Audio-based Automatic Chord Recognition Chordino Vamp plugin [Mauch and Dixon, ISMIR 2010]. Uses chromagram obtained with a non-negative least squares (NNLS) procedure for approximate note transcription. Accuracy of 80% when assessed with MIREX 2009 s dataset (Popular songs from the Beatles and Queen). Excerpt from Buddy Guy s Mary Had A Little Lamb 53
54 Chord Sequence Patterns! Segmentation of audio recordings into chord sequences! Representation of no detection (N)! Most frequency of (non-contiguous) chord sequences are patterns 54
55 Chord Pattern Length Bell-shaped curve for all genres Jazz and Classical have more patterns: greater harmonic diversity? Folk has more long patterns? 55
56 Frequent Chord Patterns: Classical vs. Blues Classical Blues Classical patterns more distributed Blues connects dom7 with dom7, classical doesn t (dom7 is typically resolved to major or minor tonic) 56
57 Visualisation of Chord Sequence Patterns! Most prominent chord sequences! Compare two collections or visualisations 57 Kachkaev, A. et al. Visualising Chord Progressions in Music Collections. In: CIM 2014
58 Circular Grid (circle of fifths, straight and twisted) 58
59 Parallel coordinates (chord types, circle of fifths) 59
60 Transition matrix (chord types, roots) 60
61 Tonnetz 61
62 Folk vs. Jazz 62
63 Demo 63
64 Practical exercises! Open (similarity) or (tempo curve)! Follow the worksheet or explore freely! Pitch (class) distribution! Similartiy! Tuning! Tempo curves! Take notes of results, ideas and hypotheses! You can explore the chord sequences here: 64
65 Practical exercises! Tasks 1! Schoenberg has relatively flat pitch profile 65
66 Practical exercises! Tasks 2! Mozart sonatas seem more homogeneous 66
67 Practical exercises! Tasks 3! Tuning was stable between these periods 67
68 Practical exercises! Tasks 4! Tempo smoothes out, some final ritardando 68
69 Where to go from here! Found something interesting?! Results need careful interpretation wrt! Noisy data! Data collection! Significance! Cross-validation! Meaning! Can be challenging, needs expertise in music and data 69
70 Where to go from here (ctd.)! Follow up ideas and hypothesis by close examination of the data! Extract data with SPARQL interface (more on Semantic Web and SPARQL tomorrow)! Use programming languages and statistical tools, e.g. Python, Pandas, SciPy, R, Matlab, SPSS, 70
71 The end! Open discussion! Thank you 71
Automatic Transcription: An Enabling Technology for Music Analysis
Automatic Transcription: An Enabling Technology for Music Analysis Simon Dixon simon.dixon@eecs.qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary University
More information1 About This Proposal
1 About This Proposal 1. This proposal describes a six-month pilot data-management project, entitled Sustainable Management of Digital Music Research Data, to run at the Centre for Digital Music (C4DM)
More informationMusic recommendation for music learning: Hotttabs, a multimedia guitar tutor
Music recommendation for music learning: Hotttabs, a multimedia guitar tutor Mathieu Barthet, Amélie Anglade, Gyorgy Fazekas, Sefki Kolozali, Robert Macrae Centre for Digital Music Queen Mary University
More informationPUBLISHING MUSIC SIMILARITY FEATURES ON THE SEMANTIC WEB
10th International Society for Music Information Retrieval Conference (ISMIR 2009) PUBLISHING MUSIC SIMILARITY FEATURES ON THE SEMANTIC WEB Dan Tidhar, György Fazekas, Sefki Kolozali, Mark Sandler Centre
More informationAnnotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
More informationIndustry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
More informationSYMBOLIC REPRESENTATION OF MUSICAL CHORDS: A PROPOSED SYNTAX FOR TEXT ANNOTATIONS
SYMBOLIC REPRESENTATION OF MUSICAL CHORDS: A PROPOSED SYNTAX FOR TEXT ANNOTATIONS Christopher Harte, Mark Sandler and Samer Abdallah Centre for Digital Music Queen Mary, University of London Mile End Road,
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationDYNAMIC CHORD ANALYSIS FOR SYMBOLIC MUSIC
DYNAMIC CHORD ANALYSIS FOR SYMBOLIC MUSIC Thomas Rocher, Matthias Robine, Pierre Hanna, Robert Strandh LaBRI University of Bordeaux 1 351 avenue de la Libration F 33405 Talence, France simbals@labri.fr
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationBeethoven, Bach und Billionen Bytes
Meinard Müller Beethoven, Bach und Billionen Bytes Automatisierte Analyse von Musik und Klängen Meinard Müller Tutzing-Symposium Oktober 2014 2001 PhD, Bonn University 2002/2003 Postdoc, Keio University,
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationFAST MIR IN A SPARSE TRANSFORM DOMAIN
ISMIR 28 Session 4c Automatic Music Analysis and Transcription FAST MIR IN A SPARSE TRANSFORM DOMAIN Emmanuel Ravelli Université Paris 6 ravelli@lam.jussieu.fr Gaël Richard TELECOM ParisTech gael.richard@enst.fr
More informationBIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum
Big Data Analysis John Domingue (STI International and The Open University) Project co-funded by the European Commission within the 7th Framework Program (Grant Agreement No. 257943) 1 The Data landscape
More informationUsing In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
More informationMonitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
More informationAutomatic Chord Recognition from Audio Using an HMM with Supervised. learning
Automatic hord Recognition from Audio Using an HMM with Supervised Learning Kyogu Lee enter for omputer Research in Music and Acoustics Department of Music, Stanford University kglee@ccrma.stanford.edu
More informationSurvey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer
Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,
More informationReference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
More informationBig Data Analytics with Spark and Oscar BAO. Tamas Jambor, Lead Data Scientist at Massive Analytic
Big Data Analytics with Spark and Oscar BAO Tamas Jambor, Lead Data Scientist at Massive Analytic About me Building a scalable Machine Learning platform at MA Worked in Big Data and Data Science in the
More informationSustainable Development with Geospatial Information Leveraging the Data and Technology Revolution
Sustainable Development with Geospatial Information Leveraging the Data and Technology Revolution Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights
More informationA Professional Big Data Master s Program to train Computational Specialists
A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationData Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.
More informationA MUSICAL WEB MINING AND AUDIO FEATURE EXTRACTION EXTENSION TO THE GREENSTONE DIGITAL LIBRARY SOFTWARE
A MUSICAL WEB MINING AND AUDIO FEATURE EXTRACTION EXTENSION TO THE GREENSTONE DIGITAL LIBRARY SOFTWARE Cory McKay Marianopolis College and CIRMMT Montréal, Canada cory.mckay@mail.mcgill.ca David Bainbridge
More informationOpenAIRE Research Data Management Briefing paper
OpenAIRE Research Data Management Briefing paper Understanding Research Data Management February 2016 H2020-EINFRA-2014-1 Topic: e-infrastructure for Open Access Research & Innovation action Grant Agreement
More informationMachine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI-261323
Machine Learning and Cloud Computing trends, issues, solutions Daniel Pop HOST Workshop 2012 Future plans // Tools and methods Develop software package(s)/libraries for scalable, intelligent algorithms
More informationIEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper
IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper CAST-2015 provides an opportunity for researchers, academicians, scientists and
More informationRecent advances in Digital Music Processing and Indexing
Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components
More informationMEng, BSc Applied Computer Science
School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationControl of affective content in music production
International Symposium on Performance Science ISBN 978-90-9022484-8 The Author 2007, Published by the AEC All rights reserved Control of affective content in music production António Pedro Oliveira and
More informationA Performance Analysis of Distributed Indexing using Terrier
A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search
More informationA causal algorithm for beat-tracking
A causal algorithm for beat-tracking Benoit Meudic Ircam - Centre Pompidou 1, place Igor Stravinsky, 75004 Paris, France meudic@ircam.fr ABSTRACT This paper presents a system which can perform automatic
More informationJames Hardiman Library. Digital Scholarship Enablement Strategy
James Hardiman Library Digital Scholarship Enablement Strategy This document outlines the James Hardiman Library s strategy to enable digital scholarship at NUI Galway. The strategy envisages the development
More informationICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
More informationConsiderations for Research Data Management
Considerations for Research Data Management Andrew Dean - OCF adean@ocf.co.uk - 07508 033894 Wednesday 3 rd December 2014 What is an RDM solution? Research Data Management A method of effectively managing
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationUbuntu and Hadoop: the perfect match
WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationDatabases in Organizations
The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron
More informationMEng, BSc Computer Science with Artificial Intelligence
School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
More informationMobile Storage and Search Engine of Information Oriented to Food Cloud
Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:
More informationHow To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
More informationAMUSICAL key and a chord are important attributes of
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 2, FEBRUARY 2008 291 Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized
More informationBig Data and Cloud Computing for GHRSST
Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge
More informationUniGR Workshop: Big Data «The challenge of visualizing big data»
Dept. ISC Informatics, Systems & Collaboration UniGR Workshop: Big Data «The challenge of visualizing big data» Dr Ir Benoît Otjacques Deputy Scientific Director ISC The Future is Data-based Can we help?
More informationHadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015
Hadoop MapReduce and Spark Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Outline Hadoop Hadoop Import data on Hadoop Spark Spark features Scala MLlib MLlib
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationchallenges Beatrice Alex! Edinburgh Language Technology Group! School of Informatics! balex@inf.ed.ac.uk! @bea_alex!
Text mining big data: potential and challenges Beatrice Alex! Edinburgh Language Technology Group! School of Informatics! balex@inf.ed.ac.uk! @bea_alex! LTG The Edinburgh Language Technology Group Research
More informationHow To Become A Data Scientist
Programme Specification Awarding Body/Institution Teaching Institution Queen Mary, University of London Queen Mary, University of London Name of Final Award and Programme Title Master of Science (MSc)
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationData-Driven Decisions: Role of Operations Research in Business Analytics
Data-Driven Decisions: Role of Operations Research in Business Analytics Dr. Radhika Kulkarni Vice President, Advanced Analytics R&D SAS Institute April 11, 2011 Welcome to the World of Analytics! Lessons
More information#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf
Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are
More informationSeparation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
More informationThe cross-disciplinary Roots of the British collaboration between scholars in humanities and
HALOGEN RESEARCH DATA MANAGEMENT BENEFITS CASE STUDY 1. BACKGROUND The cross-disciplinary Roots of the British collaboration between scholars in humanities and genetics at the University of Leicester (Wellcome
More informationManifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
More informationGATE Mímir and cloud services. Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation
GATE Mímir and cloud services Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation GATE Mímir GATE Mímir is an indexing system for GATE documents. Mímir can index: Text: the original
More informationHadoopRDF : A Scalable RDF Data Analysis System
HadoopRDF : A Scalable RDF Data Analysis System Yuan Tian 1, Jinhang DU 1, Haofen Wang 1, Yuan Ni 2, and Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai, China {tian,dujh,whfcarter}@apex.sjtu.edu.cn
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationBig Data Storage Options for Hadoop Sam Fineberg, HP Storage
Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations
More informationData Analytics at NERSC. Joaquin Correa JoaquinCorrea@lbl.gov NERSC Data and Analytics Services
Data Analytics at NERSC Joaquin Correa JoaquinCorrea@lbl.gov NERSC Data and Analytics Services NERSC User Meeting August, 2015 Data analytics at NERSC Science Applications Climate, Cosmology, Kbase, Materials,
More informationDataGraft: Simplifying Open Data Publishing
DataGraft: Simplifying Open Data Publishing Dumitru Roman 1, Marin Dimitrov 2, Nikolay Nikolov 1, Antoine Putlier 1, Dina Sukhobok 1, Brian Elvesæter 1, Arne Berre 1, Xianglin Ye 1, Alex Simov 2, Yavor
More informationLDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,
More informationfédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries
fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries Johan Montagnat CNRS, I3S lab, Modalis team on behalf of the CrEDIBLE
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationDeveloping a Website. Chito N. Angeles Web Technologies: Training for Development and Teaching Resources
Developing a Website Chito N. Angeles Web Technologies: Training for Development and Teaching Resources Static vs. Dynamic Website Static Website Traditional Website Contains a fixed amount of pages and
More informationWhat do Big Data & HAVEn mean? Robert Lejnert HP Autonomy
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence
More informationM3039 MPEG 97/ January 1998
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039
More informationPrinciples for Working with Big Data"
Principles for Working with Big Data" Juliana Freire Visualization and Data Analysis (ViDA) Lab Computer Science & Engineering Center for Urban Science & Progress (CUSP) Center for Data Science New York
More informationChapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
More informationGUITAR TAB MINING, ANALYSIS AND RANKING
12th International Society for Music Information Retrieval Conference (ISMIR 2011) GUITAR TAB MINING, ANALYSIS AND RANKING Robert Macrae Centre for Digital Music Queen Mary University of London robert.macrae@eecs.qmul.ac.uk
More informationSoftware Description Technology
Software applications using NCB Technology. Software Description Technology LEX Provide learning management system that is a central resource for online medical education content and computer-based learning
More informationLJMU Research Data Policy: information and guidance
LJMU Research Data Policy: information and guidance Prof. Director of Research April 2013 Aims This document outlines the University policy and provides advice on the treatment, storage and sharing of
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationWROX Certified Big Data Analyst Program by AnalytixLabs and Wiley
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or
More informationCiteSeer x in the Cloud
Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar
More informationMetadata Repositories in Health Care. Discussion Paper
Health Care and Informatics Review Online, 2008, 12(3), pp 37-44, Published online at www.hinz.org.nz ISSN 1174-3379 Metadata Repositories in Health Care Discussion Paper Dr Karolyn Kerr karolynkerr@hotmail.com
More informationApache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
More informationOpenCB a next generation big data analytics and visualisation platform for the Omics revolution
OpenCB a next generation big data analytics and visualisation platform for the Omics revolution Development at the University of Cambridge - Closing the Omics / Moore s law gap with Dell & Intel Ignacio
More informationBYODs & FAIR Data Stewardship
BYODs & FAIR Data Stewardship Luiz Olavo Bonino luiz.bonino@dtls.nl www.elixir-europe.org Summary FAIR Data stewardship Approach in NL BYOD FAIR Data tooling ecosystem Way of working (FAIR) Data Stewardship
More informationAmit Sheth & Ajith Ranabahu, 2010. Presented by Mohammad Hossein Danesh
Amit Sheth & Ajith Ranabahu, 2010 Presented by Mohammad Hossein Danesh 1 Agenda Introduction to Cloud Computing Research Motivation Semantic Modeling Can Help Use of DSLs Solution Conclusion 2 3 Motivation
More informationBIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
More informationLinked Science as a producer and consumer of big data in the Earth Sciences
Linked Science as a producer and consumer of big data in the Earth Sciences Line C. Pouchard,* Robert B. Cook,* Jim Green,* Natasha Noy,** Giri Palanisamy* Oak Ridge National Laboratory* Stanford Center
More informationBIGDATA GREENPLUM DBA INTRODUCTION COURSE OBJECTIVES COURSE SUMMARY HIGHLIGHTS OF GREENPLUM DBA AT IQ TECH
BIGDATA GREENPLUM DBA Meta-data: Outrun your competition with advanced knowledge in the area of BigData with IQ Technology s online training course on Greenplum DBA. A state-of-the-art course that is delivered
More informationThe Knowledge Sharing Infrastructure KSI. Steven Krauwer
The Knowledge Sharing Infrastructure KSI Steven Krauwer 1 Why a KSI? Building or using a complex installation requires specialized skills and expertise. CLARIN is no exception. CLARIN is populated with
More informationBig Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel
Big Data and Analytics: A Conceptual Overview Mike Park Erik Hoel In this technical workshop This presentation is for anyone that uses ArcGIS and is interested in analyzing large amounts of data We will
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationThe Value of Taxonomy Management Research Results
Taxonomy Strategies November 28, 2012 Copyright 2012 Taxonomy Strategies. All rights reserved. The Value of Taxonomy Management Research Results Joseph A Busch, Principal What does taxonomy do for search?
More informationProgramme Specification Postgraduate Programmes
Programme Specification Postgraduate Programmes Awarding Body/Institution Teaching Institution University of London Goldsmiths, University of London Name of Final Award and Programme Title MSc Data Science
More informationSensor data management software, requirements and considerations. Don Henshaw H.J. Andrews Experimental Forest
Sensor data management software, requirements and considerations Don Henshaw H.J. Andrews Experimental Forest Joint NERC Environmental Sensor Network/LTER SensorNIS Workshop, October 25-27 th, 2011 COMMON
More informationDatabase preservation toolkit:
Nov. 12-14, 2014, Lisbon, Portugal Database preservation toolkit: a flexible tool to normalize and give access to databases DLM Forum: Making the Information Governance Landscape in Europe José Carlos
More informationUsing Big Data and GIS to Model Aviation Fuel Burn
Using Big Data and GIS to Model Aviation Fuel Burn Gary M. Baker USDOT Volpe Center 2015 Transportation DataPalooza June 17, 2015 The National Transportation Systems Center Advancing transportation innovation
More informationKnowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
More information