Building a Linked Data Graph for Education

Similar documents
Automating Big Data Management, by DISIT Lab Distributed [Systems and Internet, Data Intelligence] Technologies Lab Prof. Ph.D. Eng.

Linking Maritime Datasets to Dutch Ships and Sailors Cloud - Case studies on Archangelvaart and Elbing. J.A. Entjes July 10th, 2015

What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?

BIG DATA TOOLS. Top 10 open source technologies for Big Data

How To Handle Big Data With A Data Scientist

Preparing Your Data For Cloud

The Next Wave of Data Management. Is Big Data The New Normal?

How To Use An Orgode Database With A Graph Graph (Robert Kramer)

Student Project 2 - Apps Frequently Installed Together

THE SEMANTIC WEB AND IT`S APPLICATIONS

Querying MongoDB without programming using FUNQL

JAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

Student Project 1 - Explorative Data Analysis with Hadoop and Spark

Unified Batch & Stream Processing Platform

Data Services as a Service for the G-Cloud

Aternity for End User Experience Monitoring Any Application, Any Device, Any User. June 26 th, 2013

Smart Cities require Geospatial Data Providing services to citizens, enterprises, visitors...

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM

An Open Architecture Platform for Demand Resources from AutoDR and MBCx

How To Make Sense Of Data With Altilia

Evangelia Mitsopoulou, St George s University of London Panagiotis Bamidis, Aristotle University of Thessaloniki Daniela Giordano, University of

Large Scale/Big Data Federation & Virtualization: A Case Study

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

The Role of Data & Analytics in Your Digital Transformation. Michael Corcoran Sr. Vice President & CMO

REAL-TIME BIG DATA ANALYTICS

Cloud3DView: Gamifying Data Center Management

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study

Engaging mobile consumers in emerging markets

DISCOVERING RESUME INFORMATION USING LINKED DATA

The Definitive Guide to Data Blending. White Paper

Business Intelligence and Column-Oriented Databases

Client Overview. Engagement Situation. Key Requirements

POSTGRAD PLACEMENTS. Placements are an integral part of the Masters programmes, so international students will not require additional work visas.

COMP9321 Web Application Engineering

Oracle Big Data Handbook

Market for Telecom Structured Data, Big Data, and Analytics: Business Case, Analysis and Forecasts

Smart Cities; An American Perspective, and Smart Cities Council as the Catalyst

Integrating analytics into the Graduate DEGREE curriculum

White Paper: Big Data and the hype around IoT

Delivering Exceptional Customer Experience is a Key Catalyst for IT Transformation

Apache Hadoop. Alexandru Costan

Big Data and Analytics (Fall 2015)

A Tipping Point for Automation in the Data Warehouse.

Big Data Explained. An introduction to Big Data Science.

Applying Semantics to Unstructured Data (Big and Getting Bigger)

Customer Cloud Architecture for Big Data Analytics.

Big Data Integration: A Buyer's Guide

Customer Behaviour Analytics: Billions of Events to one Customer-Product Graph. Budapest BI Forum, 6th November 2013 Presented by Paul Lam

Your Mobility Strategy Guide Book

THE OPPORTUNITIES OF BIG DATA CREATING A DATA ECO-SYSTEM UNLOCKING BIG DATA VALUE


1. Understanding Big Data

Join the Lean Wave. Asanka Abeysinghe Director, Solutions Architecture. WSO2, Inc. Friday, July 22, 11

BioInterchange 2.0 CODAMONO. Integrating and Scaling Genomics Data

Business Intelligence in a Hybrid Cloud Environment

discussion on Standardization Nov, 2009

Some Economics of Cultural PSI: the Micro Perspective

Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science

Big Data and Analytics: Challenges and Opportunities

Fire Risk Assessment Network

FOR A FEW TERABYTES MORE THE GOOD, THE BAD and THE BIG DATA. Cenk Kiral Senior Director of BI&EPM solutions ECEMEA region

BIRT ihub Actuate Customer Days. Wow that looks good! Jeff Morris & Mark Gamble

Sentiment Analysis on Big Data

Building Dashboards for Real Business Results. Cindi Howson BIScorecard December 11, 2012

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

DISIT Lab, competence and project idea on bigdata. reasoning

Data Virtualization. Paul Moxon Denodo Technologies. Alberta Data Architecture Community January 22 nd, Denodo Technologies

DataBridges: data integration for digital cities

Texas Digital Government Summit. Data Analysis Structured vs. Unstructured Data. Presented By: Dave Larson

Web design and implementation

Service Description, G-Cloud Services Social Media Monitoring and Engagement

G Cloud 6 CDG Service Definition for Forgerock Software Services

Using Big Data in Healthcare

INTRODUCTION TO CASSANDRA

Big Data Strategy Issues Paper

Massive Scale Analytics for a Smarter Planet

The story so far: Teaching Mobile App Development at PG level at Londonmet

Enterprise Information Flow


MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS November 7, Machine Learning Group

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

From Slate to Tablet: The Development of New Media for Learning in Taiwan

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft

The Next Frontier in CRM Analytics Oracle Transactional Business Intelligence Enterprise for CRM Cloud Service

Cross-Sectional Integration of LAM Resources on the Basis of Authority Data

we deliver custom software solutions and services to owners and operators of commercial real estate

Linked Medieval Data: Semantic Enrichment and Contextualisation to Enhance Understanding and Collaboration

BIG DATA AND MICROSOFT. Susie Adams CTO Microsoft Federal

New Eco-Systems in the software and service domain in the Cloud area

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Intermediate Advanced All

The Principles of the Business Data Lake

Customized Report- Big Data

Job Seeker Behaviour. Part 1: A Day in the Life of a Job Seeker. Tracy Godding, Head of UX. March 2016

Doing Multidisciplinary Research in Data Science

Transcription:

Building a Linked Data Graph for Education Dr Tom Heath tom.heath@talis.com Talis Education Ltd SemTechBiz, London, September 2012

What do we mean by an 'Education Graph'? The set of all (connected) entities that have an active or potential bearing on the education process (initially focused on higher education)

Why do we care about an 'Education Graph'? Because learning is artificially disconnected An education graph can bridge those disconnects

Components of the Education Graph Course Descriptions Publishers Khan Academy, P2PU, etc Scheduling Course Content Biblio Databases Students and Teachers Subject Codes Wikipedia Facebook etc Youtube

Components of the Education Graph Course Descriptions Publishers Khan Academy, P2PU, etc Scheduling Course Content Biblio Databases Students and Teachers Subject Codes Wikipedia Facebook etc Youtube

Talis Aspire Reading Lists

Talis Aspire Reading Lists >40 enterprise customers in the UK and beyond = 1/3 UK universities 10,000s of reading lists 100,000s of learning resources Heavy usage with interesting peaks in demand

Talis Aspire Reading Lists An RDF-native application, hosted by us Backed by a hosted triplestore Migrating to MongoDB Linked Data views available on the public Web A real, live Linked Data application with paying customers (Probably) the most heavily used Linked Data application in the education domain

Components of the Education Graph Course Descriptions Publishers Khan Academy, P2PU, etc Scheduling Course Content Biblio Databases Students and Teachers Subject Codes Wikipedia Facebook etc Youtube

Building and Enriching the Education Graph

Problem From Plain Text to a 'Biblio-graph-ic' Record Only some data is entered in structured form Legacy data is typically plain text citations Our Approach Pre-process citation text with regex Pass through heavily modified version of FreeCite Clean output again with regex Return as JSON object Pass through entity reconciliation process...

Enhancing Data Quality with Entity Reconciliation Validate the accuracy of the record by matching against high-quality reference data sources OpenLibrary, OpenKB (serials/journals), CrossRef Books: match on a precise edition, roll up to work, map to our canonical resource Articles: enrich the graph describing the resource using OpenKB, search CrossRef using enriched description, map to our canonical resource

A Happy By-Product Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Unifying the Institutional Sub-Graphs Goal Create a cross-institution (portion of) the education graph, centred around learning resources Process Harvest the data from each Reading List store Repeat the entity reconciliation process Retain the links mapping canonical resources to institutional data

Warehousing the Education Graph Goal a single point of access to the education graph Applications analytics and business intelligence (for us and customers) data science experiments Approach based on Apache Jena TDB/Fuseki + MongoDB

Applications of the Education Graph

Directory of Quality Learning Resources

Recommending Learning Resources

Reflections

Reflections What we think matters may not be important e.g. native RDF apps vs. Linked Data views online

Reflections What we think matters may not be important e.g. native RDF apps vs. Linked Data views online Tooling Reality Check how do we measure up compared to e.g. NoSQL databases, Hadoop, etc?

Reflections What we think matters may not be important e.g. native RDF apps vs. Linked Data views online Tooling Reality Check how do we measure up compared to e.g. NoSQL databases, Hadoop, etc? Is numerical data a second-class citizen of the graph?

Questions? Web: talisaspire.com Twitter: @talisaspire YouTube: youtube.com/user/talisaspire Facebook: facebook.com/talisaspire