The Semantic Data Web: A Networked Knowledge Ecosystem Sören Auer
Creating Knowledge Why do we need the Data Web? Problem: Try to search for these things on the current Web: Apartments near German-Russian bilingual childcare in Berlin. ERP service providers with offices in Vienna and London. Researchers working on breast cancer topics in Eastern Europe. Information is available on the Web, but opaque to current search. Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate such structured information from different sources: berlin.de Has everything about childcare in Berlin. HTML Web server DB RDF Search engine RDF HTML Web Immobilienscout.de server Knows all about real estate offers in Germany DB Sören Auer DBpedia and the Emerging Web of Linked Data Page 2 http://lod2.eu
Web 1.0 Web 2.0 Web 3.0 Many Web sites containing unstructured, textual content Few large Web sites are specialized on specific content types Many Web sites containing & semantically syndicating arbitrarily structured content Pictures Video Encyclopedic articles + +
Popularity The Long Tail of Information Domains Pictures Recipes News Video Calendar SemWeb supported structured content The Long Tail by Chris Anderson (Wired, Oct. 04) adopted to information domains Currently supported structured content types Requirements- Engineering Talent management Not or insufficiently supported content types Special interest communities Itinerary of King George Gene sequences
Creating Knowledge 1. Uses RDF Data Model Klopotek Linked Data in a Nutshell organizes PubFor2012 starts 23.4.2012 takesplacein Berlin 2. Is serialised in triples: Klopotek organizes PubFor2012 PubFor2012 starts 20120423 ^^xsd:date PubFor2012 takesplaceat Berlin 3. Uses Content-negotiation Sören Auer DBpedia and the Emerging Web of Linked Data Page 5 http://lod2.eu
The emerging Web of Data interlink SILK create DXX Engine fuse poolparty SemMF OntoWiki 2007 Sigma 20082008 WiQA2008 ORE Virtouso repair DL-Learner MonetDB Sindice enrich 2008 2009 classify 2009
User-generated Media Government Publications Cross-domain Geo Life sciences http://lod-cloud.net/
Interlinking/ Fusing Creating Knowledge Manual revision/ authoring Classification/ Enrichment Storage/ Querying Linked Data Lifecycle Quality Analysis Extraction Evolution / Repair Search/ Browsing/ Exploration Sören Auer DBpedia and the Emerging Web of Linked Data Page 8 http://lod2.eu
Creating Knowledge Extraction Sören Auer DBpedia and the Emerging Web of Linked Data Page 9 http://lod2.eu
Creating Knowledge Extraction From unstructured sources NLP, text mining, annotation From semi-structured sources DBpedia, LinkedGeoData, SCOVO/DataCube From structured sources RDB2RDF Sören Auer DBpedia and the Emerging Web of Linked Data Page 10 http://lod2.eu
Creating Knowledge Transforming Wikipedia into an Knowledge Base extract structured information from Wikipedia & make this information available on the Web as LOD: ask sophisticated queries against Wikipedia (e.g. universities in brandenburg, mayors of elevated towns, soccer players), link other data sets on the Web to Wikipedia data Represents a community consensus Recently launched DBpedia Live transforms Wikipedia into a structured knowledge base Sören Auer DBpedia and the Emerging Web of Linked Data Page 11 http://lod2.eu
Title Abstract Infoboxes Geo-coordinates Categories Images Links other language versions other Wikipedia pages To the Web Redirects Disambiguations Structure in Wikipedia
Infobox templates Wikitext-Syntax {{Infobox Korean settlement title = Busan Metropolitan City img = Busan.jpg imgcaption = A view of the [[Geumjeong]] district in Busan hangul = 부산광역시... area_km2 = 763.46 pop = 3635389 popyear = 2006 mayor = Hur Nam-sik divs = 15 wards (Gu), 1 county (Gun) region = [[Yeongnam]] dialect = [[Gyeongsang]] }} http://dbpedia.org/resource/busan RDF representation dbp:busan dbpp:title Busan Metropolitan City dbp:busan dbpp:hangul 부산광역시 @Hang dbp:busan dbpp:area_km2 763.46 ^xsd:float dbp:busan dbpp:pop 3635389 ^xsd:int dbp:busan dbpp:region dbp:yeongnam dbp:busan dbpp:dialect dbp:gyeongsang...
A vast multi-lingual, multi-domain knowledge base DBpedia extraction results in: descriptions of ca. 3.4 million things (1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases labels and abstracts for these 3.2 million things in up to 92 different languages; 1,460,000 links to images and 5,543,000 links to external web pages; 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories altogether over 1 billion pieces of information (i.e. RDF triples): 257M from English edition, 766M from other language editions DBpedia Live (http://live.dbpedia.org/sparql/) & Mappings Wiki (http://mappings.dbpedia.org) integrate the community into a refinement cycle
DBpedia SPARQL Endpoint SELECT?name?birth?description?person WHERE {?person dbp:birthplace dbp:berlin.?person skos:subject dbp:cat:german_musicians.?person dbp:birth?birth.?person foaf:name?name.?person rdfs:comment?description. FILTER (LANG(?description) = 'en'). } ORDER BY?name 2011/05/12 CONSEGI - Sören Auer: DBpedia 15
Creating Knowledge DBpedia Applications: Spotlight Tapping the intelligence of the crowd for text annotation 2011/05/12 CONSEGI - Sören Auer: DBpedia 16 Sören Auer DBpedia and the Emerging Web of Linked Data Page 16 http://lod2.eu
DBpedia Applications: Relfinder http://www.visualdataweb.org/relfinder.php 2011/05/12 CONSEGI - Sören Auer: DBpedia 17
DBpedia Applications: Zemanta 2011/05/12 CONSEGI - Sören Auer: DBpedia 18
DBpedia Applications: Faceted-Browser 2011/05/12 CONSEGI - Sören Auer: DBpedia 19
Authoring Creating Knowledge Sören Auer DBpedia and the Emerging Web of Linked Data Page 20 http://lod2.eu
Two Kinds of Semantic Wikis Creating Knowledge 1. Semantic (Text) Wikis Authoring of semantically annotated texts 2. Semantic Data Wikis Direct authoring of structured information (i.e. RDF, RDF-Schema, OWL) Sören Auer DBpedia and the Emerging Web of Linked Data Page 21 http://lod2.eu
OntoWiki Dynamic views on knowledge bases Creating Knowledge Sören Auer DBpedia and the Emerging Web of Linked Data Page 22 http://lod2.eu
OntoWiki Creating Knowledge RDF triples on resource details page Sören Auer DBpedia and the Emerging Web of Linked Data Page 23 http://lod2.eu
OntoWiki Creating Knowledge Dynamische Vorschläge aus dem Daten Web Sören Auer DBpedia and the Emerging Web of Linked Data Page 24 http://lod2.eu
Creating Knowledge Catalogus Professorum Lipsiensis Sören Auer DBpedia and the Emerging Web of Linked Data Page 25 http://lod2.eu
Creating Knowledge RDFaCE- RDFa Content Editor for rnews (IPTC) Sören Auer DBpedia and the Emerging Web of Linked Data Page 26 http://lod2.eu
Creating Knowledge Sören Auer DBpedia and the Emerging Web of Linked Data Page 27 http://lod2.eu
Creating Knowledge Integrating various NLP APIs Sören Auer DBpedia and the Emerging Web of Linked Data Page 28 http://lod2.eu
Creating Knowledge Interlinking/ Fusing Manual revision/ authoring LOD Lifecycle Classification/ Enrichmen t Storage/ Querying supported by Debian based Quality Analysis LOD2 Stack Extractio n (http://stack.lod2.eu) Search/ Browsing/ Exploratio n Evolution / Repair Sören Auer The Emerging Web of Linked Data 23.4.2012 Page 29 http://lod2.eu
Creating Knowledge Publisher Opportunities Publishers have vast repositories of textual (and semistructured) content Publishing metadata will raise awareness about publishing products Semantic annotation of content will enable the repurposing, repackaging, tailoring of content Facilitate better content search (e.g. faceted-browsing) Increase interoperability Optimize publishing workflows Sören Auer The Emerging Web of Linked Data 23.4.2012 Page 30 http://lod2.eu
Creating Knowledge becoming linking hubs for the Data Web support facilities for knowledge based authoring & collaboration Provide storage facilities for Linked Data Storage/ Querying Manual revision/ authoring Interlinking/ Fusing Classification/ Enrichment Publishers in the Linked Data Publisher is valuable background knowledge for KB enrichment & repair. Quality Analysis Authorative Linked Data for quality assessment Extract and publish structured (meta-) data for publishing content Extraction Hosting & maintenance of exploration tools Web Search/ Browsing/ Exploration Evolution / Repair Be the lighthouse for the LOD ocean. Sören Auer The Emerging Web of Linked Data 23.4.2012 Page 31 http://lod2.eu
Creating Knowledge Thanks for your attention! Sören Auer http://www.uni-leipzig.de/~auer/ http://aksw.org http://lod2.org auer@uni-leipzig.de Sören Auer DBpedia and the Emerging Web of Linked Data Page 32 http://lod2.eu