Linguistic and Legal Ontologies



Similar documents
The Lois Project: Lexical Ontologies for Legal Information Sharing

Semantic Interoperability

ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU

Semantic Search in Portals using Ontologies

Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy)

A Semantic web approach for e-learning platforms

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach

A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches

ONTOLOGY-BASED APPROACH TO DEVELOPMENT OF ADJUSTABLE KNOWLEDGE INTERNET PORTAL FOR SUPPORT OF RESEARCH ACTIVITIY

AN OPEN KNOWLEDGE BASE FOR ITALIAN LANGUAGE IN A COLLABORATIVE PERSPECTIVE

Information Technology for KM

DATA MODEL FOR STORAGE AND RETRIEVAL OF LEGISLATIVE DOCUMENTS IN DIGITAL LIBRARIES USING LINKED DATA

Secure Semantic Web Service Using SAML

Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013

Comparing Ontology-based and Corpusbased Domain Annotations in WordNet.

Application of ontologies for the integration of network monitoring platforms

The Ontology and Architecture for an Academic Social Network

DISCOVERING RESUME INFORMATION USING LINKED DATA

BUSINESS VALUE OF SEMANTIC TECHNOLOGY

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

THE SEMANTIC WEB AND IT`S APPLICATIONS

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study

SmartLink: a Web-based editor and search environment for Linked Services

Building a Question Classifier for a TREC-Style Question Answering System

CLOVA: An Architecture for Cross-Language Semantic Data Querying

Joint Steering Committee for Development of RDA

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

Introduction to SKOS. Bob DuCharme October 6, 2011

Lightweight Data Integration using the WebComposition Data Grid Service

Publishing Linked Data Requires More than Just Using a Tool

Converging Web-Data and Database Data: Big - and Small Data via Linked Data

Semantic annotation of requirements for automatic UML class diagram generation

EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION

LDIF - Linked Data Integration Framework

ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004

New Generation of Social Networks Based on Semantic Web Technologies: the Importance of Social Data Portability

Collecting Polish German Parallel Corpora in the Internet

How To Use An Orgode Database With A Graph Graph (Robert Kramer)

Annotea and Semantic Web Supported Collaboration

How To Use Networked Ontology In E Health

SemWeB Semantic Web Browser Improving Browsing Experience with Semantic and Personalized Information and Hyperlinks

Ontology and automatic code generation on modeling and simulation

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Processing: current projects and research at the IXA Group

ONTOLOGY BASED FEEDBACK GENERATION IN DESIGN- ORIENTED E-LEARNING SYSTEMS

Annotation: An Approach for Building Semantic Web Library

A generic approach for data integration using RDF, OWL and XML

xmlegeseditor: an OpenSource Visual XML Editor for supporting Legal National Standards

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts

Overview of MT techniques. Malek Boualem (FT)

Natural Language Database Interface for the Community Based Monitoring System *

Mapping a Traditional Dialectal Dictionary with Linked Open Data

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

ONTODESIGN; A DOMAIN ONTOLOGY FOR BUILDING AND EXPLOITING PROJECT MEMORIES IN PRODUCT DESIGN PROJECTS

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

Ontology-Based Query Expansion Widget for Information Retrieval

Ontologies for elearning

Exploiting Comparable Corpora and Bilingual Dictionaries. the Cross Language Text Categorization

Semantic Knowledge Management System. Paripati Lohith Kumar. School of Information Technology

The Prolog Interface to the Unstructured Information Management Architecture

How To Make Sense Of Data With Altilia

Model Driven Interoperability through Semantic Annotations using SoaML and ODM

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute

A HUMAN RESOURCE ONTOLOGY FOR RECRUITMENT PROCESS

Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

STAR Semantic Technologies for Archaeological Resources.

Interchanging lexical resources on the Semantic Web

Natural Language Processing. Part 4: lexical semantics

MERGING ONTOLOGIES AND OBJECT-ORIENTED TECHNOLOGIES FOR SOFTWARE DEVELOPMENT

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet

Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval

CitationBase: A social tagging management portal for references

Transcription:

Linguistic and Legal Ontologies Tommaso Agnoloni ITTIG-CNR Institute of Legal Information Theory and Techniques (CNR, Italy) LEX Summer School 2012 Sept. 13, 2012, Ravenna

Semantic resources - Thesauri: concepts and their lexicalizations (keywords, topics) - Linguistic knowledge (Lexicons Semantic networkslightweight ontologies) - Domain knowledge (Ontologies conceptual models)

Lexical resources knowledge representation about words that refer to objects semantic networks (lexical or lightweight ontologies): nets of concepts structured according to lexical, taxonomic and conceptual relations Focus is on terminology concept density Language dependent constraints over relations are based on POS (Part Of Speech) categories.

Conceptual resources Knowledge representation about objects of the world Formal ontologies: classes of entities described by formal (meta)properties and attributes add axioms and constraints to clarify the intended meaning of the terms gathered on the ontology agreed among the members of a community of interest axiomatic characterisation (allow reasoning) formal constraints density Language independent

Legal domain RESOURCES TASKS Legal sources Access / publication / transparency Comparison / Harmonization Drafting Interoperability / exchange Compliance / procedures Services Mulilingualism Multi Legal systems Legal Sources integration Legal language Legal knowledge

1 Thesauri LEX Summer School 2012 Sept. 13, 2012, Ravenna

traditional lexical resources structured vocabularies (e.g. taxonomies, directories, thesauri, classification schemas) are lists of terms organized in hierachies (BT/NT) and linked by generic RT (related term) relations No semantic constraints Information Retrieval oriented Focus is on documents tagging and classification Do not address linguistic aspects of terminology

KOS and SKOS (KOS) Knowledge Organization Systems (including controlled vocabularies, taxonomies, thesauri) exist and have been used since a long time, in particulary in libraries and then in digital libraries. More recently standards for their representation and exchange on the web have been introduced SKOS (Simple Knowledge Organization System) is a W3C recommendation providing a format fot the standard reperesentation of KOS in interoperable, distributed and linkable way The fundamental difference between SKOS and other representation formats is that it is based on the principles of the Semantic Web. Differently from other existing standards, SKOS have been designed from the very beginning to allow the creation of modular KOSs that can be reused and referred over the web (interconnected controlled vocabularies)

SKOS Semantic Web standards RDF and RDFS provide the infrastructure for the creation of a distributed network of data Based on RDF, SKOS inherits its power in terms of flexibility and distribution Since SKOS is represented in RDF, each concept has its unique identifier (its URI), that identifies it as a resource that can be univocally referred on the web and over which assertions can be stated http://www.w3.org/tr/skos-reference/

SKOS Data Model Sean Bechofer: SKOS Past, present, future

SKOS Concept in RDF <skos:concept rdf:about="http://www.ittig.cnr.it/dogi/descriptor#s1018"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#namedindividual"/> <skos:preflabel xml:lang="it">responsabilità penale</skos:preflabel> <skos:altlabel xml:lang="it">...</skos:altlabel> <skos:preflabel xml:lang="en">criminal liability</skos:preflabel> <skos:broader rdf:resource="http://www.ittig.cnr.it/dogi#c0014"/> <skos:related rdf:resource="http://www.ittig.cnr.it/dogi/term#d4977"/> <!-- es. mapping to BNCF Nuovo Soggettario --> <skos:exactmatch rdf:resource="http://purl.org/bncf/tid/12679"/> </skos:concept> See: Monolingual and multilingual variants

Multilingual Thesaurus of the European Union http://eurovoc.europa.eu/

Thesaurus based cross language retrieval

Low level interoperability: ECRIS http://ec.europa.eu/justice/criminal/european-ejustice/ecris/index_en.htm The computerised system ECRIS was established to achieve an efficient exchange of information on criminal convictions between EU countries.

IPSV Integrated Public Sector Vocabulary http://doc.esd.org.uk/ipsv

http://www.esd.org.uk/esdtoolkit/tree/root.aspx enables all local authorities to record their public facing services against a comprehensive list of services, processes and interactions

Economic thesaurus http://zbw.eu/stw

Crowd sourced resources Manual construction huge effort, costly Concepts, Definitions, Categories, Multilinguality: WIKIPEDIA, WIKTIONARY (crowdsourcing) Wide coverage Low formality DBPedia, mapping/linking, reuse

LAW on WIKIPEDIA http://en.wikipedia.org/wiki/portal:law

Formalization: DBPedia resource (RDF)

EDITING/BROWSING: VocBench http://godel.ittig.cnr.it:8081/vocbench VocBench is an opensource web-based, multilingual, collaborative vocabulary editing and workflow tool originally developed by FAO for AGROVOC agricultural thesaurus editing and manteinance (20 language, 40000 terms)

VocBench Data model / SKOS-xl SKOS-XL defines an extension for the Simple Knowledge Organization System, providing additional support for describing and linking lexical entities http://www.w3.org/2005/incubator/lld/wiki/use_case_agrovoc_thesaurus

References /1 SKOS Reference Documentation http://www.w3.org/tr/skos-reference/ Sean Bechofer, SKOS past present and future, http://www.slideshare.net/seanb/skos-past-present-and-future Heather Hedden, The Accidental Taxonomist, Medford, N.J. : Information Today, 2010 Eurovoc Conference: Mind The Lexical Gap (2010) http://eurovoc.europa.eu/drupal/?q=node/936 Agrovoc VocBench: http://aims.fao.org/tools/vocbench Greenberg, J., Losee, R., Pérez Agüera, J.R., Scherle, R., White, H., and Willis, C. (2011). HIVE: Helping Interdisciplinary Vocabulary Engineering. Bulletin of the American Society for Information Science and Technology, 37 (4): 23-26.

2 Linguistic ontologies LEX Summer School 2012 Sept. 13, 2012, Ravenna

Lexical Resources CHALLENGE: Natural languages strongly characterized by AMBIGUITY and LEXICAL VARIABILITY AIM: find an adequate model taking into account such phenomena TASKS: Acquisition from texts of lexical meaning lexical similarity text categorization automatic disambiguation (WSD Word sense disambiguation) multilinguality NLP algorithms need data to operate: - big textual collections (corpora) multilingual parallel corpora - proper linguistic resources like lexicons and grammars

Lexical Ontologies A lexical knowledge base is a database representing lexical meaning to be accessed by systems for text analysis provide (explicit) knowledge background to machines to deal with natural language System of symbols representing concepts encoded by natural language expressions (lexical units, terms, etc.) Further: specify semantic classes grouping terms at semantic level car, van, truck dog, cat, horse Beach,spiaggia piano concert, rock concert VEHICLE ARTIFACT MAMMAL ANIMAL OBJECT BEACH LOCATION CONCERT EVENT ENTITY

Typologies of lexical ontologies Monolingual vs multilingual General purpose vs domain specific Type of content (Morpho)syntactic Semantic Terminological Mixed

Semantic computational lexicons Represent the meaning of a word Distinguish different senses of a word Represent similarity, relatedness etc. (e.g. bank, check, money are concepts related to finance)

WordNet Inspired by psycholinguistic theories of human lexical memory Born and developed for English at the cognitive science laboratory of Princeton University Lexical Semantic Resource Semantic Lexicon Maps words to meanings (senses) WordNet is organized around word meaning (not word forms as with traditional lexicons) Freely available http://wordnet.princeton.edu/ As a DB, in formal structure, as API

WordNet Synsets - Wordnet partitions the lexicon in nouns, verbs, adjectives and adverbs each organized in sets of cognitively equivalent synonims or synsets nounsynset {vacation, holiday} verbsynset {close, shut} adjectivesynset {soiled, dirty} - each set of synonims represent a lexical concept and then a sense - each member of a synset encodes the same concept - every concept is associated with a particular PART OF SPEECH (noun, verb, adjective, adverb)

Synsets and Senses Synsets represent word meaning Words that occur in several synsets have a corresponding number of meanings (senses) Synsets do not explain a concept but merely exemplify that a concept exist

Concept meaning is explained by a gloss a specific concept can be referred using one or more lexical form lexical form: is the way used to represent a single word as a sequence of characters A term can represent several concepts (wordsenses): e.g diritto (right) and diritto (law)

The WordNet knowledge organization system Every WordSense is associated exactly to a single synset Every WordSense is referred to a single lexical form Every lexical form can belong to one ore more Wordsense and thus can be associated to one or more synsets (polysemy) Synset 1 N WordSen se N 1 LexicalFor m

Synset Hierarchy Synsets are organized in hierarchies generalization (hypernymy) specialization (hyponymy) Hierarchy Example

WordNet Relations Synsets are bound by semantic intra-lingual relations hyponymy hypernymy meronymy olonymy constituting a rich semantic network and by inter-lingual equivalence relations, set by an interlingual index (ILI)

Multilingual access: EuroWordNet - EWN developed a methodology to connect the semantic networks created for the different european languages; - ILI : Inter Lingual Index, each synset in a monolingual WN has at least one equivalence relation with an ILI record; - concepts ( synset of a monolingual WN ) connected to the same ILI record are considered equivalent concepts.

The Inter-Lingual Index

The EuroWordNet Architecture Domain ontology Top ontology 2nd order entity traffic air traffic road traffic III ride III move III III English WordNet conducir I II berijden III III betragen II ILI record (drive) II Inter Lingual Index Spanish WordNet III rijden I III mover dynamic drive III cabalga r location II guidare III Dutch WordNet cavalcar e III III muoversi Italian WordNe t

OntoWordNet: KOS translation in a standard OWL format Formal specification of WordNet through extension and axiomatization of its conceptual relations http://www.w3.org/tr/wordnet-rdf/ - Mark van Assem, Aldo Gangemi, Guus Schreiber

Linguistic resources in the legal domain LEGAL LANGUAGE: TERMS have precise meaning (e.g. legal definition) Still, terms can be polysemous and should therefore be assigned to more than one concept prescrizione (decorrenza dei termini, obbligo) IT:Diritto ES: Derecho to both EN:Right EN:Law LEGAL CONCEPT (language independent, legal system independent?) Its LEXICAL REPRESENTATION within a linguistic system Its LEXICAL REPRESENTATION in different linguistic systems

JurWordNet - a Semantic Network for the Legal Domain From a list of domain specific words: < Activity, civil action, proceeding, legal proceeding, judicial proceeding, criminal procedure, administrative procedure,civil law suit;> To a structured vocabulary: {Activity} {proceeding} Synset: a set of words that can be interchanged in a given context {judicial proceeding, legal proceeding} {criminal procedure} {civil law suit, civil action } {administrative procedure}

JurWordNet canone annuo canone di abbonamento JW s.n.4 Prestazione Ordinamento canonico in denaro s. n. 3 norma legge canone IW JW Senso n.1:catalogo dei santi canonizzati Senso n.2: composizione musicale a più voci

http://godel.ittig.cnr.it/jwn/editor

Word sense disambiguation: semantic polisemy the 4 senses of the Italian term ordine : a command given either in speech or writing by a person or body having the authority to do so {ordine}it, {Befehl_2}DE, {bevel,opdracth-2} NL{ordem_5}PT, {prikaz} CZ. arrangement of separate elements according to specific criteria, {ordine-2}it,{order_6} EN,{Ordnung} DE,{ordening} NL, {ordem_6} PT,{usporadami} CZ. a group of persons or things which form a separate/independent category, because they share a condition or some particular characteristics {ordine_3}it,{}en,{klasse}de, {klasse, soort} NL ordem_3}pt,{}cz. the political and social structure of a state{ordine_4}it, {system}en,{gesellschaftsordnung} DE,{maatshapellicjik systeem} NL,{ordem_4 } PT.

Word sense disambiguation: contextbased polysemy Lexical Definition - worker_1: a person who works at a specific occupation. has_hyper EU Directives Definitions: 8.2005-02-02: worker_2: any person who, in the Member State concerned, is protected as an employee under national employment law and in accordance with national practice; has_hyper 23.2005-02-02: worker_3: any person carrying out an occupation on board a vessel, including trainees and apprentices, but excluding port pilots and shore personnel carrying out work on board a vessel at the quayside; 22.2005-02-02 worker_4: any person employed by an employer, including trainees and apprentices but excluding domestic servants; 21.2005-02-02: worker_5: any worker as defined in Article 3 (a) of Directive 89/391/EEC who habitually uses display screen equipment as a significant part of his normal work.

Ontology-based disambiguation: systematic polysemy between an institution, a function, and a physical object the entry President of the Republic can indicate a physical person, the constitutional body, or the holder of the state function. between a normative content and a physical entity the entry contract is a legal transaction (e.g. the content of a contract) is expressed by an information object (e.g. the linguistic encoding of the content of a contract) which is realized by a legal document (e.g. the physical object realizing the encoding of the contract)

Linking linguistic to formal ontologies linguistic ontologies support semantic annotation and improve conceptual and cross lingual retrieval even if: Multilingual mapping is language dependent Shallow semantic characterization (no distinctions between domain and linguistic relations) Similarities setting and senses distinctions are explicited but not justified In more complex applications, as: terminological consistency checking (e.g. In legislative drafting ), concept comparison knowledge sharing the lexical layer must be anchored to a deeper conceptualization

Disambiguation by external link to Core and Foundational Ontology Dolce Foundational (Upper) Ontology Entity Endurant physical object non-physical-object non-agentive-physical-object non-ag-social-object agentive-social-object Legal Core Ontology legal document legal description legally-constructed-institution Legal WordNet legal-evidence, legal proof municipality

The Lois multilingual database The LOIS (e-content 2005-2006) database (33,000 concepts of private law domain in Italian, English, German, Czech, Portuguese and Dutch) built combining manual and semi-automatic methods and bottom-up and topdown concept extraction: manual translation of the Italian set of lexical concepts (JurWordNet) manual creation of new synsets by legal experts automatic extraction of explicitly defined concepts from legislative text by pattern extraction automatic extraction of lexical elements from text Disambiguation explicited by links to Core and Foundational Ontologies (Dolce+CLO)

Translating legal concepts Translating' legal concepts implies interpretation but: interpretation acts within a single legal system, translation acts from a source linguistic (and legal system) to a target linguistic and legal system 'Translating' legal concepts implies comparison among national juridical notions and institutes, but: In the European supra-national context, the legislative process has a creative, innovative role: Members States agree upon meaning assumptions which are not necessarily consistent with the national conceptualizations

Translating legal concepts An example: in Directive 99/44/EC English word: reasonably Italian (translation of Directive): ragionevolmente Italian (transposition law): con ordinaria diligenza (the latter is highly specialised, common to the terminology of the Italian legal literature and especially representing the correct fulfilment of the duties of the contractual parties: Art. 1176 civil code) This is an evidence of extra-eu polysemy

conceptual misalignment Example: German Klar und verständlich 1) the print or the writing of the information must be clear and legible 2) the information must be intelligible by the consumer 3) the language of the information must be the national of consumer 1. print or writing Klar und verständlich Chiaro e Comprensibile 2. intelligibility 3. national language Clear and Understandable Need to distinguish between terms and meanings (concepts)

Multi-lingual Multi-LegalSystem Legal Taxonomy Syllabus The meaning(s) of legal terms cannot be separated from their relation to a legal system Legal terms can be translated, but Conceptual frames (= taxonomies) do not necessarily follow Methodology In-depth analysis of terms in different systems (including European Directives) Manual comparison based on legislation, jurisprudence, and doctrine Development of a resource for private law and an interface for resource update

Multi-lingual Multi-LegalSystem

Multi-lingual Multi-LegalSystem ontologies depend on the legal system!

Building lexical ontologies for law: methodological choices Definition or adoption of a Knowledge Organization System (KOS), e.g. WordNet. Concepts selection: bottom-up/ top-down manual/automatic Legal concept translation : how to set equivalence relations among legal concepts of different language (belonging to different legal systems)? Disambiguating several kinds of polisemy

Dalos (Drafting Legislation with Ontology-based Support) Dalos (eparticipation 2007-2008) aims at providing the European Legislator with a semantic tool to support terminological harmonization and meaning coherence a modular architecture allows multiple views of the semantic components and a flexibile interfacing between lexical and formal ontological layers

Concept selection: the bottom-up approach of Dalos The DALOS lexicon is automatically extracted from EU texts (Directives and Judgments) on Consumer Law, using NLP tools (syntactic parsing, conceptual clustering, statistics, learning, etc.) Concepts selected by two NLP learning tools are merged and structured in a WordNet-like model A perfect equivalence relation holds among concepts automatically extracted from aligned fragments of multilingual parallel corpora (no need of a shared ILI) ontology alignment (Classification of concepts in a Domain Ontology)

Dalos (www.dalosproject.eu)

References /2 Carlo Strapparava, Semantica, book chapter in: "Instrumentum vocale: intelligenza artificiale e linguaggio", Bononia University Press, 2008 (in Italian) Fellbaum, C. (ed.) (1998). WordNet: An Electronic Lexical Database. MIT Press, Cambridge,Mass Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L. (2002), Sweetening Ontologies with DOLCE. In: Proceedings of EKAW 2002 Peters W., Sagri M.T., Tiscornia D., (2007), The structuring of legal knowledge in LOIS, Artificial Intelligence andlaw, 15:117-135. Vossen, P., Peters, W. & Díez-Orzas, P. (1997). The Multilingual design of the EuroWordNet Database. In: Mahesh, K. (ed.), Ontologies and multilingual NLP, Proceedings of IJCAI-97 workshop L. Lesmo,, G. Boella, A. Mazzei, and P. Rossi. Multilingual conceptual dictionaries based on ontologies: Analytical tools and case studies. In Proc. of Conference Approaching the Multilanguage Complexity of European Law: Methodologies in Comparison, pages 1-14, Florence, November 2006 T. Agnoloni, L. Bacci, E. Francesconi, W. Peters, S. Montemagni and G. Venturi, A two-level knowledge approach to support multilingual legislative drafting, in J. Breuker, P. Casanovas, E. Francesconi, M. Klein (eds.), Legal Ontologies and Semantic Web, IOS Press, 2008

3 Ontologies - Legal ontologies LEX Summer School 2012 Sept. 13, 2012, Ravenna

Ontology (def.) An ontology is a formal specification of a shared conceptualization of a domain of interest formal: the ontology should be machine-readable shared: it is accepted by a group or community Further: it should be restricted to a given domain of interest and therefore model concepts and relations that are relevant to a particular task or application domain axiomatic characterisation (allow reasoning) formal constraints density Language independent

Typologies Purpose Interoperability (systems) / communication (humans) Systems engineering Generality Support knowledge acquisition Reasoning and problem solving Indexing and search Upper / Foundational / Top (e.g. Event) Core (central concepts) Domain (a vocabulary of a particular domain) Content Task (activity or process) Domain (vocabulary)

Upper-Core-Domain Breuker et al. 2005

Design approaches Top down Bottom up Reuse (e.g. informal knowledge, legacy) Standard vocabularies (dc, skos, foaf, FRBR,..) Patterns (typical solutions to recurrent modelling problems)

(some) Legal Ontologies Nuria Casellas Lex Summer School 2010 LKIF-Core http://www.estrellaproject.org/lkif-core/ Core Legal Ontology http://www.loa-cnr.it/ontologies/clo/corelegal.owl Ontology of Legal Cases http://wyner.info/research/ontologies/legalcaseontology_v9.owl Copyright ontology http://rhizomik.net/ontologies/2006/01/copyrightonto.owl Best-user Ontology (BATNA): http://www.best-project.nl/ontology/best.daml Minimal Model Economic Crimes Ontology: http://www.man.poznan.pl/~jolac/minimalmodel/minimalmodel.owl Free Legal Ontology (powers, persons, institutions) http://purl.org/derecho/vocabulario#

CLO Core Legal Ontology The CLO provides the general categories of the legal domain that are in principle found in all the legal systems and sub domains, like: law, legal norm, regulation, legal agent, legal role [Gangemi et al. 2005] CLO depends on DOLCE+. The Norm Case pattern is a component within the CoreLegal (CLO) module: Legal cases conform to norms when actions, objects and values are classified by tasks, roles, and parameters respectively.. For example, an obligation for a role towards a task should correspond to a participation of an agent (object) in an action;

LKIF-Core Core ontology of basic legal concepts http://www.estrellaproject.org/lkif-core Importance Law Right Jurisdiction Permission Prohibition Rule Sanction Violation Power Duty Legal position Norm Obligation Permissive right Argument Abstractness deontic operator Law Norm Obligative right Permissive right Power Right Rule Time Anancastic rule Existential initiation Existential termination Potestative right Productive characterisation Absolute obligative right Legal Relevance Civil law Law Legal consequence Legislation Obligation Right Authoricy Deontic operator Duty Jurisdiction Legal fact Legal person Legal position Legal procedure Liability

LKIF-Core Core ontology of basic legal concepts 14 modules Top Basic action, expression, role Legal top, mereology, place, time, process Norm, legal-role, legal-action Vocabulary & Frames Modification, rules

A Legal Case OWL Ontology http://wyner.info/research/ontologies/legalcaseontology_v9.owl

Domain Ontology (DALOS) www.dalosproject.eu

Ontology-lexicon interfacement Ontology Lexicalisation Integrating ontologies (knowledge representation about objects) and lexicons (knowledge representation about words that refer to objects) Enriching ontologies with a lexical layer Conceptual characterization of concepts in semantic lexicons TASKS Q&A systems Ontology-based Information Extraction from text Text analytics Text mining Ontology Learning from text Lexical methods in Ontology Alignment Automatic Translation (Buitelaar, 2011) Requirements for ontology-lexicon model: Keep semantics separate from linguistic info clearly separate world from word knowledge

Dalos knowledge base

LemOn (Lexical Model for Ontologies) www.lemon-model.net General model for formalizing lexical features relative to independently defined ontological semantics Embeds different lexical models: LMF, OWN, SKOS Provides a mean to connect lexical features to ontological semantics Ontology Entity: The ontology entity that describes the meaning of the concept in a languageindependent manner Lexical Sense: This object is used to attach all meaning-dependent properties of the word or term. Lexical Entry: This represents the word or term itself. Lexical Form: This object is used to describe a single form (e.g., plural, perfect, etc.) or an entry Written Representation: The actual string that the lexical entry is realized as.

References /3 D. Allemang, J. Hendler, Semantic Web for the Working Ontologist, Second Edition: Effective Modeling in RDFS and OWL, Morgan Kaufman, 2011 Casellas, N. (2010), Semantic Enhancement of Legal Information Are We Up for the Challenge?, available at http://blog.law.cornell.edu/voxpop/2010/02/15/semantic-enhancement-of-legal-informatiom...-arebreuker, J., Casanovas, P., Klein, M.C.A., Francesconi, E. (Eds.), Law, Ontologies and the Semantic Web, Frontiers in Artificial Intelligence and Applications, IOS Press, 2009 Gangemi, A, M.-T. Sagri, D. Tiscornia (2003).A Constructive Framework for Legal Ontologies, in Benjamins e.a., editors, Law and the Semantic Web, pp. 36-64. Springer Verlag, Berlin, vol. 3396 R. Hoekstra, J. Breuker, M. Di Bello, and A. Boer. The LKIF Core ontology of basic legal concepts. In P. Casanovas, M. A. Biasiotti, E. Francesconi, and M.T. Sagri, editors, Proceedings of the Workshop on Legal Ontologies and Artificial Intelligence Techniques (LOAIT 2007), June 2007. A. Wyner and R. Hoekstra. A Legal Case OWL Ontology with an Instantiation of Popov v. Hayashi. Knowledge Engineering Review, xx:xx, 2011. To appear. http://wyner.info/research/papers/wynerhoekstraker2010ontology.pdf J. McCrae, D. Spohr, and P. Cimiano. 2011. Linking lexical resources and ontologies on the semantic web with lemon. In Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I (ESWC'11)

4 Linked Open Data and Resources Integration LEX Summer School 2012 Sept. 13, 2012, Ravenna

Open Data Open Data has to do with: Semantic Web Technologies Open formats Transparency Transparency and and participation participation Economic Economic value, value, innovation, innovation, services services - Open Data is about making data easily reusable in applications

The Open in Open Data Open Data is about data re-use A piece of content or data is open if anyone is free to use, reuse, and redistribute it subject only, at most, to the requirement to attribute and share-alike.. The Open Knowledge Definition (OKD) (opendefinition.org/okd)

2 kinds of openness Legal Openness Technical Openness

Make Data legally open Use existing licensing: Public Domain (no rights reserved) Public Domain Dedication & License (PDDL) Creative Commons Zero (CC0) Attribution (you must give credit) Open Data Commons Attribution (ODC-BY) Creative Commons Attribution (CC-BY) Sharealike (you must share back) Open Data Commons Attribution-Sharealike (ODbL) Creative Commons Attribution Sharealike (CC-BY-SA) Or create your own license

Make Data technically open Raw Data - separate informative content from presentation Use open formats (txt, XML, html, odt) use URIs for identification expose the data for access via the HTTP protocol use the RDF data model to describe content of resources and to link them to other useful information (Machine Readable METADATA)

RDF Data Model Provides the missing relational level simple data model based on triples Statements are <subject, predicate, object> triples: <Book X, hasauthor, Author Y> Can be represented as a graph: Book X hasauthor Statements describe properties of resources A resource is any object that can be pointed to by a URI: The subject of one statement can be the object of another A collection of statements creates a directed labeled graph Author Y Resources available distributed on the web

From a Web of Documents to a Web of Data Establish machine readable meaningful links among resources on the web of data just like hyperlinks connect html documents on the web of documents 2007 2011

Data as a Service DAAS Application Data API Application Data Cloud Data API Application Application Data API Application Application Data API

Linked Data Principles use URIs for identification of resources expose the data for access via the HTTP protocol use the RDF data model to describe content of resources and to link them (and make them linkable) to other useful information

Open Government Data

First steps in Europe Directive 2003/98/EC on the re-use of public sector information (PSI)

Motivations for open PSI Transparency Accountability Public control of government Engagement / participation / democracy More efficient use of public resources Economic value EUROPEAN PSI is estimated to worth 27 billion Added value on data (metadata) Apps/Services fed by publicly available data - new market

Public Legal Information: towards a Legal Data Cloud EU Legal Sources (mockup)

Legal Information Access Legal sources fragmentation To a worryingly large extent, statutory law is not practically accessible today, even to the courts whose constitutional duty it is to interpret and enforce it. There are four principal reasons. First, the majority of legislation is secondary legislation. Secondly, the volume of legislation has increased very greatly over the last 40 years Thirdly, on many subjects the legislation cannot be found in a single place, but in a patchwork of primary and secondary legislation. Fourthly, there is no comprehensive statute law database with hyperlinks which would enable an intelligent person, by using a search engine, to find out all the legislation on a particular topic. Lord Justice Toulson in R v Chambers [2008] EWCA Crim 2467 LINKED DATA PRINCIPLES / Open Standards

Legal Information Standards URN:Lex XML Schema (Metalex/CEN, Crown XML, NormeInRete, AkomaNtoso..) Legislative metadata model (FRBR+Metalex Cen+Dublin Core.. ) Legal Ontologies Core Legal Ontologies, LKIF Ready for reuse in a linked open data context Instantiate the annotation models with real Legal Documents Corpora

Interplay with semantic assets Technically interoperable thanks to web standards with: Thesauri: EUROVOC Skos/XML Legal ontologies OWL/RDF Multilingual retrieval conceptual access; e.g. topic filtered view on data Computational lexicons (OWL/RDF) Improved automated semantic relation

Semantic Web Applications Access/Browse a global interconnected DB Merge / mix data Perform powerful cross-datasets query

Innovative legal applications/services On top of the legal data layer Views / Services / domain specific app Integrated access by subject Cross dataset query Recompose fragmented sources in a single place providing services accessing distributed resources Automated (machine readable resources) The more deeply annotated the more powerful

LEGAL DATA Legislation Case-Law Bills Amendment s Proposals Votes Mash-ups SOCIAL DATA Blogs Comments News Social networks SCIENTIFIC SCIENTIFIC DATA DATA Literature Literature Bibliography Bibliography Abstracts Abstracts Doctrine Doctrine FACTUAL FACTUAL DATA DATA Trends Trends Statistics Statistics Indicator Indicator ss

legislation.gov.uk

doc.metalex.eu

api.epdb.eu

opencongress.org

openparlamento.it

openspending.org

References /4 Berners-Lee, T. (2006) Linked Data - Design Issues, available at http://www.w3.org/designissues/linkeddata.html Bizer, C., Heath, T. and Berners-Lee, T. (2009), Linked Data - The Story So Far, in: Heath, T., Hepp, M., and Bizer, C. (eds.). Special Issue on Linked Data, International Journal on Semantic Web and Information Systems (IJSWIS) Heath, T. and Bizer, C. (2011), Linked Data: Evolving the Web into a Global Data Space (1st edition). Synthesis Lectures on the Semantic Web: Theory and Technology, 1:1, 1-136. Morgan & Claypool. F. Maali, R. Cyganiak, V. Peristeras A publishing pipeline for Linked Government Data. In 9th Extended Semantic Web Conference (ESWC2012), Springer Heraklion, Crete, Greece, 2012. T. Agnoloni, M.T. Sagri, D. Tiscornia, Opening Public Data:a path towards innovative legal services, in proceedings of LVI 2011