Semantic Technology and Cloud Computing Applied to Tactical Intelligence Domain Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 1
Abstract The tactical intelligence community is plagued with the inability to share actionable processed information across the various services. The Distributed Common Ground System (DCGS) Integration Backbone (DIB) allows services to share metadata about information resources, but it does not provide the capability to share relationships of the information needed to make decisions. Semantic technologies have been successfully applied in the tactical intelligence community to address this lack of actionable intelligence sharing problem. Semantic technologies, and the associated standards Resource Description Framework (RDF) and Web Ontology Language (OWL), hold promise for providing the tools required for capture and dissemination of complex information across the services. Providing accurate solutions that are scalable and reusable can be challenging when extended to the ever increasing data sources and size of data being collected. This session will present a proven approach to applying semantic solutions to enterprise-level information sharing by utilizing cloud computing technologies as a scalable back end infrastructure coupled with light-weight Ozone widget-based UI s on the front end, and semantic technologies as the agile data space. This allows for the delivery of solutions by leveraging the power of large-scale RDF graphs in a cloud environment, while, at the same time, shielding the end user from the complexities associated with that backend infrastructure (e.g., complex queries, data mappings, rules, etc.). Certain real-world use cases of this approach, specifically in the tactical intelligence but also commercial industry, will be discussed to highlight both the benefits and technological challenges associated with building enterprise-level solutions, including: Semantic integration of disparate data sources using layered federated ontologies Provenance of topics and concepts across unstructured data sources (e.g., HUMINT reports) and integration of those items to structured data Platform requirements for large numbers of user communities to have access to their data via a shared cloud environment Capture of SME information necessary to build end user applications (e.g., desktop and mobile apps), for a variety of domain applications and different levels of the organization. 2
Overview Semantic Technology: Future for Actionable Intelligence Cloud Computing: More than Lean IT Convergence of Cloud Computing and Semantic Technologies Tactical Intelligence Community Opportunities & Capabilities Linked Open Data and Open Government Initiative Layering Semantics in the Cloud for IC Smarter Data Drives User Interface Summary 3
Semantic Technology Future for Standards-based, Interoperable, Actionable Intelligence World Wide Web Consortium (W3C) Based Standards HTML, XML, RDF / OWL and others Industry and Government Involvement in Standards Schema-less RDF Derived from Subject-Predicate-Object in Language Can Express Anything Layered Semantic Stack Provides Richer Information Based on RDF/XML Foundation Designed to Model Human Cognition to Help Analyst With Decisions W3C Semantic Web Stack RDF Schema-less Context For Resources Doesn t Replace Analyst Dennis Wisnosky 4
Cloud Computing Lean IT and more Evolution of DISA Distributed Enterprise Computing Centers (DECC) Source: Defense Information Systems Agency (DISA) Source: National Institute for Standards and Technology (NIST) The New Capabilities Cloud Computing Paradigm High Level Apache HDFS Architecture Source: IBM Notional Map Reduce Architecture Source: kk.org The primary advantage of the cloud is the ability to be agile and support new requirements... It s a blank slate. You can add technologies rapidly, MAJ Philip Root, Asst. PM, DCGS-A SIPRNet Cloud Comparison of Costs for On-Premise vs. Cloud-based Office Solutions 30 25 20 15 10 5 0 25 0 11 9 4 4 Source: Forrester Research Infrastructure Subscription USD Costs Per User Per Month 5
Faster, Better Actionable Intel with Less The Perfect Storm for Semantics in Cloud Convergence Semantic systems magnify data problems RDF provides verbose data description A single text document could result in tens of thousands of RDF triples A large relational database can easily generate billions of triples Large Enterprise Graphs present scalability opportunities Provenance and pedigree highlight years of IT evolution with lack of proper data efficacy Authoritative and supplementary data sources are either treated equally or require additional logic to resolve Confidence of results is reduced without addressing data efficacy Cloud provides enterprise scalability infrastructure Rich analytics provide more expressive logic to enrich RDF and OWL Linked Open Data and Data.gov Have Shown Viable Success That IC can Leverage Data Increase and Semantic Explosion Compound Needle in a Stack of Needles Analysis Analytics and data are processed together to maximize network efficiency 6 Source: Kevin Meiner s DI2E Kickoff Briefing 31March2011
Actionable Intelligence Sharing Capabilities and Opportunities in Tactical IC Semantic Technology is already in DCGS programs Semantic Wiki RDF Repositories OWL Ontology DIB Supports Sharing of Entities and Intel Sources (Reports, Imagery, etc.) Does not share relationships and properties well (Actionable Intelligence) OUSD(I) FVEY Memo DIB 2.0 Security Linked Data in DIB (4.0?) We must develop and deploy a crossdomain, global, integrated DIIE that interfaces seamlessly with the emerging intelligence community architecture, and incorporates important new capabilities. OUSD(I) DI2E Memo 7
Integrating IC Clouds Leveraging Linked Open Data and Open Government Initiative Linked Open Data Provides Method to Publish Data To Automatically Connect With Other Data Sources URI / IRI: a generic means to identify entities or concepts HTTP: a simple yet universal mechanism for retrieving resources, or descriptions of resources RDF: a generic graph-based data model with which to structure and link data Successful Open Government Initiative Data Used Beyond Plans Standards-based is KEY LOD Effort in National IC will be Open Sourced Enhanced Security W3C Government Linked Data WG Linked Data Integration Framework (LDIF) PRISM (Blackbook) Architecture Source: IARPA Blackbook Briefing 8 Open Government Directive / Initiative
Notional Stack for Semantics in Cloud Leveraging Best Practices From IC, Government, and Industry Cloud User Visualization & Experience Increased Actionable Intelligence Analytics & Enrichment Semantic Representation Data Source & Data Access Security Process Management Sensors & Sensor Feeds
Cloud-based Semantic Systems Expanding User Interface Capabilities Cloud Semantic Systems Drive new UI Capabilities: Semantic Similarity Analysis Dynamic Faceted Navigation Enterprise Graph Analysis Automated Link Analysis / Social Network Analysis Graph Matching "The old, industrial-age framework paradigm was one that said [buy] 'one system that does it all'. Where we need to get to is a framework that provides analysts with the ability to choose applications. Once that is in place, the "industry is not driving the show. G-2, the intel corps, is driving the show." MG John Custer III, former commander of the US Army Intelligence Center, Fort Huachuca. 10
Summary Semantic Technology Provides Standards-based Information Interoperability Complements SOA s System Interoperability and XML s Data Interoperability Enables Sharing of Rich Relationships and Properties Cloud Technology is More than Lean IT Provides a Blank Slate for Driving New Capabilities Faster and Cheaper Demands for Better, Faster Intel in a Cost Reduction Environment Creates a Perfect Storm for Convergence of Semantics in Cloud Technologies like Apache Hadoop and Map Reduce Enable Hard Semantic Tasks like Enterprise Graph Matching, Semantic Text Analytics, etc. Linked Open Data Promotes Data Co-Existence Without Cost of Complete Integration Open Government Initiative Success Makes this Approach Appealing Open Source Tools Exist to Make Publishing Easy; IC Program will Open Source their Tool Soon The DIB Needs to be Modified to Promote Linked Open Data in MDC A Layered Approach for Ontology Design and Implementation Works Best Needs to be integrated with Process Controls and Security Semantics in Cloud Expands User Interface and Experience Provides Analysts Better, Faster Actionable Intelligence 11