InSciTe Project Hanmin Jung Head of the Dept. of Computer Intelligence Research
KISTI Institute of Advanced Information S/W Research Center Dept. of Computer Intelligence Research
Human vs. Machine Intelligence 3
Machine Intelligence IBM Watson http://powet.tv/powetblog/wp-content/uploads/2011/02/watson_the_computer_beats_ken_jennings_and_brad_rutter_at_jeopardy_full.jpg 4
Machine Intelligence Standford s Robotic Car http://cdn3.digitaltrends.com/wp-content/uploads/2011/10/1200-siri.jpg
Machine Intelligence Apple Siri http://cdn3.digitaltrends.com/wp-content/uploads/2011/10/1200-siri.jpg
Web Evolution 7
Size of Data in the World Q: How about human? A: Our brain has the capacity to store information in the hundreds of terabytes to petabyte range. 8 http://www.ektron.com/billcavablog/big-data-big-content-big-challenges/
Effect of Big Data Search Evaluation http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/40491.pdf 9
Value Pyramid InSciTe Adaptive (2012) InSciTe Advanced (2011) Forecasting Scenario Planning Advising Decision Support Extracting Search Clustering Modified from D. Bousfield & P. Fooladi, STM Information: 2009 Final Market Size and Share Report, 2010. 10
Needs of Experts Relationship between technologies Technology gap Market shares Key players in group Citation information Leading companies Social information New entries Product information Partner candidates recommendation Standard patents Technology hierarchy Trend reports Market size Significance of papers/patents Search history Core technologies 11 Information verification
Technology Intelligence R. Rohrbeck, H. Arnold, and J. Heuer, Strategic Foresight in Multimedia Enterprises, 2007.
Quantitative Analytics 13
Quantitative Analytics Insights for Search 14 http://www.google.com/insights/search/
TI Projects FUSE Funded by IARPA (early 2011 ~ early 2016) Kick off meeting in summer, 2011 Foresight and Understanding from Scientific Exposition Program Seeks to develop automated methods that aid in the systematic, continuous, and comprehensive assessment of technical emergence using information found in the published scientific, technical, and patent literature Partners BAE Systems, Brandeis Univ., New York Univ., 1790 Analytics, 15
TI Projects CUBIST Funded by the European Commission (late 2010 ~ late 2013) 1 st CUBIST workshop in July, 2011 Combining and Uniting Business Intelligence with Semantic Technologies Program Aims to develop new ways to interrogate not only the massive volume data on the Internet, but also analyze the different formats it exist in such as blogs, wikis, and video Partners SAP, Ontotext, Sheffield Hallam Univ.,
TI Projects Common Technologies Semantic technologies Ontology, reasoning, URI scheme Analytics model BYOM (e.g. technology opportunity discovery model, technology evolution model, formal concept analysis model) Information extraction (InSciTe, FUSE) Named entities and events/relations in textual documents
InSciTe Advanced (2011)
InSciTe Advanced (2011) Data Fact Sheet Articles: 15.4 millions (6.7 millions for papers, 8.7 millions for patents) IEEE proceedings/journals (2001~2011) Papers for all technical areas (2009~2011) US/EU/Japan patents (2001~2011) Technical terms: 68 thousands Institutions: 340 thousands
InSciTe Adaptive (2012) 20
InSciTe Adaptive (2012) Crawling Web Data by RSS & Google API
InSciTe Adaptive (2012) Data Fact Sheet Articles: 22.6 millions (9.8 millions for papers, 7.6 millions for patents, 5.3 millions for Web data) All technical areas (2001~2011) Named entities: 1.9 millions Authority dictionary: 1.5 millions entries Linked Data: 290 GB (will be connected)
InSciTe Adaptive (2012) Big Data Test Bed 23
Case Studies Ministry of Justice (2007~)
Case Studies Korea Customs Service (2010~2011)
Case Studies Defense Agency for Technology and Quality (2011~2012) 26
Case Studies ISTIC, China For national digital library based on analytics 27
InSciTe Architecture OntoVerifier Reasoning Verifier OntoPipeliner Semantic Service Composer OntoRelFinder Relationship Path Finder OntoReasoner Reasoning Engine OntoFrame SS&AE Semantic Search & Analytics Engine Analytics Models TLCD Model Technology Life Cycle Discovery Model OntoURI Semantic Knowledge Manager Ontology TLC Model Technology Life Cycle Model ETD Model Emerging Technology Discovery Model SINDI-CORE/LINK Entity & Relationship Extractor OntoURIResolver Identity Resolver Linked Data TUC Model Terminology Use Cycle Model Web Data Crawler RSS/Google API Web Data Literatures
InSciTe Project Goal & Tasks (2013) Development of S&T Literature Big Data Analytics/Application Platform Big Data mining technology Semantic analytics technology Big Data relationship analytics/application technology Technologies Text mining Multimedia mining Semantic integration Reasoning and graph analysis Modeling and assess for relationship analytics and application
InSciTe Project Partners (2013) OVUM, UK Building analytics model Understanding business needs Planning InSciTe service MSRA, China TBD GESIS & Hildesheim Univ., Germany Analyzing patent trends Assessing InSciTe service platform
Homepage http://semantics.kisti.re.kr 31
A lot of times, people don t know what they want until you show it to them. Many people won t be convinced until they ve seen it for themselves. by Steve Jobs by Jakob Nielsen Thank you jhm@kisti.re.kr 32