Search Based Applications



Similar documents
Global Headquarters: 5 Speen Street Framingham, MA USA P F

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Revealing the Where of Business

Fogbeam Vision Series - The Modern Intranet

CONNECTING DATA WITH BUSINESS

BIG DATA THE NEW OPPORTUNITY

BIG DATA-AS-A-SERVICE

The Challenges of Integrating Structured and Unstructured Data

BBBT Podcast Transcript

Reaping the Rewards of Big Data

Turning Big Data into a Big Opportunity

CIO considerations for Big Data

CORPORATE BPM ACTIVITIES TOOLS IN THE BULGARIAN ENTERPRISES

T HE I NFORMATION A RCHITECTURE G LOSSARY

Data Discovery, Analytics, and the Enterprise Data Hub

DATA VISUALIZATION: When Data Speaks Business PRODUCT ANALYSIS REPORT IBM COGNOS BUSINESS INTELLIGENCE. Technology Evaluation Centers

Using Tableau Software with Hortonworks Data Platform

OPTIMIZATION OF PROCESS INTEGRATION

White Paper. Real-time Customer Engagement and Big Data are Changing Marketing

WHITEPAPER BIG DATA GOVERNANCE. How To Avoid The Pitfalls of Big Data Governance?

Making Big Data Analytics Fast and Easy Big Data Not Delivering? Context is the key.

Digital Disruption & the Digital Media Supply Chain

Accenture and Oracle: Leading the IoT Revolution

IoT Analytics: Using Big Data to Architect the Products and Services of Tomorrow

White Paper. Version 1.2 May 2015 RAID Incorporated

Thought Leadership White Paper Three Steps to Building a Long-Term Big Data Analytics Strategy

Research of Postal Data mining system based on big data

Find the signal in the noise

Banking Application Modernization and Portfolio Management

Qlik UKI Consulting Services Catalogue

Big Data better business benefits

Socialprise: Leveraging Social Data in the Enterprise Rev 0109

A Guide to Open Source Transformation Services. How and Why Organizations are Making the Move to Open Source

xrm Framework and Microsoft SharePoint

Empowering the Masses with Analytics

Navigating Big Data business analytics

Unlocking The Value of the Deep Web. Harvesting Big Data that Google Doesn t Reach

How To Handle Big Data With A Data Scientist

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Big Data Integration: A Buyer's Guide

An Enterprise Framework for Business Intelligence

Cloud Computing and the Coming IT Cambrian Explosion. Irving Wladawsky-Berger

A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Transforming Big Data Into Smart Advertising Insights. Lessons Learned from Performance Marketing about Tracking Digital Spend

A Convergence in Application Architectures and New Paradigms in Computing

Search-Based Applications (SBAs)

Six Drivers For Cloud Business Growth Efficiency

Microsoft Business Intelligence

I D C T E C H N O L O G Y S P O T L I G H T

Learning and Teaching

We are Big Data A Sonian Whitepaper

IBM Information Management

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

WHITEPAPER. The Death of the Traditional ECM System. SharePoint and Office365 with Gimmal can Enable the Modern Productivity Platform

COURSE NAME: Database Management. TOPIC: Database Design LECTURE 3. The Database System Life Cycle (DBLC) The database life cycle contains six phases;

DEGREE CURRICULUM BIG DATA ANALYTICS SPECIALITY. MASTER in Informatics Engineering

South Dakota Board of Regents New Baccalaureate Degree Minor

Analytics in the Finance Organization

Redefining Infrastructure Management for Today s Application Economy

Emerging Geospatial Trends The Convergence of Technologies. Jim Steiner Vice President, Product Management

RAPID ENGINEERING WITH AGILE RIGHTSHORE DELIVERY (REWARD)

Meeting the Challenges of Business Intelligence

I D C E X E C U T I V E B R I E F

Digital Experts Programme Tri-borough adult social care app case study

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Data Center Fabrics and Their Role in Managing the Big Data Trend

Tapping the benefits of business analytics and optimization

Data Refinery with Big Data Aspects

Cinda Daly. Who is the champion of knowledge sharing in your organization?

Business white paper The disruptive power of big data

Business Intelligence of the Future. kpmg.com

ECM Migration Without Disrupting Your Business: Seven Steps to Effectively Move Your Documents

Brochure. Update your Windows. HP Technology Services for Microsoft Windows 2003 End of Support (EOS) and Microsoft Migrations

2012 Mobile Advertising

Successful Outsourcing of Data Warehouse Support

Executive summary. Table of contents. Four options, one right decision. White Paper Fitting your Business Intelligence solution to your enterprise

Fitting Your Business Intelligence Solution to Your Enterprise

CHAPTER 1 STUDY OF PERFORMANCE OF ERP SYSTEM IN SELECT INDIAN MANUFACTURING FIRMS: INTRODUCTION

Business Intelligence & IT Governance

Revitalising your Data Centre by Injecting Cloud Computing Attributes. Ricardo Lamas, Cloud Computing Consulting Architect IBM Australia

OpenText Tempo Social

Northrop Grumman White Paper

BI Market Dynamics and Future Directions

Business Architecture: a Key to Leading the Development of Business Capabilities

Windows Server 2003 migration: Your three-phase action plan to reach the finish line

Role of Analytics in Infrastructure Management

The Marketer s Guide To Building Multi-Channel Campaigns

Computer and Internet Usage at Businesses in Kentucky Steven N. Allen

Project, Portfolio Management (PPM) for the Enterprise Whose System is it Anyway?

THe evolution of analytical lab InForMaTICs

WHITEPAPER. A Data Analytics Plan: Do you have one? Five factors to consider on your analytics journey.

Auto-Classification for Document Archiving and Records Declaration

How Big Data Transforms Data Protection and Storage

BUSINESS CASE The SaaS (R)Evolution in Healthcare Recovery

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

Ensighten Data Layer (EDL) The Missing Link in Data Management

The disruptive power of big data

Transcription:

CHAPTER 1 Search Based Applications 1 1.1 INTRODUCTION Figure 1.1: Can you see the search engine behind these screens? Management of information via computers is undergoing a revolutionary change as the frontier between databases and search engines is disappearing. Against this backdrop of nascent convergence, a new class of software has emerged that combines the advantages of each technology, right now, in Search Based Applications. Until just a short while ago, the lines were still relatively clear. Database software concentrated on creating, storing, maintaining and accessing structured data, where discrete units of information (e.g. product number, quantity available, quantity sold, date) and their relation to each other were well defined. Search engines were primarily concerned with locating a document or a bit of information within collections of unstructured textual data: short abstracts, long reports, newspaper articles, email, Web pages, etc. (classic Information Retrieval, or IR; see Chap. 3). Business applications were built on top of databases, which defined the universe of information available to the end user, and search engines were used for IR on the Web and in the enterprise.

2 1. SEARCH BASED APPLICATIONS Figure 1.2: Databases have traditionally been concerned with the world of structured data; search engines with that of unstructured data (some of these data types, like HTML pages and email messages, contain a certain level of exploitable structure, and are consequently sometimes referred to as "semi-structured"). Such neat distinctions are now falling away as the core architectures, functionality and roles of search engines and databases have begun to evolve and converge. A new generation of non-relational databases, which shares conceptual models and structures with search engines, has emerged from the world of the Web (see Chapter 4), and a new breed of search engine has arisen which provides native functionality akin to both relational and non-relational databases (described in Chapters 3-9 and listed in Chapter 10). It is this new generation engine that supports Search Based Applications, which offer precise, multi-axial information access and analysis that is virtually indistinguishable at a surface level from database applications, yet are endowed with the usability and massive scalability of Web search. 1.1.1 WHAT IS A SEARCH BASED APPLICATION? We define a Search Based Application (SBA) as any software application built on a search engine backbone rather than a database infrastructure, and whose purpose is not classic IR, but rather mission-oriented information access, analysis or discovery. 1 1 This new type of application has alternately been referred to as a "search application," "search-centric application," "extended business application," "unified information access application" and "search-based application." The latter is the label used by IDC s Susan Feldman, one of the first industry analysts to identify SBAs as a disruptive trend and an influential force in the SBA label being adopted as the industry standard. Feldman has recently moved toward a more precise definition, limiting SBAs to "fully packaged applications" supplying "all the tools that are commonly needed for a specific task or workflow," that is to say, commercial-off-the-shelf (COTS) software [Feldman and Reynolds, 2010]. However, we prefer a broader definition to underscore one of the great benefits of the SBA model: the ability for anyone to rapidly and inexpensively develop highly specific solutions for unique contexts, and, following the same pattern as database applications, we expect both custom and COTS SBAs to flourish over the next decade.

1.1. INTRODUCTION 3 Definition: Search Based Application A software application that uses a search engine as the primary information access backbone, and whose main purpose is performing a domain-oriented task rather than locating a document. Examples: Customer service and support Logistical track and trace Contextual advertising Decision intelligence e-discovery SBAs may be used to provide more intuitive, meaningful and scalable access to the content in a single database, hiding away the complexity of the database structure as data is extracted and re-purposed by search engine techniques. They may also be used to autonomously and intelligently gather together massive volumes of unstructured and structured data from an unlimited number of sources (internal or external) and to make this aggregate data available in real time to a wide base of users for a broad range of purposes. While search engines in the SBA context complement rather than replace databases, which remain ideal tools for many types of transaction processing, this re-purposing of search engines nonetheless represents a major rupture with a 30-year tradition of database-centered software application development. In spite of the significance of this shift, the SBA trend has been unfolding largely under the radar of researchers, systems architects and software developers. However, SBAs have begun to capture the focused attention of business. 2 "The elements that make search powerful are not necessarily the search box, but the ability to bring together multiple types of information quickly and understandably, in real time, and at massive scale. Databases have been the underpinning for most of the current generation of enterprise applications; search technologies may well be the software backbone of the future." Susan Feldman, IDC LINK, June 9, 2010 2 SBAs are fueling a significant portion of the growth in the search and information access market, which IDC estimates grew at double digit rates in 2007 and 2008, and at a healthy 3.9% (to $2.1 billion) in 2009 [Feldman and Reynolds, 2010]. Gartner, Inc. estimates an compound annual growth rate of 11.7% from 2007 to 2013 for the enterprise search market [Andrews, 2010].

4 1. SEARCH BASED APPLICATIONS 1.2 HIGH IMPACT, LOW RISK SOLUTION FOR BUSINESSES SBAs offer businesses a rapid, low risk way to eliminate some of the peskiest and most common information systems (IS) problems: siloed data, poor application usability, shifting user requirements, systemic rigidity and limited scalability. Figure 1.3: Search engine-basedsourcier makes vast volumes of structured water quality data accessible via map-based search and visualization, and ad hoc, point-and click-analysis. Even though SBAs allow business to clear these hurdles and bring together large volumes of real time information in an immediately actionable form thereby improving productivity, decision making and innovation too many in the business community are still unaware that search engines can serve as an information integration, discovery and analysis platform. This is the reason we have written this book. 1.3 FERTILE GROUND FOR INTERDISCIPLINARY RESEARCH We have also undertaken this project to introduce SBAs to a wider segment of the data management research community.though the convergence of search and database technologies is gradually being recognized by this community 3, many researchers are still unaware of the pragmatic benefits of SBAs and the mutually beneficial evolutions underway in both search and database disciplines. 3 See, for example, recent workshops like Using Search Engine Technology for Information Management (USETIM 09) that was held in August 2009 at the 35th International Conference on Very Large Data Bases (VLDB09), which examines whether search engine technology can be used to perform tasks usually undertaken by databases. http://vldb2009.org/?q=node/30

1.4. A VALUABLE TOOL FOR DATABASE ADMINISTRATORS 5 However, as a group of prominent database and search scientists recently noted, exploding data volumes and usage scenarios along with major shifts in computing hardware and platforms have resulted in an "urgent, widespread need for new data management technologies," innovations that will only come about through interdisciplinary research. 4 Figure 1.4: This Akerys portal generates personalized, real-time real estate market intelligence based on unstructured online classifieds and in-house databases. 1.4 A VALUABLE TOOL FOR DATABASE ADMINISTRATORS Like their research counterparts, many Database Administrators (DBAs) are also unfamiliar with SBAs. We hope this book will raise awareness of SBAs among DBAs as well, because SBAs offer these professionals a fast and non-intrusive way to offload overtaxed systems 5 and to reveal the full richness of the data those systems contain, opening database content up for free-wheeling discovery and analysis, and enabling it to be contextualized with external Web, database and enterprise content. 4 From the The Claremont Report on Database Research, the summary report of the May, 2008 meeting of a group of leading database and data management researchers who meet every five years to discuss the state of the research field and its impacts on practice: http://db.cs.berkeley.edu/claremont/claremontreport08.pdf 5 Offloading a database means extracting all the data that a user might want to access and indexing a copy of this information in a search engine. The term offloading refers to the fact that search requests no longer access the original database, whose processing load is hence reduced.

6 1. SEARCH BASED APPLICATIONS 1.5 NEW OPPORTUNITIES FOR SEARCH SPECIALISTS For search specialists who are not yet familiar with SBAs,we hope to introduce them to this significant new way of using search technology to improve our day-to-day personal and professional lives, and to make them aware of the new opportunities for scientific advancement and entrepreneurship awaiting as we seek ways to improve the performance of search engines in the context of SBA usage. 1.6 NEW FLEXIBILITY FOR SOFTWARE DEVELOPERS We also hope to make software developers aware of the new options SBAs offer: one doesn t always need to access an existing database (or create a new one) to develop business applications or to meticulously identify all user needs in advance of programming, and one need not settle for applications that must be modified every time these needs or source data change. 1.6.1 LECTURE ROADMAP While this diversity of audiences and the short format of the book necessitate a surface treatment of many issues, we will consider our mission accomplished if each of our readers walks away with a solid (if basic) understanding of the significance, function, capabilities and limitations of SBAs, and a desire to go forth and learn more. To begin, we ll first take a look at the ways in which information access needs have changed, then provide a comparative view of ways in which search engines and databases work and how each has evolved. We ll then explain how SBAs work and how and when they are being used, including presenting several case studies. Finally, we will situate this shift within the larger context of evolutions taking place on the Web, including conceptions of the Deep Web, the Semantic Web, and the Mobile Web, and what these evolutions may mean for the next generation of SBAs.