Marketplace Overview: Text Analytics Vendor Options Nick Patience Research Director, Information Management The 451 Group
Choosing a vendor: things to consider YOUR REQUIREMENTS Corpus size and growth Scalability On-site vs. SaaS Languages Interoperability with existing systems Compatibility with future tech (OWL, RDF, XML, etc)
Choosing a vendor: things to consider VENDOR ISSUES Are there other customers in your specialty? Viability of vendor Willingness for pilot or proof of concept Help available for configuration and installation
Issues affecting the text analytics market 1. Mergers and acquisitions 2. Regulatory mandates (FRCP, SarbOx) 3. On-premise licensing or SaaS 4. Economic uncertainty discretionary or must-have? 5. Does a market even exist?
M&A to Date Target Acquirer Deal Value When Why? Clearforest Reuters $25M* 4/07 Customer buying supplier Inxight Biz Objects $76M 5/07 Need to understand text Stratify Iron Mountain $158M 10/07 E-discovery FAST Microsoft $1.24b 1/08 Boost Sharepoint search Teragram SAS Institute $10-15m* 3/08 Acquire own text analytics *451 estimate; Source: 451 M&A KnowledgeBase
Business Drivers ediscovery December 1, 2006: Effective date of the Electronic Discovery Amendments to the Federal Rules of Civil Procedure
Business Drivers Electronic Publishing There will be no media consumption left in ten years that is not delivered over an IP network. There will be no newspapers, no magazines that are delivered in paper form. Everything gets delivered in an electronic form. Steve Ballmer, CEO Microsoft, June 6, 2008
Business Drivers Security / Fraud Detection / Risk Mgmt.
Market Map End of 2008 Govt/Military Intelligence SAP [Inxight], IBM Cognos, SAS, SPSS, Attensity, Autonomy, Infonic Pharma & Life Sciences Temis, IBM, SPSS Early Warning (Mfg) Banking & Insurance Media & Publishing SPSS, SAS, Attensity IBM, SAS, SPSS, Autonomy, FAST, Megaputer FAST, Temis, Nstein, Infonic, Autonomy, ClearForest, Lexalytics
Market Map End of 2008, cont. Market Research & Surveys Customer Analytics Business Intelligence SPSS, SAS SAS, SPSS, Autonomy, Attensity, IXReveal SAP [Inxight], Cognos, Clarabridge Security Autonomy, IBM, Cyveillance General OEM Basis, Lexalytics, SAS [Teragram] Records Management IBM
Founded: 2000 Key Customers: Base: Palo Alto, CA Customer analytics Travel and hospitality Whirlpool JetBlue Funding: $28m venture capital Intelligence Law enforcement Travelocity the Attensity Text Analytics suite Exhaustive extraction Targeted extraction Output in OWL ontology language Statistical extraction Avg deal size: $250,000 before services On premise: Windows, Linux SaaS Auto-categorization Anaphora resolution
Founded: 1996 Key Customers: Automatic categorization Base: Cambridge, UK Government / military intelligence Security Customer analytics Standard & Poors Cisco Sony IDOL Server 7 Automatic taxonomy generation Funding: Public Banking and insurance Media and publishing Law Halliburton Gillette Conceptual retrieval Avg deal size: not available On Premise: Windows, Linux, Solaris, AIX SaaS: XXXX
Founded: 1995 Base: Cambridge, MA Government / military intelligence Commercial Search Engines Funding: < $10 million, In-Q- Tel General OEM Key Customers: Multilingual text analytics Language identification Entity extraction Google Oracle Avg deal size: $250-300,000 On Premise: Windows, Linux, Solaris, Siebel FAST Name matching Name Translation HP Yahoo Rosette Linguistics Platform SaaS: XXXXXXX
Founded: 1972 Key Customers: 32 languages Segmentation Stemming Part-of-speech tagging Avg deal size: $$$$ Base: Walldorf, Germany Government & military intelligence Federal Agencies (DOA, DAA, DHS) OEM: SAS, IBM, Oracle On premise: Windows, Solaris, Linux, HPUX, UIX Funding: Public Business intelligence BusinessObjects Text Analysis Entity extraction Document-level classification Document summarization SaaS: XXXXXX
Founded: 2005 Base: Reston, VA Business intelligence Funding: $10.2m, venture capital Key Customers: Intuit H&R Block Gaylord Hotels Content Mining Platform Categorization BI-tool friendly Avg deal size: $150-300,000, $10,000 / month SaaS On-premise:??????? SaaS
Founded: 1998 Key Customers: Tagging concepts Categorization Semantic tagging Avg deal size: Not available On Premise Base: Waltham, MA Media and publishing Dow Jones Calais SaaS: available Funding: Public Air Force Elsevier Entity, fact and event extraction Packaged extraction modules Statistical and semantic tagging
Founded: 1997 Key Customers: Language detection Lemmatization Synonyms Avg deal size: $$$ Base: Needham, MA Banking & insurance WeightWatchers.com National Instruments FAST ESP Thesaurus Phrase detection Spell-checking Anti-phrasing On Premise: Windows, Linux, HP UX, Solaris, UIX Funding: MSFT, public Media & publishing Autotrader.com SaaS: XXXXXXX
Founded: 1889 Base: Armonk, NY Funding: Public Military / govt intelligence Pharma & life sciences Security Records management Key Customers: Trend analysis Large Japanese telco provider Large financial data provider Omnifind Analytics Edition Banking & insurance Keyword search Large Japanese auto manufacturer Delta analysis Automated alerting Avg deal size: $$$$ On Premise: Windows, AIX, Linux Semantic search Drill down search SaaS: XXXXX
Founded: 2000 Key Customers: Base: London, UK Funding: Public Media and publishing Thomson Reuters Dow Jones factiva Sentiment analysis of print media Avg deal size: Not available On premise: Windows Sentiment SaaS:XXXXXXXX
Founded: 2000 Key Customers: Base: Jacksonville, FL Law enforcement Funding: Private Security Jacksonville Sheriff s office Fireman s fund Concept extraction Thesaurus Relationship discovery Avg deal size: $ On Premise: Windows ureveal SaaS: available Categorization Bayesian, SVD, Keyword, concept search Clustering Classification
Founded: 2003 Key Customers: Base: Amherst, MA Media & publishing FT.com Cymfony Funding: Private Marketing & surveys SmartBrief Cisco Systems Salience Engine w/ Sentiment Toolkit Entity extraction Entity relationships Document summarization Avg deal size: $125-150,000 On premise: Windows Sentiment extraction Tailored sentiment toolkit SaaS: XXXXX
Founded: 1997 Key Customers: Base: Bloomington, IN Defense Aviation Ernst & Young Pfizer Funding: Private Pharmaceuticals Insurance DVA FAA Taxonomy creation Polyanalyst Taxonomy-based categorization Entity extraction Clustering Avg deal size: $300,000 On premise: Windows SaaS: XXXXXX
Founded: 2001 Key Customers: Automated entity extraction Categorizer Concept extraction Taxonomy management Le Monde Avg deal size: $750,000 Text Mining Engine (TME) On premise: Windows, Linux Base: Montreal, Quebec Media & publishing Conde Nast Reed Business SaaS: XXXXX Funding: Public Reader s Digest Time, Inc. Optional summarizer Sentiment analysis engine
Founded: 1976 Key Customers: Base: Cary, NC Government / military Early warning (mfg) Banking & insurance Ford Pitney Bowes Eli Lilly Funding: Private Customer analytics General OEM Market research & surveys Department of the Treasury Multi-lingual SAS Text Miner Multiple languages POS tagging Clustering Entity extraction Stemming Concept extraction Avg deal size: $200-300,000 (Inxight 2005) On premise: Windows, Solaris, AIX SaaS: XXXXX
Founded: 1968 Key Customers: Multi-lingual sentiment analysis Base: Chicago, Illinois Market research & surveys Govt / military intelligence Pharma & life sciences Fortune 500 Clementine 12 Support Vector Machines algorithms Bayesian Networks algorithms Funding: Public Customer analytics Early warning (mfg) Customer analytics Banking & insurance recency, frequency and monetary survival analysis Avg deal size: $$$ On Premise: Windows, Linux, Solaris, HP-UX, IBM AIX SaaS: XXXXX
Founded: 2000 Key Customers: Entity extraction Categorization Information clustering Base: Paris, FR Govt / military intelligence Industrial Pharma & life sciences Novartis BASF Pfizer Luxid Concept-based searching Keyword searching Funding: 7m, private equity Avg deal size: 3,000-10,000 per user per year. On-premise version is priced on a per CPU basis and typically costs 200,000-300,000 On premise: Windows, Linux SaaS: Hosted version available
Founded: 2003 Key Customers: Base: Mclean, VA Govt / military intelligence Federal agencies Funding: Private - undisclosed Base set of 10 taxonomies Statistical and NLP techniques Frame of Reference Avg deal size:???????? On Premise: Linux Viziant 1.0 Entity extraction and stemming Classification Discovery SaaS: XXXXX
Sentiment Analysis Andiamo Systems Biz360, a veteran of the space BrandIntel Buzzlogic, a recent startup Collective Intellect about a year old Jodange media-based opinion tracking for chosen topics or influencers Monitor110, aimed at institutional investors MotiveQuest, tweaks its linguistic model depending on the domain being analyzed Nielsen Media Research's BuzzMetrics the 800-pound gorilla that rolled up some of the early players Northern Light - veteran search company, with its MI Analyst sentiment analysis product Perception Metrics, claims to be able to do phrase-level sentiment analysis, aimed at PR and marketing professional RavenPack International, counts Dow Jones & Company as a partner Sentiment Metrics, a British-based brand monitoring company SAS offers the service SPSS offers the service Sentiment Metrics SentiMetrix still in stealth, apparently ScoutLabs, is in beta and uses Lexalytics technology SkyGrid, aggregates and analyzes financial news Summize, analyzes online product reviews for sentiment Umbria, focused on online sentiment analysis of social media, such as blogs
Questions? nick.patience@the451group.com Nick Patience Research Director, Information Management http://blogs.the451group.com/information_management/