isearch TECHNICAL SPECIFICATIONS Features Add-ons Content Connectors Cloud Connectors

Transcription

1

2 isearch TECHNICAL SPECIFICATIONS Features Add-ns Cntent Cnnectrs Clud Cnnectrs Scial Media Cnnectrs Sentiment Analysis Summary generatin Speech-t-Text Keywrd Sptting Interlinking and Cntent Enrichment Pre-laded Sectr Specific Ontlgies Technlgy Linguistic Resurces Semantics Advanced Text Analytics Query Expansin Search Visual Query Expansin Cntent Federatin Technical Data Supprted Repsitries Supprted Clud Facilities Supprted Scial Media Cnnectrs Pre-Laded Ontlgies Supprted Languages Supprted Media Types... 18

3 1. isearch TECHNICAL SPECIFICATIONS 1.1. Features FEATURE ONTOLOGY BASED TAONOMY & THESAURUS-BASED KEYWORD BASED Spelling checker and Did yu mean ptin Predictive search Instant search Highlighting and snippets Faceted search Last searches prduced Mst frequent and recently used terms queried Rich media Advanced search - Tags Advanced search Cncepts & Instances Synnyms Multilingualism Natural Language Categry Crss-Language Prbabilistic tag clud Taxnmy supprt Thesaurus supprt Ontlgy supprt Cntextual Cnceptual Brwser Relatinship viewer Triplets

4 Spelling checker and did yu mean ptin: Suggests rthgraphically crrect sentences when users mistype (i.e., Dn Quijtte will trigger Did yu mean Dn Quijte? ). Predictive search: Suggests relevant terms t cmplete a user query r sentence. Fr instance, as the user types Dn the system generates a drp-dwn menu with suggestins: Dn Quijte, Dn Antni, etc. Instant search: Builds n the categry search by directly displaying the categrized results in a drp-dwn menu s the user can chse amng them. Highlighting and snippets: Displays results using highlights and cnfigurable snippets. Faceted Search: Makes it pssible t search thrugh structured dcuments, e.g., ML dcuments r databases, applying the elements and prperties f dcuments t filter thrugh cmb bxes, sliders, drp-dwn lists, etc. Last searches run: Summary f the last searches run fr tunning and cnfiguratin purpses. Mst frequent and recently used terms queried: Rich media: Ability t search fr and within multimedia cntent. Advanced Search: Tags. Advanced search Cncepts & Instances: Defines cmplex Blean relatinships, directly applying the cncepts and instances in the target repsitry cnceptual mdels. Synnyms: Makes it pssible t expand the search using cmmn synnyms fr the seed terms (i.e., Dn Quijte, Alns Quijan, El ingenis hidalg Dn Quijte de la Mancha ). Multilingualism: Allws lcating and retriving dcuments even if anntatin and search languages differ. Fr example, "Dn Quijte" will yield the same results as "Dn Quixte" r "Dn Quichtte." Natural language: Fully-fledged natural language (human) queries. Categry: Cmplements previus search functinality by prpsing categrized results. Fr example, as the user types Dn Quijte isearch will display a drp-dwn menu detailing DVDs, bks r CDs frm which the user can chse. Crss-Language: Allws users t access and retrieve assets regardless f the anntatin language and the ne used in the query; i.e., Dn Quijte will trigger the same results as Dn Quixte r Dn Quichtte. Prbabilistic tag clud: Expands the traditinal cncept f the tag clud, enabling results filtering by including r excluding particular tags frm the set f results. Taxnmy supprt: Supprt fr taxnmies. Thesaurus supprt: Supprt fr thesauri. Ontlgy supprt: Fully fledged ntlgical supprt. Cntextual: Makes it pssible t lcate dcuments semantically related t the terms f the seed query, nt nly using the exact wrds but, mre imprtantly, their meaning. Cnceptual: Advanced search based n the cncepts f the underlying infrmatin mdelling frmalism (i.e. taxnmies, thesauri, ntlgies). Brwser: Enables users t visually navigate the underlying ntlgy, while visually adding cncepts t build queries r refine results. Relatinship viewer: An intuitive, pwerful, and visual way t filter the result set by navigating the clsest cncepts t the ne included as part f the seed query. Triplets: Permits creatin f highly cmplex queries that mimic natural language by editing variables in predefined sentences that represent sme f the mst cmmn queries.

5 ADD-ONS Cntent cnnectrs Clud cnnectrs Scial media cnnectrs Sentiment analysis Summary generatin Speech-t-text Keywrd sptting Interlinking & cntent enrichment Pre-laded industry specific ntlgies 1.2. Add-ns Cntent Cnnectrs The Taiger cntent cnnectr layer sets up an easy-t-use yet pwerful data access and federatin mechanism. Able t seamlessly interface with relatinal databases, cntent management systems, cllabratin platfrms r messaging systems, it prvides the necessary lgic t navigate and ingest cntent int the Taiger enterprise search platfrm. The Taiger cntent cnnectr layer ensures seamless access t mre than 40 different repsitries. Shuld yur installatin require access t cntent available thrugh ther surces, we are able t develp it upn request Clud Cnnectrs The Taiger clud cnnectr layer prvides a single pint f access t data stred in varius ppular cludbased services. Relying n Taiger r s pwerful semantic-enabled search and visual interface, users are able t ingest, index, and retrieve crprate r private infrmatin mre precisely, saving time and aviding the need t lg nt and visit a multitude f services. Furthermre, the reliable wrkflw and fine granularity privacy mechanisms ensure that nly the data wner and thse assigned access are able t retrieve sensitive cntent, in accrdance with the nging wrkflw plicy. Shuld yur installatin require access t cntent available thrugh ther surces, we are able t develp it upn request. Features: Search acrss clud-based services thrugh a single access pint. Supprt fr the mst widely used dcument and media types. Advanced semantic anntatin and search facilities. Pwerful visual interface t refine search results and find infrmatin faster. Web-based interface. Supprt fr a wide array f languages and crss-language search.

6 Scial Media Cnnectrs Taiger scial media cnnectrs set up an agile mechanism t search acrss scial media platfrms relying n a cmmn interface. Thrugh Taiger faceted search capabilities, users are able t filter tweets by tags, sentiment r tpic. Furthermre, the brwsing tls help extract infrmatin based n the user wh published it r the brand it relates t. Just as with ther Taiger cnnectrs, strng privacy and wrkflw mechanisms are available t ensure that nly users with the right access level are able t recver sensitive cntent. Shuld yur installatin require access t cntent available thrugh ther surces, we are able t develp it upn request. Features: Search acrss scial media platfrms thrugh a single access pint. Tweet indexing and filtering by tag. Search thrugh published tweets, mentins, and private messages. Access t cntent published n yur wall. Pwerful filtering capabilities Sentiment Analysis Sentiment determinatin is anther step in the prcess f cnverting unstructured cntent t structured cntent, with the aim f sptting trends and patterns within the infrmatin. The Taiger sentiment analysis add-n sets up the means t cnsistently and reliably measure emtin and, mre imprtantly, a way t adequately ascertain just wh that sentiment targets in the article. We d this by rating the psitive r negative assertins that are assciated with a dcument r entity. Our technlgy cmbines the transparency f a dictinary apprach knwing exactly what wrds are applied with sphisticated natural language techniques, t ensure that the right wrds are allcated t the right entities and themes. Currently we have ver 250,000 scred sentiment-bearing phrases, part f several cmpletely custmizable dictinaries. This enables us t prvide very fine-grained cntrl, and ur custmers can have multiple sentiment dictinaries, each with subtle tweaks t cver different verticals. Furthermre, Taiger is able t identify bjective vs. subjective sentences and use them t change sentiment weighting. Fr example, I saw (mvie name) last night, and it was great shuld have mre weight than I heard (mvie name) was great Summary generatin Summarizatin facilitates shrtening a dcument in rder t present its meaning (in the best pssible way) using a limited amunt f wrds. Taiger semantic analysis sftware accmplishes this at the sentence level; this means we are able t pick ut the mst imprtant and representative sentences frm the cntent and use them t frm a meaningful summary Speech-t-Text Speech-t-text technlgy is used t extract infrmatin frm the audi track f vides and audis by prcessing and transcribing it t text. The result is a highly accurate time-stamped ML, which feeds directly int the anntatin prcess fr linguistic and semantic prcessing.

7 Taiger Speech-t-text technlgy stands ut due t its accuracy and unparalleled perfrmance. It supprts the mst widespread Eurpean and Asian languages, including varius varieties f English and Spanish, and runs as a batch prcess able t supprt pre-recrded and live feeds. The app requires less than real time t cmplete the transcriptin prcess, reaching a maximum transcriptin speed f 1/8 f real time, and features accuracy f up t 90%, which is unprecedented n the market. It can prcess all majr file types and the verall accuracy can be easily imprved by feeding the system with a text dcument cntaining new wrds r expressins. Features: Autmated time-stamped vide and audi transcriptin. Live input feed prcessing and speaker recgnitin. Prcesses 8 hurs f audi in 1 hur. Accuracy imprvement by feeding the system a text dcument with new wrds r expressins Keywrd Sptting Able t prcess audi signals reaching speeds f up t 15x real time, this adaptable state-f-the-art keywrd sptting sftware appliance makes it pssible t lcate single wrds and grups f wrds. The system can prcess pre-recrded signals, as well as live input feeds such as news bradcasts r interviews, depsitins, r sympsia. Taiger keywrd sptting can be cnfigured t wrk with the mst widely-spken Eurpean and internatinal languages, thereby serving a wide range f needs and cnfiguratins. A cnvenient and easy-t-use GUI administratin tl fully integrated in Taiger prducts allws preferred cnfiguratin set up and easy definitin f the target wrd list. The result is a time-stamped ML indicating the incidence f each wrd tgether with a cnfidence index t imprve reliability and ensure quality cntrl, used in a wide variety f scenaris ranging frm search t alerts and ntificatins. In cmbinatin with the Speech-t-Text slutin, the Taiger keywrd sptting slutin permits imprvement f the transcriptin utcme by stressing the presence f cntrlled vcabularies. Features: Autmated keywrd lcatin. 15x real time prcessing. Cnfigurable keywrd surce. Language-independent. Telephnic, micrphnic, pre-recrded r live input feed. Sftware appliance. Time-stamped ML. Cnfidence index. Alerts and ntificatins Interlinking and Cntent Enrichment While Taiger already prvides mechanisms t search cntent available within the bundaries f the rganizatin, current needs als demand access t data surces utside the crprate firewall. Thus, Taiger prvides advanced technlgy able t enrich crprate cntent by interlinking it with ther available external resurces, such as an external website r the Wikipedia. The interlinking and cntent enrichment mechanism is based n a tw-step prcess:

8 Interlinking: first, external cntent is aligned with the target infrmatin structure in terms f its entities and schemas. Fr example, assuming that there is a dataset with a list f different anti-aging creams marketed by a csmetics cmpany, this prcess will match their ingredients with thse ingredients available in ther external datasets. Hwever, the mechanism is much mre cmplex and requires the analysis f the graph similarities between the infrmatin fund in external datasets and the infrmatin already existing within the rganizatin. Cntent enrichment: secnd, in additin t the previus step, there is a cntent enrichment prcess respnsible fr prviding mechanisms t navigate thrugh the new dataset and extract infrmatin frm it, shwing infrmatin and relatinships that were unknwn t the rganizatin. I.e., it might assciate an anti-aging cream marketed by a csmetics cmpany prvided as a query result with infrmatin abut the ingredients cntained; e.g., where an ingredient was used in the past r where it is used fr ther purpses Pre-laded Sectr Specific Ontlgies Taiger prducts have been designed t supprt ntlgies, taxnmies, and thesauri. Yet they als cntain all the features cmmn t traditinal keywrd-based search engines, such as spelling suggestin and crrectin, relevancy scres, aut-suggestin, etc., and are able t wrk withut the semantic enhancements prvided by ntlgies. While thesauri and taxnmies prvide little expressivity, ntlgies set up a much richer mechanism t mdel the relevant cncepts in a business applicatin r dmain. In a nutshell, they make it pssible t characterize any type f relatinship amng frmalized cncept categries. We knw that rganizatins acrss verticals invest cnsiderable resurces in develping r buying frmalized categries f cncepts. Therefre, Taiger prducts have been designed in such a way that they are able t swiftly integrate and wrk with pre-existing mdels. The Taiger anntatin engine will take care f reindexing cntent based n the existing mdel as needed, while the search engine will emply it t retrieve results. Smetimes existing mdels need t be updated r fine-tuned t meet a certain purpse. Taiger ntlgy and taxnmy mdeling methdlgy has been hned ver mre than 12 years, during the deplyment f ur prducts in a wide range f business and industries. Our methdlgy aims at fulfilling targets, cvering all the relevant stages f the ntlgy life cycle. Ontlgy creatin: Ontlgy ppulatin Ontlgy validatin Ontlgy deplyment Ontlgy evlutin and maintenance Additinally, Taiger ffers a set f pre-laded sectr and applicatin specific ntlgies that can be readily used. This allws a significant reductin f the set-up and cnfiguratin time, while guaranteeing best practices Technlgy

9 Linguistic Resurces The fully-fledged linguistic resurces available in all Taiger prducts prvide cmplete language prcessing capabilities in a wide array f languages and linguistic varieties. Amng ther features, Taiger supprts tkenizatin, remval f stp wrds, language identificatin, and named entity recgnitin. All these help imprve search result precisin and relevance, while keeping recall limited t avid infrmatin verlad. Tkenizatin: this prcess divides a stream f text e.g., a paragraph, an , etc. int smaller pieces called tkens. These tkens can be single wrds, symbls, punctuatin marks, etc. The utput f this prcess is the centerpiece fr the parsing prcess. Stp wrds: in mst languages there are wrds that appear in prfusin in every text articles, prepsitins, cnjunctins, etc. but cnvey little meaning. As search engines are based n the relevance f query wrds and their frequency in indexed dcuments, these wrds are remved in the indexatin and querying prcesses t avid recvering dcuments that are nt relevant t the query terms. Stemming: the stemming prcess is respnsible fr reducing tkens t their rt r stem frm. Specifically, during the indexing prcess wrds such as reduce, reducing, reduces and reduced are identified and nrmalized t the stem reduce. Later, in the query expansin phase, the same prcess is applied t the query terms, identifying any dcument that matches the stem frm. Lemmatizatin: this is a brader prcess thrugh which the system is able t identify the lemma and its part f speech, i.e. whether the lemma represents a nun, and adverb, a verb, etc. Fr example, meeting culd be a verb r a nun, depending n the cntext. Lemmatizatin therefre helps the cntextualizatin f infrmatin in a given dcument. Phrasing: while stemming and lemmatizatin are cncerned with detecting single wrds, phrasing deals with identifying cmmn expressins frmed by mre than ne wrd, e.g., free f charge r natural gas. This identificatin allws better disambiguatin in the query and matching prcesses. Synnym expansin: In rder t increase recall i.e., the number f relevant dcuments fr a given query the terms in a query are expanded with their crrespnding synnyms s that all relevant dcuments are taken int cnsideratin. Synnyms can be taken frm a gazetteer r using a SKOS plug-in. Wrd decmpsitin: Sme languages such as German, Japanese, Russian r Dutch frm wrds by cncatenating simpler nes. Fr example Wlkenkratzer skyscraper in English is frmed by the wrds Wlken, meaning cluds, and Kratzer, meaning scraper. The wrd decmpsitin feature prvides the means t break cmplex wrds dwn int their basic cmpnents, in rder t imprve the query prcessing and dcument matching. Language identificatin: In a multilingual setting, the ability t autmatically identify the cntent language is key. Taiger is able t recgnize the text and encding language in rder t autmatically apply the crrespnding language setting, resurces, and techniques. Named entity recgnitin (NER): Taiger ffers the means t identify well-knwn entities, such as peple r lcatin names, dates r figures, t name but a few. It allws the system t autmatically add useful infrmatin t the indexed dcument base and t increase disambiguatin strength. The system can be fed generic dictinaries r specific grammars, r be extended with private dictinaries, thus precisely meeting custmer needs. Spelling checker: thrugh pwerful spelling checker technlgy Taiger is able t detect spelling mistakes in the user s query and suggest a crrected versin, while returning useful answers based n the mst crrect versin f the query. The system uses dictinaries and previusly indexed dcuments t prvide suggestins.

10 Semantics Semantic technlgy and ntlgies are at the heart f ur prducts, supprting cntextual anntatin, searching, and navigatin. Ontlgies i.e. knwledge maps frmalize categries, cncepts, and relatinships fr a specific dmain. The links cnnecting cncepts have precise meaning, which mdels the structure and rganizatinal knw-hw in a cmmn and agreed way, ensuring cnsistent infrmatin presentatin, understanding, and sharing. Ontlgies are the evlutin f traditinal database schemas, in the sense that they prvide a structure fr applicatin data. Yet the levels f expressiveness and flexibility are much greater than thse f databases, making ntlgy-based applicatins significantly mre pwerful. Cmpared t thesauri and taxnmies, ntlgies are the mst advanced infrmatin mdeling mechanism currently available. Taxnmies nly allw representatin f cntrlled vcabulary wrd listings gruped accrding t their similarity in meaning typically cntaining synnyms and smetimes antnyms. Thesauri expand the cncept f taxnmies by prviding a tree structure that unerringly rganizes cncepts thrugh supertype-subtype relatinships: fr example, car is a subtype f vehicle. Ontlgies ffer all the previus features, plus a graph-like structure t classify cncepts and mdel the relatinships amng them, thus prviding a superir, tp-ntch infrmatin mdeling mechanism that nly a handful f vendrs in the market are able t fully explit and ffer with their prducts. Semantic technlgy and ntlgies als help cntextualize traditinal tags, enabling better understanding, descriptin, and matching f an asset t a user s query. This is the case where tags d nt prvide the cntext f the infrmatin they are intended t cnvey Advanced Text Analytics Ontlgies and semantic technlgy are at the cre f the advanced linguistic capabilities present in Taiger prducts. They help imprve the perfrmance f traditinal phrasing, synnym expansin, and entity extractin by narrwing the cntext arund a term r phrase t increase precisin. The ntlgy nrmalizatin unequivcally matches entities t elements in an ntlgy r infrmatin graph. This nvel functinality represents the first step twards semantically explited cntent. As a result, we are able t make mre f the infrmatin available, ging frm just identifying linguistic relatinships t als pinpinting semantic nes. All f these grundbreaking features, including the revelatin f hidden assets and the explitatin f the semantic relatinships amng them, enable infrmatin access in ways that are unprecedented and beynd the reach f ther market players. These advances clearly differentiate Taiger technlgy frm the rest. Ontlgy nrmalizatin: this prcess makes it pssible t identify nt nly generic entities peple, lcatins, cmpanies, and dates but als any ther relevant entity in the target dmain. Mrever, the system can assciate it unequivcally with an entity in the crrespnding infrmatin map. Thus, entities such as San Francisc, Frisc, r SF are nrmalized and matched against the entity /#ntlgy/sanfrancisc. Cnsequently, any mentin f any f thse entities will be easily recgnized as the city f San Francisc. This translates int an imprved textual and semantic analysis. Ontlgy-based phrasing: By using ntlgies, Taiger can imprve its text analyses. In particular, applicatins can be custmized t detect expressins that are made up f mre than ne wrd and are explicitly relevant t a specific dmain. Thus, an applicatin in the energy sectr will understand light crude as smething cmpletely different frm, but at the same time mre relevant than, an applicatin analyzing the same wrds in the fd dmain.

11 Ontlgy-based synnym expansin: Just as with phrasing, the use f ntlgies allws the definitin f synnyms fr entities that are part f specific dmain knwledge. Fr example, a custmer in the energy sectr can define crude as a synnym fr petrleum, althugh utside this sectr that definitin may nt make any sense. Ontlgy-based entity extractin: Using ntlgies t imprve and custmize entity extractin prvides increased precisin in the entity detectin and nrmalizatin prcess. This nvel feature extends the generic functinality f Named Entity Recgnitin (NER) thrugh the use f dmain ntlgies, prviding a set f instances belnging t each cncept. In this way, the system is able t detect nt nly generic entities such as peple, cmpanies, and lcatins, but als elements mre specific t the client s interest, such as prducts in a particular sectr, mvies r securities, t name but a few. Dcument summary: When handling large vlumes f infrmatin, the ability t rapidly brwse thrugh cntent and discard whatever is nt relevant can result in a cmpetitive advantage. Taiger summary generatin technlgy prduces accurate and reliable summaries in a fast and cnvenient way that helps save time and mney. Sentiment analysis: A text usually cntains a particular attitude regarding a tpic r a cntextual plarity. The ability t extract and measure this sentiment is called sentiment analysis. Using a cntinuus scale ranging frm +1 (mst psitive) t -1 (mst negative), this advanced functinality prvides the means t measure the sentiment in a particular dcument in a standardized manner Query Expansin Having linguistic and semantic resurces at the cre f ur technlgy and slutins enables Taiger t implement and imprve a mechanism knwn as query expansin. Users ften d nt frmulate search queries using the best terms: Nt precise enugh: the seed query returns t many results; lw precisin. Nt abstract enugh: the seed query des nt return any results at all; lw recall. T vercme these limitatins, the query expansin techniques are applied t refrmulate a seed query and imprve infrmatin retrieval by increasing the number and quality f search results returned. Our prducts feature traditinal and semantic query expansin. Traditinal query expansin: the linguistic resurces are used, fr instance, t expand the seed query with synnyms, hmnyms (wrds that share the same spelling but have different meanings) and ther mrphlgical frms. At the same time, the stp wrds are remved and spelling mistakes are crrected. As a result, verall recall is imprved; e.g., the system als retrieves dcuments where autmbile appears, when given a query with the term car. Semantic query expansin: the advanced text analytics including ntlgy-based nrmalizatin, phrasing, synnym expansin, r extractin are put t wrk t enrich users queries with related entities and relatinships riginating frm the target ntlgy. Fr example, a query with the term accmmdatin wuld expand int a query with different types f accmmdatin available in the system (e.g. htel, camping, etc.) T cmpensate fr the behaviral variability f queries with respect t the precisin vs. recall trade-ff, bth traditinal and semantic query expansins feature heuristics t fine-tune the critical parameters affecting the query expansin perfrmance. They als cntribute t deciding whether the expansin is useful fr a particular seed query.

12 Search+ In classical keywrd-based search engines, simple pattern-matching techniques are used t match the user s seed query with the assets in the repsitry. Nnetheless, when dealing with unstructured infrmatin, results are nt always what users are seeking. Mre advanced search engines are able t take full advantage f the cntext f the user s seed query t increase the system s precisin and yield significantly better search results. Taiger search and brwsing technlgy fully supprts traditinal keywrd search and all its accmpanying features. We cmplement it with grundbreaking cntextual, multimedia, and crss-language search technlgy, in additin t faceted search. The result is a superir search and user experience. Cntextual: by using cntextual search, Taiger can lcate dcuments semantically related t the terms f the seed query. It uses nt nly the exact wrds, but, mre imprtantly, their meaning. Fr example, a query fr "accmmdatin in UK" will return dcuments referencing different accmmdatin types (e.g., htels, B&B, guesthuses, apartments, etc.) in UK lcatins (Lndn, Manchester, Liverpl, etc.) Ontlgies enable this advanced matching prcess by making it easier t extract infrmatin frm the seed query and facilitating the crrespnding dcument retrieval. Cnceptual: thrugh the use f ntlgies, the cnceptual search is able t understand and translate the seed query int a set f relevant matching cncepts and relatinships. Fr instance, the seed query "htels in Lndn with swimming pls fr less than 90 " wuld be translated int: htels lcated in Lndn, htels with a swimming pl, htels with rates less than 90. The crrespnding asset set wuld include all thse dcuments that precisely meet all these relatinships. Multimedia: searching in audi and vide files just as we d currently with text dcuments taking advantage f time-stamped metadata. Thrugh cntextualizatin technlgy, Taiger can unerringly search acrss and within rich media cntent, expliting infrmatin cntext and seamlessly prcessing structured and unstructured metadata. The result: a superir technlgy that makes it pssible t pinpint a search result at the exact secnd a character appears r a wrd is prnunced. Crss-language: this feature enables asset access and retrieval regardless f the anntatin language r the ne used in the query. Faceted search: this permits a search thrugh structured dcuments e.g., ML dcuments, databases using the same structures that shape dcuments. Users can filter the set f results by cmb bxes, text fields, calendar bjects, sliders fr number ranges, drp-dwn lists, etc Visual Query Expansin The Taiger GUI interface is ne f the mst differentiating and pwerful features f ur slutins and ur cmpany. The varius cnfigurable cntrls enable the iterant expansin f queries in rder t visually and prgressively refine the result set in unprecedented ways. The advanced, user-friendly tls belie the technlgical cmplexity, ffering an exceptinal user experience while prviding swift, fresh, precise, and engaging target infrmatin, cntextual explratin, and discvery. Nw users can rapidly gain access t the infrmatin they need, saving time and ther valuable resurces while simultaneusly increasing prductivity. We use six different tls, namely: prbabilistic tag cluds, relatinship viewers, brwsing, advanced search, tags and cncepts, and faceted search. All the cntrls wrk in a cperative way, with the single gal f refining the result set in the desired way t facilitate infrmatin lcatin.

13 Prbabilistic tag clud: This cntrl expands the traditinal cncept f the tag clud in tw ways. First, tags are calculated using nly the results f each individual user query. This prvides significantly mre accurate tags than calculating them based n the cntents f the whle repsitry. Secnd, query-specific tags can be used t filter the result set, by including r excluding particular tags frm the set f results. Fr example, if Kendrick appears in the prbabilistic tag clud, the user can indicate whether all the assets cntaining this tag in their descriptin r captin shuld be excluded r maintained in the target result set. Relatinship viewer: The relatinship viewer prvides a very intuitive, pwerful, and visual way t filter the result set. Taking the main cncept intrduced in the user query as the starting pint, the cntrl presents all related cncepts that are ne degree away accrding t the current infrmatin map. The user can then navigate back and frth, explring the different relatinships amng cncepts and filtering the result set at will until the desired assets are fund. Brwser: This cntrl ffers an expanded graphical view f the whle underlying cnceptual mdel. It enables the user t navigate in search f a cncept r set f cncepts t directly refine the result set by indicating whether cncepts shuld be added r remved frm the result set, just like in the case f the tag clud. It als represents a useful tl when it cmes t learning abut the underlying ntlgy r discvering ther relevant assets the user was unaware f. The brwser supprts multiple ntlgies. Ontlgy-based faceted search: Expanding the cncept f traditinal faceted search thrugh the use f ntlgies, this cntrl enables the creatin f facets fr unstructured cntent als, such as audis, vides, and text assets. Users are nw able t perfrm advanced faceted search thrugh structured and unstructured cntent alike, easily reducing a set f several thusand results t nly a few results in just a cuple f clicks. Advanced search: In additin t several cmmn advanced search features, Taiger ffers the ability t define cmplex Blean relatinships, directly applying the cncepts and instances in the target repsitries and ntlgies. As with all the ther cntrls, this ne can be cmbined with the ther visual query expansin tls, thereby prviding users with precise filtering capabilities. Tags and cncepts: An asset s prperties include tags and cncepts that riginate directly frm the target repsitries and ntlgies, and which are used t anntate the asset. Taiger enables result set filtering thrugh applicatin f the same techniques as in the case f prbabilistic tag cluds. The result set is therefre mdified accrdingly, enhancing the verall user experience and search precisin Cntent Federatin The Taiger cntent federatin layer prvides a rbust and extensible mechanism fr accessing cntent residing in different repsitries and platfrms. The platfrm currently supprts the mst ppular cntent, clud, and scial media cnnectrs. Cntent cnnectrs are able t seamlessly interface with mre than 40 relatinal databases, cntent management systems, cllabratin platfrms r messaging systems; they prvide the necessary lgic t navigate and ingest cntent int Taiger middleware. Clud cnnectrs prvide a single pint f access t data residing in varius ppular clud-based services. They enable infrmatin ingestin and indexing, as well as crprate and individual infrmatin retrieval, saving time and aviding the need t visit and lg nt a multitude f services. Scial media cnnectrs set up an agile mechanism t gain access t infrmatin and search acrss scial media platfrms, relying n a cmmn interface and single pint f access. In all cases, strng privacy and wrkflw mechanisms are available t ensure that nly users with the right access level are able t retrieve sensitive cntent, accrding t the nging wrkflw plicy. When it cmes

14 t administratin, the Web-based administratin interface is able t supprt large IT admin teams scattered in different lcatins Technical Data Supprted Repsitries RDBMS IBM DB2 JDBC Micrsft SQL Server MySQL ODBC Oracle RDBMS Sybase Cntent Management Systems Alfresc EMC Dcument Cntent Server IBM FileNET P8 IBM FileNET CS and IS IBM Cntent Manager IBM Cntent Manager On Demand IBM Websphere Prtal PDM Interwven Wrksite NT Open Text LiveLink OpenText edcs (Hummingbird DM) Oracle Stellent UCM SAP KM erx Dcushare IBM Enterprise Recrds RM & Archiving IBM Enterprise Recrds HP Trim Cntext (TwerSft) Symantec Enterprise Vault Messaging Systems Ltus Ntes Symantec Enterprise Vault Clud Irn Muntain Bx Huddle Drpbx Ggle Drive Ggle Clud Strage

15 Wuala FilesAnywhere Syncplicity Micrsft Sky Drive Web Cntent Management IBM Web Cntent Management Interwven Teamsite Vignette Prduct Lifecycle Management Envia MatrixOne Cllabratin EMC CenterStage EMC Dcumentum erm Ltus Quickplace Ltus QuickR IBM Cnnectins MS SharePint 2003, 2007, 2010 Generic Systems CMIS JCR File Servers FTP WebDAV Server Other Twitter GgleDcs SalesFrce Supprted Clud Facilities Drpbx Bx.net Evernte Ggle Cntacts Ggle Dcs Ggle Calendar Salesfrce SkyDrive Campfire MindMeister RSS Supprted Scial Media Cnnectrs Twitter Facebk

16 YuTube Vime Pre-Laded Ontlgies Energy Knwledge management Media and Entertainment Digital cpyright management Ftball Financial services Banking

17 Supprted Languages SPEECH-TO-TET KEYWORD SPOTTING English Spanish (Spain, Argentina, Mexic, Clmbia, and Caribbean) French Arabic Hebrew Swedish Mandarin Chinese Japanese Farsi Russian Danish Krean German Dutch Prtuguese (Prtugal & Brazil) Turkish Greek Italian Catalnian Basque Valencian Galician

18 Supprted Media Types SPEECH-TO-TET KEYWORD SPOTTING AVI FLV MP3 WMA WMV WAV AU MPEG MPEG2 VO OGG