Best Practices for Architecting Taxonomy and Metadata in an Open Source Environment Zach Wahl President and Chief Executive Officer Enterprise Knowledge zwahl@enterprise-knowledge.com Twitter @EKConsulting Don Miller Vice President of Sales Concept Searching donm@conceptsearching.com Twitter @conceptsearch
Expert Speakers Zach Wahl - President and Chief Executive Officer at Enterprise Knowledge has over 15 years experience leading programs in knowledge and information management, working with more than 200 public and private organizations to successfully design and implement information management systems. He has developed his own taxonomy design methodology, has authored courses on knowledge management, and is a frequent speaker and trainer. Don Miller Vice President of Sales at Concept Searching has over 20 years experience in knowledge management. He is a frequent speaker on records management, and information architecture challenges and solutions, and has been a guest speaker at Taxonomy Boot Camp, and numerous SharePoint events about information organization and records management.
Agenda Enterprise Knowledge Introduction to Business Taxonomy for Open Source Open Source Challenges and Considerations Design Best Practices Taxonomy in Action Concept Searching Unique Approach Considerations Use Case Demonstration Next Steps
Company founded in 2002 Product launched in 2003 Focus on management of structured and unstructured information Technology Platform Delivered as a web service Automatic concept identification, content tagging, auto-classification, taxonomy management Only statistical vendor that can extract conceptual metadata 2009, 2010, 2011, 2012, 2013 100 Companies that Matter in KM (KMWorld) and Trend Setting product of 2009, 2010, 2011, 2012, 2013 Authority to Operate enterprise wide US Air Force and enterprise wide NETCON US Army Locations: US, UK, and South Africa Client base: Fortune 500/1000 organizations Managed Partner under Microsoft global ISV Program - go to partner for Microsoft for auto-classification and taxonomy management The Global Leader in Managed Metadata Solutions Smart Content Framework for Information Governance comprising Six Building Blocks for success Product Suite: conceptsearch, concepttaxonomymanager, conceptclassifier, conceptclassifier for SharePoint, concepttaxonomyworkflow, conceptcontenttypeupdater for SharePoint
Enterprise Knowledge Dedicated to Making Your Information Work for You Principals bring over 15 years of taxonomy design consulting with support for over 200 organizations globally. www.enterprise-knowledge.com Twitter: @EKConsulting Blog: http://www.enterprise-knowledge.com/category/blog/ Core services include: Knowledge Management and Taxonomy Enterprise Search Application Development Agile Consulting and Project Management
Taxonomy Definitions tax on o my (tāk-sōn-mē) n. pl. tax on o mies 1. The classification of organisms in an ordered system that indicates natural relationships. 2. The science, laws, or principles of classification; systematics. 3. Division into ordered groups or categories: "Scholars have been laboring to develop a taxonomy of young killers" (Aric Press). Zach s Definition Controlled vocabularies used to describe or characterize explicit concepts of information, for purposes of capture, management, and presentation. 6
Taxonomy and Metadata Provide structure to unstructured information Join or relate multiple disparate sources of information Provide multiple avenues to find and discover information Enable findability Findability 7
Taxonomy and Metadata Free Text Entry Title Metadata Card Brochures & Manuals Memos News Policies & Procedures Presentations Reports Author Doc Type Topic Department Employee Services Compensation Retirement Insurance Education & Training Manufacturing Safety Quality 8
Taxonomy and Metadata Content~Information~Data~Files Metadata Fields Metadata Values/Tags Taxonomies (Flat or Hierarchical)~ Controlled Vocabularies 9
Traditional v. Business Taxonomies Traditional Taxonomy Business Taxonomy Purpose Categorization Findability Designed By Scientists/Librarians The Business Managed By Scientists/Librarians The Business Used By Scientists/Librarians Everyone Complexity Deep, Wide, Detailed Flat, Simple, Deconstructed Key Characteristics Mutually Exclusive, Collectively Exhaustive Usable, Intuitive, Natural 10
The Business Taxonomy Usable Easy to adopt and utilize for any skill level Relatively flat (2-3 levels) Easy to navigate Intuitive Does not require training and reflects the way the user thinks Natural Uses the organization, vocabulary, and logic of the user 11
The Business Taxonomy Tend to be less rigid and constrained Influenced by traditional usability design Driven by the content and needs you have today Leverages multiple categorization approaches (via multiple metadata fields and multiple taxonomies) Accepts imperfect categorization 12
Open Source Challenges and Considerations Open Source is free and easy But taxonomy isn t There are multiple ways to use taxonomy Menus, Search, Tag Clouds, Page Tags Taxonomy design is not enough, you need to plan for taxonomy implementation and exposure Open Source tools like Drupal favor flat taxonomies Faceting is easy to enable but requires diligent tagging and oversight
Taxonomy Design for Open Source Best Practices Define taxonomy purpose, audience, and use cases upfront. Design before you build. Practice usability design best practices (limit depth and breadth, use plain language, etc). Flat lists work best in Open Source content management tools. Leverage primary category/topic taxonomy with supporting metadata fields. For instance, in Drupal, use of multiple Lists with Views to enable faceting. Design for your end users and publishers. Employ analytics and support iterative design. Plan for the long-term ensure governance plans are in place before content migration and rollout.
Taxonomy in Action (Drupal)
Creating a Taxonomy
Associating a Taxonomy to a Content Type
Filtering Using Taxonomy
Unique Approach Concept Searching has a unique approach to ensure success Concept Searching s unique statistical concept identification underpins all technologies Multi-word suggestion is explicitly more valuable than single term suggestion algorithms Concept Searching provides Automatic Concept Term Extraction Triple Baseball Three Heart Organ Center Bypass Highway Avoid conceptclassifier will generate conceptual metadata by extracting multi-word terms that identify triple heart bypass as a concept as opposed to single keywords Metadata can be used by any search engine index or any application/process that uses metadata.
Smart Content Framework Sum of parts is greater than whole Metadata driven application and enforcement of policies - conceptclassifier has been deployed since 2010 to automatically generate metadata and use that metadata to apply and enforce policies. Many clients are using the platform to support their information governance strategy. Proven, mature functionality out of the box - The platform has been deployed in numerous sites and applications across the enterprise, including MOSS and SharePoint 2010, 2013, Solr, Stellent, Documentum, SQL, Oracle, File Shares, Exchange via SharePoint and across the enterprise.
Open Source Considerations Given enough eyeballs, all bugs are shallow. Linus Torvalds Creator of Linux Security Quality Customizability Freedom (avoid vendor lock-in) Interoperability Auditability Support Cost Try Before You Buy Any difference if you are purchasing proprietary software? Not much!
Open Source or Proprietary OK By Us Concept Searching Technology Platform conceptsearch conceptclassifier concepttaxonomymanager conceptsql concepttaxonomyworkflow conceptclassifier Technology Platform Compound Term Processing Engine Licensed for concept extraction only conceptclassifier concepttaxonomymanager concepttaxonomyworkflow
Use Case Smart Content Framework TM Building Blocks - Metadata, Insight Situation Company is the premier global provider of fee based market intelligence, advisory services, and events for the information technology, telecommunications and consumer technology markets Seeking a solution to enhance site visitors search experience Potential loss of revenues Challenge Complex taxonomy requirements Inability for clients to identify the relevant information they were seeking Solution concepttaxonomymanager and conceptclassifier Solr Integrated in-house Automation is great, but still needs a human eye to gain that last bit of ground. Anyway, it's a great story and I'm still very happy with Concept Searching and the flexibility it gives us. Director, Enterprise Solutions Planning Benefits Improved search results Increased accuracy and relevant retrieval of information for external clients and site visitors
Concept Searching Demonstration
What s the End Result? Technology from Concept Searching complements Enterprise Knowledge s strategic and tactical planning experience and expertise in architecting solutions that improve business processes. Utilizing Concept Searching s Smart Content Framework and intelligent metadata enabled solutions, this partnership addresses key challenges in enterprise search, records management, data privacy, migration, and content management in secure and complex environments. For a comprehensive demo of the combined solution and discussion of expected ROI, please contact Don Miller at Concept Searching or Zach Wahl at Enterprise Knowledge
Thank You Zach Wahl President and Chief Executive Officer Enterprise Knowledge zwahl@enterprise-knowledge.com Twitter @EKConsulting Don Miller Vice President of Sales Concept Searching donm@conceptsearching.com Twitter @conceptsearch