Things Made Easy: One Click CMS Integration with Solr & Drupal



Similar documents
The Open Source Alternative for Social Business Software

Commerce Services Documentation

GETTING STARTED WITH DRUPAL. by Stephen Cross

The Search API in Drupal 8. Thomas Seidl (drunken monkey)

Drupal Node Overview. Attendee Guide. Prepared for: EDT502, Fall 2007, Dr. Savenye Prepared by: Jeff Beeman. November 26, 2007 EDT502 Final Project

What is Drupal, exactly?

ANNEX B TERMS OF REFERENCE. Assignment Name: IT service provider/partner for iknow politics website Our ref no: /23

VIVO Dashboard A Drupal-based tool for harvesting and executing sophisticated queries against data from a VIVO instance

A Close Look at Drupal 7

Content Management Software Drupal : Open Source Software to create library website

Building Your First Drupal 8 Company Site

Absolute Beginner s Guide to Drupal

Web project proposal. European e-skills Association

Drupal.

Who? Wolfgang Ziegler (fago) Klaus Purer (klausi) Sebastian Gilits (sepgil) epiqo Austrian based Drupal company Drupal Austria user group

How To Manage Your Digital Assets On A Computer Or Tablet Device

CMS and Internet Marketing

Indian Journal of Science International Weekly Journal for Science ISSN EISSN Discovery Publication. All Rights Reserved

Drupal: The Basics & More. Walter Nelson RAND Corporation walternelson.com

CONCEPTCLASSIFIER FOR SHAREPOINT

Drupal 8 The site builder's release

Trainer name is P. Ranjan Raja. He is honour of and he has 8 years of experience in real time programming.

Acquia Introduction December 9th, 2009

MASTER DRUPAL 7 MODULE DEVELOPMENT

MS 50547B Microsoft SharePoint 2010 Collection and Site Administration

Drupal and the Media Industry. Stéphane Corlosquet EMWRT IX, Sept 2013, Amsterdam

Flattening Enterprise Knowledge

Auto-Classification in SharePoint. How BA Insight AutoClassifier Integrates with the SharePoint Managed Metadata Service

Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company

5 Mistakes to Avoid on Your Drupal Website

Drupal. open source^ community experience distilled. 7 Business Solutions. Build powerful website features for your business. Trevor James.

Drupal 6 to Drupal 7 Migration Worksheet

BUILDING MULTILINGUAL WEBSITES WITH DRUPAL 7

Web to Print Knowledge Experience. A Case Study of the Government of Hessen, Germany s Half-Time Report

The Open Source CMS. Open Source Java & XML

NS DISCOVER 4.0 ADMINISTRATOR S GUIDE. July, Version 4.0

Big Data Drupal. Commercial Open Source Big Data Tool Chain

LEARNING DRUPAL. Instructor : Joshua Owusu-Ansah Company : e4solutions Com. Ltd.

Co-evolving document collections and knowledge structures. CoDAK. Dr. Evgeny Knutov! ! (MSc Seminar Nov )

Optimizing Drupal Performance. Benchmark Results

Content Management Systems: Drupal Vs Jahia

Symfony vs. Integrating products when to use a framework

Data-Gov Wiki: Towards Linked Government Data

Draft Response for delivering DITA.xml.org DITAweb. Written by Mark Poston, Senior Technical Consultant, Mekon Ltd.

Get results with modern, personalized digital experiences

Mercy Baggot Street Canopy Intranet

SharePoint 2010 End User - Level II

NatureServe s Environmental Review Tool

Elgg 1.8 Social Networking

COPYRIGHTED MATERIAL. 1Introducing Drupal

Four Reasons Your Technical Team Will Love Acquia Cloud Site Factory

Faichi Solutions. Drupal Commerce An Ideal Solution for Your ecommerce Platform. Contents. Whitepaper published on 1 ST Nov. 2014

PROPOSAL To Develop an Enterprise Scale Disease Modeling Web Portal For Ascel Bio Updated March 2015

How To Fix A Bug In Drupal 8.Dev

Faichi Solutions. The Changing Face of Drupal with Drupal 8

Content management system comparison

TYPO3 6.x Enterprise Web CMS

INSPIRE Dashboard. Technical scenario

How We Did It. Unique data model abstraction layer to integrate, but de-couple EHR data from patient website design.

Graphviz Website Installation, Administration and Maintenance

INTRO TO DRUPAL. February 23, 2013

How To Make Sense Of Data With Altilia

What's New in SAS Data Management

Microsoft SharePoint 2010 Site Collection and Site Administration Course 50547A; 5 Days, Instructor-led

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

Typo3_smartsite. Smartsite CMS Release 5 5/24/2006

Build it with Drupal 8

Content Manager User Guide Information Technology Web Services

WHAT'S NEW IN SHAREPOINT 2013 WEB CONTENT MANAGEMENT

(An) Optimal Drupal 7 Module Configuration for Site Performance JOE PRICE

PROJECT WEBSITE PROJECT WEBSITE

Building Drupal sites using CCK, Views and Panels. Khalid Baheyeldin Drupal Camp, Toronto May 11 12,

The Business Case For SharePoint Ian Woodgate

Wednesday, November 7, 12 THE LEADER IN DRUPAL PLATFORM DESIGN AND DEVELOPMENT

Drupal and ArcGIS Yes, it can be done. Frank McLean Developer

Peer 1 Hosting Multisite

XpoLog Center Suite Data Sheet

Full-text Search in Intermediate Data Storage of FCART

Assembling a Next Generation Enterprise Web Infrastructure with Drupal and Acquia

Simple Tips to Improve Drupal Performance: No Coding Required. By Erik Webb, Senior Technical Consultant, Acquia

FormAPI, AJAX and Node.js

System. CMS Vendor Comparison. Ektron 8.6. Drupal Sitecore 6.5. Kentico EMS 8.2. EPiServer WordPress SharePoint Umbraco 4.

SHAREPOINT NEWBIES Claudia Frank, 17 January 2016

BUILDING WEB JOURNAL DIRECTORY AND ITS ARTICLES WITH DRUPAL

Delivering Smart Answers!

Localizing dynamic websites created from open source content management systems

Search Big Data with MySQL and Sphinx. Mindaugas Žukas

Configuring SharePoint 2013 Document Management and Search. Scott Jamison Chief Architect & CEO Jornata scott.jamison@jornata.com

An Introduction to Drupal Architecture. John VanDyk DrupalCamp Des Moines, Iowa September 17, 2011

Open Source Content Management System for content development: a comparative study

Automating Drupal Development with Patterns

Using Apache Solr for Ecommerce Search Applications

High quality, low maintenance content ZEIT Online Breno Faria, Christoph Goller

Data Domain Profiling and Data Masking for Hadoop

Content Management System - Drupal. Vikrant Sawant (vikrant.sawant@lc.ca.gov) Legislative Data Center, California

RFID Based 3D Indoor Navigation System Integrated with Smart Phones

Questions and Answers for Scott County Requisition No Scott County Drupal Setup and Website Redesign

UW- Madison Department of Chemistry Intro to Drupal for Chemistry Site Editors

Drupal as a Jigsaw. A birds eye view. John Kennedy (CommerceJohn) Wednesday the 15 th of May 2013

2007 to 2010 SharePoint Migration - Take Time to Reorganize

Transcription:

May 10, 2012 Things Made Easy: One Click CMS Integration with Solr & Drupal Peter M. Wolanin, Ph.D. Momentum Specialist (principal engineer), Acquia, Inc. Drupal contributor drupal.org/user/49851 co-maintainer of the Drupal Apache Solr Search Integration module

Key Questions to Be Answered What is Drupal? What Apache Solr features are integrated with Drupal? Why is Drupal plus Apache Solr is better than starting from scratch? What elements of the search can you configure in the UI without code?

Why Are You Here? You are starting a new website project? You are wondering how hard it is to actually integrate Apache Solr with a website? You already use Drupal but not with Apache Solr? You like things that are easy yet powerful?

Drupal: Web Application Framework + CMS == Social Publishing Platform Drupal is as much a Social Software platform as it is a web content management system. CMS Watch, The Web CMS Report 2009 workflow content users blogs / wikis taxonomy semantic web Content Mgmt Systems Social Software Tools forums / comments social ranking RSS social tagging analytics social networks

Drupal + Solr Provides Immediate Access to Rich Search Features Dynamic content requires dynamic navigation - which is provided by an effective search Search facets mean no dead ends Solr provides better keyword relevancy in results Much faster searches for sites with lots of content By avoiding database queries, Drupal with Solr scales better

DEMO: A Drupal 7 partial copy of the conference site with Apache Solr integration http://youtu.be/yy6kma_viwc

Drupal Has User Accounts, Roles & Permissions Define custom roles Set granular access controls by role Configure user behavior: Registration Email Profiles Pictures

Drupal Modules Add Functionality There s a module for that More than 4100 Drupal 7 community modules Often controlled by rolebased permissions Drupal core and modules are GPL v2+, and have a huge, active community

Drupal is Written in PHP, Which Makes for Easy Customization The Drupal architecture encourages and provides many avenues for customization by writing modules but not patching Drupal core Drupal has a huge community of users. Approximately 10,000 sites report to Drupal.org that they use the Apache Solr Search Integration module.

Drupal Adapts toyou!!

Drupal Entities are Content + Data Nodes are the basic entity used for text content The entity system is extensible - can represent any data Examples of data stored within Drupal entities Text geographic location Node reference Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9

Entity Types are Enriched With User-configurable Data Fields Define new data fields on a node using the Field API module. Text, images, integers, date, reference, etc Flexible and configurable in the UI No programming required (many existing modules)

A Strong Framework for Content Classification Core taxonomy system Modules provide taxonomy-based appearance, access control Standard input options include free tagging, flat-controlled, and hierarchical-controlled

Drupal + Solr Search for Business, Government and NGOs http://www.mattel.com/search/ apachesolr_search/ https://www.eff.org/search/site/ http://www.poly.edu/search/apachesolr_search/ http://www.whitehouse.gov/search/site/ http://opensource.com/search/apachesolr_search/ https://www.ethicshare.org/publications/ http://www.nypl.org/search/apachesolr_search/ http://www.mylifetime.com/community/search/apachesolr_search/ http://www.emporia.edu/search/site/ http://www.restorethegulf.gov/search/apachesolr_search/ http://www.hrw.org/en/search/apachesolr_search/

Drupal Has Already Solved Many Solr Integration Challenges The most important - content indexing. Facets, sorting, and highlighting of results. Immediate integration with the More Like This and spell-check handlers. Included sub-module integrates content access permissions by indexing to and filtering Solr results based on the current user.

Easy Content Recommendation! Uses the MLT handler Picks fields from the currently viewed node

The Module Has a Pipeline for Indexing Drupal Content to Solr Drupal entities are processed into one (or more) document objects. Each document object is converted to XML and sent to Solr. Node object title nid type Document object entity_type label entity_id Drupal bundle functions XML string <doc> <field name="entity_type">node</field> <field name="label">hello Drupal</field> <field name="entity_id">101</field> <field name="bundle">session</field> </doc>

Entity Meta-data Gives Automatic Facets! Content types Taxonomy terms per vocabulary Content authors Posted and modified dates Text and numbers selected via select list/radios/check boxes

Drupal Modules Implement hooks to Control Indexing and Display HOOK_apachesolr_index_document_build($document, $entity, $entity_type, $env_id) By creating a Drupal module (in PHP), you can implement module and theme hooks to extend or alter Drupal behavior. Change or replace the data normally indexed. Modify the search results and their appearance.

Updates to an Entity or Related Meta-data Cause Reindexing Drupal entities are indexed during Drupal cron (typically invoked via *nix cron). By using a specialized tracking table, content can automatically be queued for reindex when changed, and subsets of content can potentially be sent to different Solr indexes. Entities include many ID-based reference fields (e.g. the User ID of the author). Changes to the referenced data is also watched.

Indexing Tracking Tables Maintain Order +-------------+-----------+-------------+--------+------------+ entity_type entity_id bundle status changed +-------------+-----------+-------------+--------+------------+ node 36 session 1 1336520756 node 37 session 1 1336510489 node 38 session 1 1336510456 node 39 session 1 1336510456 node 40 speaker_bio 1 1336510456 +-------------+-----------+-------------+--------+------------+ When a node is updated, the changed timestamp is updated. The indexing pipeline tracks the largest timestamp and entity_id which has been indexed.

Example: Taxonomy Term Classifying a Node is Changed Grapefruit Citrus fruit function apachesolr_taxonomy_term_update($term) All nodes classified with this terms are queued to be re-indexed by setting the changed column to the current time. Thus you will correctly match Citrus instead of Grapefruit for those documents.

When Unpublished, Content is Purged Drupal core includes a simple editorial workflow where content may be toggled between published (visible) and unpublished (incomplete, removed, spam, etc). The module immediately removes content from the index when unpublished, and also tracks it for future removal in case the Solr server is unavailable.

Search Using Dismax Query Parsing & Boosting Features Dynamic fields in schema.xml used to index standard and custom entity data fields Dismax (or EDismax) handler used for keyword searching across multiple fields and per-field boosts Query-time boosting options available in the UI

A Query Object Is Used to Prepare and Run Searches HOOK_apachesolr_query_prepare($query) $query->setparam('hl.fl', $field); $keys = $query->getparam('q'); $response = $query->search();

More Modules Available to Add More Features A few examples: ApacheSolr Attachments Apache Solr Multisite Search Apache Solr Organic Groups Integration Apachesolr User indexing Apachesolr Commerce

To Wrap Up! Drupal has extensive Apache Solr integration already, and is highly customizable. The Drupal platform is widely adopted, and the Drupal community drives rapid innovation. Acquia provides Enterprise Drupal support and a network of partners. Acquia includes a secure, hosted Solr index with every support subscription.

Did I Answer These? What is Drupal? What Apache Solr features are integrated with Drupal? Why is Drupal plus Apache Solr is better than starting from scratch? What elements of the search can you configure in the UI without code?

Other PHP Integration Tools http://www.solarium-project.org/ http://php.net/solr http://pecl.php.net/package/solr http://code.google.com/p/solr-php-client/ Caveat: don t use serialized PHP response format in a custom integration - use JSON writer.

Acquia is Hiring! Do you love Drupal, Solr, the LAMP stack, DevOps or anything related, and working at a fast-growing and successful startup? Boston and Portland area U.S. offices. Some remote opportunities as well. Come talk to me! peter.wolanin@acquia.com pwolanin in IRC #drupal or #solr

Resources... Questions?! http://drupal.org/project/apachesolr http://drupal.org/project/apachesolr_attachments http://archive.org/details/ drupalconchi_day2_attain_apache_solr_coding_c hops http://www.acquia.com/tags/apachesolr http://groups.drupal.org/lucene-nutch-and-solr