Using Solr search in a Dot Net environment.... MATTHEW BRUMPTON... SENIOR SYSTEMS AND APPLICATIONS DEVELOPER UNIVERSITY OF ESSEX... DevCon1 12 APRIL 2013
Overview of today s talk UK Data Archive and the UK Data Service What is Solr? Why Solr? Life before Solr Current applications Architecture Application Server Working with Solr Acknowledgements Q&A
The UK Data Archive and the UK Data Service Based at the University of Essex since 1967 Curator of the largest collection of digital data in the social sciences and humanities in the UK See data-archive.ac.uk for more details Makes these available via the new UK Data Service UK Data Service also provides value-added services for UK Census data, government surveys and beyond UK Data Service includes Universities of Essex, Manchester (Mimas, CCSR), Leeds, Southampton, Edinburgh (Edina) and University College London See ukdataservice.ac.uk for more details
What is Solr? Open-source solution, largely supported by the Apache Software Foundation. Written in Java SOLR built on Lucene Lucene itself is just an indexing and search library Capable of indexing billions of items in a clustered environment. Features include: Full-text search Faceted search Highlighting Rich document handling Distributed search (Solr Cloud) Highly scalable NoSQL
Why Solr? Alternatives Microsoft FAST (SharePoint 2010) HP Autonomy (TNA) Elastic Search (Lucene) Large community (http://wiki.apache.org/solr/publicservers) British Library The Guardian Ticketmaster Scalability Capable of indexing billions of items in a clustered environment. Performance Can search millions of records in milliseconds Low cost No purchasing costs
Life before Solr ESDS Qualidata search interface ESDS International search interface ESDS Data catalogue ESDS Government Survey Finder BROWSE Major Studies BROWSE Subject Headings BROWSE New releases HASSET Subject Headings BROWSE Subject Headings BROWSE Thematic pages Comparable indicators (Long) ESDS Longitudinal search interface ESDS Government search interface Comparable geography (Long) ESDS Qualidata free text search interface ESDS Government Variable Search Variable Search ESDS Data Catalogue CESSDA catalogue DATA ESDS Government: publications citing ESDS International data ESDS International: publications citing ESDS International data ESDS Longitudinal: publications citing ESDS Longitudinal surveys Survey Question Bank RELU-DSS Census data catalogue (Data exploration) Nesstar (Data exploration) Quali Online UKDA-Store SDS HDS
Current applications We now have one architecture that supports all our search interfaces The following applications have been built over the last 2 years: UGEO - spatial content of studies http://geo.data-archive.ac.uk/ RELU - Research data, publications and outputs http://relu.data-archive.ac.uk/explore-data/search-browse Discover UK Data Service data collections, support guides, case studies, and related publications. http://discover.ukdataservice.ac.uk Variable search - variables and questions from survey datasets. http://discover.ukdataservice.ac.uk/variables UK Data Service website search http://ukdataservice.ac.uk/web-search.aspx
Application Architecture Application Umbraco / ASP.MVC HTML Jquery Business / Data Access.NET Libraries UKDA.Search.Library SolrNet Solr Cloud Solr 4.1 Lucene JVM 7 Tomcat 7 Data MS SQL.NET 4 Console Entity framework / WebAPI
Server Architecture
Working with Solr Sending a query to Solr http://dasolrc3:8983/solr/catalogue/select? q="computer+program"& sort=date+desc& fl=commontitle%2c+commonlink%2c+commondescription%2c+date Responds with a json, XML or.csv result Built in admin panel http://dasolrc3:8983/solr/
Demo `1. Code behind request Stack trace HTML Form Request : MVC Controller (HTTP Post) UKDA.Search.library SolrNet UKDA.Search.library HTTP Repsonse Solr Tomcat 7 Java 7 Solr 4.2 Lucene `2. Ajax request Stack trace Request : Get Response : XML JQuery Request : Rest MVC WebService (HTTP Post) UKDA.Search.library SolrNet UKDA.Search.library Response : JSON JQuery
Acknowledgements The architecture shown was built with the input of the following people: Project managers Lucy Bell Jack Kneeshaw David Hall Van Den Eynden Tom Ensom Solr Oscar Dovao.Net Jonathan Sexton Steve Warin Sidharth Balakrishnan John Payne Darren Bell Raju Golla Sirisha Kakarla Nic Dragos Erkan Bostanci Bayar Menzat
Thanks for listening Any Questions?
CONTACT Matthew Brumpton UNIVERSIY OF ESSEX WIVENHOE PARK COLCHESTER ESSEX CO4 3SQ T +44 (0)1206 872001 E mbrump@data-archive.ac.uk W data-archive.ac.uk