PDQ-Wizard Prototype 1.0 Installation Guide



Similar documents
Kony MobileFabric. Sync Windows Installation Manual - WebSphere. On-Premises. Release 6.5. Document Relevance and Accuracy

3. Installation and Configuration. 3.1 Java Development Kit (JDK)

Magento Search Extension TECHNICAL DOCUMENTATION

Third-Party Software Support. Converting from SAS Table Server to a SQL Server Database

Server Setup and Configuration

Developing Web Services with Eclipse and Open Source. Claire Rogers Developer Resources and Partner Enablement, HP February, 2004

Install guide for Websphere 7.0

Getting Started using the SQuirreL SQL Client

IUCLID 5 Guidance and Support

Witango Application Server 6. Installation Guide for OS X

Using the DataDirect Connect for JDBC Drivers with the Sun Java System Application Server

Lucid Key Server v2 Installation Documentation.

Application Notes for Packaging and Deploying Avaya Communications Process Manager Sample SDK Web Application on a JBoss Application Server Issue 1.

ArpViewer Manual Version Datum

Monitoring Oracle Enterprise Performance Management System Release Deployments from Oracle Enterprise Manager 12c

Using Actian PSQL as a Data Store with VMware vfabric SQLFire. Actian PSQL White Paper May 2013

Installation Guide for contineo

DeskNow. Ventia Pty. Ltd. Advanced setup. Version : 3.2 Date : 4 January 2007

Web Application Architecture (based J2EE 1.4 Tutorial)

24x7 Scheduler Multi-platform Edition 5.2

KonyOne Server Installer - Linux Release Notes

Mastering Tomcat Development

EMC Documentum Content Services for SAP Repository Manager

ZeroTurnaround License Server User Manual 1.4.0

Project Management (PM) Cell

a) Install the SDK into a directory of your choice (/opt/java/jdk1.5.0_11, /opt/java/jdk1.6.0_02, or YOUR_JAVA_HOME_DIR)

Portal Factory CMIS Connector Module documentation

Using the Adobe Access Server for Protected Streaming

TCH Forecaster Installation Instructions

Volume 1: Core Technologies Marty Hall Larry Brown. An Overview of Servlet & JSP Technology

Implementing a Web Service Client using Java

Integrating VoltDB with Hadoop

PowerTier Web Development Tools 4

Nesstar Server Administrator Guide

EVALUATION ONLY. WA2088 WebSphere Application Server 8.5 Administration on Windows. Student Labs. Web Age Solutions Inc.

By Wick Gankanda Updated: August 8, 2012

1 How to install CQ5 with an Application Server

Continuous Integration (CI) and Testing - Configuring Bamboo, Hudson, and TestMaker

SSO Plugin. J System Solutions. Upgrading SSO Plugin 3x to 4x - BMC AR System & Mid Tier.

NGASI AppServer Manager SaaS/ASP Hosting Automation for Cloud Computing Administrator and User Guide

Overview of Web Services API

Author: Gennaro Frazzingaro Universidad Rey Juan Carlos campus de Mostòles (Madrid) GIA Grupo de Inteligencia Artificial

Bitrix Site Manager ASP.NET. Installation Guide

DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER

FileMaker Server 13. FileMaker Server Help

ITG Software Engineering

Java Servlet and JSP Programming. Structure and Deployment China Jiliang University

MIGS Payment Client Installation Guide. EGate User Manual

Quick and Easy Solutions With Free Java Libraries Part II

Creating Web Services Applications with IntelliJ IDEA

To install Multifront you need to have familiarity with Internet Information Services (IIS), Microsoft.NET Framework and SQL Server 2008.

Java Web Services Developer Pack. Copyright 2003 David A. Wilson. All rights reserved.

Secure Messaging Server Console... 2

SpagoBI exo Tomcat Installation Manual

An Overview of Servlet & JSP Technology

Specops Command. Installation Guide

LAE 5.1. Windows Server Installation Guide. Version 1.0

Apache Jakarta Tomcat

Installation of MSSQL Server 2008 Express Edition and Q-Monitor 3.x.x

Installation of IBM DB2 9.1.x and Q-Monitor 3.x.x

Programming on the Web(CSC309F) Tutorial: Servlets && Tomcat TA:Wael Aboelsaadat

FileMaker Server 14. FileMaker Server Help

Web Development on the SOEN 6011 Server

Ellucian Recruiter Installation and Integration. Release 4.1 December 2015

Using the DataDirect Connect for JDBC Drivers with WebLogic 8.1

Installation of MSSQL Server 2005 Express Edition and Q-Monitor 3.x.x

XMLVend Protocol Message Validation Suite

LICENSE4J AUTO LICENSE GENERATION AND ACTIVATION SERVER USER GUIDE

Escenic Content Engine Installation Guide

Crystal Reports for Eclipse

KINETIC SR (Survey and Request)

AWS Schema Conversion Tool. User Guide Version 1.0

STREAMEZZO RICH MEDIA SERVER

SDK Code Examples Version 2.4.2

Java Web Programming with Eclipse

Spectrum Spatial Analyst Version 4.0. Installation Guide for Linux. Contents:

Talend for Data Integration guide

Tool-Assisted Knowledge to HL7 v3 Message Translation (TAMMP) Installation Guide December 23, 2009

Using Netbeans and the Derby Database for Projects Contents

EMC Clinical Archiving

SAP NetWeaver Identity Management Identity Services Configuration Guide

SafeNet KMIP and Google Cloud Storage Integration Guide

Force.com Migration Tool Guide

IBM WebSphere Adapter for PeopleSoft Enterprise Quick Start Tutorials

Supplement IV.E: Tutorial for Tomcat. For Introduction to Java Programming By Y. Daniel Liang

FileMaker Server 11. FileMaker Server Help

CA Workload Automation Agent for Databases

Web-JISIS Reference Manual

NGASI Shared-Runtime Manager Administration and User Guide WebAppShowcase DBA NGASI

XenClient Enterprise Synchronizer Migration

Intelligent Event Processer (IEP) Tutorial Detection of Insider Stock Trading

Configuring the LCDS Load Test Tool

MICROSTRATEGY 9.3 Supplement Files Setup Transaction Services for Dashboard and App Developers

Installation Guide. MashZone. Version 9.6

ServletExec TM 5.0 User Guide

Hudson Continous Integration Server. Stefan Saasen,

WORKING WITH LOAD BALANCING AND QUEUEING FOR ADOBE INDESIGN CS5 SERVER

Transcription:

PDQ-Wizard Prototype 1.0 Installation Guide University of Edinburgh 2005 GTI and edikt 1. Introduction This document is for users who want set up the PDQ-Wizard system. It includes how to configure the environment, compile and build the source code and deploy the system. 1.1. What is PDQ-Wizard PDQ-Wizard is a web-based application that provides easy queries to the PubMed literature database. 1.2. Prerequisites Java 5.0 Axis RC1.0 Ant 1.6.5 JUnit Tomcat 5.0 MySQL 5.0 and JDBC connector Python 2.4.2 PubMed web service library 2. Prepare the tools 2.1. Java Java 2 Standard Edition (J2SE) 5.0 SDK is used to compile and run the application. Here is the download page http://java.sun.com/j2se/1.5.0/download.jsp. Previous versions of Java SDK does not work. 2.1.1. JavaServer Faces This is the component that facilitates JSP dynamic web pages using the MVC(Model-View-Controller) pattern. We used latest 1.1, file: jsf-1_1_01.zip which can be downloaded from http://java.sun.com/j2ee/javaserverfaces/download.html. 2.1.2. Other components The Jakarta Taglibs standard 1.0 is used by the project. http://jakarta.apache.org/taglibs/ The components Jakarta commons dbcp and collections are also used to support database pooling in the Tomcat. Download these jar files from http://jakarta.apache.org/commons/ and save in the common/lib folder of Tomcat (see below). 2.2. Ant Ant is the build tool which is heavily used by the PDQ-Wizard application. Ant version 1.6.5 is needed to handle Tomcat deployment as a build target. (tools.types.redirectelement is required.) It can be downloaded at http://ant.apache.org/. 2.3. JUnit JUnit is a tool for unit testing.

2.4. Axis Axis is a tool for web service support. Version 1.2 RC1 is tested, and other versions may not work. Axis can be downloaded here: http://www.apache.org/dyn/closer.cgi/ws/axis/1_2rc1/. After downloading, unpack the file into any folder. 2.5. MySQL MySQL is the relational database management system that PDQ-Wizard uses to store local data cache. It can be downloaded at http://dev.mysql.com/downloads/mysql/5.0.html. Note that MySQL version 5.0 is used because View is supported. Installation of MySQL is straight-forward and no configuration is necessary. The PDQ-Wizard database needs to be created and this will be discussed later. An additional component is required to support access to MySQL from Java, and this is the JDBC driver for MySQL. It can be downloaded at http://dev.mysql.com/downloads/connector/j/3.1.html. 2.6. Tomcat Tomcat is the web server to support PDQ-Wizard to show data result as web pages. It can be downloaded at http://tomcat.apache.org/download-55.cgi and installed easily. 2.7. Python The Python programming language interpreter is used to run the data extraction programs for populating the alias database. This tool can be downloaded from http://www.python.org/. Installation is straightforward. 3. Build the PubMed library PDQ-Wizard accesses the PubMed database through the Entrez Utilities web service interface. To use it, we first download the web service definition file and then generate the stub Java source code. 3.1. Download PubMed WSDL The Entrez utilities web service WSDL (Web Service Description Language) file can be downloaded from http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/eutils.wsdl, it is used to generate the stub code to connect to the Entrez database for data queries. The current version used by PDQ-Wizard is 1.3. This file should be saved in the <PDQ-Wizard folder>\entrez for later use. 3.2. Generate the stub code library To make use of the Entrez web service, we need to generate stub source code from the WSDL file, compile these source code and package the class files into a jar file. We have these steps built into the Ant build script as a single task. <target name="gen-entrez" description="generate ncbi.jar file from eutils.wsdl"> <java classname="org.apache.axis.wsdl.wsdl2java" fork="true" dir="entrez"> <classpath refid="classpath" /> <arg value="eutils.wsdl"/> </java> <mkdir dir="entrez/build" /> <javac destdir="entrez/build" source="1.4"> <src path="entrez" /> <classpath refid="classpath" /> </javac>

<jar destfile="ncbi.jar" basedir="entrez/build" /> </target> To run this task at command line: Ant gen-entrez As a result, we can see that a gov folder is created in the entrez folder and all Java source code files generated from the WSDL file are put in the gov directory tree. Another folder created by the Ant task is the build folder which contains all compile Java class files. And what we need is the jar file named ncbi.jar which is placed in the current (PDQ-Wizard root) folder. 4. Preparing the data source This section explains how to prepare the database for local data cache and alias collection. 4.1. Create the database After MySQL is installed successfully, we can create the PDQ-Wizard database. To manually create the database, run mysql at command console like this: mysql -u root --password=<yourpassword> Then issue command to create the database: mysql> CREATE DATABASE IF NOT EXISTS PDQDB; mysql> exit 4.2. Create data tables After the database is created, it is ready to create the data tables and indexes. We can use the build script to do the work. In the PDQ-Wizard folder, issue ant command for create table task: Ant create-tables To verify the tables are successfully created, start mysql and try the following command: Mysql> use pdqdb; Mysql> describe aliasmaps You can see the following if the tables are created. +-------------+--------------+------+-----+---------+-------+ Field Type Null Key Default Extra +-------------+--------------+------+-----+---------+-------+ geneid varchar(16) NO PRI symbol varchar(64) NO MUL category varchar(2) NO GE description varchar(128) NO +-------------+--------------+------+-----+---------+-------+ 4 rows in set (0.24 sec) 4.3. Download the alias datasets There are several datasets that we want to use to search synonyms or aliases for gene IDs or protein IDs. These datasets can be downloaded from these sites: ftp://ftp.ncbi.nlm.nih.gov/gene/ http://www.pir.uniprot.org/database/downloads.shtml

Files to download from the FTP site include gene2accession, gene2refseq, gene_info, and uniprot_sprot.dat from the HTTP site. (These files should be saved in the db folder under PDQ-Wizard folder) 4.4. Extract and Load the datasets The downloaded datasets need to be extracted in order to store into the database tables. We have several programs written in Python language to extract the datasets from the original files into a text file which can be directly loaded into the MySQL database. The following is the list of programs to do the work. Extract.py Extract gene2refseq and gene2accession files. Python extract.py gene2refseq refseq.txt Python extract.py gene2accession accession.txt Output files: refseq.txt, accession.txt Exgeneinfo.py Extractunigene.py Exswissprot.py Extract datasets from gene_info. python exgeneinfo.py 128 256 Output files: geneinfo.txt, aliasmaps.txt Extract data from gene2unigene. python extractunigene.py unigene.txt Hs Mm Output file: unigene.txt Extract AC and DE from SwissProt text file Output files: swissprot.txt and swissprotsyns.txt After running these extractor programs, we have a few text files that are ready to be loaded into the database. The file loaddata.sql contains the commands to populate the database tables with these files. LOAD DATA LOCAL INFILE 'geneinfo.txt' INTO TABLE geneinfo LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'swissprot.txt' INTO TABLE geneinfo LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'aliasmaps.txt' INTO TABLE aliasmaps LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'swissprotsyns.txt' INTO TABLE aliasmaps LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'refseq.txt' INTO TABLE aliasmaps LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'accession.txt' INTO TABLE aliasmaps LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'unigene.txt' INTO TABLE aliasmaps LINES TERMINATED BY '\r\n'; To run this file, issue command: Mysql -u root --password=<password> --database=pdqdb < loaddata.sql 4.5. Data Cleansing The alias table contains synonyms or alias names of genes and proteins. These terms will be used as query terms when searching the pubmed literature database. However, some of these terms are common English words or very short acronyms that are too common in the articles. These terms should be eliminated from the alias table. We rely on a dictionary to delete the terms from the alias database table. The resources we use include: English dictionary: OpenBsd, usr/share/words Cell-line dictionary: http://www.biotech.ist.unige.it/cldb/indexes.html Abbrev and Acronyms in Biomedical Research and Practice. http://focosi.altervista.org/abbreviations.html These words are put in a SQL script file delwords.sql with commands like: DELETE FROM AliasMaps WHERE LENGTH(Alias)<3; DELETE FROM AliasMaps WHERE Alias='aand'; DELETE FROM AliasMaps WHERE Alias='able'; DELETE FROM AliasMaps WHERE Alias='about';

Later, run mysql -u root --password=<password> --database=pdqdb < delwords.sql 5. Building the system This section is about building the PDQ-Wizard system from the source code. 5.1. Download the package The PDQ-Wizard application is distributed as one package which contains the source code, web pages, JSP pages, and other related files. Download the package from : http://forge.nesc.ac.uk/ Unpack it into a folder, for example, C:\pdq 5.2. Directory structure After unpacking the application, the directory structure is like the following: pdq/ src/ pdq/*.java test/*.java db/*.sql web/*.html,*.jsp css/*.css images/*.jpg,*.gif WEB-INF/web.xml, faces-config.xml 5.3. Modify the build properties Under the pdq folder (the root of PDQ-Wizard), there is a file named build.properties. It is a text file containing variable values representing the paths of the libraries, for example: tomcat.home=c:\\program Files\\Apache Software Foundation\\Tomcat 5.5 deploy.path=${tomcat.home}/webapps sf-api.jar=${tomcat.home}/jsf-1_1_01/lib/jsf-api.jar jsf-impl.jar=${tomcat.home}/jsf-1_1_01/lib/jsf-impl.jar commons-logging.jar=${tomcat.home}/jsf-1_1_01/lib/commons-logging.jar The real directory paths need to be modified according to the folders where those components are installed. For example, the JavaServer faces components (jsf-impl.jar) may not be installed in the Tomcat home folder. The ${} represents a variable (shortcut) defined previously. 5.4. Build the source code To check whether it compiles use the ant script: Ant compile If not successful, check the tools and libraries and try again. After a successful compile, run build: Ant build

The build script will create a build folder under the pdq folder. It contains everything to be deployed into the web container. 6. Deploying the system The PDQ-Wizard needs to be deployed into the Web container, in this case, the Tomcat web server. Deployment of an application into Tomcat can be as simple as copying the files into a particular folder, namely, webapps under the Tomcat home folder. The easiest way is to use the Ant build script. 6.1. Deploy using build script To deploy the application, issue the Ant build command: Ant deploy After a successful deployment, a folder is created in the webapps folder of Tomcat. The content of this folder is a copy of the build folder created from building the application. [Tomcat]/ webapps/ pdq/*.html,*.jsp css/*.css images/*.jpg,*.gif WEB-INF/web.xml,faces-config.xml classes/ pdq/*.class lib/*.jar 6.2. Restart the application Whenever something is modified, such as a web page, or Java bean, the application must be rebuilt and deployed into the web container. In order that the update is immediately shown, it is necessary to let the container reload the updated application. With Tomcat, we can use the build script to do the work. Ant stop Ant start 7. Running the system 7.1. Unit testing Some components are unit tested, to run the unit test code, issue command: Ant runtest 7.2. Start from a browser Start a browser like IE, point to URL http://<host>:8080/pdq. If on local machine, use http://localhost:8080/pdq.