Text Mining in a Pharmaceutical Company

Similar documents
Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

ResearchGate. Scientific Profile. Professional network for scientists. ResearchGate is. Manage your online presence

Big Data and Text Mining

ITIL Knowledge Management

Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR)

Elsevier ClinicalKey TM FAQs

James Hardiman Library. Digital Scholarship Enablement Strategy

Joint Position on the Disclosure of Clinical Trial Information via Clinical Trial Registries and Databases 1 Updated November 10, 2009

Integrated Information Services (IIS) Strategic Plan

Centro de Salud México España Centro de Salud San Francisco Culhuacán Mexico City, Mexico

A Model for Centralized Monitoring & Clinical Data Management Reducing costs while ensuring compliance, risk mitigation & quality

Faut-il des cyberarchivistes, et quel doit être leur profil professionnel?

BOOSTING THE COMMERCIAL RETURNS FROM RESEARCH

Local Loading. The OCUL, Scholars Portal, and Publisher Relationship

Digital Asset Management 数 字 媒 体 资 源 管 理 任 课 老 师 : 张 宏 鑫

Associate Group Director, Regulatory Intelligence & Policy

SUBJECT TABLES METHODOLOGY

Introduction. The Healthcare Sector in Asia - Current Situation. Projected Healthcare Expenditure

DFG form /15 page 1 of 8. for the Purchase of Licences funded by the DFG

Master Data Management. (References)

Abbott Diagnostics. Durable Growth Business

Overview of Active Directory Rights Management Services with Windows Server 2008 R2

High Performance Computing Initiatives

1.00 ATHENS LOG IN BROWSE JOURNALS 4

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

Chapter 8: Types of Business Process Outsourcing. Imtiaz Surve H. Surve

College of Communication and Information. Library and Information Science

Building a Scalable Big Data Infrastructure for Dynamic Workflows

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

NERC Data Policy Guidance Notes

Introducing to E-resources. Orientation Program By Swati Bhattacharyya, Librarian

Drug Discovery in China

Job Profile Clinical Research Associate III (CRA)

EUROPEAN INDUSTRIAL PHARMACISTS GROUP

Sourcing best practices SAP AG. All rights reserved. Internal


Citi. Commercial Cards. Commercial Card Program Provider to the State of Texas. Transaction Services

10 Reasons why Visma is the best software supplier for your business

Office SharePoint Server 2007

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

Enterprise Release Management

License Administration Guide. FlexNet Publisher 2014 R1 ( )

A leader in the development and application of information technology to prevent and treat disease.

Board of Member States ERN implementation strategies

F. Hoffmann-La Roche AG chooses enteo NetInstall to manage software packaging and distribution

IBM Analytical Decision Management

Bachelor of Health Sciences

Rewriting the rules of print management and procurement at your fingertips

C A S E S T UDY The Path Toward Pervasive Business Intelligence at Merck

SAS Drug Development User Connections Conference 23-24Jan08

License Administration Guide. FlexNet Publisher Licensing Toolkit

IT Risk Management. Agenda. Roche at a glance OSI Layers above 7 What we did What we plan to do how to not get shoot

Innovate and Grow: SAP and Teradata

Siemens Industry Automation Division

How To Use Open Source Software For Library Work

Plagiarism. Dr. M.G. Sreekumar UNESCO Coordinator, Greenstone Support for South Asia Head, LRC & CDDL, IIM Kozhikode

SILOBREAKER ENTERPRISE SOFTWARE SUITE

UCD School of Psychology. Library Information Resources Policy. Page 1

Unifying IT How Dell Is Using BMC

Data Management Plan. Name of Contractor. Name of project. Project Duration Start date : End: DMP Version. Date Amended, if any

URM and Its Benefits FAQ

FileNet and SharePoint Better Together. Tom Moen Channel Development Manager

Web Archiving and Scholarly Use of Web Archives

BIG DATA BREATHES LIFE INTO NEXT-GEN PHARMA R&D

Worldwide Advanced and Predictive Analytics Software Market Shares, 2014: The Rise of the Long Tail

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Delivering Smart Answers!

Focused on your goals SERVICES AND SUPPORT CAPABILITIES OVERVIEW

SINTERO SERVER. Simplifying interoperability for distributed collaborative health care

Should Costing Version 1.1

Boost the Success of Medical Device Development With Systematic Literature Reviews

Developing Microsoft SharePoint Server 2013 Advanced Solutions

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

CASE STUDY. NHS Board. Contact. . Title. Category. Background/ context. NHS Tayside

IBM Cloud Managed Infrastructure Services for New Zealand Government

Multistep Dynamic Expert Sourcing

MICROSOFT DYNAMICS CRM Roadmap. Release Preview Guide. Q Service Update. Updated: August, 2011

Transcription:

Not for redistribution Text Mining in a Pharmaceutical Company Carola Lefrank, Roche Innovation Center Basel Bern, June 2015

About Us Use of Text Mining Negotiation Challenges Commercial Services Summary Q & A

About F. Hoffmann La Roche Ltd. Roche develops products offering advances in the prevention, diagnosis and treatment of disease. As the world s leading supplier of in vitro diagnostic systems and tests, we help guide clinical decision-making in every patient-care setting, from hospitals to patients homes. In the Pharmaceuticals Division we are focusing on areas of significant and pressing unmet medical need. Areas like cancer, where we lead the industry both in innovations and in market share. http://www.roche.com

About F. Hoffmann La Roche Ltd. Key Figures 2014 Group Sales: 47.5 bchf (Pharma: 36.7 bchf, Dia: 10.8 bchf) Over 88 000 people working together across more than 150 countries Our R & D Departments employ scientists with expertise in many disciplines, such as Chemistry, Biology, Physics, Pharmacology, Clinical Medicine and Informatics

Published Scientific Content Services Our Mission Our service ensures access to external scientific content (ejournals, ebooks and external databases) to Roche employees worldwide. The scope is Science, Technology and Medicine, as well as selected Business Information. The main activities include review and optimization of the Scientific Content portfolio on an ongoing basis and evaluation of new products in close collaboration with the Roche user community (this includes the product as well as the financial perspective). The service team manages all procurement activities related to the acquisition of external information, including negotiations with vendors, as well as backcharging of these services within the Roche Group. In addition, the ST ensures the integration of full text links into our database products (RocheLink), provides access to non-licensed material via our Document Delivery service, organize trainings for end user tools, provides guidance for copyright compliance and ensures maintenance of the Roche central intranet access point for external scientific content (info.roche.com).

Published Scientific Content Services Numbers 2014 Resources Ebooks: approx. 80 000 Ejournals: approx. 15 000 Databases: 48 Vendors: 176 People Users accessing external content: approx. 13 500 Staff Elibrary and VM: 4.7 FTE Scientific Documentation: 5.4 FTE Information Science: 8 FTE Contracts: approx. 300

About Us Use of Text Mining Negotiation Challenges Commercial Services Summary Q & A

The Relevance of Text Mining Exponentially increasing number of publications and complexity of topics results in growing workload for scientists to keep track of their area of expertise and explore new areas screening large corpora in a fact-based manner for research libraries to make content accessible in depth Text-mining allows in-depth analysis of full-text content in areas where searches on abstract level are not sufficient (e.g. Pharmacovigilance) Text-mining is creating a new quality in content analysis by complementing formerly established methods with new features: Improved retrieval Reduce search time Cast a wider net (federated search) Fact retrieval in place of document retrieval Integration Literature review using automated text-mining methods is more comprehensive and systematic

Definition Textmining describes the computer-based process of deriving relevant information from free texts by using a combination of linguistic, statistical, and machine learning techniques.

Information Sources Scientific Articles (journals and books) Abstract, Fulltext Patents WPO, EPO, USPTO and more Data Feeds Competitive Intelligence Information Clinical Trial News Databases Search Result Lists Internal Information Document Management System, Intranet, Collaborative Spaces, Lab Notebooks et al.

Information Sources - Abstract vs Full Text Abstract Fulltext Easy access Clearly structured Easy processing Limited access/license needed Complete content Difficult processing Including diverse type of data

Roche Use Case Linguamatics I2E Select text corpus for queries (also federated queries) Complete database Dataset based on database search Internal data Run a query on I2E using pre-built queries or refine queries on-the-fly Receive a excel-like result list of extracted information: fact retrieval linking scientific entities by relationships Do an intellectual check of the result list and modify/ refine query if needed Provide a reference list with links to fulltext documents (if available) to customer

About Us Use of Text Mining Negotiation Challenges Commercial Services Summary Q & A

Text and Data Mining Clause for Pharma Industry Developed and released by ALPSP, PDR and STM Association in 2012 Licensee may download, extract and index information from the Publisher s Content to which the Subscriber has access under this Subscription Agreement. Where required, mount, load and integrate the results on a server used for the Subscriber s text mining system and evaluate and interpret the TDM Output for access and use by Authorized Users. The Subscriber shall ensure compliance with Publisher's Usage policies, including security and technical access requirements. Text and data mining may be undertaken on either locally loaded Publisher Content or as mutually agreed.

Challenges Clause needs to be implemented in multiple bilateral publisher contracts: Number of contracts Multiple clause versions Diverse business models Scepticsm against text and data mining for business reasons Lack of cross-publisher cooperation (e.g. unified XML format) Technical and financial limitations for smaller vendors

About Us Use of Text Mining Negotiation Challenges Commercial Services Summary Q & A

Commercial Service 1 CCC - XML for Mining It is a growing platform of normalized full text data in XML format End user can create corpora including not-subscribed content Optional purchase functionality is available for the not-subscribed content Content is imported into text and data mining software CCC is in charge of copyright compliance The customer needs to have a CCC Copyright license in place The service is available for corporate customers

CCC - XML for Mining Publishers 1. Publishers provide content and rights <XML> <XML> <XML> CCC XML Article corpus TDM Software TDM Service 2. Companies create article sets using keywords or other criteria (including not-subscribed content) DirectPath Platform (ES backend) 3. Companies preview results including subscription status. Optional content purchase slide content provided by CCC 4. Final results are imported into text mining tools

Commercial Service 2 Elsevier - Text Mining for Life Sciences It is a growing text mining platform using abstracts, conference proceedings and full text journal articles Document input and output is available in multiple data formats Main focus is on interactions API is under development Elsevier is interested in collaborating with academic institutions as well as with Pharma and Biotech

Commercial Service 2 Elsevier - Text mining Process slide content provided by Elsevier

Non Commercial Service CrossRef Not-for-profit association of scholarly publishers CrossRef core service is DOI allocation and reference linking CrossRef text and data mining service: DOI is the basis for a text and data mining API The Cross Ref Metadata API is used to check (open access or subscribed) full text availability of content identified by CrossRef DOIs across publisher sites and regardless of the business model. The API provides links to fulltext in multiple formats Customer use their own tools to mine the content The service is currently free of charge

Non Commercial Service CrossRef slide content provided by CrossRef

About Us Use of Text Mining Negotiation Challenges Commercial Services Summary Q & A

Summary Our scientists need excellent laboratory, technology and information support Faster access to more content Better retrieval of relevant content More functionality More convenience Latest technology All this can be improved with the help of Text Mining Technologies

Summary We will need to manage potential constraints Financial Budget Headcount Number of available FTE s Skills, Background Technical System Requirements

About Us Use of Text Mining Negotiation Challenges Commercial Services Summary Q & A

Doing now what patients need next