May 2010 LIMS Integration Framework Model Dr. Partha Mukherjee
Contents Abstract 2 Market Trend 3 Target Audience 3 Problem statement 3 Solution 4 Conclusion 8 References 8 About the Author 9 ABOUT HCL 10 Abstract Today s LIMS business applications rarely live in isolation. The LIMS users expect instant access to all the LIMS business functions an enterprise can offer, regardless of which LIMS system the functionality may reside in. This requires disparate LIMS applications to be connected into a larger, integrated solution. This integration is usually achieved through the use of some form of middleware. Middleware provides the plumbing such as data transport, data transformation, and routing. The integration of the Laboratory Information Management System (LIMS) is a major challenge in the industry. In a pharmaceutical company which has multiple versions of LIMS or a single version of LIMS, the integration within the LIMS products is still under research. Here, the ideal solution would be a systematic LIMS data integration and cleansing of the unstructured data management and the information management that allows storing the data from the various LIMS operational system to a common single LIMS repository system, from where, the user can retrieve the data for the reporting. The proposed framework model is designed in such a way that any type of LIMS product can be plug-in with the proposed integration model. Here, the intelligent and generic adaptors will help in the data push/pull operation for the seamless integration between the different types of LIMS applications. These adaptors can be plug-in with any types of LIMS.
Market Trend The revenues in the laboratory information management systems (LIMS) market have received a huge boost with the growing use of these systems in various applications in mature sectors such as pharmaceuticals, manufacturing, and IT. They are also used in emerging sectors such as biotechnology and food and beverage. This increased usage can be attributed to the launch of innovative LIMS technologies and products, especially in the biotech industries. The adoption of LIMS and the biotech companies are likely to expand simultaneously. As biotech enterprises become successful and begin to mimic pharmaceutical companies, they will have greater need for LIMS integration so that they can get one common platform and repository for all. The Frost & Sullivan research service examines the key trends in the LIMS market and observed that lots of research work is still required for the LIMS integration. As the LIMS software is more expensive to install, low research funding has become a huge concern among LIMS companies. Most of the solution framework of the available LIMS products in the market is based on old technologies and the laboratories do not obtain adequate funds for managing their data and up-gradation of LIMS, and therefore have to manage with systems having lesser compatibility. Since these old technology systems lack in beneficial and flexible operations, they will reduce the efficiency of work. So for an organization it is just impossible to maintain the multiple versions of LIMS and business is searching for some solution for the seamless LIMS integration so that business can integrate the multiple version of LIMS to a single version LIMS. Target Audience The target audiences are from the following industries, Pharmaceutical industry & Biotech Industry Petrochemicals & Chemicals Contract Research Organization Clinical Lab Food Industry Others Problem statement Global life science organizations are under increasing pressure to harmonize and integrate their asynchronous types of LIMS business processes and are searching for ways to standardize on mission-critical computing solutions for an integration of multiple types of LIMS to a single version LIMS. The integration of LIMS is a real challenge in industry because most of the LIMS products which are available in the industry are the propitiatory systems. Lots of challenges are involved in the integration of the different LIMS business processes because the data format is different and also unstructured works in asynchronous mode. Most of LIMS which are available in the market are based on the old technologies and
most of them are also stand alone systems. It is very much difficult to align the multiple types of source LIMS with a single target LIMS because of unstructured data exchange and data formats. The business is in pressure to integrate their multiple versions of LIMS to a single version LIMS where master data as well as the transaction data plays a major role. Solution Architecting LIMS integration solutions is a complex task. There are many conflicting LIMS process rule engines and even more possible right solutions. Most integration vendors provide methodologies and best practices, but these instructions tend to be very much geared towards the vendor-provided tool set and often lack treatment of the bigger picture, including underlying guidelines, principles and best practices. Asynchronous messaging architectures have proven to be the best strategy for enterprise integration because they allow for a loosely coupled solution that overcomes the limitations of remote communication, such as latency and unreliability. The trend towards asynchronous messaging has manifested itself in a variety of EAI (Enterprise Asynchronous Integration) suites as well emerging standards for reliable, asynchronous web services. Many of the assumptions that hold true when developing single, synchronous applications are no longer valid. What is needed is vendor-independent design guidance and framework model on building robust LIMS integration architectures based on asynchronous messaging. The LIMS integration layer of the proposed solution focused on the Data Virtualization, discussed in section 5.1 and the data integration techniques discussed in section 5.2. LIMS Integration Framework Model Data integration is a difficult proposition, but help is coming in the form of a relatively new approach to information management called data virtualization. The typical enterprise today runs multiple types of LIMS, where the data integration within the LIMS is a real challenge. Data integration is getting harder all the time, and we believe that the data volumes and the unstructured data format is the major issues here which are continuing to grow. The data integration is really needed because it represents the value to the business, to the consumers and to the partners. Customer wants quality data to be able to make better business decisions. Data virtualization, also referred to as Information-as-a-Service and Data-as-a-Service, promises to ease the impediments to data integration by decoupling data from applications and storing it on the middleware layer. Data virtualization can essentially be thought of as a service-oriented architecture (SOA) for data, according. But where the traditional SOA approach has focused on business processes, data virtualization focuses on the information that those business processes use. The features of a LIMS Integration become challenges from a data integration perspective. There are three key data integration challenges involved with the LIMS Integration, which are:
Completeness of the LIMS life cycle information that deals with the unstructured data format Correctness of the LIMS information that deals with the data accuracy and the data quality Criticality of information that deals with on-time availability. These challenges become more prominent with the expectation that the level of data integration with the unstructured data format which are complete and the requirement that multiple views of data should not lead to inaccuracy while blending them with data from external sources. In order to address these issues, big vendors in the market started promoting industry-specific data integration models that can be implemented with some customization to fit the needs of the enterprise. However the challenges specific to the integration of unstructured data sources go beyond these generic data models. Fig.-1: LIMS Integration and Migration In the proposed Integration layer the LIMS unstructured or structured data formats from the different types of LIMS are pulled by an intelligent software devise called the Pulled Adapter. The function of this Pull Adaptor is to extract the data from the different types of LIMS to a single location where the meaningful LIMS data but in different data format are transformed to a single data format. Once the meaningful data formats from the different versions of the LIMS are transformed to a single version then the transformed data will be pushed to a common repository. The Common repository will contain a single version of LIMS which will be extracted from different types of LIMS. The data integration framework for the unstructured data is based on the ESB ( Enterprise Service Bus) and MoM (Message Oriented Middleware). The Pull/push adaptors are the integrator encapsulated within the MoM. An enterprise service bus (ESB) is a software architecture middleware that provides fundamental services for more complex integration architectures. Messaging tends to concentrate on the reliable exchange of messages around a network; using queues as a reliable load balancer and topics to implement publish and subscribe. An ESB typically tends to add different features above and beyond messaging such as orchestration, routing, transformation, adapters and mediation. The adaptors are generic in nature and it can be plug-in with any types of LIMS Product. In the proposed paper we are providing two types of LIMS integration scenarios. The first is the LIMS integration and the migration that can be done between the two or more types of source LIMS products with a single version of target LIMS, where after the migration and integration the old version of the source LIMS can be
decommissioned and the new version of target LIMS can continue. Another scenario, where we can extract and integrate the data from the various types of source LIMS to a single common repository from where we can generate various types of reports with respect to the multiple version/types of source LIMS. This type of integration holds good for the LIMS Analytics. Data Integration Techniques There are various types of middleware generally used for the integration methodologies like, 1. Remote procedure calls 2. Message-oriented middleware 3. Message brokers In the proposed system we are using the MoM (Message-oriented middleware) as shown in Fig.-3. The proposed integration system is based on the ESB framework where for the seamless data integration the Message-Oriented Middleware (MOM) is created. Here, MOM is also able to guarantee that Data& Messages will reach their destination, even when the destination is not available (asynchronous mode). The root pattern hierarchy is based on the messaging technologies where the creation of the message channels are dependent on the message deployment time, what and how many channels will be involved and what will be the channel direction. In the proposed system as data exchange and data integration is required in both the directions like source LIMS to Target LIMS and again Target LIMS to the source LIMS so it should be bi-directional. The structure of the channel adaptors will be created as per Fig.-2. Fig.-2: Two different LIMS Applications with different data formats Here the source applications LIMS A and LIMS B are the two different source LIMS applications where the data formats for LIMS-A is A with message and the data format for the LIMS-B is B with message, which is different. Here we are considering only one process like the Sample Life Cycle. As an example we can say that the LIMS-A= Sapphire and the LIMS-B= Labware, where the Sapphire database is in RDBMS where as the Labware database is in DBMS. As the Labware database is in DBMS so all the data integrity are designed at the business layer where as in sapphire it has been designed at the database layer. If we have to transform the data from both of these LIMS product to a third target LIMS, where the Database is in RDBMS, then we can follow the proposed steps
and the canonical data model. The different data formats will be translated through the source translator which will be mapped to the target data formats C athrough the target translator and finally the resultant data will be transformed as per the format of the target LIMS-X. The integration and the transformation will be performed via the canonical data model between source to the target application so that seamless data exchanged can occur between the source LIMS to the target LIMS. The basic steps for the above mentioned LIMS integration are as follows: Step-1: Data Synchronization (data mapping) between the source LIMS to the Step-2: Data Migration from source LIMS to the Target LIMS Step-3: Process Integration between the source LIMS to the target LIMS Step-4: Data Integration between the source LIMS to the target LIMS Step-4: Data build-up (following Incremental & Iterative process) The proposed protocol for the integration layer will be SOAP (Simple Object Access Protocol). Fig.-4: Application Integration with Different data format The asynchronous data mapping allows the application to continue processing through the message channel after making a middleware service request with different data format extracting from different types of LIMS Products. The message is dispatched to the queue manager, which makes sure that the message is delivered to its final destination LIMS. The message transformation layer understands the data format of all messages being passed among the applications and transforms those messages while they move. The massage channels are handling the canonical data model where it can handle the unstructured data format for different types of LIMS applications.
Conclusion In the proposed integration framework model, the system will be flexible enough to add more source LIMS as and when required. This would also help in adding new LIMS business line in future. The target LIMS will hold both the transactional level information at the lowest level as well as the master data so there will be no bottle neck in generating operational or Adhoc nature reports from a single source as and when required. The migration process will take care the data migrate by incremental method which would bring down the data loading time to a great extent. The proposed design acts a single unified source of reporting for optimal performance. Due to the integration, the licensing cost will be reduced considerably. The number of physical servers will reduce; this would bring down the recurring hardware and maintenance cost. So using this frame work we can also optimize the cost to a great extent. References 1. Worldwide Regulatory Compliance Issues in Life Science (IDC #32690,December 2004) 2. 1Q05 Leading Indicators in Life Science s IT Spending Survey, an IDC Report 3. Knowledge Management in Drug Discovery R&D 3rd Millennium, Inc. February 2003 4. Laboratory Automation: Smart Strategies and Practical Applications Donald S. Young1, University of Pennsylvania Medical Center, 3400 Spruce St., Philadelphia, PA 19014-4283 5. Laboratory Information Management System Outlook, Five year market analysis and technology forecast through 2008 Arc Advisory Group 6. Laboratory Automation and Information Management 32 (1996) 7-22 7. A Revolution in Agility: Business Integration Through Service-Oriented Architecture : An Oracle White Paper Updated August 2008 8. Enterprise Integration Patterns: Mr. Eva Shon, Software & Systems Engineering Seminar
About the Author Dr. Partha Mukherjee is engaged as a Project Manager at LIMS COE, Kolkata, HC L Technologies Ltd. He has around16 years of IT experience. Last 10 years he has been engaged with HCLT in the area of Business Intelligent and data warehouse technology. Partha has specializes in the area of BI, LIMS Integration, LIMS Analytics and Master Data Management. Partha has done his doctoral research on Multimedia Database Management System and BI from the CSE Department, Jadavpur University, Kolkata and he had published multiple research papers in the international IEEE Journals and Conferences.
10 ABOUT HCL HCL Technologies HCL Technologies is a leading global IT services company, working with clients in the areas that impact and redefine the core of their businesses. Since its inception into the global landscape after its IPO in 1999, HCL focuses on transformational outsourcing, underlined by innovation and value creation, and offers integrated portfolio of services including software-led IT solutions, remote infrastructure management, engineering and R&D services and BPO. HCL leverages its extensive global offshore infrastructure and network of offices in 26 countries to provide holistic, multi-service delivery in key industry verticals including Financial Services, Manufacturing, Consumer Services, Public Services and Healthcare. HCL takes pride in its philosophy of Employee First which empowers our 58,129 transformers to create a real value for the customers. HCL Technologies, along with its subsidiaries, had consolidated revenues of US$ 2.6 billion (Rs. 12,048 crores), as on 31st March 2010 (on LTM basis). About HCL Enterprise HCL is a $5 billion leading global Technology and IT Enterprise that comprises two companies listed in India - HCL Technologies & HCL Infosystems. Founded in 1976, HCL is one of India s original IT garage start-ups, a pioneer of modern computing, and a global transformational enterprise today. Its range of offerings spans Product Engineering, Custom & Package Applications, BPO, IT Infrastructure Services, IT Hardware, Systems Integration, and distribution of ICT products across a wide range of focused industry verticals. The HCL team comprises over 64,000 professionals of diverse nationalities, who operate from 26 countries including over 500 points of presence in India. HCL has global partnerships with several leading Fortune 1000 firms, including leading IT and Technology firms. For more information, please visit www.hcl.in.