1 Big Data Challenges
2 A pretty picture, or a measurement? Organelles Dynamics Cells Retinal Imaging Physiology Pathology
3 Fundus Camera Optical coherence tomography Fluorescence Histology High Content Screening Fluorescence Lifetime Imaging Atomic Force Microscopy Electron Microscopy Dicom from Bio- magnetic Imaging to Ultrasound
4 Overview of types of imaging Objective Data Type Dimensionality Size Analysis Complexity Methods
5 Overview The specimen is illuminated with light of a specific wavelength (or wavelengths) which is absorbed by the fluorophores, causing them to emit light of longer wavelengths (i.e., of a different colour than the absorbed light). The illumination light is separated from the much weaker emitted fluorescence through the use of a spectral emission filter. This emitted light is stored as channels. Multi- colour images of several types of fluorophores must be composed by combining several single- colour images.
6 Detector Ocular Emission Filter Light Source Dichroic Filter Excitation Filter Objective Sample
7 X, Y typically 512x512 or 1024x1024 but now seeing larger(8192x8192). 50Kx50K common in pathology images. Z component microscopes have a depth of focus meaning they can see into a sample by a few microns. Time component Time lapse images are common in cell biology and Fluorescein angiography. Timescale: up to 72 hours. Channel component In fluorescence microscopy proteins can tagged with a dye that fluoresces at a particular wavelength. In AFM microscopy there can be many non- image features recorded. It is typical to have 3-4 channels in an image, though some imaging techniques can have 30+. Bit depth Typically 12 bit, but can be 8, 16,32, float, double or complex
8 Type: 5D Images Data type: typically 8, 12 or 16bit X, Y: 512, 512 now moving on to 4096, 4096 Z: In fixed cell can be 64+, less in live cell imaging T: 0 in fixed cells, can be in live cell imaging C: 3-4 typical, can be more. Size: 8MB- 50GB
9 Pipettes and vials 96/384 well plates and robot control
11 Cells Image Data Numerical Data Information
13 Type: 5D Images Data type: typically 8, 12 or 16bit X, Y: 512, 512 now moving on to 2048, 2048 Wells: 96, 384, 3840 Z: Commonly only on 1 section T: 0, but recent article in science showing live cell HCS. C: was 1, or2, 3-4 typical, can be more. Size: 1GB- 600GB
14 Performed by examining a thin slice (section) of tissue under a light microscope or electron microscope. The ability to visualize or differentially identify microscopic structures is frequently enhanced through the use of histological stains. Histology is an essential tool of biology and medicine. Biological tissue has little inherent contrast in either the light or electron microscope. Staining is employed to give both contrast to the tissue as well as highlighting particular features of interest.
15 Type: 3D Images Data type: typically 8 bit X, Y: 30K, 30K can be over 100K, 100K Z: Normally 1 z- section, but can br 20+ T: 0, fixed cell imaging C: 2-3 typical, quite often mixed colour rather than pseudo colour. Size: Highly compressed using JPEG2000- often very lossy; uncompressed 10GB- 30GB single plane images.
16 Accurate protocols and efficient algorithms can be run on whole slides or specified areas. Under the control of a pathologist scoring can become more accurate and achieve higher throughput to increase statistical significance. There is currently high interest in cancer biomarker expression quantification. In the research and pharmaceutical industries these methods become increasingly important in proportion to the number of markers, targets, and drugs to test.
17 Digital Acquisition System Image Raw Data Metadata Processed Data Metadata OMERO Quantitative Analysis Metaphase Image for Publication Anaphase Image for Publication Telophase Image for Publication Visualizing Data Management, tagging and querying
18 OMERO DB Image Data Associated Files ICE.grid Hibernate Spring AOP IOC C3PO DB ICE C++ Java Bio- Formats OMERO.editor OMERO.insight OMERO.insight Services Services Services Services OMERO.fs Python OMERO.matlab OMERO.web OMERO.scripts
19 Structure your image data Annotate your analysis results Attach your analysis results Images Datasets Projects Other data
22 Histology one big image, low compute Pyramid Generation High Content Screening many smaller images, high compute Classification FLIM smaller images, high compute Interactive, curve fitting and model selection
23 Image Data Associated Files OMERO DB Submit node ICE.grid Hibernate Spring AOP IOC C3PO DB Services Services OMERO.fs Services Services ICE C++ Java Python Submit node HPC Service
24 Xml specification held on OMERO server User calls HPC service (which wraps RAPID) with id of xml file and parameters. Rapid submits the job using the xml specification The resulting script runs on the cluster gets the data from OMERO There are two datasets, one for control and one containing the experimental condition. Distributes image via MPI to worker nodes The analysis results are attached to the dataset containing the images with the experimental condition
25 Image generation occurs on OMERO server OMERO server overloaded building image pyramid Short term solution Generate pyramids on cluster Problem Moving data from OMERO to cluster Rendering settings change, rebuild the pyramid
26 Queen's Medical Research Institute Royal Infirmary 500 user site Training Scripts for profiling server configurations Histology data, 200 images/day, new scope z- sections and stains Change colour mapping in realtime Interactive analysis of ROI. Leveraging analysis