Behavior Grouping based on Trajectories Mining. Department of Medical Informatics Shimane University, School of Medicine, Japan

Size: px
Start display at page:

Download "Behavior Grouping based on Trajectories Mining. Department of Medical Informatics Shimane University, School of Medicine, Japan"

Transcription

1 Behavior Grouping based on Trajectories Mining Shoji Hirano Shusaku Tsumoto Department of Medical Informatics Shimane University, School of Medicine, Japan 1

2 Introduction Outline Background, Objective, Approach Method Multiscale comparison and grouping of trajectories Experimental Results Australia Sign Language data Hospital Management Conclusions 2

3 Temporal Data Mining One Dimensional Time Series: Chronological Behavior of One Variable Two Dimensional Time Series Trajectory: Behavior of Two Variables Grouping of Temporal Sequences Capture the dynamic behavior of Temporal Variables 2D: Detection of Co-variant variables Disease Grouping,..

4 Discoveries from Hepatitis Data Left: ALB, PLT covariant Right: ALB, PLT non-covariant PLT PLT PLT PLT #170 (C5;F4) #602 (C5;F4) ALB ALB #558 (C15;F1) ALB #636 (C15;F3) ALB Two Groups of Disease Progression of Liver Fibrosis Group1: ALB, PLT: decreasing Group2: PLT: decreasing, ALT: stable

5 Trajectory Mining Process Segmentation and Generation of Multiscale Trajectories Segment Hierarchy Trace and Matching Calculation of Dissimilarities Clustering of Trajectories 5

6 Multiscale Structural Comparison Represent trajectories using multiscale description Search the best correspondences of partial trajectory throughout all scales Attr.2 (cf.ueda et al. (1990) Trajectory B Scale 0 Scale 1 Scale 2 Segment t=0 Attr.2 Attr.1 Trajectory A Scale 0 Scale 1 Scale 2 t=0 Attr.1 6

7 Multiscale Description Represent convex/concave structure of trajectories on various observation scales Trajectory representation ( ex ( t), ex ( t),..., ex ( )) c( t) = 2 1 I t ex i ( t), i I : time series of test i (cf. Mokahatan et al. (1986)) σ=large C( t, σ ) Trajectory at scale σ C( t, σ ) = EX ( t, σ ), EX 2( t, σ ),..., EX I ( t, EX ( t, σ ) = ex ( t) g( t, σ ) ( )) 1 σ i = i n= σ e In(σ ) exi ( t) I n : modified Bessel function of order n σ=large: Global feature of the trajectory σ=small: Local feature of the trajectory σ=small C(t,0) 7

8 Segment Matching based on Concave/convex Structures Segment: partial trajectory between inflection points Curvature at scale σ(2d case) K( t, σ ) = EX 1EX + EX EX ( EX + EX ) / 2 (cf.ueda et al. (1990) σ=large c j ( t, σ ) (σ ) A EX ( m) i ( t, σ ) = Inflection point: t, σ C j EX i ( t, σ ) m t Segment representation m = ex ( t) g i ( m) ( t, σ ) ( ) : K( t 1, σ ) K( t, σ ) < { ( σ ) a i = 1,2 N} ( σ ) A = i,..., 0 σ=small (0) a 2 (0) a 1 (0) A 8

9 Multiscale Structural Comparison Global Matching Criteria Minimization of total segment dissimilarity Complete match; the original trajectory must be formed without gaps/overlaps by concatenating the segments Dissimilarity k ) d( a i, b ( ( h) j ) between two segments ( k ) ( h) a i, b j d( a ( k ) i, b ( j) h ) = g g + θ θ ( k ) a i ( h) b j 2 ( k ) a i ( h) b j 2 + v ( k ) a i v ( h) b gradient rotation angle velocity j + γ k ) ( c ( a ) + c( b ( ( j) i h )) replacement cost ( k ) a v = i l n ( k ) ai ( k ) a i (length) (# of points) (k ) θ ai (k ) g ai (h) v b j (h) θ bj (h) g bi Segment (k ) a i Segment (h) b j 9

10 Value-based Dissimilarity of Trajectories After structural matching, calculate the value-based dissimilarity for each pair of matched segments Attr.2 Trajectory A CoG Attribute 1 dissimilarity dv1(ap,bp) = peak difference+ (left diff. + right diff.)/2 Attr.2 Attr.1 Attribute 2 dissimilarity dv2(ap,bp) = peak difference+ (left diff. + right diff.)/2 Trajectory B (0) (0) 2 2 val ( a p, bp ) = dv 1 dv2 d + + cost Attr.1 D val ( A, B) = 1 P P p= 1 d val ( a (0) p, b (0) p ) 10

11 Experiment 1: ASL Data Dataset: Australia sign lang. dataset in UCI KDD archive Time-series data on the hand positions (3D) collected from 5 signers during performance of sign language. Used for experimental validation by Vlachos et al. in ICDE02 (as 2D trajectory) and Keogh et al. in KDD00 (as 1D time-series) For each signers, two to five sessions were conducted. In each session, five sign samples were recorded for each of the 95 words. The length of each sample was different and typically contained about time points. signer A signer E session 1 session n session n word 1 word 95 sample 1 sample 5 word 95 sample 1 sample 5 Examples of Norway 11

12 Experiment 1: ASL Data Experimental Procedure Out of the 95 signs (words), select the following 10 signs: Norway, cold, crazy, eat, forget, happy, innocent, later, lose, spend. Select a pair of words such as {Norway, cold}. For each word, there exist 5 sign samples; therefore a total of 10 samples are selected. Calculate the dissimilarities for each pair of the 10 samples by the proposed method. Construct two groups by applying average-linkage hierarchical clustering. Evaluate whether the samples are grouped correctly. word 1 ( Norway ) sample 1 sample 5 word 2 ( cold ) sample 1 sample 5 pairwise comparison & grouping (into two clusters) evaluate whether groups are correct or not Apply this procedure for every pair of 10 words (total 45 pairs /session) 12

13 Experiment 1: ASL Data Results Session # of correct pairs ratio andrew2 26/ john2 34/ john3 29/ john4 30/ stephen2 38/ stephen4 29/ waleed1 33/ waleed2 36/ waleed3 25/ waleed4 26/ (best) (worst) According to Vlachos et al., the results by the Euclidean dist., DTW, and LCSS were (15/45), (20/45), and (21/45). Signer/session info was not available on the paper. 13

14 Background for 2 nd Expermeint Hospital Information System (1980 s- ) Computerization of All Hospital Information Large-Scale Databases Data: Order and its Record: 1Order 3 to 5 Trans. All the clinical actions are described as orders Prescription Doctor (Order) Pharmacist Laboratory Examination Doctor (Order) Laboratory

15 Background: HIS (2) Hospital Information System Computerization of Orders Results of Orders Data for Clinical Actions Reuse of Stored Data Laboratory Examinations, Prescriptions, They are results from orders History of Orders: History of Clinical Actions Data-centric Hospital Management

16 Background: HIS (3) How many orders are made every day? A Case: Shimane University Hospital 616 beds, 1000 for outpatient clinic #Orders: about 8000 Prescription: 700, Injection: 700 Actions (Doctors & Nurses): 4300 Storage of Data : 100MB /day 30GB / year (cf. Image: 2.5TB/ year)

17 Chronology of #Orders ( ~6.7) Mon Tue Wed Thr Fri Fri Sun Sat

18 Chronology of #Orders ( ) Descriptions Documents Nursery

19 #Login 2008/6/2~2008/6/7 Wards Outpatient Clinic

20 Reuse of Data Understanding Dynamic Behavior of Hospital, Doctors and Patients : Temporal Data Mining Reuse of Orders Analysis of Clinical Actions Data Mining for Temporal Behaviors of Hospital or Medical Staff New type of Hospital Management

21 Co-occurrence of #Orders ( ) Reservations Prescription Morning Examinationa Afternoon Records

22 Experiment 2 : Data of #Orders Data # of Orders for Each Day ( ~6.7) Objective Find groups of similar trajectories Analyze the relationships between the grouped trajectories Method Generate a dissimilarity matrix using the proposed method Perform cluster analysis using dendrograms generated by hierarchical clustering method Results 2 Major Groups: Outpatient/Ward + Ward

23 Clustering Results

24 Visualization for Clusters

25 Records + Reservations Reservations Morning Outpatient Wards Afternoon Records Prescriptions, Examinations, Radiology, Reservations

26 Records and Nursery (Wards) Nursery Afternoon Wards Morning Records Outpatient Nursery and Injections

27 Conclusions Presented a new method for trajectory mining Trajectory representation -> multiscale, structural comparison -> value-based dissimilarity -> clustering Application to Australia Sign Language Dataset Correct grouping ratio: (worst), (best) High robustness to noise Application to Hopsital Data Two Groups of Behavior of #Orders: Outpatient, Ward Captured the Macroscopic Behavior of the UniversityHospital Future work Extention to Multidimensional Trajectories 27

28 Preliminary Results (3D) Matching Results for 3-D Trajectories 28

29 29

Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis

Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis Vol. 1, No. 1, Issue 1, Page 11 of 19 Copyright 2007, TSI Press Printed in the USA. All rights reserved Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis

More information

Maintenance of Domain Knowledge for Nursing Care using Data in Hospital Information System

Maintenance of Domain Knowledge for Nursing Care using Data in Hospital Information System Maintenance of Domain Knowledge for Nursing Care using Data in Hospital Information System Haruko Iwata, Shoji Hirano and Shusaku Tsumoto Department of Medical Informatics, School of Medicine, Faculty

More information

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Hidenao Abe, Miho Ohsaki, Hideto Yokoi, and Takahira Yamaguchi Department of Medical Informatics,

More information

Coupled Behavior Analysis with Applications

Coupled Behavior Analysis with Applications Coupled Behavior Analysis with Applications Professor Longbing Cao ( 操 龙 兵 ) Director, Advanced Analytics Institute University of Technology Sydney, Australia www-staff.it.uts.edu.au/~lbcao Agenda Why

More information

Data Mining for Risk Management in Hospital Information Systems

Data Mining for Risk Management in Hospital Information Systems Data Mining for Risk Management in Hospital Information Systems Shusaku Tsumoto and Shoji Hirano Department of Medical Informatics, Shimane University, School of Medicine, 89-1 Enya-cho, Izumo 693-8501

More information

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?

More information

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] Stephan Spiegel and Sahin Albayrak DAI-Lab, Technische Universität Berlin, Ernst-Reuter-Platz 7,

More information

SoSe 2014: M-TANI: Big Data Analytics

SoSe 2014: M-TANI: Big Data Analytics SoSe 2014: M-TANI: Big Data Analytics Lecture 4 21/05/2014 Sead Izberovic Dr. Nikolaos Korfiatis Agenda Recap from the previous session Clustering Introduction Distance mesures Hierarchical Clustering

More information

ANALYSIS OF VARIOUS CLUSTERING ALGORITHMS OF DATA MINING ON HEALTH INFORMATICS

ANALYSIS OF VARIOUS CLUSTERING ALGORITHMS OF DATA MINING ON HEALTH INFORMATICS ANALYSIS OF VARIOUS CLUSTERING ALGORITHMS OF DATA MINING ON HEALTH INFORMATICS 1 PANKAJ SAXENA & 2 SUSHMA LEHRI 1 Deptt. Of Computer Applications, RBS Management Techanical Campus, Agra 2 Institute of

More information

Information processing for new generation of clinical decision support systems

Information processing for new generation of clinical decision support systems Information processing for new generation of clinical decision support systems Thomas Mazzocco tma@cs.stir.ac.uk COSIPRA lab - School of Natural Sciences University of Stirling, Scotland (UK) 2nd SPLab

More information

2.1. Data Mining for Biomedical and DNA data analysis

2.1. Data Mining for Biomedical and DNA data analysis Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: simmibagga12@gmail.com) Dr. G.N. Singh Department of Physics and

More information

Hospital Information System in Japan Case example of Osaka University Hospital

Hospital Information System in Japan Case example of Osaka University Hospital Hospital Information System in Japan Case example of Osaka University Hospital Yasushi Matsumura, M.D., Ph.D. Osaka University Graduated School of Medicine Medial Informatics Kansai Area in Japan Kansai

More information

Highmark Professional Provider Privileging Application Teleradiology Supplement INSTRUCTIONS

Highmark Professional Provider Privileging Application Teleradiology Supplement INSTRUCTIONS Highmark Professional Provider Privileging Application Teleradiology Supplement INSTRUCTIONS 1. Please complete a separate Teleradiology Application Supplement for each physical location where imaging

More information

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns Outline Part 1: of data clustering Non-Supervised Learning and Clustering : Problem formulation cluster analysis : Taxonomies of Clustering Techniques : Data types and Proximity Measures : Difficulties

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

Exploration and Visualization of Post-Market Data

Exploration and Visualization of Post-Market Data Exploration and Visualization of Post-Market Data Jianying Hu, PhD Joint work with David Gotz, Shahram Ebadollahi, Jimeng Sun, Fei Wang, Marianthi Markatou Healthcare Analytics Research IBM T.J. Watson

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan , pp.217-222 http://dx.doi.org/10.14257/ijbsbt.2015.7.3.23 A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan Muhammad Arif 1,2, Asad Khatak

More information

Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is

Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is

More information

CLUSTER ANALYSIS FOR SEGMENTATION

CLUSTER ANALYSIS FOR SEGMENTATION CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every

More information

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

Two-Phase Data Warehouse Optimized for Data Mining

Two-Phase Data Warehouse Optimized for Data Mining Two-Phase Data Warehouse Optimized for Data Mining Balázs Rácz András Lukács Csaba István Sidló András A. Benczúr Data Mining and Web Search Research Group Computer and Automation Research Institute Hungarian

More information

For People With Diabetes. Blood Sugar Diary

For People With Diabetes. Blood Sugar Diary For People With Diabetes Blood Sugar Diary A Circle of Help to Live a Healthy Life You are the center of a healthy life with diabetes. All the elements of good care begin and end with you. The Importance

More information

Fujitsu Healthcare Business Overview

Fujitsu Healthcare Business Overview ITU Workshop on E-health services in low-resource settings: Requirements and ITU role (Tokyo, Japan, 4-5 February 2013) Fujitsu Healthcare Business Overview Yoshiyuki Takahashi, Director, Global / New

More information

Temporal Data Mining for Small and Big Data. Theophano Mitsa, Ph.D. Independent Data Mining/Analytics Consultant

Temporal Data Mining for Small and Big Data. Theophano Mitsa, Ph.D. Independent Data Mining/Analytics Consultant Temporal Data Mining for Small and Big Data Theophano Mitsa, Ph.D. Independent Data Mining/Analytics Consultant What is Temporal Data Mining? Knowledge discovery in data that contain temporal information.

More information

Secure Healthcare IT Solutions Covering Wide Range of Medical Care Information

Secure Healthcare IT Solutions Covering Wide Range of Medical Care Information Secure Healthcare IT Solutions Covering Wide Range of Medical Care Information OWAKI Naoki HASUMI Osamu SHIRAKANE Hisaya Toshiba Medical Systems Corporation offers a cluster of healthcare cloud services

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

HMS & HSDM Human Resources Winter Recess Break - December 2014. 1. Time Reporting Instructions

HMS & HSDM Human Resources Winter Recess Break - December 2014. 1. Time Reporting Instructions HMS & HSDM Human Resources Winter Recess Break - December 2014 Sections: 1. Time Reporting Instructions 2. Scenarios/Examples for Time Worked on a Holiday 3. Reporting and Approving Deadlines (Time & Labor

More information

DATA MINING AND CUSTOMER RELATIONSHIP MANAGEMENT FOR CLIENTS SEGMENTATION

DATA MINING AND CUSTOMER RELATIONSHIP MANAGEMENT FOR CLIENTS SEGMENTATION DATA MINING AND CUSTOMER RELATIONSHIP MANAGEMENT FOR CLIENTS SEGMENTATION Ionela-Catalina Tudorache (Zamfir) 1, Radu-Ioan Vija 2 1), 2) The Bucharest University of Economic Studies, Economic Cybernetics

More information

LAUREA MAGISTRALE - CURRICULUM IN INTERNATIONAL MANAGEMENT, LEGISLATION AND SOCIETY. 1st TERM (14 SEPT - 27 NOV)

LAUREA MAGISTRALE - CURRICULUM IN INTERNATIONAL MANAGEMENT, LEGISLATION AND SOCIETY. 1st TERM (14 SEPT - 27 NOV) LAUREA MAGISTRALE - CURRICULUM IN INTERNATIONAL MANAGEMENT, LEGISLATION AND SOCIETY 1st TERM (14 SEPT - 27 NOV) Week 1 9.30-10.30 10.30-11.30 11.30-12.30 12.30-13.30 13.30-14.30 14.30-15.30 15.30-16.30

More information

State of Bahrain Ministry of Health Salmaniya Medical Complex

State of Bahrain Ministry of Health Salmaniya Medical Complex State of Bahrain Ministry of Health Salmaniya Medical Complex A Handbook January 2001 Appendix Purpose. 3 Introduction. 4 Specialties available at S.M.C. 5 Outpatient Policies and Procedures Source of

More information

INFORMATION TECHNOLOGIES FOR PATIENT CARE MANAGEMENT

INFORMATION TECHNOLOGIES FOR PATIENT CARE MANAGEMENT SUMMARY Features INTERIN Technology, a complex of software tools and techniques for building health care information systems, was developed in the Program Systems Institute, Russian Academy of Sciences.

More information

Household Information. * Print Full Name: Date: * Address: * Language: * Date of Birth: * Gender: F M

Household Information. * Print Full Name: Date: * Address: * Language: * Date of Birth: * Gender: F M KinderWaitlist Application Household Information * Print Full Name: Date: * Address: Street Apartment/Unit City State Zip Code * Language: * Date of Birth: * Gender: F M Social Security Number: - - Declined

More information

Energy Price Fact Sheet

Energy Price Fact Sheet Regulated Retail- Single Rate - ERM4 838RS Customer type Fuel type(s) Distributor Offer type Release date Small business Electricity Endeavour Single rate Regulated offer 01/07/2013 Electricity charges

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Deep Vein Thrombosis or Pulmonary Embolism

Deep Vein Thrombosis or Pulmonary Embolism What You Need to Know After Deep Vein Thrombosis or Pulmonary Embolism The content provided here is for informational purposes only. It is not intended to diagnose or treat a health problem or disease,

More information

Time series clustering and the analysis of film style

Time series clustering and the analysis of film style Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such

More information

Investigating Clinical Care Pathways Correlated with Outcomes

Investigating Clinical Care Pathways Correlated with Outcomes Investigating Clinical Care Pathways Correlated with Outcomes Geetika T. Lakshmanan, Szabolcs Rozsnyai, Fei Wang IBM T. J. Watson Research Center, NY, USA August 2013 Outline Care Pathways Typical Challenges

More information

Measurements on the Spotify Peer-Assisted Music-on-Demand Streaming System

Measurements on the Spotify Peer-Assisted Music-on-Demand Streaming System The Spotify Protocol on the Spotify Peer-Assisted Music-on-Demand Streaming System Mikael Goldmann KTH Royal nstitute of Technology Spotify gkreitz@spotify.com P2P 11, September 1 2011 on Spotify Spotify

More information

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in

More information

Making the Most of Your Local Pharmacy

Making the Most of Your Local Pharmacy Making the Most of Your Local Pharmacy Wigan Borough Pharmacy Patient Satisfaction Survey 2015 Introduction A patient satisfaction survey was carried out involving pharmacies in Wigan Borough and supported

More information

Implementing MICO Beyond the EMR

Implementing MICO Beyond the EMR Implementing MICO Beyond the EMR Dr Wong Yue Sie Group COO, SHS COO & Div Chair ACSS, SGH National Health Informatics Summit 18 July 2009 1 SingHealth Institutions 3 Hospitals 5 National Specialty Centres

More information

Healthcare Professional. Driving to the Future 11 March 7, 2011

Healthcare Professional. Driving to the Future 11 March 7, 2011 Clinical Analytics for the Practicing Healthcare Professional Driving to the Future 11 March 7, 2011 Michael O. Bice Agenda Clinical informatics as context for clinical analytics Uniqueness of medical

More information

Neural Networks Lesson 5 - Cluster Analysis

Neural Networks Lesson 5 - Cluster Analysis Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29

More information

A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis

A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis Yusuf Yaslan and Zehra Cataltepe Istanbul Technical University, Computer Engineering Department, Maslak 34469 Istanbul, Turkey

More information

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,

More information

Chemoembolization for Patients with Pancreatic Neuroendocrine Tumours

Chemoembolization for Patients with Pancreatic Neuroendocrine Tumours Chemoembolization for Patients with Pancreatic Neuroendocrine Tumours What is this cancer? Pancreatic Endocrine Tumours are also called Pancreatic Neuroendocrine Tumours. This cancer is rare and it starts

More information

Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz

Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz Luigi Di Caro 1, Vanessa Frias-Martinez 2, and Enrique Frias-Martinez 2 1 Department of Computer Science, Universita di Torino,

More information

Project Management Professionals Hot Topics & Challenges Quality Management. Topic: Seven Basic Quality Management Tools

Project Management Professionals Hot Topics & Challenges Quality Management. Topic: Seven Basic Quality Management Tools Topic: Seven Basic Quality Management Tools Presenter: Sohel Akhter, PMP, CCNA,ISMS PMP Instructor, Netcom Learning Adjunct Professor, MBA program, CUNY 1 Agenda Seven Basic Quality Tools & Techniques

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical

More information

COMMUNITY HEALTH RESOURCES PARENT GUIDE. Children s Diagnostic & Treatment Center (CDTC) - 954-728-8080

COMMUNITY HEALTH RESOURCES PARENT GUIDE. Children s Diagnostic & Treatment Center (CDTC) - 954-728-8080 COMMUNITY HEALTH RESOURCES PARENT GUIDE Broward County Health Department - 954-467-4700 Children s Dental Treatment 954-467-4820 Immunizations 954-467-4700 Ex. 4705 Tuberculosis (TB) Clinic 954-467-4808

More information

Statistical Databases and Registers with some datamining

Statistical Databases and Registers with some datamining Unsupervised learning - Statistical Databases and Registers with some datamining a course in Survey Methodology and O cial Statistics Pages in the book: 501-528 Department of Statistics Stockholm University

More information

Electronic Health Records - An Overview - Martin C. Were, MD MS March 24, 2010

Electronic Health Records - An Overview - Martin C. Were, MD MS March 24, 2010 Electronic Health Records - An Overview - Martin C. Were, MD MS March 24, 2010 Why Electronic Health Records (EHRs) EHRs vs. Paper Components of EHRs Characteristics of a good EHRs A Kenyan EHRs implementation

More information

Part-time Diploma in InfoComm and Digital Media (Information Systems) Certificate in Information Systems Course Schedule & Timetable

Part-time Diploma in InfoComm and Digital Media (Information Systems) Certificate in Information Systems Course Schedule & Timetable Certificate in Information Systems Course Schedule & Timetable Module Code Module Title Start Date End Date Coursework Final Exam PTDIS010101 Management Information Tue, April 16, 2013 Tue, 2 April 2013

More information

DHL Data Mining Project. Customer Segmentation with Clustering

DHL Data Mining Project. Customer Segmentation with Clustering DHL Data Mining Project Customer Segmentation with Clustering Timothy TAN Chee Yong Aditya Hridaya MISRA Jeffery JI Jun Yao 3/30/2010 DHL Data Mining Project Table of Contents Introduction to DHL and the

More information

Automated Process for Generating Digitised Maps through GPS Data Compression

Automated Process for Generating Digitised Maps through GPS Data Compression Automated Process for Generating Digitised Maps through GPS Data Compression Stewart Worrall and Eduardo Nebot University of Sydney, Australia {s.worrall, e.nebot}@acfr.usyd.edu.au Abstract This paper

More information

The Role of The Consultant, The Doctor and The Nurse. Mr Gary Kitching Consultant in Emergency Medicine Foundation Training Programme Director

The Role of The Consultant, The Doctor and The Nurse. Mr Gary Kitching Consultant in Emergency Medicine Foundation Training Programme Director The Role of The Consultant, The Doctor and The Nurse. Mr Gary Kitching Consultant in Emergency Medicine Foundation Training Programme Director Objective To provide an overview of your role as a junior

More information

A Two-Step Method for Clustering Mixed Categroical and Numeric Data

A Two-Step Method for Clustering Mixed Categroical and Numeric Data Tamkang Journal of Science and Engineering, Vol. 13, No. 1, pp. 11 19 (2010) 11 A Two-Step Method for Clustering Mixed Categroical and Numeric Data Ming-Yi Shih*, Jar-Wen Jheng and Lien-Fu Lai Department

More information

Open source tools for trajectory data analysis

Open source tools for trajectory data analysis Open source tools for trajectory data analysis ITS Canada 15 th Annual Conference Nicolas Saunier nicolas.saunier@polymtl.ca June 13 th 2012 Outline Transportation Data Sample Applications Open Source

More information

Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II)

Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II) Fundamenta Informaticae 90 (2009) i vii DOI 10.3233/FI-2009-0001 IOS Press i Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II) Yingxu Wang Visiting

More information

Tutorial for proteome data analysis using the Perseus software platform

Tutorial for proteome data analysis using the Perseus software platform Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information

More information

Patient Trajectory Modeling and Analysis

Patient Trajectory Modeling and Analysis Patient Trajectory Modeling and Analysis Jalel Akaichi and Marwa Manaa Higher Institute of Management of tunis, 41 Rue de la Liberté, Cité Bouchoucha, 2000 Bardo, Tunisia j.akaichi@gmail.com, manaamarwa@gmail.com

More information

Injury Reporting PACKET. 1-888-627-7586 www.careworksmco.com

Injury Reporting PACKET. 1-888-627-7586 www.careworksmco.com Injury Reporting PACKET 1-888-627-7586 www.careworksmco.com Workplace Injury. Take the Right Steps. Helping Simplify the First Report of Injury (FROI) Process 1 2 3 4 INJURED EMPLOYEE 4-STEP PROCESS Immediately

More information

Visual Data Mining with Pixel-oriented Visualization Techniques

Visual Data Mining with Pixel-oriented Visualization Techniques Visual Data Mining with Pixel-oriented Visualization Techniques Mihael Ankerst The Boeing Company P.O. Box 3707 MC 7L-70, Seattle, WA 98124 mihael.ankerst@boeing.com Abstract Pixel-oriented visualization

More information

Dynamic Data in terms of Data Mining Streams

Dynamic Data in terms of Data Mining Streams International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

Effective Clustering of Time-Series Data Using FCM

Effective Clustering of Time-Series Data Using FCM Effective Clustering of Time-Series Data Using FCM Saeed Aghabozorgi and Teh Ying Wah Abstract Today, wide important advances in clustering time series have been obtained in the field of data mining. A

More information

Cluster Analysis. Isabel M. Rodrigues. Lisboa, 2014. Instituto Superior Técnico

Cluster Analysis. Isabel M. Rodrigues. Lisboa, 2014. Instituto Superior Técnico Instituto Superior Técnico Lisboa, 2014 Introduction: Cluster analysis What is? Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/54957

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information

Operations Research in Health Care or Who Let the Engineer Into the Hospital?

Operations Research in Health Care or Who Let the Engineer Into the Hospital? Operations Research in Health Care or Who Let the Engineer Into the Hospital? Michael W. Carter Health Care Resource Modelling Group Mechanical and Industrial Engineering University of Toronto 1 Outline

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

The SPSS TwoStep Cluster Component

The SPSS TwoStep Cluster Component White paper technical report The SPSS TwoStep Cluster Component A scalable component enabling more efficient customer segmentation Introduction The SPSS TwoStep Clustering Component is a scalable cluster

More information

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project Alastair Duncan STFC Pre Coffee talk STFC July 2014 SCAPE Scalable Preservation Environments The

More information

Visualization of large data sets using MDS combined with LVQ.

Visualization of large data sets using MDS combined with LVQ. Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk

More information

Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties. Pierre Mouillard MD

Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties. Pierre Mouillard MD Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties Pierre Mouillard MD What is Big Data? lots of data more than you can process using common database software and

More information

Sales Associate Business Plan-Tobias Realty

Sales Associate Business Plan-Tobias Realty Sales Associate Business Plan-Tobias Realty Sales Associate Year Date Use this guide to stimulate your thoughts about your career planning. Begin by identifying and prioritizing your vision for the next

More information

Staying on Schedule. Tips for taking your HIV medicines

Staying on Schedule. Tips for taking your HIV medicines Staying on Schedule Tips for taking your HIV medicines 9 1 3 8 7 6 5 Taking HIV medicines is a big step in fighting your HIV. These medicines can reduce the amount of HIV in your blood to very low levels

More information

Comparison of Elastic Matching Algorithms for Online Tamil Handwritten Character Recognition

Comparison of Elastic Matching Algorithms for Online Tamil Handwritten Character Recognition Comparison of Elastic Matching Algorithms for Online Tamil Handwritten Character Recognition Niranjan Joshi, G Sita, and A G Ramakrishnan Indian Institute of Science, Bangalore, India joshi,sita,agr @ragashrieeiiscernetin

More information

Unsupervised learning: Clustering

Unsupervised learning: Clustering Unsupervised learning: Clustering Salissou Moutari Centre for Statistical Science and Operational Research CenSSOR 17 th September 2013 Unsupervised learning: Clustering 1/52 Outline 1 Introduction What

More information

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: macario.cordel@dlsu.edu.ph

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

High Blood Pressure in People with Diabetes:

High Blood Pressure in People with Diabetes: Prepared in collaboration with High Blood Pressure in People with Diabetes: Are you at risk? Updated 2012 People with diabetes are more likely to have high blood pressure. What is blood pressure? The force

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

14 Sep 2015-31 Jul 2016 (Weeks 1 - V14)

14 Sep 2015-31 Jul 2016 (Weeks 1 - V14) Mon MAN2012L/Employability and Enterprise Skills/ Mrs. F Malik, Mrs. J Morgan MAN2012L/Employability & Enterprise Skills/ Mrs. F Malik, Mrs. J Morgan Assignment Success Workshop NOT COMPULSORY Wks 3,5

More information

STScI Bandwidth and the Archive. STUC Presentation 11/12/2009 Carl Johnson

STScI Bandwidth and the Archive. STUC Presentation 11/12/2009 Carl Johnson STScI Bandwidth and the Archive STUC Presentation 11/12/2009 Carl Johnson 1 Prologue STScI established an Archive Team to unify all our multi-mission archive services, operations, and resources under a

More information

Trajectory based Behavior Analysis for User Verification

Trajectory based Behavior Analysis for User Verification Trajectory based Behavior Analysis for User Verification Hsing-Kuo Pao 1, Hong-Yi Lin 1, and Kuan-Ta Chen 2 1 Dept. of Computer Science & Information Engineering, National Taiwan University of Science

More information

Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 30602-2501

Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 30602-2501 CLUSTER ANALYSIS Steven M. Ho!and Department of Geology, University of Georgia, Athens, GA 30602-2501 January 2006 Introduction Cluster analysis includes a broad suite of techniques designed to find groups

More information

A Review of Missing Data Treatment Methods

A Review of Missing Data Treatment Methods A Review of Missing Data Treatment Methods Liu Peng, Lei Lei Department of Information Systems, Shanghai University of Finance and Economics, Shanghai, 200433, P.R. China ABSTRACT Missing data is a common

More information

Supplementary Material: Covariate-adjusted matrix visualization via correlation decomposition

Supplementary Material: Covariate-adjusted matrix visualization via correlation decomposition Supplementary Material: Covariate-adjusted matrix visualization via correlation decomposition Han-Ming Wu 1, Yin-Jing Tien 2, Meng-Ru Ho 3,4,5, Hai-Gwo Hwu 6, Wen-chang Lin 5, Mi-Hua Tao 5, and Chun-Houh

More information

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

Data Mining Project Report. Document Clustering. Meryem Uzun-Per Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...

More information

Spatio-Temporal Map for Time-Series Data Visualization

Spatio-Temporal Map for Time-Series Data Visualization Spatio-Temporal Map for Time-Series Data Visualization Hiroko Nakamura Miyamura, Yoshio Suzuki, and Hiroshi Takemiya Center for Computational Science and E-system Japan Atomic Energy Research Agency 6-9-3

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecture Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e CHAPTER 1: INTRODUCTION Big Data 3 Widespread

More information

PLASTIC REGION BOLT TIGHTENING CONTROLLED BY ACOUSTIC EMISSION MONITORING

PLASTIC REGION BOLT TIGHTENING CONTROLLED BY ACOUSTIC EMISSION MONITORING PLASTIC REGION BOLT TIGHTENING CONTROLLED BY ACOUSTIC EMISSION MONITORING TADASHI ONISHI, YOSHIHIRO MIZUTANI and MASAMI MAYUZUMI 2-12-1-I1-70, O-okayama, Meguro, Tokyo 152-8552, Japan. Abstract Troubles

More information

HDDVis: An Interactive Tool for High Dimensional Data Visualization

HDDVis: An Interactive Tool for High Dimensional Data Visualization HDDVis: An Interactive Tool for High Dimensional Data Visualization Mingyue Tan Department of Computer Science University of British Columbia mtan@cs.ubc.ca ABSTRACT Current high dimensional data visualization

More information