FEATURE SPECIFIC CRIMINAL MAPPING USING DATA MINING TECHNIQUES AND GENERALIZED GAUSSIUN MIXTURE MODEL

Similar documents
AN INTELLIGENT ANALYSIS OF CRIME DATA USING DATA MINING & AUTO CORRELATION MODELS

Detecting Suspicion Information on the Web Using Crime Data Mining Techniques

Web Crime Mining by Means of Data Mining Techniques

NATIONAL SECURITY CRITICAL MISSION AREAS AND CASE STUDIES

Data Mining for Digital Forensics

Algorithmic Crime Prediction Model Based on the Analysis of Crime Clusters

A Proposed Data Mining Model to Enhance Counter- Criminal Systems with Application on National Security Crimes

An intelligent Analysis of a City Crime Data Using Data Mining

Crime Hotspots Analysis in South Korea: A User-Oriented Approach

Clustering Data Streams

International Journal of Advanced Computer Technology (IJACT) ISSN: PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS

Keywords cosine similarity, correlation, standard deviation, page count, Enron dataset

Crime Pattern Analysis

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

DATA MINING AND EXPERT SYSTEMS IN LAW ENFORCEMENT AGENCIES

IJCSES Vol.7 No.4 October 2013 pp Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis

Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis

Clustering Technique in Data Mining for Text Documents

Optimal Multi Server Using Time Based Cost Calculation in Cloud Computing

International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online

IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES

International Journal of Advance Research in Computer Science and Management Studies

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results

Distributed Framework for Data Mining As a Service on Private Cloud

Supporting Privacy Protection in Personalized Web Search

A New Approach For Estimating Software Effort Using RBFN Network

Analyzing Huge Data Sets in Forensic Investigations

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

CREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

FREQUENT PATTERN MINING FOR EFFICIENT LIBRARY MANAGEMENT

Flexible Deterministic Packet Marking: An IP Traceback Scheme Against DDOS Attacks

A Novel Approach for Network Traffic Summarization

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

An Approach to Improve Computer Forensic Analysis via Document Clustering Algorithms

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

A UPS Framework for Providing Privacy Protection in Personalized Web Search

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing

Bisecting K-Means for Clustering Web Log data

Prediction Model for Crude Oil Price Using Artificial Neural Networks

An Enhanced Algorithm to Predict a Future Crime using Data Mining

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies

An Overview of Knowledge Discovery Database and Data mining Techniques

A Secure Online Reputation Defense System from Unfair Ratings using Anomaly Detections

Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior

E-commerce Transaction Anomaly Classification

Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2

Application of Data Mining Techniques in Intrusion Detection

A Study of Web Log Analysis Using Clustering Techniques

A Robust Method for Solving Transcendental Equations

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

Visibility optimization for data visualization: A Survey of Issues and Techniques

An Efficient Data Correctness Approach over Cloud Architectures

Using Data Mining for Mobile Communication Clustering and Characterization

Exploring Resource Provisioning Cost Models in Cloud Computing

EAGLE EYE Wi-Fi. 1. Introduction

Web Forensic Evidence of SQL Injection Analysis

Clinic + - A Clinical Decision Support System Using Association Rule Mining

A New Cognitive Approach to Measure the Complexity of Software s

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Spam Detection Using Customized SimHash Function

Data Mining in Web Search Engine Optimization and User Assisted Rank Results

EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM

BIG DATA IN HEALTHCARE THE NEXT FRONTIER

Effective Data Mining Using Neural Networks

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

Online Credit Card Application and Identity Crime Detection

MapReduce Approach to Collective Classification for Networks

Spam Filtering in Online Social Networks Using Machine Learning Technique

International Journal of Engineering Research ISSN: & Management Technology November-2015 Volume 2, Issue-6

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment

Financial Trading System using Combination of Textual and Numerical Data

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD

Document Image Retrieval using Signatures as Queries

On A Network Forensics Model For Information Security

IBM Content Analytics: Rapid insight for crime investigation

INVESTIGATIONS INTO EFFECTIVENESS OF GAUSSIAN AND NEAREST MEAN CLASSIFIERS FOR SPAM DETECTION

A Study of Data Perturbation Techniques For Privacy Preserving Data Mining

AN EFFICIENT STRATEGY OF AGGREGATE SECURE DATA TRANSMISSION

Standardization of Components, Products and Processes with Data Mining

Natural Language to Relational Query by Using Parsing Compiler

Qi Liu Rutgers Business School ISACA New York 2013

Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques

SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL

Smart Security by Predicting Future Crime with GIS and LBS Technology on Mobile Device

DELEGATING LOG MANAGEMENT TO THE CLOUD USING SECURE LOGGING

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Transcription:

FEATURE SPECIFIC CRIMINAL MAPPING USING DATA MINING TECHNIQUES AND GENERALIZED GAUSSIUN MIXTURE MODEL Uttam Mande Y.Srinivas J.V.R.Murthy Dept of CSE Dept of IT Dept of CSE GITAM University GITAM University J.N.T.University Visakhapatnam Visakhapatnam Kakinada mineuttam@gmail.com sriteja.y@gmail.com mjonnalagedda@gmail.com Abstract: Lot of research is projected to map the criminal with that of crime and it is observed that there is still a huge increase in the crime rate due to the gap between the optimal usage of technologies and investigation. This has given scope for the development of new methodologies in the area of crime investigation using the techniques based on data mining, image processing, forensic, and social mining. In this paper, presents a model using new methodology for mapping the criminal with the crime. This model clusters the criminal data basing on the type crime. When a crime occurs, based on the eye witness specified features, the criminal is mapped. Here we propose a novel methodology that uses Generalized Gaussian Mixture Model to map the features specified by the eyewitness with that of the features of the criminal who have committed the same type of the crime, if the criminal is not mapped, the suspect table is checked and the reports are generated. Key words: Data mining, Generalized Gaussian Mixture model, Crime, criminal mapping, criminal identification, forensics.. Introduction With the quest towards luxurious life, people to fulfill their ambitions are getting themselves indulged towards the crime and criminal activities. The national crime record bureau states that, there are nearly 230000 crimes which occurred upto 200, The Governments is working rigorously towards the steps to be taken to decrease the crime rate, using ) Prevention of the crime occurrence 2) Finding the accused and punishing. Prevention of the crime occurrence(before crime occurrence) It always better to prevent the occurrence crime instead of taking the measures after occurrence of the crime. For this, the analysis of crime data plays a vital role in prediction of the occurrence of the future crime. Much work has been reported in this areas and it is mostly confined in predicting the crime occurrence based on the location, which helps for to analyze the locations chosen by the criminals. Analyzing the data of crime, finding the frequent locations they are using the for particular type of crime, time of crime, and social status of the people, crime links etc. will be of more use. This helps the law keepers to predict the crime and take the nessacary action..2 Finding accused and punishing(after the occurrence of the crime) Punishment creates a fear among the criminals and thereby acts as a preventive measure to stop the others criminals to get attracted towards the criminal activities; this directly implies that if a criminal is identified and punished rigorously for his criminal activities, indirectly leads towards the crime prevention. As per the latest statistics, it can be observed that more than 70% per cases are not solved and pending which implies increase in the rate of the crime, and as per the survey reports, the crime activities are mostly repeated by the old criminals who have committed similar crimes earlier, so it proves that if the old crimes are solved, indirectly helps to prevent the new crimes. With this gap in mind, in this paper, we have developed an model which helps the police department and other investigation agencies to find the criminal at the earliest. This model is confined to situations where the eye witness is there at the place of crime. for our model we have collected a data which is generated from the criminal data available from different police stations. The database is created by considering two basic facts viz., availability of witness at the criminal sport and the reports from the forensic labs that are collected from the clue spot[uttam et al 202]. The data mining concepts are exploited for mining the likelihood of the criminals based on the features/ witness available. Based on the type of the crime the clustering is performed. Here the weight are assigned to the each crimes and the they are clustered using the k- means clustering. We have got mainly four clusters. The crime activities considered in this paper are Murder, Riot, kidnap and robbery. 375

Based on the features described about the criminal, a face is generated and the generated face is compared with that of the existing faces for finding the likelihood of the criminals[uttam et al202]. These identities are further mapped with the forensic clues to formulate as unique identity. Data mining techniques help to explore the enormous data and making it possible in reaching the ultimate goal of criminal analysis by using the concepts of clustering and classification. In this paper the concept of clustering is carried out basing on the type of crime. The earlier work The rest of paper is organized as follows, section 2 of the paper describes about the feature extraction, section- 3 will give the details about the crime identification presented, Clustering techniques is discussed,in section -4 the generalized Gaussian Mixture model is presented, experimentation is highlighted in section- 5, the section 6 of the paper focus on the conclusion 2. Features Extraction: The way of doing crime investigation is depended on different factor like, )type of crime. 2) Witness available and 3) Clues available etc.. Using these factors the crimes can be mapped with criminal who are having the same pattern. In the crime database we have to consider the features which are available at the crime spot. For the identification of any crime we must depend on the different crime variable s like )clue variables 2) criminal relating/identification variables. Crime clues play a vital role in the proper identification of criminal. The clues help the stepping stone towards the crime analysis, and criminal relating is the mapping of the criminal based on the clues with data available in the data base, by the use of intelligent knowledge mapping. 6) Physical attributes of criminal a. (hair, built, eyebrows,nose,teeth, beard, age group,mustache,languages known) These criminal variables help to analyze the dataset there by making the crime investigators to plane for identification of the criminal.. chose of the variables is depended on the particular scenario of the crime occurred.in this paper we have considered the k- means clustering to cluster the data base based on type of crime and the classification is carried out from the feature available. 3. k-means clustering: In order to simplify the analysis process the huge dataset available is to be clustered. The clustering in this paper is based on the type of crime. A data set is generated from the database available from the Andhra Pradesh police department and a table is created by considering the FIR report The various fields considered including the criminal identification numbers, criminal attributes, criminal psychological behavior, crime location, time of crime (day/night), witness /clue, the data set is generated by indexing the each feature with index value as shown in figure 3. 3. Crime variables for criminal mapping The important crime variables used for crime mapping include ) Type of crime a. (murder,rape,riot,kidnap) 2) way of doing the crime killing a. Criminal psychological behavior can be recognized here like for murder(harsh, smooth, removing parts) 3) Time of crime a. (time,day time,night) 4) Modus operandi a. (object used for crime), )Pistol 2)Rope 3)Stick 4)Knife 5) Crime location a. (place: restaurant, theater, road, railway station, shop/gold shop, mall, house, apartment ) Fig 3. 376

Depending on the different offences and the victims the crime are categorized as shown in the Fig 3.2.For the different crimes different sections are given,hence when ever crime occurs until the investigation completes, law enforces can not define the crime to exact category,this implies that the at early stages the crime is defined in generalized way When a crime occurs the data to be analyzed is taken from any one dataset of the cluster in which this crime comes. Here the features specified by the eye witness are considered and they are given to mapping model which calculates the PDF s of features using Generalized Gaussian mixture model and compare with PDF S of the criminals features. 4. Generalized Gaussian Mixture Model (GGMM) Generalized Gaussian Distribution In this section we briefly discuss the probability distribution and its properties used in the crime identification algorithm. Let the crime data values intensities in the entire crime data obtained by crime data values intensities is a Random Variable and follow a Finite Generalized Gaussian Mixture Distribution. It is also assumed that the entire crime data is a collection of K crime data regions, then the crime data values intensities in each crime data region follows a Generalized Gaussian Distribution. The probability density function is f ( z,, P) e 2 ( ) AP (, ) P P ( Z i ) i AP (, ) -- (4.) Fig 3.2 categories of crimes Crimes are categorized in many ways, here we have given weights to each type of crime where weighing scheme is considered in the manner all the relative crimes will be given with near values, after applying clustering algorithm on this type of crime feature we have got four clusters of crime data they are robbery, kidnap, murder and riot.as shown in figure 3.3 X-axis crimes indexes Y-axis CID Where ( ) 0, AP (, ) P ( 3 ) P 2 2. -- (4.2) The parameter is the mean, the function AP (, ) is an scaling factor which allows that the 2 Var(Z) =, and P is the shape parameter. When P=, the corresponding Generalized Gaussian corresponds to a Laplacian or Exponential Distribution, When P=2, the corresponding Generalized Gaussian corresponds to a Gaussian distribution. In limiting cases P, equation () converges to a uniform distribution in ( 3, 3 ) and when P 0, the distribution becomes a degenerate one in Z= μ. 5. Experimentation The features are taken as input from the witness available and basing on the features, the database is compared for the relevancy in crime. The features that are considered are the standard features generally used in FIR at the police stations and is shown in the following figure 5. Fig 3.2 377

cid MSE PSNR 00 0. 28.835 003 0. 23.836 004 0.22 27.33 005 0.888 24.32 008 0.555 25.34 009 0.444 25.426 02 0. 23.436 03 0 50 04 0.777 24.6 Fig 5. The features which are given as input are given for the calculation of PDF using the Generalized Gaussian Mixture model, these PDF s are compared with PDF s of the criminals features in the data base and the criminals with nearer values are displayed as shown in the fig 5.2 05 0. 28.836 06 0.666 24.945 07 2. 22.445 Fig5.3 the snap shot of data of MSE and PSNR Here the as shown in the table the values of MSE and PSNR are analyzed and the criminal with highest PSNR is considered to be most likelihood criminal as shoen in the fig 5.5 the criminal with cid 03 is considered as most likelihood criminal with highest PSNR Fig 5.2 The outputs obtained includes cids and their filtered data is stored in the data base further investigation here the metrics like the MSE and PSNR are used for the evaluation of the outputs as shown in the fig 5.3 Fig 5.5 Graph showing the CID with high PSNR If the witness is available, at the crime incident, or of the forensics reports are available, then in such cases, identification of the criminal is a considered in this paper. The criminal is mapped by collecting 378

the features about the crime from the witness and comparing them with that of the available from the data base. and if there is a map, the criminal can be identified..if the data available from the witness is not sufficient, the forensics reports are also considered that are available, and correlate this report with the report of the witness to ratify criminal. Using the methodology a criminal is identified and for the uniqueness, this data is given as input to the Generalized Gaussian mixture model to identify a unique criminal. From the above table, 03 is identified as the criminals 6. Conclusion: This paper presents a novel methodology of identifying a criminal, in the presence of witness or any clue by the forensic experts. In these situations, in this paper we have tried to identify the criminal by mapping the criminal using the Generalized Gaussian mixture model.. References:.Carlile of Berriew Q.C Data mining: The new weapon in the war on terrorism retrived from the Internet on 28-02-20 2.Cate H. Fred Legal Standards for Data Mining retrieved from the internet on 2-03-20 http://www.hunton.com/files/tbl_s47details/fileup load265/250/cate_fourth_amendment.pdf 9. A. Gray, P. Sallis, and S. MacDonell, Software Forensics: Extending Authorship Analysis Techniquesto Computer Programs, Proc. 3rd Biannual Conf.Int l Assoc. Forensic Linguistics, Int l Assoc. Forensic Linguistics, 997, pp. -8. 0. R.V. Hauck et al., Using Coplink to Analyze Criminal-Justice Data, Computer, Mar. 2002, pp. 30-37.. T. Senator et al., The FinCEN Artificial Intelligence System: Identifying Potential Money Laundering from Reports of Large Cash Transactions, AI Magazine,vol.6, no. 4, 995, pp. 2-39. 2. W. Lee, S.J. Stolfo, and W. Mok, A Data Mining Framework for Building Intrusion detection Models, Proc. 999 IEEE Symp. Security and Privacy, IEEE CS Press, 999, pp. 20-32. 0. O. de Vel et al., Mining E-Mail Content for Author Identification Forensics, SIGMOD Record, vol. 30, no. 4, 200, pp. 55-64. 3. G. Wang, H. Chen, and H. Atabakhsh, Automatically Detecting Deceptive Criminal Identities, Comm. ACM, Mar. 2004, pp. 70-76. 4. S. Wasserman and K. Faust, Social Network Analysis:Methods and Applications, Cambridge Univ. Pr is available as a Word file, <copyright.doc>. 3.Clifton Christopher (20). Encyclopedia Britannica: data mining, Retrieved from the web on 20-0-20 4.Jeff and Harper, Jim Effective Counterterrorism and the Limited Role of Predictive Data Mining retrieved from the web 2-02-20 5. U.M. Fayyad and R. Uthurusamy, Evolving Data Mining into Solutions for Insights, Comm. ACM, Aug. 2002, pp. 28-3. 6. W. Chang et al., An International Perspective on Fighting Cybercrime, Proc. st NSF/NIJ Symp. Intelligence and Security Informatics, LNCS 2665, Springer-Verlag, 2003, pp. 379-384. 7. H. Kargupta, K. Liu, and J. Ryan, Privacy- Sensitive Distributed Data Mining from Multi- Party Data, Proc. st NSF/NIJ Symp. Intelligence and Security Informatics, LNCS 2665, Springer- Verlag, 2003, pp. 336-342.April 2004 8. M.Chau, J.J. Xu, and H. Chen, Extracting Meaningful Entities from Police Narrative Reports, Proc.Nat l Conf. Digital Government Research, Digital Government Research Center, 2002, pp. 27-275. 379