Novel Data Extraction Language for Structured Log Analysis
|
|
- Jodie Farmer
- 8 years ago
- Views:
Transcription
1 Novel Data Extraction Language for Structured Log Analysis P.W.D.C. Jayathilake 99X Technology, Sri Lanka. ABSTRACT This paper presents the implementation of a new log data extraction language. Theoretical formation of the language schema was presented in a previous work of ours (Jayathilake, 2011). In the design of new language we focus on specific problems encountered in automating log analysis. Emphasis is put on the structured nature of log files. A brief review on existing data format description mechanisms is also provided. After describing the implementation of the new language, we compare it with another popular data description language to highlight the unique capabilities of it. KEYWORDS Log data extraction, Data format description language, Log analysis, Declarative language INTRODUCTION Software log files contain information pertaining to most user and system actions within an organization. Regulations such as PCI DSS, FISMA, HIPAA and frameworks like ISO and COBIT emphasize standards on logging. If utilized properly, log data can generate a huge value in various facets of a business. Log analysis has proven its potential in intrusion detection, unintended user activity identification, system compliance testing, software troubleshooting, software monitoring, performance benchmarking and functional testing. Despite its benefits, log analysis is a process that incurs a huge cost if the entire range of its phases is performed manually. Reasons are two fold; log analysis requires expertise and a significant time is consumed for digging deep into loads of data and making inferences. Commercial tools that exist in the market deliver a range of functionalities for automating certain stages of the analysis process. These include log data collection from different sources, data indexing, searching, automatic identification of common log file constructs such as timestamps and IP addresses, customizable dashboards for data visualization, highlighting anomalies, automatic compliance checks, etc. All existing tools treat log data as unstructured information. Though high entropy of log information justifies this practice, it imposes numerous limitations in automating log analysis. Lack of contextual correctness, for example, poses many challenges in creating semantics for inferring results automatically. Jayathilake (2011) published an initial version of a framework that creates a platform for structured log analysis. Its core constituent was a new procedural language, which was designed to be used in every phase of automated log analysis. Though the language proved to be powerful in processing log data, we soon realized its inappropriateness in log data extraction. Jayathilake (2011) later published a specification for a declarative language for describing the format of any log file. The intention was to make the log file format declaration more readable and to pick information of interest from log files more easily. We proved the flexibility of the specification in expressing formats of different log file types such as line logs, highly structured logs and tabular logs. Furthermore, we verified that the specification is resilient for log file corruptions, which is a prominent problem in the domain. 1
2 This paper presents an implementation of that specification based on Simple Declarative Language. Simple Declarative Language (SDL) is an easy representation mechanism for data structures (Leuck, 2012). Java and.net implementations for the language already exist so that a syntax expressed in a compliant format can be parsed easily. We formulate a syntax that facilitates easy expression of all language constructs. Since the syntax is compliant with SDL, we could use an existing SDL parser for lexical analysis. Interpretation stage is implemented according to a new algorithm, which uses recursion extensively. Inconsistent log data are handled through a hierarchical fault tolerance mechanism that provides users to select the level of recovery after detecting a log file corruption. Selective data extraction is supported to enable users to cherry-pick data from huge log files for further analysis. Supportive routines are added to reduce effort in dealing with common log file constructs such as timestamps, IP addresses, port numbers and error codes. Output of the data extraction process is a tree that incarnates semantic relationships between log entries. High expressiveness, simplicity, short learning curve, readability and immunity for log file corruptions are the strengths that we identify in the language. EXISTING DATA DESCRIPTION LANGUAGES This section provides an overview on existing data description languages. EAST - East is an ISO standard data description language, which is developed by the Consultative Committee for Space Data Systems (CCSDS, 2007). It provides a rich mechanism to express data format completely and non-ambiguously. Data are regarded as a collection of data entities and the EAST description is used to interpret and gain access to those entities. Its main design goals are strong data description capabilities, human readability, and computer interpretability. One prominent problem with EAST is the lack of support for describing file structures where position of one data entity needs to be determined at run time by examining fields in other entities. DRB Data Request Broker is an open source Java application programming interface (GAEL, 2009). It is an expansion on EAST. It can be used for reading, writing and processing heterogeneous data. DRB is a software abstraction layer, which can be utilized by developers for programming applications independently from the way data are encoded within files. It is also possible to perform calculations using XQuery from within the data description allowing full description of files where the locations of data fields within a file must be calculated from other data fields. However calculations must be described in XQuery and can lead to increased complexity hence reduced human readability. PADS/ML - This is a domain-specific language designed to improve the productivity of data analysts. It is a functional computer language to formally specify the logical and physical structure in data (Mandelbaum et al, 2007). In contrast to other data description languages, PADS/ML provides a platform where the description can stand as a sound documentation on the data too. However, it does not offer a satisfactory level of support for describing semantic information. DFDL Data Format Description Language is an open standard that came up due to the need for representing text and binary data with various formats in a common XML paradigm (OGF, 2010). It also allows data to be taken from an XML model and written out to its native format. By having a data format described with a DFDL description, which is accessible to 2
3 multiple applications, one can provide a common interface to the data, therefore facilitating data interchange. DFDL does not inherently support semantic information but can be used in conjunction with ontologies for this purpose. One drawback is the verbose nature of DFDL because of XML metadata that affects human readability. HAWK This is a powerful, flexible language for log file analysis, which uses simple methods to analyse. Its basis in pattern-action pairs allow for flexible combination of programs. It provides support for a range of log file analysis functionality such as filtering, recoding, and counting. The language provides the processing power for analysing the log files (HAWK, 2009). BFD - The Binary Format Description language is an XML-based language for expressing binary data formats (National Collaboratories, 2003). It is an extension to extensible Scientific Interchange Language(XSIL). A BFD template can be used to extract data from a set of files and put them into an XML for further processing. REQUIREMENT FOR A NEW LOG DATA EXTRACTION LANGUAGE Above-mentioned languages are mostly generic data format description languages that consider a wide range of applications. Log analysis is one niche where those tools can be utilized. However, log analysis, as a separate domain, exhibits unique characteristics and poses specific problems. For example, corrupted data is a prominent challenge facing any attempt to automate the analysis process. Huge amount of data, inconsistent formats and frequent format changes further add to this. Even so it is vital to have a data description scheme that results in more human readable templates compared to highly verbose XML solutions. THE NEW LANGUAGE In order to address these unique needs we designed a new log data extraction language based on a simple schema. Jayathilake (2011) discussed the theoretical formulation of this language along with case studies on its applications. In summary, it is based on interpreting a log file as a hierarchy of units termed log entities. Three types of log entities are identified. 1. Type A This type is defined as a sequence of other log entries defined by the pair ([LE 1, LE 2,, LE N ], ERROR_RECOVERY) where LE i are log entries. The sequence should be built with the same order of log entries as specified inside the square brackets in the first element of the pair. ERROR_RECOVERY is a flag that indicates whether the system should try to recover from parse errors for this type of log entries. 2. Type B This is a sequence of other log entries defined by the 4-tuple ({LE 1, LE 2,, LE N }, MAX, MIN, ERROR_RECOVERY) where LE i are log entries. The sequence can be built with those log entries by putting them in any order. Each LEi can appear in the sequence zero or more times. The list containing LEis is termed the candidate list for the sequence. MAX is the maximum number of log entries permitted in the sequence. If its value is -1, there is no limit for the length of the sequence. Similarly MIN is the minimum number of log entries that should 3
4 be present in the sequence. -1 indicates that there is no lower bound for the length of the sequence. ERROR_RECOVERY is a flag having same semantics as in definition for Type A. 3. Type C A singleton (k) where k is a fixed sequence of bytes. The language also provides a mechanism to recover from corruptions in a log file. When a part of text that does not follow the format in the description is detected, the interpreter has the ability to fall-back to the next log entry and to continue execution without premature termination. IMPLEMENTATION We implemented the language syntax in Simple Declarative Language (SDL), which provides infrastructure for describing arbitrary data formats. Below we explain the syntax for each of the three log entry types through examples. 1. Type A Line typea Timestamp Process TID Area Category ER=true This syntax defines a log entry named Line, which is built by a sequence of other log entries Timestamp, Process, TID, Area, and Category. Error recovery (ER) is set to true. 2. Type B Gap typeb Space Tab Max=-1 Min=2 ER=false This is a definition of a log entry named Gap, which stands for an empty space created by two or more spaces and tabs. Spaces and tabs can occur in any order and quantity. Error recovery (ER) is set to false. Char typec a 3. Type C The Type C log entry Char defined here stands for the character a. Implementation of the language is shown in Fig. 1. It constitutes two main components; the lexical analysis module (parser) and an interpreter module. The parser module processes the given format specification using SDL. This is possible since the new language syntax is compliant with SDL. Log file content is lexically analysed with respect to the pre-processed format specification. After that, the interpreter extracts log file content and converts it to a proprietary binary format. This data format is ready to be processed by the log data analysis framework presented in Jayathilake (2011). A recursive algorithm is used to implement the interpreter module. 4
5 Figure 1: Implementation of the new language COMPARISON WITH DFDL In this section we provide a comparison of the new language with DFDL, which is another promising technology to express file formats. Similar to any other XML based schemas DFDL incurs significant metadata overhead. Log entry formats expressed in DFDL are much verbose than their expressions in our schema. Less verbose Resilient for log corruptions Optimized for log file formats Lot of metadata Unable to handle log corruptions Offers a powerful type system Line typea Timestamp Process TID Area Category ER=true Figure 2: Comparison between our schema and DFDL 5
6 Fig. 2 provides a comparison between the expressions of one log entry in our schema and in DFDL. The new schema results in more compact and readable format expressions. Since the new language is specifically designed for log file formats, in contrast to DFDL, which is a generic format expression mechanism, the new language offers few other benefits too. One prominent advantage of it is the ability to deal with log corruptions. On the other hand, DFDL provides a rich type system so that most common data types are natively identified. CONCLUSION The new log data extraction language has the capability to express a wide range of log file formats while offering a simple, human-readable syntax. Its hierarchical interpretation of log entries enables it to capture difficult log formats containing lot of peculiarities that many other existing data format description languages fail on. The schema is proven to work with many industrial log file types such as line logs, message logs, XML logs and tabular logs. A prominent feature in the new language is its ability to deal with inconsistencies and corruptions in log files. It strengthens the automated log analysis mechanism with the ability to use as much correct data as possible. Simple Declarative Language provided a useful platform when implementing the language syntax. The current implementation of the language supports only text log files, which constitutes a limitation. It can be enhanced to handle binary logs too. A further improvement can be adding the capability to handle log file formats where the location of one log entry should be dynamically read from another log entry. REFERENCES Jayathilake, D. (2011) A mind map based framework for automated software log file analysis, Proceedings of the International Conference on Software and Computer Applications (ICSCA 2011), pp Jayathilake, D. (2011) A novel mind map based approach for log data extraction, Proceedings of the 6th IEEE International Conference on Industrial and Information Systems (ICIIS 2011), pp Andrews, J. H. (1988) Testing using log file analysis: tools, methods and issues, Proceedings of the 13th IEEE International Conference on Automated Software Engineering, pp Valdman, J. (2001) Log file analysis, Department of Computer Science and Engineering (FAV UWB), Tech. Rep. DCSE/TR Consultative Committee for Space Data Systems, The Data Description language EAST Specification. [pdf]. Available at: < p-2.1/ccsds p-2.1.pdf> [Accessed 05 May 2012]. GAEL Consultant, Data Request Broker. [online] Available at: < [Accessed 05 May 2012]. Mandelbaum, Y., Fisher, K., Walker, D., Fernandez, M. and Gleyzer, A. (2007) PADS/ML: A Functional Data Description Language, Proceedings of the 34th annual ACM SIGPLAN- SIGACT symposium on Principles of programming languages (POPL 07), pp
7 OGF Data Format Description Language Working Group, Data Format Description Language (DFDL) v1.0 Core Specification. [pdf]. Available at: < [Accessed 05 May 2012]. HAWK Network Defense, The Future: Dynamic Log Analysis. [pdf]. Available at: < Whitepaper4.pdf> [Accessed 05 May 2012]. National Collaboratories, Binary Format Description (BFD) Language. [online] Available at: < [Accessed 05 May 2012]. Daniel Leuck, Simple Declarative Language. [online] Available at: < [Accessed 05 May 2012]. 7
A Mind Map Based Framework for Automated Software Log File Analysis
2011 International Conference on Software and Computer Applications IPCSIT vol.9 (2011) (2011) IACSIT Press, Singapore A Mind Map Based Framework for Automated Software Log File Analysis Dileepa Jayathilake
More informationFiskP, DLLP and XML
XML-based Data Integration for Semantic Information Portals Patrick Lehti, Peter Fankhauser, Silvia von Stackelberg, Nitesh Shrestha Fraunhofer IPSI, Darmstadt, Germany lehti,fankhaus,sstackel@ipsi.fraunhofer.de
More informationRotorcraft Health Management System (RHMS)
AIAC-11 Eleventh Australian International Aerospace Congress Rotorcraft Health Management System (RHMS) Robab Safa-Bakhsh 1, Dmitry Cherkassky 2 1 The Boeing Company, Phantom Works Philadelphia Center
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
More information2. Distributed Handwriting Recognition. Abstract. 1. Introduction
XPEN: An XML Based Format for Distributed Online Handwriting Recognition A.P.Lenaghan, R.R.Malyan, School of Computing and Information Systems, Kingston University, UK {a.lenaghan,r.malyan}@kingston.ac.uk
More informationDatabases in Organizations
The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron
More informationMULTI AGENT-BASED DISTRIBUTED DATA MINING
MULTI AGENT-BASED DISTRIBUTED DATA MINING REECHA B. PRAJAPATI 1, SUMITRA MENARIA 2 Department of Computer Science and Engineering, Parul Institute of Technology, Gujarat Technology University Abstract:
More informationAutomatic Timeline Construction For Computer Forensics Purposes
Automatic Timeline Construction For Computer Forensics Purposes Yoan Chabot, Aurélie Bertaux, Christophe Nicolle and Tahar Kechadi CheckSem Team, Laboratoire Le2i, UMR CNRS 6306 Faculté des sciences Mirande,
More informationXML DATA INTEGRATION SYSTEM
XML DATA INTEGRATION SYSTEM Abdelsalam Almarimi The Higher Institute of Electronics Engineering Baniwalid, Libya Belgasem_2000@Yahoo.com ABSRACT This paper describes a proposal for a system for XML data
More informationIndex Terms Domain name, Firewall, Packet, Phishing, URL.
BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet
More informationSecure Semantic Web Service Using SAML
Secure Semantic Web Service Using SAML JOO-YOUNG LEE and KI-YOUNG MOON Information Security Department Electronics and Telecommunications Research Institute 161 Gajeong-dong, Yuseong-gu, Daejeon KOREA
More informationIntroduction to Web Services
Department of Computer Science Imperial College London CERN School of Computing (icsc), 2005 Geneva, Switzerland 1 Fundamental Concepts Architectures & escience example 2 Distributed Computing Technologies
More informationTechnical. Overview. ~ a ~ irods version 4.x
Technical Overview ~ a ~ irods version 4.x The integrated Ru e-oriented DATA System irods is open-source, data management software that lets users: access, manage, and share data across any type or number
More informationA Recommendation Framework Based on the Analytic Network Process and its Application in the Semantic Technology Domain
A Recommendation Framework Based on the Analytic Network Process and its Application in the Semantic Technology Domain Student: Filip Radulovic - fradulovic@fi.upm.es Supervisors: Raúl García-Castro, Asunción
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationA Framework for Ontology-Based Knowledge Management System
A Framework for Ontology-Based Knowledge Management System Jiangning WU Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China E-mail: jnwu@dlut.edu.cn Abstract Knowledge
More informationBUSINESS VALUE OF SEMANTIC TECHNOLOGY
BUSINESS VALUE OF SEMANTIC TECHNOLOGY Preliminary Findings Industry Advisory Council Emerging Technology (ET) SIG Information Sharing & Collaboration Committee July 15, 2005 Mills Davis Managing Director
More informationDistributed Database for Environmental Data Integration
Distributed Database for Environmental Data Integration A. Amato', V. Di Lecce2, and V. Piuri 3 II Engineering Faculty of Politecnico di Bari - Italy 2 DIASS, Politecnico di Bari, Italy 3Dept Information
More informationXpoLog Competitive Comparison Sheet
XpoLog Competitive Comparison Sheet New frontier in big log data analysis and application intelligence Technical white paper May 2015 XpoLog, a data analysis and management platform for applications' IT
More informationSome Research Challenges for Big Data Analytics of Intelligent Security
Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,
More informationII. PREVIOUS RELATED WORK
An extended rule framework for web forms: adding to metadata with custom rules to control appearance Atia M. Albhbah and Mick J. Ridley Abstract This paper proposes the use of rules that involve code to
More informationMultiple electronic signatures on multiple documents
Multiple electronic signatures on multiple documents Antonio Lioy and Gianluca Ramunno Politecnico di Torino Dip. di Automatica e Informatica Torino (Italy) e-mail: lioy@polito.it, ramunno@polito.it web
More informationA Semantic Approach for Access Control in Web Services
A Semantic Approach for Access Control in Web Services M. I. Yagüe, J. Mª Troya Computer Science Department, University of Málaga, Málaga, Spain {yague, troya}@lcc.uma.es Abstract One of the most important
More informationReverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms
Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,
More informationThe Ontological Approach for SIEM Data Repository
The Ontological Approach for SIEM Data Repository Igor Kotenko, Olga Polubelova, and Igor Saenko Laboratory of Computer Science Problems, Saint-Petersburg Institute for Information and Automation of Russian
More informationCombining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University
More informationSemantically Enhanced Web Personalization Approaches and Techniques
Semantically Enhanced Web Personalization Approaches and Techniques Dario Vuljani, Lidia Rovan, Mirta Baranovi Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, HR-10000 Zagreb,
More informationDataDirect XQuery Technical Overview
DataDirect XQuery Technical Overview Table of Contents 1. Feature Overview... 2 2. Relational Database Support... 3 3. Performance and Scalability for Relational Data... 3 4. XML Input and Output... 4
More informationA FRAMEWORK FOR MANAGING RUNTIME ENVIRONMENT OF JAVA APPLICATIONS
A FRAMEWORK FOR MANAGING RUNTIME ENVIRONMENT OF JAVA APPLICATIONS Abstract T.VENGATTARAMAN * Department of Computer Science, Pondicherry University, Puducherry, India. A.RAMALINGAM Department of MCA, Sri
More informationApplication of XML Tools for Enterprise-Wide RBAC Implementation Tasks
Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks Ramaswamy Chandramouli National Institute of Standards and Technology Gaithersburg, MD 20899,USA 001-301-975-5013 chandramouli@nist.gov
More information2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India
Integrity Preservation and Privacy Protection for Digital Medical Images M.Krishna Rani Dr.S.Bhargavi IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Abstract- In medical treatments, the integrity
More informationA Scalability Model for Managing Distributed-organized Internet Services
A Scalability Model for Managing Distributed-organized Internet Services TSUN-YU HSIAO, KO-HSU SU, SHYAN-MING YUAN Department of Computer Science, National Chiao-Tung University. No. 1001, Ta Hsueh Road,
More informationFIPA agent based network distributed control system
FIPA agent based network distributed control system V.Gyurjyan, D. Abbott, G. Heyes, E. Jastrzembski, C. Timmer, E. Wolin TJNAF, Newport News, VA 23606, USA A control system with the capabilities to combine
More informationAdvantages of XML as a data model for a CRIS
Advantages of XML as a data model for a CRIS Patrick Lay, Stefan Bärisch GESIS-IZ, Bonn, Germany Summary In this paper, we present advantages of using a hierarchical, XML 1 -based data model as the basis
More informationSyntax Check of Embedded SQL in C++ with Proto
Proceedings of the 8 th International Conference on Applied Informatics Eger, Hungary, January 27 30, 2010. Vol. 2. pp. 383 390. Syntax Check of Embedded SQL in C++ with Proto Zalán Szűgyi, Zoltán Porkoláb
More informationEfficient Information Retrieval in Network Management Using Web Services
Efficient Information Retrieval in Network Management Using Web Services Aimilios Chourmouziadis 1, George Pavlou 1 1 Center of Communications and Systems Research, Department of Electronic and Physical
More informationEnterprise Data Quality
Enterprise Data Quality An Approach to Improve the Trust Factor of Operational Data Sivaprakasam S.R. Given the poor quality of data, Communication Service Providers (CSPs) face challenges of order fallout,
More informationChapter 11 Mining Databases on the Web
Chapter 11 Mining bases on the Web INTRODUCTION While Chapters 9 and 10 provided an overview of Web data mining, this chapter discusses aspects of mining the databases on the Web. Essentially, we use the
More informationVIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR
VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR Andrey V.Lyamin, State University of IT, Mechanics and Optics St. Petersburg, Russia Oleg E.Vashenkov, State University of IT, Mechanics and Optics, St.Petersburg,
More informationCOCOVILA Compiler-Compiler for Visual Languages
LDTA 2005 Preliminary Version COCOVILA Compiler-Compiler for Visual Languages Pavel Grigorenko, Ando Saabas and Enn Tyugu 1 Institute of Cybernetics, Tallinn University of Technology Akadeemia tee 21 12618
More informationOntology and automatic code generation on modeling and simulation
Ontology and automatic code generation on modeling and simulation Youcef Gheraibia Computing Department University Md Messadia Souk Ahras, 41000, Algeria youcef.gheraibia@gmail.com Abdelhabib Bourouis
More informationComponent visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University mcserep@caesar.elte.hu
More informationModeling Turnpike: a Model-Driven Framework for Domain-Specific Software Development *
for Domain-Specific Software Development * Hiroshi Wada Advisor: Junichi Suzuki Department of Computer Science University of Massachusetts, Boston hiroshi_wada@otij.org and jxs@cs.umb.edu Abstract. This
More informationLightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
More informationDatabase Migration- How hard can it be?
2012 2 nd International Conference on Information Communication and Management (ICICM 2012) IPCSIT vol. 55 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V55.12 Database Migration- How
More informationA Mechanism for VHDL Source Protection
A Mechanism for VHDL Source Protection 1 Overview The intent of this specification is to define the VHDL source protection mechanism. It defines the rules to encrypt the VHDL source. It also defines the
More informationEncoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web
Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web Corey A Harper University of Oregon Libraries Tel: +1 541 346 1854 Fax:+1 541 346 3485 charper@uoregon.edu
More informationAugmented Search for IT Data Analytics. New frontier in big log data analysis and application intelligence
Augmented Search for IT Data Analytics New frontier in big log data analysis and application intelligence Business white paper May 2015 IT data is a general name to log data, IT metrics, application data,
More informationThe International Journal of Digital Curation Volume 7, Issue 1 2012
doi:10.2218/ijdc.v7i1.217 Grammar-Based Specification 95 Grammar-Based Specification and Parsing of Binary File Formats William Underwood, Principal Research Scientist, Georgia Tech Research Institute
More informationProgramming Languages
Programming Languages Programming languages bridge the gap between people and machines; for that matter, they also bridge the gap among people who would like to share algorithms in a way that immediately
More informationLDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,
More informationMiddleware support for the Internet of Things
Middleware support for the Internet of Things Karl Aberer, Manfred Hauswirth, Ali Salehi School of Computer and Communication Sciences Ecole Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne,
More informationUNIVERSITY OF MALTA THE MATRICULATION CERTIFICATE EXAMINATION ADVANCED LEVEL COMPUTING. May 2011
UNIVERSITY OF MALTA THE MATRICULATION CERTIFICATE EXAMINATION ADVANCED LEVEL COMPUTING May 2011 EXAMINERS REPORT MATRICULATION AND SECONDARY EDUCATION CERTIFICATE EXAMINATIONS BOARD AM Computing May 2011
More informationINTRUSION PROTECTION AGAINST SQL INJECTION ATTACKS USING REVERSE PROXY
INTRUSION PROTECTION AGAINST SQL INJECTION ATTACKS USING REVERSE PROXY Asst.Prof. S.N.Wandre Computer Engg. Dept. SIT,Lonavala University of Pune, snw.sit@sinhgad.edu Gitanjali Dabhade Monika Ghodake Gayatri
More informationA Standards-Based Approach to Extracting Business Rules
A Standards-Based Approach to Extracting Business Rules Ira Baxter Semantic Designs, Inc. Stan Hendryx Hendryx & Associates 1 Who are the presenters? Semantic Designs Automated Analysis and Enhancement
More informationFrom Business World to Software World: Deriving Class Diagrams from Business Process Models
From Business World to Software World: Deriving Class Diagrams from Business Process Models WARARAT RUNGWORAWUT 1 AND TWITTIE SENIVONGSE 2 Department of Computer Engineering, Chulalongkorn University 254
More informationTZWorks Windows Event Log Viewer (evtx_view) Users Guide
TZWorks Windows Event Log Viewer (evtx_view) Users Guide Abstract evtx_view is a standalone, GUI tool used to extract and parse Event Logs and display their internals. The tool allows one to export all
More informationMD Link Integration. 2013 2015 MDI Solutions Limited
MD Link Integration 2013 2015 MDI Solutions Limited Table of Contents THE MD LINK INTEGRATION STRATEGY...3 JAVA TECHNOLOGY FOR PORTABILITY, COMPATIBILITY AND SECURITY...3 LEVERAGE XML TECHNOLOGY FOR INDUSTRY
More informationAn Approach to Eliminate Semantic Heterogenity Using Ontologies in Enterprise Data Integeration
Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 3 rd, 2013 An Approach to Eliminate Semantic Heterogenity Using Ontologies in Enterprise Data Integeration Srinivasan Shanmugam and
More informationA Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems
Proceedings of the Postgraduate Annual Research Seminar 2005 68 A Model-based Software Architecture for XML and Metadata Integration in Warehouse Systems Abstract Wan Mohd Haffiz Mohd Nasir, Shamsul Sahibuddin
More informationSymbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
More informationBisecting K-Means for Clustering Web Log data
Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining
More informationApplication of Data Mining Techniques in Intrusion Detection
Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology leiminxuan@sohu.com Abstract: The article introduced the importance of intrusion detection, as well as
More informationHow To Create A Data Transformation And Data Visualization Tool In Java (Xslt) (Programming) (Data Visualization) (Business Process) (Code) (Powerpoint) (Scripting) (Xsv) (Mapper) (
A Generic, Light Weight, Pluggable Data Transformation and Visualization Tool for XML to XML Transformation Rahil A. Khera 1, P. S. Game 2 1,2 Pune Institute of Computer Technology, Affiliated to SPPU,
More informationCHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1.1 Background The command over cloud computing infrastructure is increasing with the growing demands of IT infrastructure during the changed business scenario of the 21 st Century.
More informationTest Data Management Concepts
Test Data Management Concepts BIZDATAX IS AN EKOBIT BRAND Executive Summary Test Data Management (TDM), as a part of the quality assurance (QA) process is more than ever in the focus among IT organizations
More informationProtecting Business Information With A SharePoint Data Governance Model. TITUS White Paper
Protecting Business Information With A SharePoint Data Governance Model TITUS White Paper Information in this document is subject to change without notice. Complying with all applicable copyright laws
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationImproving the visualisation of statistics: The use of SDMX as input for dynamic charts on the ECB website
Improving the visualisation of statistics: The use of SDMX as input for dynamic charts on the ECB website Xavier Sosnovsky, Gérard Salou European Central Bank Abstract The ECB has introduced a number of
More informationINTRUSION DETECTION ALARM CORRELATION: A SURVEY
INTRUSION DETECTION ALARM CORRELATION: A SURVEY Urko Zurutuza, Roberto Uribeetxeberria Computer Science Department, Mondragon University Mondragon, Gipuzkoa, (Spain) {uzurutuza,ruribeetxeberria}@eps.mondragon.edu
More informationLumousoft Visual Programming Language and its IDE
Lumousoft Visual Programming Language and its IDE Xianliang Lu Lumousoft Inc. Waterloo Ontario Canada Abstract - This paper presents a new high-level graphical programming language and its IDE (Integration
More informationTransparency and Efficiency in Grid Computing for Big Data
Transparency and Efficiency in Grid Computing for Big Data Paul L. Bergstein Dept. of Computer and Information Science University of Massachusetts Dartmouth Dartmouth, MA pbergstein@umassd.edu Abstract
More informationScalable Extraction, Aggregation, and Response to Network Intelligence
Scalable Extraction, Aggregation, and Response to Network Intelligence Agenda Explain the two major limitations of using Netflow for Network Monitoring Scalability and Visibility How to resolve these issues
More informationEFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More informationMALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
More informationOn the general structure of ontologies of instructional models
On the general structure of ontologies of instructional models Miguel-Angel Sicilia Information Engineering Research Unit Computer Science Dept., University of Alcalá Ctra. Barcelona km. 33.6 28871 Alcalá
More informationData Mining & Data Stream Mining Open Source Tools
Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.
More informationAutomated Medical Citation Records Creation for Web-Based On-Line Journals
Automated Medical Citation Records Creation for Web-Based On-Line Journals Daniel X. Le, Loc Q. Tran, Joseph Chow Jongwoo Kim, Susan E. Hauser, Chan W. Moon, George R. Thoma National Library of Medicine,
More informationConnections to External File Sources
Connections to External File Sources By using connections to external sources you can significantly speed up the process of getting up and running with M-Files and importing existing data. For instance,
More informationWeb Forensic Evidence of SQL Injection Analysis
International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University
More informationDoctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
More informationHadoop Technology for Flow Analysis of the Internet Traffic
Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet
More informationRecovering Business Rules from Legacy Source Code for System Modernization
Recovering Business Rules from Legacy Source Code for System Modernization Erik Putrycz, Ph.D. Anatol W. Kark Software Engineering Group National Research Council, Canada Introduction Legacy software 000009*
More informationSEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. ravirajesh.j.2013.mecse@rajalakshmi.edu.in Mrs.
More informationBASI DI DATI II 2 modulo Parte II: XML e namespaces. Prof. Riccardo Torlone Università Roma Tre
BASI DI DATI II 2 modulo Parte II: XML e namespaces Prof. Riccardo Torlone Università Roma Tre Outline What is XML, in particular in relation to HTML The XML data model and its textual representation The
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationML for the Working Programmer
ML for the Working Programmer 2nd edition Lawrence C. Paulson University of Cambridge CAMBRIDGE UNIVERSITY PRESS CONTENTS Preface to the Second Edition Preface xiii xv 1 Standard ML 1 Functional Programming
More informationParsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
More informationDLDB: Extending Relational Databases to Support Semantic Web Queries
DLDB: Extending Relational Databases to Support Semantic Web Queries Zhengxiang Pan (Lehigh University, USA zhp2@cse.lehigh.edu) Jeff Heflin (Lehigh University, USA heflin@cse.lehigh.edu) Abstract: We
More informationPreservation Handbook
Preservation Handbook [Binary Text / Word Processor Documents] Author Rowan Wilson and Martin Wynne Version Draft V3 Date 22 / 08 / 05 Change History Revised by MW 22.8.05; 2.12.05; 7.3.06 Page 1 of 7
More informationPTK Forensics. Dario Forte, Founder and Ceo DFLabs. The Sleuth Kit and Open Source Digital Forensics Conference
PTK Forensics Dario Forte, Founder and Ceo DFLabs The Sleuth Kit and Open Source Digital Forensics Conference What PTK is about PTK forensics is a computer forensic framework based on command line tools
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
More informationTool Support for Model Checking of Web application designs *
Tool Support for Model Checking of Web application designs * Marco Brambilla 1, Jordi Cabot 2 and Nathalie Moreno 3 1 Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza L. Da Vinci,
More informationData Integration Hub for a Hybrid Paper Search
Data Integration Hub for a Hybrid Paper Search Jungkee Kim 1,2, Geoffrey Fox 2, and Seong-Joon Yoo 3 1 Department of Computer Science, Florida State University, Tallahassee FL 32306, U.S.A., jungkkim@cs.fsu.edu,
More informationInformation Visualization of Attributed Relational Data
Information Visualization of Attributed Relational Data Mao Lin Huang Department of Computer Systems Faculty of Information Technology University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Australia
More informationNETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE Anjali P P 1 and Binu A 2 1 Department of Information Technology, Rajagiri School of Engineering and Technology, Kochi. M G University, Kerala
More informationA Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment
DOI: 10.15415/jotitt.2014.22021 A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment Rupali Gill 1, Jaiteg Singh 2 1 Assistant Professor, School of Computer Sciences, 2 Associate
More informationSnapshots in Hadoop Distributed File System
Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any
More informationBPCMont: Business Process Change Management Ontology
BPCMont: Business Process Change Management Ontology Muhammad Fahad DISP Lab (http://www.disp-lab.fr/), Université Lumiere Lyon 2, France muhammad.fahad@univ-lyon2.fr Abstract Change management for evolving
More information