Curriculum Vitae et Studiorum



Similar documents
Curriculum of the research and teaching activities. Matteo Golfarelli

UNIVERSITÀ DI PISA Department of Computer Science. Master s degree in Business Informatics (2 years, 120 ECTS)

Business Intelligence for The Internet of Things

Curriculum Vitae et Studiorum Dossier n Cinzia Di Giusto

FRANCESCO BELLOCCHIO S CURRICULUM VITAE ET STUDIORUM

Time: A Coordinate for Web Site Modelling

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Antonino Freno. Curriculum Vitae. Phone (office): Office: +33 (0)

Valeria Leggieri Curriculum Vitae

Curriculum Vitae et Studiorum

Transforming XML trees for efficient classification and clustering

KNOWLEDGE GRID An Architecture for Distributed Knowledge Discovery

ANDREA COLOMBARI. Curriculum vitae

UNIVERSITÀ DI PISA Department of Computer Science. Master s degree in Business Informatics (2 years, 120 ECTS)

ACQUIRING, ORGANISING AND PRESENTING INFORMATION AND KNOWLEDGE ON THE WEB. Pavol Návrat

How To Understand The Theory Of Network Routing In A Computer Program

Martino Sykora CURRICULUM VITAE ET STUDIORUM

How To Understand The Behaviour Of A Fault Monitor

CAS CS 565, Data Mining

Academic Curriculum vitae

Web Document Clustering

General Purpose Database Summarization

Curriculum Vitae Rosario Surace

How To Create A Text Classification System For Spam Filtering

Grid Data Integration Based on Schema Mapping

Process Mining. Luigi Pontieri Istituto di Calcolo e Reti ad Alte Prestazioni ICAR-CNR Via Bucci 41c, Rende (CS) pontieri@icar.cnr.

Emanuele Storti Scientific curriculum

CURRICULUM VITAE ET STUDIORUM Francesca Gardini

Giuseppe Riccardi, Marco Ronchetti. University of Trento

Resource Management on Computational Grids

A New Approach for Evaluation of Data Mining Techniques

Efficient Storage and Temporal Query Evaluation of Hierarchical Data Archiving Systems

CURRICULUM VITAE CECILIA ROSSIGNOLI

Grid Data Integration based on Schema-mapping

Social Media Mining. Data Mining Essentials

INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS

Use of P2P Overlays for Distributed Data Caching in Public Scientific Computing

Curriculum Vitae. Education. Laura Paladino Dipartimento di Matematica Università di Pisa Largo Bruno Pontecorvo, Pisa (Cosenza)

Classification and Prediction

Clustering Digital Data by Compression: Applications to Biology and Medical Images

The STC for Event Analysis: Scalability Issues

Curriculum Vitae et Studiorum

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH

Exploiting peer group concept for adaptive and highly available services

Complex Information Management Using a Framework Supported by ECA Rules in XML

Francesco Merlo Curriculum Vitæ

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

INTEROPERABILITY IN DATA WAREHOUSES

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

Web-Based Genomic Information Integration with Gene Ontology

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

Clustering Technique in Data Mining for Text Documents

Paolo Maistri. September 8, Personal Information 2. Education and Studies 2. Academic Activities and Affiliations 3

BUSINESS RULES AS PART OF INFORMATION SYSTEMS LIFE CYCLE: POSSIBLE SCENARIOS Kestutis Kapocius 1,2,3, Gintautas Garsva 1,2,4

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

CLAUDIO ROSSETTI Curriculum Vitæ. Place of birth: Rome, Italy. Date of birth: April 12,

CURRICULUM VITAE. (September, 24th 2008)

Towards a European Certification of Informatics Curricula

Information Management course

Visual Data Mining with Pixel-oriented Visualization Techniques

Graph Mining and Social Network Analysis

SQL Server 2005 Features Comparison

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

Curriculum Vitae Et Studiorum

Horizontal IoT Application Development using Semantic Web Technologies

Description of Knowledge Discovery Tools in KDTML

Efficient Integration of Data Mining Techniques in Database Management Systems

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

OAK Database optimizations and architectures for complex large data Ioana MANOLESCU-GOUJOT

Search Result Optimization using Annotators

Introduzione alle Biblioteche Digitali Audio/Video

Standardization of Components, Products and Processes with Data Mining

,QWHJUDWLRQRI'HGXFWLRQDQG,QGXFWLRQIRU0LQLQJ 6XSHUPDUNHW6DOHV'DWD

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -

An XML Framework for Integrating Continuous Queries, Composite Event Detection, and Database Condition Monitoring for Multiple Data Streams

Sofia Ceppi. Personal Information 2. Association Memberships 2. Education 2. Academic Positions and Affiliations 3

EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION

Place and date of birth Rome, November 26 th 1983

ISSN: CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS

Fluency With Information Technology CSE100/IMT100

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Data Mining Governance for Service Oriented Architecture

1 Mobile Data Mining on Small

Knowledge Discovery from Data Bases Proposal for a MAP-I UC

M. Lamine BA Post-doc fellow

Integration and Coordination in in both Mediator-Based and Peer-to-Peer Systems

CURRICULUM VITAE ET STUDIORUM

XML Data Mining: Models, Methods, and Applications REFERENCE. University of Calabria, Italy. Andrea Tagarelli. Information Science

Europass Curriculum Vitae

Lesson 4 Web Service Interface Definition (Part I)

Scalable Parallel Clustering for Data Mining on Multicomputers

Data Mining Solutions for the Business Environment

Computing Range Queries on Obfuscated Data

Unsupervised Data Mining (Clustering)

Workshop on Good Practice of Documentation Rome,ISS

Curriculum Vitae. CRISTINA ELISA ORSO January Dipartimento di Giurisprudenza e Scienze Politiche, Economiche e Sociali,

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

Transcription:

Curriculum Vitae et Studiorum Giovanni Costa March 1, 2007 1 Personal Information Birth date and birth place: April 14 1976, Milano (MI) Italy. Nationality: Italian. Address: via Messina 42, 89026 San Ferdinando (RC), Italy. Office telephone: +39 0984/494618 Personal telephone: +39 393/3197290 Home page: http://www.icar.cnr.it/costa E-mail: costa@icar.cnr.it; gcosta@deis.unical.it 1

2 Education On Febraury 26 2007, he received Ph.D. in Computer Science from Università della Calabria, Cosenza, Italy. Title of the thesis: Knowledge Management and Extraction in XML Data. Advisors: Prof. Domenico Saccà, Dr. Giuseppe Manco. On May 28 2003, he received Laurea degree cum laude in Computer Science Engineering from Università della Calabria, Cosenza, Italy. Title of the thesis : Compressione di dati XML per l ottimizzazione delle interrogazioni. Advisors: Prof. Domenico Saccà, Ing. Angela Bonifati, Ing. Andrea Pugliese. 3 Research activities He was a Ph.D. Student at D.E.I.S (Electronic, Informatics and Systems Department - Università della Calabria) until October 2006. He is currently a post-doc Research Fellow at ICAR-CNR (National Research Council of Italy), Rende (CS). 3.1 Research topics Querying of XML data in compressed domain. XML documents have an inherent textual nature due to redundant tags and to the PCDATA content. Therefore, they lead themselves naturally to compression. Once the compressed documents are produced, however, one would like to be able to still query them under a compressed form as much as possible (lazy decompression). The advantages of processing queries in the compressed domain are several: first, in a traditional query setting, access to small chunks of data may lead to less disk I/Os and reduce the query processing time; second, the memory and computation efforts in processing compressed data can be dramatically lower than those for uncompressed ones, thus even lowbattery mobile devices can afford them; third, the possibility of obtaining compressed query results allows to spare network bandwidth when sending these results to a remote location. The XQueC system compresses XML data and queries it as much as possible under its compressed form, covering all real-life, complex classes of queries. The XQueC system adheres to the following approach: (i) XQueC takes advantage of the XMill principle of compressing separately data and structure for efficiently querying compressed data. (ii) It adopts a simple storage model suitable for compressed XML, and a set of access support 2

structures, allowing for many evaluation alternatives for complex XQuery query. Several storage methods are possible; we view ours as a simple choice for making a proof of concept. (iii) XQueC seamlessly extends a simple algebra for evaluating XML queries to include compression and decompression. This algebra is exploited by a comprehensive cost based optimizer, able to devise query evaluation methods that freely mix regular operator and compression-relevant ones. (iv)it exploits an adaptation of order preserving string compression algorithm ALM in order to evaluate in the compressed domain the class of queries involving inequality comparisons. Clustering of XML documents.the increasing relevance of the Web as a means for sharing information has made traditional approaches to information handling ineffective. Indeed, they are mainly devoted to the management of highly structured information, like relational databases, whereas Web data are semistructured and encoded using different formats. In particular, XML is touted as the driving-force for exchanging data on the Web, since it benefits from several advantages with respect to other data models. Examples are the flexibility for designing ad hoc markup languages for the representation and exchange of semistructured data within any application context, and the support of suitable document type definitions (DTDs) and XML Schema that permit to specify both the structure and the content of the documents. As the heterogeneity of XML sources increases, the need for organizing XML documents according to their structural features has become challenging. XRep is a novel methodology for clustering XML documents by structure, which is based on the notion of XML cluster representative. A cluster representative is a prototype XML document subsuming the most relevant structural features of the documents within a cluster. The intuition at the core of the approach is that a suitable cluster prototype can be obtained as the outcome of a proper overlapping among all the documents within a given cluster. Actually, the resulting tree has the main advantage of retaining the specifics of the enclosed documents, while guaranteeing a compact representation. This eventually makes the proposed notion of cluster representative extremely profitable in the envisaged applications: in particular, as a summary for the cluster, a representative highlights common subparts in the enclosed documents, and can avoid expensive comparisons with individual documents in the cluster. Trajectory Clustering. The discovery of frequently used trajectory segments can be useful in the context of Intelligent Transportation Systems as well as for improving the quality of network services. AT-DCS (Automatic Top-Down Clustering of Sequences) is a new approach to clustering trajec- 3

tories, that scales to processing large volumes of such data both in terms of effectiveness and efficiency. The main idea of the approach is borrowed from decision-tree learning, where traditional classification algorithms (such as C4.5 or CART) implement a top-down approach in order to recursively partition the available data on the basis of the gain in purity of the subsets w.r.t. the original dataset. There, purity is referred to the frequencies of class labels: the more label frequencies within a partition are unbalanced, the purer is the partition. These approaches have been proven to be both efficient and effective. AT-DCS implements a similar strategy for clustering high dimensional categorical data. Given an initial dataset, it recursively searches for a partition, which improves the overall purity. The algorithm is parametric to the notion of purity, which allows to adopt the quality criterion that best adapts to the specific case of clustering. In this paper, we provide a definition of purity, that is directly related to the frequency of the attribute values within the partition. Intuitively, the more predominance of some attribute values w.r.t. other vales is appreciable, the purer is the partition. 3.2 Project Activities 2003-2004 Technologies and Services for Enhanced Content Delivery (ECD). The project studies and proposes advanced technologies to find and organize contents of data available on the web. In this setting, a goal was to develop a prototype able to cluster data coming from transactional databases (such as, e.g., sessions of web users, or documents represented as bags of words). The prototype was applied to the postprocessing of results of queries to search engines. 2004-2005 Grid.it - WP6 - Knowledge services for intensive data analysis and intelligent query answering. This project, coordinated by the National Research Council (CNR), is defined within the scientific and technological context of new ITC platforms and of large scale distributed systems. The goal is to study and to experiment systems and software tools that turn out to be innovative at all levels, as well as to demonstrate their capabilities through some specific applications. 4

3.3 Reviewing Activities He is involved on review activity for the following national and international conference: International Conference on Data Mining (ICDM). International Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD). International Symposium on Applied Computing (SAC). Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD). IASTED International Conference on Databases and Applications (DBA). 4 Teaching activities 4.1 Academical Courses Since September 2003, he is involved in teaching activities. In particular: A.Y. 2006-2007 Teacher for the course Fundamentals of Computer Science, Faculty of Political Science, Università della Calabria. Teacher assistant for the course Computer Architecture, Faculty Teacher assistant for the course Network Operating Systems, Faculty A.Y. 2005-2006 Teacher assistant for the course Fundamentals of Computer Science, Faculty Teacher assistant for the course Computer Architecture, Faculty Teacher assistant for the course Network Operating Systems, Faculty A.Y. 2004-2005 Teacher assistant for the course Fundamentals of Computer Science, Faculty 5

Teacher assistant for the course Computer Architecture, Faculty Teacher assistant for the course Network Operating Systems, Faculty Teacher assistant for the course Introduction to Computer Science, Faculty A.Y. 2003-2004 Teacher assistant for the course Computer Architecture, Faculty 4.2 Master and Training courses Since September 2003, he is involved in teaching activities for the following master and training courses: May 2005. Teacher for the module Business Intelligence: Analisi dei dati finalizzata al marketing of training course La gestione delle funzioni aziendali nell era dell e-business for the project M.ENT.E - Management of integrated enterprise organized by Sviluppo Italia Calabria. From Dicember 2004 to January 2005. Teacher for the module Programmazione: Architetture e Sistemi operativi of training course La gestione delle funzioni aziendali nell era dell e-business for the project M.ENT.E - Management of integrated enterprise organized by Sviluppo Italia Calabria. 5 Scientific papers 5.1 Published papers G. Costa, F. Folino,A. Locane, G. Manco, R. Ortale. Data Mining for Effective Risk Analysis in a Bank Intelligence Scenario. In Proceedings of ICDE Workshop on Data Mining and Business Intelligence (DMBI 2007), Instanbul, Turkey, April 2007 (To Appear). G. Costa, A. D Atri, G. Manco, R. Ortale, D. Sacca, S. Za. Logistic management in a mobile environment: an approach Based on trajectory mining. In proceedings of IEEE Workshop on Mobile Communications 6

and Learning (MCL 2007) ; Sainte-Luce, Martinique - April 22-28, 2007 (To appear). A. Arion, A. Bonifati, G. Costa, S. D Aguanno, I. Manolescu, A. Pugliese. XQueC: pushing queries to compressed XML data. In proceedings of International Conference on Very Large Data Bases (VLDB), Berlino, Germania, 2003. ISBN 0-12-722442-4, pp. 1065-1068, Morgan Kaufmann, San Francisco, USA, 2003. A. Arion, A. Bonifati, G. Costa, S. D Aguanno, I. Manolescu, A. Pugliese. Effcient query evaluation over compressed XML data. In proceedings of International Conference on Extending Database Technology (EDBT), 2004. LNCS 2992, pp. 200-218, Springer-Verlag, Berlino, Germania, 2004. A. Arion, A. Bonifati, G. Costa, S. D Aguanno, I. Manolescu, A. Pugliese. XQueC: pushing queries to compressed XML data. Journees Bases de Donnees Avancees (BDA), Lione, Francia, 2003. G. Costa, G. Manco, R. Ortale, A. Tagarelli: Clustering of XML Documents by Structure based on Tree Matching and Merging. Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD), 2004: 314-325. G. Costa, G. Manco, R. Ortale, A. Tagarelli: A Tree-based Approach to Clustering XML Documents by Structure. In proceedings of International Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) 2004: 137-148. 5.2 Technical reports G. Costa, G. Manco, R. Ortale, A. Tagarelli A Tree-based Approach to Clustering XML Documents by Structure Istituto di calcolo e reti ad alte prestazioni (ICAR-CNR), technical report n.02, 2004. In accordance with the Italian law 675/96 and with D. Lgs n.196 approved June 30th 2003, I hereby authorize the use of my personal and professional details contained in this curriculum vitae. 7

Rende, March 1, 2007 Giovanni Costa 8