A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS
|
|
|
- Clementine Davis
- 10 years ago
- Views:
Transcription
1 A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS Caldas, Carlos H. 1 and Soibelman, L. 2 ABSTRACT Information is an important element of project delivery processes. Throughout the project life-cycle, information is needed for the planning, implementation, control, and analysis of project activities. Moreover, the use of communications and information technologies can support the project delivery process by helping to make value adding activities more efficient as well as to eliminate non-value adding activities through enhanced project controls. In the last few decades, the architecture, engineering, and construction (AEC) industry has experienced an increased availability of electronic project data due to new information technologies. Evolving project requirements also led to more complex projects that generate large amounts of data. These data are usually generated and stored in different data types such as relational databases, electronic text documents, CAD drawings, pictures, video, and audio, among others. These factors justify the importance of efficient and effective project data and information management techniques. A large percentage of AEC information is stored on text documents, including contracts, specifications, meeting minutes, change orders, field reports, and requests for information. This emphasizes the need for the development of methods for improving the organization, search, and retrieval of this type of data in project information management systems. This paper describes a study that aimed to formalize a method to address these issues. The proposed method combines the vector space model for information retrieval and text mining algorithms. Experiments conducted using document collections from several construction projects demonstrated the efficiency of the proposed approach. KEYWORDS text mining, document management, information systems, project management, information retrieval. 1 Assistant Professor. Department of Civil Engineering. University of Texas at Austin. 1 University Station C1752. Austin, TX [email protected] 2 Associate Professor. Department of Civil and Environmental Engineering. Carnegie Mellon University. Porter Hall 118N, Pittsburgh, PA [email protected] Page 2912
2 INTRODUCTION Several studies emphasized the importance of information integration methodologies to improve organization and access in inter-organizational construction information management systems (Sanvido and Medeiros, 1990; Reinschmidt et al., 1991; Turk et al., 1994; Fisher and Kunz, 1995; Rezgui et al., 1996; Eastman, 1999; Teicholz, 1999; Zhu et al., 2001; Froese, 2003) The majority of the architecture, engineering, construction, and facilities management (AEC/FM) information integration initiatives focus on structured data. Structured data is defined here as data that have a database like structure, usually in information systems that use some form of database in the background. However, a large percentage of the construction data is exchanged using text-based documents such as contracts, change orders, requests for information, among others. These documents contain valuable information for project planning, implementation, control, and analysis. This large number of documents creates challenges for project information management. Therefore, the management of the information contained in these types of documents becomes crucial to construction management. This paper presents a research study that investigated methodologies to provide semiautomatic support for project document integration in model-based information systems. The main goals of this study were to improve the organization and access to large document collections in project management information systems and to promote the integration of text documents in model-based information systems. A text information integration methodology was proposed, implemented, verified, and validated. The methodology is composed of three main steps: classification, retrieval and ranking, and association. This paper summarizes these steps, introduces a prototype software system that implements the proposed methodology, and presents an overview of the research experiments and results. Details of the proposed approach are included in Caldas and Soibelman (2005). TEXT INFORMATION INTEGRATION METHODOLOGY The formalization of the proposed methodology required the identification of the challenges associated with text information integration. For this purpose, construction databases were analyzed and feedback from industry practitioners was obtained regarding current approaches and technologies used for project document management, including project websites, document management systems, and project contract management systems. A text information integration methodology (TIIM) was then formalized and implemented. The methodology was then verified and validated using data from existing projects. The steps of the TIIM methodology will be briefly presented in the following sections. Classification The first step of the TIIM model is automated project document classification. This step is Page 2913
3 based on pattern classification algorithms. However, available classification algorithms cannot be applied directly. Several steps need to be accomplished beforehand. Decisions made in these preprocessing steps affect the results. These factors justified and motivated the development of an automated document classification process to implement the first step of the TIIM method (Caldas and Soibelman, 2003a). Experiments using the proposed process demonstrated that factors and parameters such as the choice of the classification algorithms and index weighting methods, as well as the use of dimensionality reduction techniques, affect the final classification results. Since each construction document can belong to more than one class, the classification process was designed to handle multiple binary classifications. In this case, each document is compared with each class. For each class, a binary decision is made in order to define whether or not the document is related with that particular class. This process is repeated to all classes defined for the project being considered. Pattern classification algorithms are then used to create a classification model for each particular class by using a set of training documents that have previously been classified manually by a domain expert. This approach relies on the existence of previously classified documents. The main components of the proposed document classification process are detailed in Caldas and Soibelman (2003a) and include: data collection, data conversion, dimensionality reduction, data preparation, data transformation, learning, and document classification. Retrieval and Ranking The proposed approach the retrieval and ranking of project documents combines the vector space model for information retrieval and pattern classification techniques to develop a specialized information retrieval and ranking system for the AEC/FM domain. In this step, the concept of an AEC/FM domain-specific information retrieval and ranking system was formalized and a prototype system was implemented to prove this concept. The vector space model (Salton and Lesk, 1968) was selected for document representation because the resulting model could be uniformly applied both to the classification and the retrieval and ranking steps of the proposed TIIM model. As a starting point, a query vector is constructed using terms extracted from the project model related to a selected model object. This vector, as well as the classification notations of the classifications associations from the selected object, is used as input for information retrieval. A classification-based retrieval and ranking algorithm was developed to implement this step of the model. In order to apply the classification-based retrieval and ranking algorithm, all project documents must be classified using the process described previously and all document vectors need to be normalized. Page 2914
4 The documents that match the model object s class are identified and selected. Then, the class centroid vector of all document vectors belonging to the object s class is calculated. The coordinates of the class centroid vector are calculated as the average of the coordinates of each document vector that belongs to the class under consideration. A term vector is created based on terms extracted from the object s description. This vector has coordinate values equal to one for the coordinates that correspond to the object s terms and coordinate values equal to zero for all other coordinates. A query vector is then calculated based on the term vector and the class centroid vector. This query vector is the representation of the project model object in the multidimensional space composed of project document vectors. Since the class centroid vector, the term vector, and the query vector are in the same multidimensional space of the project document collection vector space model, these vectors have the same number of coordinates. The identification of documents that are related to the selected object is based on the similarity between the query vector and each of the document vectors from the project document collection. This similarity is calculated using the cosine between two vectors. The similarity measure is calculated for all documents in the collection. Results are ranked and the documents whose similarities measurements are above a defined threshold are retrieved. Association Association of text documents to project model objects is the last step on the text information integration model. After project documents have been ranked and retrieved, a reference to the relevant documents is added to the project model. It is important to emphasize that the actual document is not added to the model, just a reference to it. This facilitates future updates, saves storage space and computer memory, and helps the integration with existing information systems. This association was implemented in the context of IFC specification (IFC 2x2, 2003). The document association relationship IfcRelAssociateDocument is used to establish links between a document reference IfcDocumentReference or document and any type of project object IfcObject. TIIM MODEL: IMPLEMENTATION A model-based information system prototype based on the TIIM methodology was developed for proof of concept and to conduct the validation experiments. This prototype system, called Unstructured Data Integration System (UDIS), is composed of smaller prototype systems that were developed to implement each of the steps of the TIIM model, including the Vector Space Modeler System (VSMS) and the Construction Document Classification System (CDCS) (Caldas and Soibelman, 2003a). VSMS is used to generate the vector space model representation of the project document collection. CDCS classifies project documents according to items of a construction information classification system. Figure 1 shows the UDIS Page 2915
5 Figure 1: Unstructured Data Integration System TIIM MODEL: RESULTS AND VALIDATION The study involved the analysis of more than 25 construction databases and 30,000 electronic documents. The document types included requests for information (RFI), design reviews, specifications, and change orders, among others. The typical file formats were PDF and MS Word. These databases were from real projects obtained from private contractors and the US Army Corps of Engineers. They encompassed different type of projects, such as commercial building, hotels, dormitories, and military bases, among others. Twenty of these databases were specifically used in the experiments conducted to validate the text information integration methodology. These experiments were performed using the prototype software systems previously described. The main objectives were to compare the proposed integration method with existing solutions for project document management. The technologies used for comparison were project contract management systems, project websites, and information retrieval systems. Popular commercial systems currently being used by the AEC/FM industry were used to conduct this study. Originally these systems do not provide support for the integration of project documents in model-based information systems. Therefore, the comparison was based on the existing retrieval and ranking capabilities of these systems. Documents related to selected IFC objects from each of the twenty construction databases were identified, retrieved, and ranked using the systems under comparison. The performance measures of precision and recall were calculated (Baeza-Yates and Ribeiro- Neto, 1999) and used for comparison. Precision is the percentage of the retrieved documents that were related to the selected IFC object. Recall is the percentage of the related documents that were actually retrieved. Statistical methods were used to analyze the results. Statistical paired tests were selected because the available experimental units were considerably different prior to their random assignment to the groups (systems under comparison) with respect to characteristics that may affect the experimental responses. Page 2916
6 The average recall results were 66.86% for the UDIS system, 44.17% for the project contract management system, 31.06% for the project website, 41.64% for the information retrieval system 1, and 31.70% for the information retrieval system 2. The difference between the results of the system that implements the TIIM model and each of the 4 other approaches was considered significant as indicated by the low values for the level of significance (ρ-value) obtained from the statistical tests. The average precision results were 46.33% for the UDIS, 28.55% for the project contract management system, 29.20% for the project website, 31.45% for the information retrieval system 1, and 29.58% for the information retrieval system 2. Once again, this difference was considered significant according to the results of the tests. The results demonstrated a significant improvement in the capability to identify documents that are related to project model objects. CONCLUSIONS In summary, the contributions from this research project can be divided in two parts. First a methodology for integrating project documents in model-based information systems was developed. The methodology was verified and validated using documents from several existing projects. The research results demonstrated that the methodology promotes a significant improvement in the capability to identify documents that are related to project model objects. Access to project documents is improved because large collections of documents can be analyzed more effectively. Differences in vocabulary are minimized using the classification-based approach and process automation makes the results more consistent. Integrated documents can then be used to support the planning, implementation, control, and analysis of project activities. The second contribution was the development of semiautomatic methods for the classification, retrieval and ranking, and association. The way in which these methods were formalized facilitates their incorporation on existing project management information systems. This is an important requirement for the application of the proposed methodology in real-world projects. The formalisms developed in this research to deal with text documents can be expanded to other types of data generated on AEC/FM projects. Interfaces with other research areas in construction engineering and management such as data collection, data mining, planning, optimization, and simulation can be established. Similarly, new research areas such as integrated project data analysis and proactive project information systems can be explored REFERENCES Baeza-Yates, R. and Ribeiro-Neto, B. Anumba, (1999). Modern Information Retrieval, ACM Press, New York, NY. Caldas, C., Soibelman, L., and Gasser, L. (2005) A methodology for the integration of project documents in model-based information systems. Journal of Computing in Civil Engineering, 19(1), Page 2917
7 Caldas, C. H. and Soibelman, L. (2003a) Automating hierarchical document classification for construction management information systems. Automation in Construction, 12(4), Eastman, C.M. (1999). Building product models: computer environments supporting design and construction. CRC Press, Boca Raton, FL. Fischer, M., and Kunz, J. (1995) The circle: architecture for integrating software. Journal of Computing in Civil Engineering, 9(2), Froese, T. (2003) Future directions for model-based interoperability. Proceedings of the 2003 ASCE Construction Research Congress, Honolulu, HI. IFC 2x2 (2003). IFC2x Edition 2. International Alliance for Interoperability. Reinschmidt, K. F., Griffis, F.H. and Bronner, P.L. (1991). Integration of Engineering, Design, and Construction. Journal of Construction Engineering and Management, 117(4), Rezgui, Y., Brown, Y., Cooper, G., Yip, J., Brandon, P., and Kirkham, J. (1996) An information management model for concurrent construction engineering. Journal of Automation in Construction, 5(4), Salton, G. and Lesk, M. E. (1968) Computer evaluation of indexing and text processing. Journal of the Association for Computing Machinery, 15(1), Sanvido, V. E. and Medeiros, D.J. (1990). Applying computer-integrated manufacturing concepts to construction. J. Constr.Engrg. and Mgmt., ASCE, 116(2), Turk, Ž., Björk, B-C., Johansson, C. and Svensson, K. (1994) Document Management Systems as an Essential Step Towards CIC, Preproceedings CIB W78 Workshop on Computer Integrated Construction, VTT, Helsinki. Zhu, Y., Issa, R. R. and Cox, R. F. (2001). Web-based construction document processing via malleable frame Journal of Computing in Civil Engineering, 15(3), Page 2918
Semantic Concept Based Retrieval of Software Bug Report with Feedback
Semantic Concept Based Retrieval of Software Bug Report with Feedback Tao Zhang, Byungjeong Lee, Hanjoon Kim, Jaeho Lee, Sooyong Kang, and Ilhoon Shin Abstract Mining software bugs provides a way to develop
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
AUTOMATED CONSTRUCTION PLANNING FOR MULTI-STORY BUILDINGS
AUTOMATED CONSTRUCTION PLANNING FOR MULTI-STORY BUILDINGS Tang-Hung Nguyen 1 ABSTRACT This paper outlines a computer-based framework that can assist construction planners and schedulers in automatically
An Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
BIM Extension into Later Stages of Project Life Cycle
BIM Extension into Later Stages of Project Life Cycle Pavan Meadati, Ph.D. Southern Polytechnic State University Marietta, Georgia This paper discusses the process for extending the implementation of the
Development of an Ontology for the Document Management Systems for Construction
Development of an Ontology for the Document Management Systems for Construction Alba Fuertes a,1, Núria Forcada a, Miquel Casals a, Marta Gangolells a and Xavier Roca a a Construction Engineering Department.
CRITICAL SUCCESS FACTORS FOR KNOWLEDGE MANAGEMENT STUDIES IN CONSTRUCTION
CRITICAL SUCCESS FACTORS FOR MANAGEMENT STUDIES IN CONSTRUCTION Yu-Cheng Lin Assistant Professor Department of Civil Engineering National Taipei University of Technology No.1.Chung-Hsiao E. Rd., Sec.3,
Stabilization by Conceptual Duplication in Adaptive Resonance Theory
Stabilization by Conceptual Duplication in Adaptive Resonance Theory Louis Massey Royal Military College of Canada Department of Mathematics and Computer Science PO Box 17000 Station Forces Kingston, Ontario,
SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL
SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL Krishna Kiran Kattamuri 1 and Rupa Chiramdasu 2 Department of Computer Science Engineering, VVIT, Guntur, India
Clustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
Studying the Impact of Text Summarization on Contextual Advertising
Studying the Impact of Text Summarization on Contextual Advertising Giuliano Armano, Alessandro Giuliani and Eloisa Vargiu Dept. of Electric and Electronic Engineering University of Cagliari Cagliari,
Synchronous Building Information Model-Based Collaboration in the Cloud: a Proposed Low Cost IT Platform and a Case Study
Synchronous Building Information Model-Based Collaboration in the Cloud: a Proposed Low Cost IT Platform and a Case Study J. Munkley 1, M. Kassem 2 and N. Dawood 2 1 Niven Architects, Darlington, DL3 7EH,
Categorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
Query Recommendation employing Query Logs in Search Optimization
1917 Query Recommendation employing Query Logs in Search Optimization Neha Singh Department of Computer Science, Shri Siddhi Vinayak Group of Institutions, Bareilly Email: [email protected] Dr Manish
A Framework of Personalized Intelligent Document and Information Management System
A Framework of Personalized Intelligent and Information Management System Xien Fan Department of Computer Science, College of Staten Island, City University of New York, Staten Island, NY 10314, USA Fang
Design/Construction Integration thru Virtual Construction for Improved Constructability Walid Thabet, Virginia Tech
Design/Construction Integration thru Virtual Construction for Improved Constructability Walid Thabet, Virginia Tech Executive Summary: Changing industry needs have take big leaps in the past two decades.
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013
FACTORING CRYPTOSYSTEM MODULI WHEN THE CO-FACTORS DIFFERENCE IS BOUNDED Omar Akchiche 1 and Omar Khadir 2 1,2 Laboratory of Mathematics, Cryptography and Mechanics, Fstm, University of Hassan II Mohammedia-Casablanca,
MASTER OF SCIENCE IN Computing & Data Analytics. (M.Sc. CDA)
MASTER OF SCIENCE IN Computing & Data Analytics (M.Sc. CDA) Learn. Generate. Innovate. Saint Mary s new Master of Science in Computing & Data Analytics (MSc CDA) is a graduate-level, 16-month professional
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Requirements for Developing BIM Governance Model: Practitioners Perception
Requirements for Developing BIM Governance Model: Practitioners Perception 1 Eissa Alreshidi; 2 Prof. Yacine Rezgui; 3 Dr. Monjur Mourshed (1) School of Engineering, Cardiff University, Cardiff, UK, E-mail:
Bisecting K-Means for Clustering Web Log data
Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining
Large-Scale Data Sets Clustering Based on MapReduce and Hadoop
Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE
Incorporating Window-Based Passage-Level Evidence in Document Retrieval
Incorporating -Based Passage-Level Evidence in Document Retrieval Wensi Xi, Richard Xu-Rong, Christopher S.G. Khoo Center for Advanced Information Systems School of Applied Science Nanyang Technological
APPLYING CASE BASED REASONING IN AGILE SOFTWARE DEVELOPMENT
APPLYING CASE BASED REASONING IN AGILE SOFTWARE DEVELOPMENT AIMAN TURANI Associate Prof., Faculty of computer science and Engineering, TAIBAH University, Medina, KSA E-mail: [email protected] ABSTRACT
I. The SMART Project - Status Report and Plans. G. Salton. The SMART document retrieval system has been operating on a 709^
1-1 I. The SMART Project - Status Report and Plans G. Salton 1. Introduction The SMART document retrieval system has been operating on a 709^ computer since the end of 1964. The system takes documents
Engineering Standards in Support of
The Application of IEEE Software and System Engineering Standards in Support of Software Process Improvement Susan K. (Kathy) Land Northrop Grumman IT Huntsville, AL [email protected] In Other Words Using
ISSUES ON FORMING METADATA OF EDITORIAL SYSTEM S DOCUMENT MANAGEMENT
ISSN 1392 124X INFORMATION TECHNOLOGY AND CONTROL, 2005, Vol.34, No.4 ISSUES ON FORMING METADATA OF EDITORIAL SYSTEM S DOCUMENT MANAGEMENT Marijus Bernotas, Remigijus Laurutis, Asta Slotkienė Information
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
ESPRIT 29938 ProCure. ICT at Work for the LSE ProCurement Chain:
ESPRIT 29938 ProCure ICT at Work for the LSE ProCurement Chain: Putting Information Technology & Communications Technology to work in the Construction Industry The EU-funded ProCure project was designed
Designing and Evaluating Visualization Techniques for Construction Planning
Designing and Evaluating Visualization Techniques for Construction Planning Kathleen Liston 1, Martin Fischer, and John Kunz 2 Abstract Construction project teams view project information with traditional
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
A Framework of Information Management System for Construction Projects
A Framework of Information Management System for Construction Projects Ma Zhiliang and Qin Liang Department of Civil Engineering, Tsinghua University, Beijing 100084 ([email protected]; [email protected])
The Enron Corpus: A New Dataset for Email Classification Research
The Enron Corpus: A New Dataset for Email Classification Research Bryan Klimt and Yiming Yang Language Technologies Institute Carnegie Mellon University Pittsburgh, PA 15213-8213, USA {bklimt,yiming}@cs.cmu.edu
Strategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
Mobile 2D Barcode/BIM-based Facilities Maintaining Management System
Mobile 2D Barcode/BIM-based Facilities Maintaining Management System Yu-Cheng Lin, Yu-Chih Su, Yen-Pei Chen Department of Civil Engineering, National Taipei University of Technology, No.1.Chung-Hsiao E.
A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS
A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS Charanma.P 1, P. Ganesh Kumar 2, 1 PG Scholar, 2 Assistant Professor,Department of Information Technology, Anna University
Building a Database to Predict Customer Needs
INFORMATION TECHNOLOGY TopicalNet, Inc (formerly Continuum Software, Inc.) Building a Database to Predict Customer Needs Since the early 1990s, organizations have used data warehouses and data-mining tools
Personalized Hierarchical Clustering
Personalized Hierarchical Clustering Korinna Bade, Andreas Nürnberger Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, D-39106 Magdeburg, Germany {kbade,nuernb}@iws.cs.uni-magdeburg.de
Big Data Text Mining and Visualization. Anton Heijs
Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark
III. DATA SETS. Training the Matching Model
A Machine-Learning Approach to Discovering Company Home Pages Wojciech Gryc Oxford Internet Institute University of Oxford Oxford, UK OX1 3JS Email: [email protected] Prem Melville IBM T.J. Watson
Search Result Optimization using Annotators
Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,
Blog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
Life-cycle Information Flow Management System
Life-cycle Information Flow Management System 1 Qingpeng MAN, 2 Yaowu WANG, 3 Heng LI, 4 Yuan CHANG, 5 Skitmore MARTIN 1 School of Management, Harbin Institute of Technology; Harbin, China; [email protected]
A Building Life-Cycle Information System For Tracking Building Performance Metrics
LBNL-43136 LC-401 Proceedings of the 8 th International Conference on Durability of Building Materials and Components, May 30 - June 3, 1999, Vancouver, BC A Building Life-Cycle Information System For
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Florida International University - University of Miami TRECVID 2014
Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,
Master s Program in Information Systems
The University of Jordan King Abdullah II School for Information Technology Department of Information Systems Master s Program in Information Systems 2006/2007 Study Plan Master Degree in Information Systems
Ontology-Based Semantic Modeling of Safety Management Knowledge
2254 Ontology-Based Semantic Modeling of Safety Management Knowledge Sijie Zhang 1, Frank Boukamp 2 and Jochen Teizer 3 1 Ph.D. Candidate, School of Civil and Environmental Engineering, Georgia Institute
Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
Quality Assessment for Geographic Web Services. Pedro Medeiros (1)
Quality Assessment for Geographic Web Services Pedro Medeiros (1) (1) IST / INESC-ID, Av. Prof. Dr. Aníbal Cavaco Silva, 2744-016 Porto Salvo, [email protected] Abstract Being able to assess the
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
Interactive Multimedia Courses-1
Interactive Multimedia Courses-1 IMM 110/Introduction to Digital Media An introduction to digital media for interactive multimedia through the study of state-of-the-art methods of creating digital media:
Semantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
A Business Process Services Portal
A Business Process Services Portal IBM Research Report RZ 3782 Cédric Favre 1, Zohar Feldman 3, Beat Gfeller 1, Thomas Gschwind 1, Jana Koehler 1, Jochen M. Küster 1, Oleksandr Maistrenko 1, Alexandru
Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object
Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object Anne Monceaux 1, Joanna Guss 1 1 EADS-CCR, Centreda 1, 4 Avenue Didier Daurat 31700 Blagnac France
Identifying Best Bet Web Search Results by Mining Past User Behavior
Identifying Best Bet Web Search Results by Mining Past User Behavior Eugene Agichtein Microsoft Research Redmond, WA, USA [email protected] Zijian Zheng Microsoft Corporation Redmond, WA, USA [email protected]
THE USE OF VISUALISATION TO COMMUNICATE DESIGN INFORMATION TO CONSTRUCTION SITES
THE USE OF VISUALISATION TO COMMUNICATE DESIGN INFORMATION TO CONSTRUCTION SITES A. Ganah 1, C. Anumba and N. Bouchlaghem Civil & Building Engineering Dept. Loughborough University, LE11 3TU, UK Architectural
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
Email Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
Turning Emergency Plans into Executable
Turning Emergency Plans into Executable Artifacts José H. Canós-Cerdá, Juan Sánchez-Díaz, Vicent Orts, Mª Carmen Penadés ISSI-DSIC Universitat Politècnica de València, Spain {jhcanos jsanchez mpenades}@dsic.upv.es
Keywords : Data Warehouse, Data Warehouse Testing, Lifecycle based Testing
Volume 4, Issue 12, December 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Lifecycle
CONSTRUCTION DESIGN DOCUMENT MANAGEMENT SCHEMA AND PROTOTYPE
CONSTRUCTION DESIGN DOCUMENT MANAGEMENT SCHEMA AND PROTOTYPE Ziga Turk Asst.Prof., Faculty of Civil Engineering and Geodesy, University of Ljubljana, Slovenia. CDDM Schema and Prototype dr. Ziga Turk FGG
HELP DESK SYSTEMS. Using CaseBased Reasoning
HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind
VisPMIS: A VISUAL PROJECT MANAGEMENT INFORMATION SYSTEM
Eleventh East Asia-Pacific Conference on Structural Engineering & Construction (EASEC-11) Building a Sustainable Environment November 19-21, 2008, Taipei, TAIWAN VisPMIS: A VISUAL PROJECT MANAGEMENT INFORMATION
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
A Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, [email protected] Abstract Most text data from diverse document databases are unsuitable for analytical
A Framework for Adaptive Process Modeling and Execution (FAME)
A Framework for Adaptive Process Modeling and Execution (FAME) Perakath Benjamin [email protected] Madhav Erraguntla [email protected] Richard Mayer [email protected] Abstract This paper describes the
An evolutionary learning spam filter system
An evolutionary learning spam filter system Catalin Stoean 1, Ruxandra Gorunescu 2, Mike Preuss 3, D. Dumitrescu 4 1 University of Craiova, Romania, [email protected] 2 University of Craiova, Romania,
A STRATEGIC PLANNER FOR ROBOT EXCAVATION' by Humberto Romero-Lois, Research Assistant, Department of Civil Engineering
A STRATEGIC PLANNER FOR ROBOT EXCAVATION' by Humberto Romero-Lois, Research Assistant, Department of Civil Engineering Chris Hendrickson, Professor, Department of Civil Engineering, and Irving Oppenheim,
