Search Engine Based Intelligent Help Desk System: iassist

Size: px
Start display at page:

Download "Search Engine Based Intelligent Help Desk System: iassist"

Transcription

1 Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India Abstract: Intelligent Help Desk System is the need of every individual. Many organizations use Case Based Help Desk System. Maintaining an up-to-date Case History for each and every problem is difficult and costly. Search Engine is doing the task of intelligent help for all the users of internet. For a given user keyword query, current web search engines return a list of individual web pages. However, information for the query is often spread across multiple pages. This degrades the quality of search results. To address these challenges, an Intelligent Help Desk System: iassist is developed. It is based on search engine results as case history of the user query. The semantic relevance of the search results with the user query is computed using NEC SENNA and WordNet. The proposed system ranks the search results based on their semantic relevance to the request. These relevant results are grouped into different clusters based on MDL principle and symmetric matrix factorization. Each cluster is summarized to generate recommended solutions. For performance analysis the system is tested using user survey. Experiments conducted demonstrate the effectiveness of iassist in semantic text understanding, document clustering and summarization. The better performance of iassist benefits from the sentence level semantic analysis, clustering using MDL principle and SNMF. Keywords: Intelligent Helpdesk, Semantic Similarity, Web Search Result Summarization, Document Summarization I. INTRODUCTION Intelligent Help Desk System is the need of every individual. Many organizations use Case Based Help Desk System to improve the quality of customer service. For a given customer request, an intelligent helpdesk system tries to find the earlier similar requests and the case history associated with the request. Helpdesk systems usually use databases to store past interactions between customers and companies. Interactions may be descriptions of a problem and recommended solutions. Major challenge faced by these help desk systems is maintenance of up-to-date case history. Maintaining an up-to-date Case History for each and every problem is difficult and costly. Search Engine is doing the task of intelligent help for all the users of internet. However, content on the Web and Enterprise Intranet is increasing day by day. The web is a vast collection of completely uncontrolled heterogeneous documents. It is huge, Diverse, and dynamic. For a user keyword query, current Web Search Engines return a list of pages with respect to the query. However, the information for a topic, especially for multitopic queries in which individual query keywords occur relatively frequently in the document collection but rarely occur together in the same document, is often distributed among multiple physical pages. So the search engines are drowning in information, but starving for knowledge. To address the challenges faced by present help desk system and web search engines, we have developed an online helpdesk system: iassist. It automatically finds problem-solution pattern from web using search engines like Google, Yahoo, etc. For a given user query, iassist interacts with the search engine to retrieve the relevant solutions. These retrieved solutions are ranked based on their semantic similarity with user query. Semantic similarity is based on semantic roles and semantic meanings. II. LITERATURE SURVEY Case-based systems have been developed to interactively search the solution space by suggesting the most informative questions to ask [2,5]. These systems use the initial information to retrieve the first candidate set and then ask the user questions to narrow down until few cases remain or the most suitable items are found. When the description of cases or items becomes complicated, these case-based systems suffer from the curse of dimensionality, and the similarity/ distance between cases or items becomes difficult to measure. Furthermore, the similarity measurements used in these systems usually are based on keyword matching, which lacks the semantic analysis of customer requests and existing cases. Help desk systems based on database search and ranking are also developed. Many methods have been proposed to perform similarity search and rank results of a query [6]. However, similar to the case-based systems, the similarity is measured based on keyword matching, which have difficulty to understand the semantics and context of text deeply. Existing search engines often return a long list of search results. Clustering technologies are often used in search result organization [7]. However, the existing document-clustering algorithms do not consider the impact of the general and common information contained in the documents. In our work, by filtering out this common information, the clustering quality can be improved, and better context organizations can then be obtained. III. SYSTEM ARCHITECTURE Figure 1 shows system architecture of iassist. System works in five modules: Preprocessing, Case Ranking, Document Clustering, Sentence Clustering and Sentence Cluster Summarization. As shown in figure, input to the system is user query in the form of question. The

2 system retrieves relevant solutions or past cases from search engine. Pre-processing of user query and past cases involves removal of non-words, then each of the retrieved document is truncated into sentences and passed through semantic role parser for semantic role labeling. Case ranking module ranks the retrieved documents based on their sentence level semantic similarity with user query. Semantically ranked documents need to be grouped according the context. Top ranking documents are clustered using Minimum Description Length (MDL) principle [1]. Sentence Clustering Module groups sentences having similar meaning into a cluster using Symmetric Non-negative Matrix Factorization (SNMF) [3]. Sentence Cluster Summarization module selects most relevant sentences from each cluster in order to form a concise summary which is represented as reference solution to the user. IV. PREPROCESSING It is essential to consider only meaningful words and to remove the redundancy in documents as well as to reduce the document size. So, preprocessing of problemsolution pattern involves removal of non-words from both the user query and documents retrieved from search engine. Further, each sentence in the retrieved document is passed to a semantic role parser to find semantic meaning of each sentence based on frames (or verbs) in a sentence. Semantic role labeling Semantic role labeling, sometimes also called shallow semantic parsing, is a task in natural language processing consisting of the detection of the semantic arguments associated with the predicate or verb of a sentence and their classification into their specific roles. A semantic role is a description of the relationship that a constituent plays with respect to the verb in the sentence. For example, given a sentence like Riya sold the book to Abbas", the task would be to recognize the verb "to sell" as representing the predicate, "Riya" as representing the seller (agent), "the book" as representing the goods (theme), and "Abbas" as representing the recipient. This is an important step towards making sense of the meaning of a sentence. A semantic representation of this sort is at a higher-level of abstraction than a syntax tree. For instance, the sentence "The book was sold by Riya to Abbas" has a different syntactic form, but the same semantic roles. In order to analyze user query and documents, semantic roles of each sentence are computed by passing these sentences through semantic role parser. This helps in categorizing the documents based on their semantic importance with user query. In iassist, NEC SENNA is used as the semantic role labeler, which is based on PropBank [4] semantic annotation. This semantic role labeler labels each verb in a sentence with its propositional arguments, and the labeling for each particular verb is called a frame. Therefore, for each sentence, the number of frames generated by the parser equals the number of verbs in the sentence. A set of abstract arguments given by the labeler indicates the semantic role of each term in a frame. In general, Arg[m] represents role of term in given sentence where m indicates argument number within sentence. For example, Arg0 is actor, Arg-NEG indicates negation. Figure 1 System Architecture V. SENTENCE-LEVEL SEMANTIC SIMILARITY COMPUTATION AND TOP RELEVANT DOCUMENT RANKING To assist users in finding answers relevant to their query, the retrieved documents from search engine are required to be ranked based on their semantic importance to the input user query. In order to rank these documents, the similarity scores between the retrieved documents and the input user query are computed. Simple keyword-based similarity measurement, such as the cosine similarity, cannot capture the semantic similarity. Thus, this system uses a method to calculate the semantic similarity between the sentences in retrieved documents from search engine and the user query based on the semantic role analysis. Along with this, the similarity computation uses WordNet in order to better capture the semantically related words. Table 1 Sentence-Level Semantic Similarity Calculation and Top Document Ranking Input : Sentences S i and S j Algorithm: 1. S i and S j are parsed into frames by the semantic role labeller. 2. For terms in frames having same semantic role, semantic similarity is computed using WordNet. This can be computed in two ways viz.. a. If two words are exactly equal in query and sentence and also having same semantic role, set term similarity equal to 1.

3 b. If two words are not equal, check semantic relation like synonyms by using WordNet hierarchy, if similar semantic meaning is found set term similarity equal to 1. c. If above two cases fails term similarity is set to 0. d. This can be represented mathematically as 3. Let, {r 1, r2,,r k }: Set of K common semantic roles between f 1 and f 2. Let T ( ) 1 r i represent set of terms related with frame f 1 and T ( ) 2 r i represent set of terms with frame f 2. represents role similarity between two term sets in two sentences, represents similarity between two terms in same role r i 4. Further this computation is used to compute similarity between two frames which in turn results in computation of sentence similarity. 5. The maximum of frame similarities between two sentences will be value of similarity between two sentences. This value lies in the interval 0 to The documents returned by the search engine for given query are ranked based on the document score calculated using following formulae: Where d n represents the n th retrieved document from search engine. Moreover, the list of the ranked documents is returned to the user as the search results. ) VI. DOCUMENT CLUSTERING USING MDL PRINCIPLE The identified top ranking cases are all relevant to the user query. But these relevant cases may actually belong to different categories. For example, if the user query is Give Information about Taj Mahal, the relevant cases may involve Taj Mahal as Tea Brand, Taj as Five Star Hotel or Taj Mahal as white Table 2 : Document Clustering Algorithm 1. Generate set of distinct keywords set for each document. 2. Calculate Support values for each keyword in distinct keyword set. Suppose that each document k is represented by a vector where is the support value of the keyword. 3. Decide support threshold value for all documents in document set. 4. Represent document set and Keywords in matrix form. Let is the term document matrix 5. It is assumed that C is the set of clusters, for document set D. Clustering information is represented using pair of matrices, and. M TC - term cluster matrix. M DC represents information with its member documents term document matrix M TD is represented using M TC and M DC. Where is and difference matrix with 0/1/-1 values. 6. Initially it is assumed that each document represents one cluster and agglomerative clustering algorithm is applied for document clustering. marble mausoleum etc. Therefore, it is necessary to further group these cases into different contexts. The proposed system makes use of Minimum Description Length(MDL) principle in order to cluster documents with similar meaning in one group. MDL Principle states that Best model inferred from a given data is the one which minimizes, length of the model in bits and the length of encoding of data, in bits. Table 3 : Procedures Algorithm AggloMDL (D) 1.Let C = c 1,c 2,c 3,..,c n, with c i = ({d i }) 2.Select best cluster pair (c i,c j ) from C for merging and form new cluster c k. 3.(c i,c j,c k ) := GetBestPair(C) 4.while(c i,c j,c k )is not empty do { 5. C:= C- {c i, c j } U {c k } 6.(c i,c j,c k ):=GetBestPair(C)} 7. return C End procedure GetBestPair(C) 1.MDLcostmin := 2.for each pair(c i,c j ) of clusters in C do{ 3.(MDLcost,ck):=GetMDLCost(c i,c j,c) /*GetMDLCost returns the optimal MDLCost when c k is made by merging c i and c j */ 4.if MDLcost<MDLcostmin then { 5.MDLcostmin :=MDLCost; )=(c i,c j,c k ) } } 7.return ); End procedure GetMDLCost(c i,c j,c) 1. Dk = D i D j ; 3. c k = (D k ); 4. C = C {c i,c j } {c k }; 5. MDL := Approximate MDL Cost of C by MDL COST Equation 6. return(mdl,c k ); End

4 MDL COST Equation Where are computed using M TD matrix VII. CLUSTERING USING SYMMETRIC NON- NEGATIVE MATRIX FACTORIZATION (SNMF) ALGORITHM W: Sentence similarity matrix where element represents similarity value between sentence pair S i and S j.h: Sentence cluster matrix. Initially, is set to 1 with size equal to size of W.The factorization problem can be stated as: Given a matrix W,find nonnegative matrices H and H T that minimize the function F(W,H) = Where, is the Frobenius norm or squared error norm. To derive the rule for updation of H with H 0, we use Karush-Kuhn-Tucker (KKT) condition leading to fixed point relation. T ( 4W H + 4 H H H ) H = 0 If the above condition is true, update the value of H using following equation 1 ( WH ) H H (1 + T 2 ( HH H ) Hence, the algorithm procedure for solving SNMF is: given an initial guess of H (identity matrix in this case), iteratively update H using above equation until convergence (update H till it satisfies KKT condition). By this continuous updation, matrix H finally has clusters of sentences. As SNMF maintains near-orthogonality of columns in H, it is useful in data (sentence) clustering. This results in softclustering where an object can belong to more than one clusters. VIII. SUMMARIZATION OF EACH SENTENCE CLUSTER Table 4 : Within cluster-sentence selection After grouping the sentences into clusters by the SNMF algorithm, 1. Remove the noisy clusters (the cluster of sentences containing less than three sentences). 2. Then, in each sentence cluster, rank the sentences based on the sentence score calculation, as shown in following equations. The score of a sentence measures the importance of a sentence to be included in the final concise solution (summary). Internal Similarity Measure : External Similarity Measure Where F 1 (S i ) measures the average similarity score between sentence S i and all other sentences in cluster C k, and N is the number of sentences in C k. F 2 (S i ) represents similarity between sentence S i and input request. (weight parameter) is set to 0.7 by trial and error. High value of indicates more weightage is given to internal similarity. IX. RESULT ANALYSIS All experiments reported here were performed on Intel CORE i3processor with 4GB RAM. All algorithms are implemented using JAVA as the programming platform. Implemented Algorithms 1) Sentence Level Semantic Similarity Calculation and Top Ranking Cases - Making use of NEC SENNA and WordNet 2) Clustering of Top Ranking Documents using MDL Principle 3) Sentence Clustering using SNMF. 4) Multi Document summarization - within cluster sentence selection In the set of experiments, we randomly select questions from different context and search result returned by the search engine. During user survey, user is asked to manually generate solution for the the selected queries. The sentences in the solution are considered as relevant sentence set. Then we compare the solution generated by iassit with standard automated summarization tool. Table 6 shows solution generated by the user and iassit. Performance of iassist is measured using standard IR measures: precision and recall Where, S man : Set of sentences selected by manual evaluation S sys : Set of sentences selected by iassist or automated summarization tool in final summary. Table 5 shows precision and recall values for sample user queries. Figure 2 and 3 show the average precision and recall of the two techniques. The higher precision value of iassist as compared to automated summarization tools demonstrates that the semantic similarity calculation can better capture the meanings of the user requests and case documents returned by the search engine. Comparison of proposed iassist system with current helpdesk systems is shown in Table 7. We observe that the user satisfaction can be improved by capturing semantically related cases as compared to only keyword-based matching cases. From the values of recall and precision obtained for sample scenarios, we conclude that combining the MDL principle that groups documents according to different contexts and the SNMF clustering algorithm can help users to easily find their desired solutions from multiple physical pages. The problem of maintaining an up-to-date history of past cases is solved by making use of search engine as a database. Also, user can query any problem related to any domain. X. CONCLUSION The proposed iassist system provides its users a single point of access to their problems by providing solutions from different domains. This system will automatically find problem-solution pattern for new request given by user by making use of search results returned by the search engine. Use of semantic case ranking, MDL clustering and SNMF with request-focused multi document summarization helps to improve the performance of iassist. The proposed approach of semantic role labeling contributes in improving the overall result of summarization. As the proposed system uses search engine results as case history for the user query, the problem of maintaining an

5 updated case history for each and every problem is automatically resolved. Figure 2 Precision of Retrieved Cases Figure 3 Recall of Retrieved Cases Table 5 : Performance Analysis Table 7 : Comparison Of iassist with Current Helpdesk Systems Table 6: Top-Ranking Summary Sample By Manual Evaluation And iassist For Sample Scenario REFERENCES [1] Chulyun Kim and Kyuseok Shim, Member, IEEE Transactions, "TEXT: Automatic Template Extraction from Heterogeneous Web Pages, Vol.23, NO.4, April [2] D.Wang, T. Li, S. Zhu, and Y. Gong, ihelp: An Intelligent Online Helpdesk System IEEE Transactions On Systems, Man, And Cybernetics Part B: Cybernetics, Vol. 41, No. 1, February [3] D.Wang, S. Zhu, T. Li, and C. Ding, Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization, in Proc. SIGIR, 2008, pp [4] M. Palmer, P. Kingsbury, and D. Gildea, The proposition bank: An annotated corpus of semantic roles, Comput. Linguist, vol.31, no. 1, pp , Mar [5] D. Bridge, M. H. Goker, L. Mcginty, and B. Smyth, Case-based recommender systems, Knowl. Eng. Rev., vol. 20, no. 3, pp , Sep [6] R. Agrawal, R. Rantzau, and E. Terzi, Contextsensitive ranking, in Proc. SIGMOD, 2006, pp [7] Leuski and J. Allan, Improving interactive retrieval by combining ranked list and clustering, in Proc. RIAO, 2000, pp

iassist:an Intelligent Online Assistance System

iassist:an Intelligent Online Assistance System International Journal of Scientific and Research Publications, Volume 3, Issue 2, February 2013 1 iassist:an Intelligent Online Assistance System Khanapure V.M 1, Prof. Chirchi V.R 2 1 P.G Department(CNE),

More information

HIGH-QUALITY customer service is extremely important

HIGH-QUALITY customer service is extremely important IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS 1 ihelp: An Intelligent Online Helpdesk System Dingding Wang, Tao Li, Shenghuo Zhu, and Yihong Gong Abstract Due to the importance

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

Search Result Optimization using Annotators

Search Result Optimization using Annotators Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,

More information

HELP DESK SYSTEMS. Using CaseBased Reasoning

HELP DESK SYSTEMS. Using CaseBased Reasoning HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind

More information

A Survey on Product Aspect Ranking

A Survey on Product Aspect Ranking A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,

More information

SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL

SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL Krishna Kiran Kattamuri 1 and Rupa Chiramdasu 2 Department of Computer Science Engineering, VVIT, Guntur, India

More information

Movie Classification Using k-means and Hierarchical Clustering

Movie Classification Using k-means and Hierarchical Clustering Movie Classification Using k-means and Hierarchical Clustering An analysis of clustering algorithms on movie scripts Dharak Shah DA-IICT, Gandhinagar Gujarat, India dharak_shah@daiict.ac.in Saheb Motiani

More information

AN APPROACH TO ANTICIPATE MISSING ITEMS IN SHOPPING CARTS

AN APPROACH TO ANTICIPATE MISSING ITEMS IN SHOPPING CARTS AN APPROACH TO ANTICIPATE MISSING ITEMS IN SHOPPING CARTS Maddela Pradeep 1, V. Nagi Reddy 2 1 M.Tech Scholar(CSE), 2 Assistant Professor, Nalanda Institute Of Technology(NIT), Siddharth Nagar, Guntur,

More information

Big Data Summarization Using Semantic. Feture for IoT on Cloud

Big Data Summarization Using Semantic. Feture for IoT on Cloud Contemporary Engineering Sciences, Vol. 7, 2014, no. 22, 1095-1103 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.49137 Big Data Summarization Using Semantic Feture for IoT on Cloud Yoo-Kang

More information

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

More information

American Journal of Engineering Research (AJER) 2013 American Journal of Engineering Research (AJER) e-issn: 2320-0847 p-issn : 2320-0936 Volume-2, Issue-4, pp-39-43 www.ajer.us Research Paper Open Access

More information

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted

More information

PRODUCT REVIEW RANKING SUMMARIZATION

PRODUCT REVIEW RANKING SUMMARIZATION PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Optimization of Internet Search based on Noun Phrases and Clustering Techniques

Optimization of Internet Search based on Noun Phrases and Clustering Techniques Optimization of Internet Search based on Noun Phrases and Clustering Techniques R. Subhashini Research Scholar, Sathyabama University, Chennai-119, India V. Jawahar Senthil Kumar Assistant Professor, Anna

More information

Expert Finding Using Social Networking

Expert Finding Using Social Networking San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research 1-1-2009 Expert Finding Using Social Networking Parin Shah San Jose State University Follow this and

More information

Interactive Dynamic Information Extraction

Interactive Dynamic Information Extraction Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken

More information

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering

More information

Research on News Video Multi-topic Extraction and Summarization

Research on News Video Multi-topic Extraction and Summarization International Journal of New Technology and Research (IJNTR) ISSN:2454-4116, Volume-2, Issue-3, March 2016 Pages 37-39 Research on News Video Multi-topic Extraction and Summarization Di Li, Hua Huo Abstract

More information

I. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION

I. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION Noesis: A Semantic Search Engine and Resource Aggregator for Atmospheric Science Sunil Movva, Rahul Ramachandran, Xiang Li, Phani Cherukuri, Sara Graves Information Technology and Systems Center University

More information

Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION

Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,

More information

Analysis of Social Media Streams

Analysis of Social Media Streams Fakultätsname 24 Fachrichtung 24 Institutsname 24, Professur 24 Analysis of Social Media Streams Florian Weidner Dresden, 21.01.2014 Outline 1.Introduction 2.Social Media Streams Clustering Summarization

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com

More information

Email Spam Detection Using Customized SimHash Function

Email Spam Detection Using Customized SimHash Function International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. ravirajesh.j.2013.mecse@rajalakshmi.edu.in Mrs.

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries

Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries Kale Sarika Prakash 1, P. M. Joe Prathap 2 1 Research Scholar, Department of Computer Science and Engineering, St. Peters

More information

Term extraction for user profiling: evaluation by the user

Term extraction for user profiling: evaluation by the user Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

More information

Mavuno: A Scalable and Effective Hadoop-Based Paraphrase Acquisition System

Mavuno: A Scalable and Effective Hadoop-Based Paraphrase Acquisition System Mavuno: A Scalable and Effective Hadoop-Based Paraphrase Acquisition System Donald Metzler and Eduard Hovy Information Sciences Institute University of Southern California Overview Mavuno Paraphrases 101

More information

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no INF5820 Natural Language Processing - NLP H2009 Jan Tore Lønning jtl@ifi.uio.no Semantic Role Labeling INF5830 Lecture 13 Nov 4, 2009 Today Some words about semantics Thematic/semantic roles PropBank &

More information

Multi-source hybrid Question Answering system

Multi-source hybrid Question Answering system Multi-source hybrid Question Answering system Seonyeong Park, Hyosup Shim, Sangdo Han, Byungsoo Kim, Gary Geunbae Lee Pohang University of Science and Technology, Pohang, Republic of Korea {sypark322,

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

A Direct Numerical Method for Observability Analysis

A Direct Numerical Method for Observability Analysis IEEE TRANSACTIONS ON POWER SYSTEMS, VOL 15, NO 2, MAY 2000 625 A Direct Numerical Method for Observability Analysis Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper presents an algebraic method

More information

DATA ANALYTICS USING R

DATA ANALYTICS USING R DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

More information

TREC 2007 ciqa Task: University of Maryland

TREC 2007 ciqa Task: University of Maryland TREC 2007 ciqa Task: University of Maryland Nitin Madnani, Jimmy Lin, and Bonnie Dorr University of Maryland College Park, Maryland, USA nmadnani,jimmylin,bonnie@umiacs.umd.edu 1 The ciqa Task Information

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles are freely available online:http://www.ijoer.

International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles are freely available online:http://www.ijoer. RESEARCH ARTICLE SURVEY ON PAGERANK ALGORITHMS USING WEB-LINK STRUCTURE SOWMYA.M 1, V.S.SREELAXMI 2, MUNESHWARA M.S 3, ANIL G.N 4 Department of CSE, BMS Institute of Technology, Avalahalli, Yelahanka,

More information

Intelligent Agents Serving Based On The Society Information

Intelligent Agents Serving Based On The Society Information Intelligent Agents Serving Based On The Society Information Sanem SARIEL Istanbul Technical University, Computer Engineering Department, Istanbul, TURKEY sariel@cs.itu.edu.tr B. Tevfik AKGUN Yildiz Technical

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,

More information

Fast Contextual Preference Scoring of Database Tuples

Fast Contextual Preference Scoring of Database Tuples Fast Contextual Preference Scoring of Database Tuples Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece Joint work with Evaggelia Pitoura http://dmod.cs.uoi.gr 2 Motivation

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

DESKTOP BASED RECOMMENDATION SYSTEM FOR CAMPUS RECRUITMENT USING MAHOUT

DESKTOP BASED RECOMMENDATION SYSTEM FOR CAMPUS RECRUITMENT USING MAHOUT Journal homepage: www.mjret.in ISSN:2348-6953 DESKTOP BASED RECOMMENDATION SYSTEM FOR CAMPUS RECRUITMENT USING MAHOUT 1 Ronak V Patil, 2 Sneha R Gadekar, 3 Prashant P Chavan, 4 Vikas G Aher Department

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

MULTI LAYER PERCEPTRON FOR WEB PAGE CLASSIFICATION BASED ON TDF/IDF ONTOLOGY BASED FEATURES AND GENETIC ALGORITHMS

MULTI LAYER PERCEPTRON FOR WEB PAGE CLASSIFICATION BASED ON TDF/IDF ONTOLOGY BASED FEATURES AND GENETIC ALGORITHMS MULTI LAYER PERCEPTRON FOR WEB PAGE CLASSIFICATION BASED ON TDF/IDF ONTOLOGY BASED FEATURES AND GENETIC ALGORITHMS N.VANJULAVALLI 1, DR.A.KOVALAN 2 1. Research Scholar, Department of Computer Science and

More information

A QoS-Aware Web Service Selection Based on Clustering

A QoS-Aware Web Service Selection Based on Clustering International Journal of Scientific and Research Publications, Volume 4, Issue 2, February 2014 1 A QoS-Aware Web Service Selection Based on Clustering R.Karthiban PG scholar, Computer Science and Engineering,

More information

Research of Postal Data mining system based on big data

Research of Postal Data mining system based on big data 3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication

More information

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content

More information

INFORMATION LOGISTICS VERSUS SEARCH. How context-sensitive information retrieval saves time spent reaching goals

INFORMATION LOGISTICS VERSUS SEARCH. How context-sensitive information retrieval saves time spent reaching goals INFORMATION LOGISTICS VERSUS SEARCH How context-sensitive information retrieval saves time spent reaching goals 2 Information logictics versus search Table of contents Page Topic 3 Search 3 Basic methodology

More information

Precision and Relative Recall of Search Engines: A Comparative Study of Google and Yahoo

Precision and Relative Recall of Search Engines: A Comparative Study of Google and Yahoo and Relative Recall of Engines: A Comparative Study of Google and Yahoo B.T. Sampath Kumar J.N. Prakash Kuvempu University Abstract This paper compared the retrieval effectiveness of the Google and Yahoo.

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

Information Retrieval Systems in XML Based Database A review

Information Retrieval Systems in XML Based Database A review Information Retrieval Systems in XML Based Database A review Preeti Pandey 1, L.S.Maurya 2 Research Scholar, IT Department, SRMSCET, Bareilly, India 1 Associate Professor, IT Department, SRMSCET, Bareilly,

More information

Text Classification Using Symbolic Data Analysis

Text Classification Using Symbolic Data Analysis Text Classification Using Symbolic Data Analysis Sangeetha N 1 Lecturer, Dept. of Computer Science and Applications, St Aloysius College (Autonomous), Mangalore, Karnataka, India. 1 ABSTRACT: In the real

More information

CHAPTER 3 DATA MINING AND CLUSTERING

CHAPTER 3 DATA MINING AND CLUSTERING CHAPTER 3 DATA MINING AND CLUSTERING 3.1 Introduction Nowadays, large quantities of data are being accumulated. The amount of data collected is said to be almost doubled every 9 months. Seeking knowledge

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

Keywords: Information Retrieval, Vector Space Model, Database, Similarity Measure, Genetic Algorithm.

Keywords: Information Retrieval, Vector Space Model, Database, Similarity Measure, Genetic Algorithm. Volume 3, Issue 8, August 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Effective Information

More information

Investigation of Latent Semantic Analysis for Clustering of Czech News Articles

Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Michal Rott, Petr Cerva Institute of Information Technology and Electronics Technical University of Liberec Studentska 2,

More information

International Journal of Electronics and Computer Science Engineering 1449

International Journal of Electronics and Computer Science Engineering 1449 International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Component visualization methods for large legacy software in C/C++

Component visualization methods for large legacy software in C/C++ Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University mcserep@caesar.elte.hu

More information

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,

More information

Performance Analysis of Clustering using Partitioning and Hierarchical Clustering Techniques. Karunya University, Coimbatore, India. India.

Performance Analysis of Clustering using Partitioning and Hierarchical Clustering Techniques. Karunya University, Coimbatore, India. India. Vol.7, No.6 (2014), pp.233-240 http://dx.doi.org/10.14257/ijdta.2014.7.6.21 Performance Analysis of Clustering using Partitioning and Hierarchical Clustering Techniques S. C. Punitha 1, P. Ranjith Jeba

More information

A Lightweight Solution to the Educational Data Mining Challenge

A Lightweight Solution to the Educational Data Mining Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

Inner Classification of Clusters for Online News

Inner Classification of Clusters for Online News Inner Classification of Clusters for Online News Harmandeep Kaur 1, Sheenam Malhotra 2 1 (Computer Science and Engineering Department, Shri Guru Granth Sahib World University Fatehgarh Sahib) 2 (Assistant

More information

Clustering Technique in Data Mining for Text Documents

Clustering Technique in Data Mining for Text Documents Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor

More information

Dr. Antony Selvadoss Thanamani, Head & Associate Professor, Department of Computer Science, NGM College, Pollachi, India.

Dr. Antony Selvadoss Thanamani, Head & Associate Professor, Department of Computer Science, NGM College, Pollachi, India. Enhanced Approach on Web Page Classification Using Machine Learning Technique S.Gowri Shanthi Research Scholar, Department of Computer Science, NGM College, Pollachi, India. Dr. Antony Selvadoss Thanamani,

More information

SERG. Reconstructing Requirements Traceability in Design and Test Using Latent Semantic Indexing

SERG. Reconstructing Requirements Traceability in Design and Test Using Latent Semantic Indexing Delft University of Technology Software Engineering Research Group Technical Report Series Reconstructing Requirements Traceability in Design and Test Using Latent Semantic Indexing Marco Lormans and Arie

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

Domain Classification of Technical Terms Using the Web

Domain Classification of Technical Terms Using the Web Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

More information

RRSS - Rating Reviews Support System purpose built for movies recommendation

RRSS - Rating Reviews Support System purpose built for movies recommendation RRSS - Rating Reviews Support System purpose built for movies recommendation Grzegorz Dziczkowski 1,2 and Katarzyna Wegrzyn-Wolska 1 1 Ecole Superieur d Ingenieurs en Informatique et Genie des Telecommunicatiom

More information

REVIEW ON QUERY CLUSTERING ALGORITHMS FOR SEARCH ENGINE OPTIMIZATION

REVIEW ON QUERY CLUSTERING ALGORITHMS FOR SEARCH ENGINE OPTIMIZATION Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

An Empirical Approach for Document Clustering in Forensic Analysis: A Review

An Empirical Approach for Document Clustering in Forensic Analysis: A Review An Empirical Approach for Document Clustering in Forensic Analysis: A Review Tanushri Potphode, Prof. Amit Pimpalkar Abstract: Now a day, in the world of digital technologies especially in computer world,

More information

Automatic Annotation Wrapper Generation and Mining Web Database Search Result

Automatic Annotation Wrapper Generation and Mining Web Database Search Result Automatic Annotation Wrapper Generation and Mining Web Database Search Result V.Yogam 1, K.Umamaheswari 2 1 PG student, ME Software Engineering, Anna University (BIT campus), Trichy, Tamil nadu, India

More information

Fuzzy-Set Based Information Retrieval for Advanced Help Desk

Fuzzy-Set Based Information Retrieval for Advanced Help Desk Fuzzy-Set Based Information Retrieval for Advanced Help Desk Giacomo Piccinelli, Marco Casassa Mont Internet Business Management Department HP Laboratories Bristol HPL-98-65 April, 998 E-mail: [giapicc,mcm]@hplb.hpl.hp.com

More information

Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R. 5. Link Analysis Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

More information

A Case Retrieval Method for Knowledge-Based Software Process Tailoring Using Structural Similarity

A Case Retrieval Method for Knowledge-Based Software Process Tailoring Using Structural Similarity A Case Retrieval Method for Knowledge-Based Software Process Tailoring Using Structural Similarity Dongwon Kang 1, In-Gwon Song 1, Seunghun Park 1, Doo-Hwan Bae 1, Hoon-Kyu Kim 2, and Nobok Lee 2 1 Department

More information

Generatin Coherent Event Schemas at Scale

Generatin Coherent Event Schemas at Scale Generatin Coherent Event Schemas at Scale Niranjan Balasubramanian, Stephen Soderland, Mausam, Oren Etzioni University of Washington Presented By: Jumana Table of Contents: 1 Introduction 2 System Overview

More information

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural

More information

TIMELINE SUMMARIZATION

TIMELINE SUMMARIZATION Evolutionary Timeline Summarization S. N Deshmukh, S. S. Nandagaonkar M.E.(computer engg-ii), Professor, computer department Abstract Faced with thousands of news articles, people usually try to ask the

More information

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS Charanma.P 1, P. Ganesh Kumar 2, 1 PG Scholar, 2 Assistant Professor,Department of Information Technology, Anna University

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Keywords cosine similarity, correlation, standard deviation, page count, Enron dataset

Keywords cosine similarity, correlation, standard deviation, page count, Enron dataset Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Cosine Similarity

More information

Optimization of Image Search from Photo Sharing Websites Using Personal Data

Optimization of Image Search from Photo Sharing Websites Using Personal Data Optimization of Image Search from Photo Sharing Websites Using Personal Data Mr. Naeem Naik Walchand Institute of Technology, Solapur, India Abstract The present research aims at optimizing the image search

More information

An ontology-based approach for semantic ranking of the web search engines results

An ontology-based approach for semantic ranking of the web search engines results An ontology-based approach for semantic ranking of the web search engines results Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University, Country Open review(s): Name

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Table of Contents. Chapter No. 1 Introduction 1. iii. xiv. xviii. xix. Page No.

Table of Contents. Chapter No. 1 Introduction 1. iii. xiv. xviii. xix. Page No. Table of Contents Title Declaration by the Candidate Certificate of Supervisor Acknowledgement Abstract List of Figures List of Tables List of Abbreviations Chapter Chapter No. 1 Introduction 1 ii iii

More information

Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision

Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Viktor PEKAR Bashkir State University Ufa, Russia, 450000 vpekar@ufanet.ru Steffen STAAB Institute AIFB,

More information

Analysis and Synthesis of Help-desk Responses

Analysis and Synthesis of Help-desk Responses Analysis and Synthesis of Help-desk s Yuval Marom and Ingrid Zukerman School of Computer Science and Software Engineering Monash University Clayton, VICTORIA 3800, AUSTRALIA {yuvalm,ingrid}@csse.monash.edu.au

More information