KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS



Similar documents
OPTIMAL MULTI SERVER CONFIGURATION FOR PROFIT MAXIMIZATION IN CLOUD COMPUTING

SECURITY ANALYSIS OF A SINGLE SIGN-ON MECHANISM FOR DISTRIBUTED COMPUTER NETWORKS

Online Student Attendance Management System using Android

Trust based Peer-to-Peer System for Secure Data Transmission ABSTRACT:

CloudFTP: A free Storage Cloud

Dynamic Resource allocation in Cloud

Detecting false users in Online Rating system & Securing Reputation

Cloud Cost Management for Customer Sensitive Data

DYNAMIC GOOGLE REMOTE DATA COLLECTION

DURGA SOFTWARE SOLUTUIONS,S.R NAGAR,HYDERABAD. Ph: , Abstract

Practical Graph Mining with R. 5. Link Analysis

Protein Protein Interaction Networks

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Secure cloud access system using JAR ABSTRACT:

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

An Empirical Study of Two MIS Algorithms

Lepide Active Directory Self Service. Configuration Guide. Follow the simple steps given in this document to start working with

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

Graph Mining and Social Network Analysis

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

Mining Social Network Graphs

Bandaru, Mounika; Gangishetti, Anil; and Putha, Sudharshan Reddy, "Attendance Tracker" (2015). All Capstone Projects. Paper 160.

1. Introduction 1.1 Methodology

Yandex: Webmaster Tools Overview and Guidelines

Automation for Customer Care System

Android City Tour Guide System

Social Media Mining. Network Measures

A Secure Intrusion detection system against DDOS attack in Wireless Mobile Ad-hoc Network Abstract

It is designed to resist the spam in the Internet. It can provide the convenience to the user and save the bandwidth of the network.

Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide

Installing GFI MailArchiver

How To Set Up A Firewall Enterprise, Multi Firewall Edition And Virtual Firewall

Personalization of Web Search With Protected Privacy

Installing GFI MailArchiver

Graph Database Proof of Concept Report

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Database Design Patterns. Winter Lecture 24

A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING

Installing GFI MailArchiver

Procedia Computer Science 00 (2012) Trieu Minh Nhut Le, Jinli Cao, and Zhen He. trieule@sgu.edu.vn, j.cao@latrobe.edu.au, z.he@latrobe.edu.

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

ONLINE DEGREE-BOUNDED STEINER NETWORK DESIGN. Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

Globally Optimal Crowdsourcing Quality Management

How To Cluster Of Complex Systems

MapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12

Stellar Phoenix. SQL Database Repair 6.0. Installation Guide

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

FREE VOICE CALLING IN WIFI CAMPUS NETWORK USING ANDROID

Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

SAP InfiniteInsight 7.0 SP1

SQL Server 2008 R2 Express Installation for Windows 7 Professional, Vista Business Edition and XP Professional.

PRIVACY-PRESERVING PUBLIC AUDITING FOR SECURE CLOUD STORAGE

Effective User Navigation in Dynamic Website

Exchange Granular Restore User Guide

USERGUIDE. Introduction

Storage Systems Autumn Chapter 6: Distributed Hash Tables and their Applications André Brinkmann

FileMaker 11. ODBC and JDBC Guide

How To Use Gfi Mailarchiver On A Pc Or Macbook With Gfi From A Windows 7.5 (Windows 7) On A Microsoft Mail Server On A Gfi Server On An Ipod Or Gfi.Org (


Large-Scale Test Mining

G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs

SYSTEM SETUP FOR SPE PLATFORMS

Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison

Part 2: Community Detection

An Ontology-based e-learning System for Network Security

Big Data and Scripting map/reduce in Hadoop

How To Use Gps Navigator On A Mobile Phone

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Nan Kong, Andrew J. Schaefer. Department of Industrial Engineering, Univeristy of Pittsburgh, PA 15261, USA

Server Installation Guide ZENworks Patch Management 6.4 SP2

Efficient Identification of Starters and Followers in Social Media

FileMaker 12. ODBC and JDBC Guide

G Porcupine. Robert Grimm New York University


Network device management solution

HP OpenView Service Desk Process Insight 2.10 software

System Requirements Table of contents

Lesson Plans Configuring Exchange Server 2007

Emergency Alert System using Android Text Message Service ABSTRACT:

IBM SPSS Modeler Social Network Analysis 15 User Guide

Network (Tree) Topology Inference Based on Prüfer Sequence

Exchange Granular Restore Instructional User Guide

Measuring the Performance of an Agent

International Journal of Advance Research in Computer Science and Management Studies

Service Desk Intelligence System Requirements

Operating System Installation Guide

AHUDesigner. The Air Handling Units selection software. Product description

Social Media Mining. Graph Essentials

Project Report BIG-DATA CONTENT RETRIEVAL, STORAGE AND ANALYSIS FOUNDATIONS OF DATA-INTENSIVE COMPUTING. Masters in Computer Science

Microsoft Office Outlook 2013: Part 1

Exchange Granular Restore. User Guide

Implementing Graph Pattern Queries on a Relational Database

Software Engineering I CS524 Professor Dr. Liang Sheldon X. Liang

Cloud Data Protection for the Masses

DomainKeys Identified Mail DKIM authenticates senders, message content

A collaborative platform for knowledge management

NETWRIX EVENT LOG MANAGER

Transcription:

ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice, RDF data may often suffer from the unreliability of their data sources, and exhibit errors or inconsistencies. In this paper, we model such unreliable RDF data by probabilistic RDF graphs, and study an important problem, keyword search query over probabilistic RDF graphs (namely, the pg-kws query). To retrieve meaningful keyword search answers, we design the score rankings for subgraph answers specific for RDF data. Furthermore, we propose effective pruning methods (via offline pre-computed score bounds and probabilistic threshold) to quickly filter out false alarms. We construct an index over the pre-computed data for RDF, and present an efficient query answering approach through the index. Extensive experiments have been conducted to verify the effectiveness and efficiency of our proposed approaches. EXISTING SYSTEM

The keyword search on deterministic (general) graphs, to our best knowledge, no prior works studied the pg-kws problem on probabilistic RDF graphs. While the feature score uses a hybrid of structural and IR scores that are similar to existing ranking scores,the entropy score can reflect our preference to keyword distributions over RDF subgraph entities. The existing works on keyword search over RDF graphs often assume that, labels of vertices in RDF graphs are certain. There are some existing works on modeling probabilistic RDF data. PROPOSE SYSTEM we propose effective pruning methods (via offline pre-computed score bounds and probabilistic threshold) to quickly filter out false alarms. Extensive experiments have been conducted to verify the effectiveness and efficiency of our proposed approaches. proposed to answer keyword search queries on certain graphs.

we can transform probabilistic RDF graph with uncertain vertex/edge keywords to the one with uncertain keywords in vertices only, to which we apply our proposed approaches. we will utilize this entropy concept to propose a metric that can indicate our preference to RDF keyword search results in probabilistic RDF graphs. we will propose two pruning strategies, score bound pruning and probabilistic pruning, which utilize score bounds or probabilistic threshold, respectively, to enable the pruning. our proposed pruning methods via score bounds. we propose a heuristicbased algorithm, which obtains w PWGs of a probabilistic subgraph g with low cost. we report the experimental results of our proposed approaches for answering pg-kws queries on both real and synthetic data. The proposed techniques explored pruning methods with graph structures and matching probabilities. various probabilistic queries over uncertain data have been proposed, including probabilistic range query (PRQ), nearest neighbor

(PNN), reverse nearest neighbor (PRNN), skyline (PSQ), reverse skyline (PRSQ), and similarity join.(psj). The bidirectional search, which uses the pre-computed distance between keywords and nodes, and obtains the bounds of ranking scores to enable fast pruning and retrieval. ALGORITHM Inspired by the complexity of finding a good PWG partitioning, we propose a heuristicbased algorithm, which obtains w PWGs of a probabilistic subgraph g with low cost. We execute the randomized algorithm for iter rounds, and select the one with the lowest cost as the output of function F( ). our PWG partitioning algorithm will iteratively obtain PWG partitioning strategy PS(i; j),as well as its corresponding cost PS(i; j):cost. Specifically, the algorithm maintains a cost matrix, cost matrix, in which each element cost matrix[i][j] = PS(i; j):cost for i j.

The time complexity of the algorithm is given by O( V (g) w C), where C is the cost to evaluate the cost model. SYSTEM CONFIGURATION SOFTWARE REQUIREMENTS: Operating System : Windows Technology : Java and J2EE Web Technologies : Html, JavaScript, CSS IDE : My Eclipse Web Server : Tomcat Tool kit : Android Phone Database : My SQL Java Version : J2SDK1.5 HARDWARE REQUIREMENTS: Hardware : Pentium

Speed : 1.1 GHz RAM : 1GB Hard Disk : 20 GB Floppy Drive : 1.44 MB Key Board : Standard Windows Keyboard Mouse : Two or Three Button Mouse Monitor : SVGA IMPLEMENTATION: Implementation is the stage of the project when the theoretical design is turned out into a working system. Thus it can be considered to be the most critical stage in achieving a successful new system and in giving the user, confidence that the new system will work and be effective. The implementation stage involves careful planning, investigation of the existing system and it s constraints on implementation, designing of methods to achieve changeover and evaluation of changeover methods. MODULE DESCRIPTION:

Number of Modules; After careful analysis the system has been identified to have the following modules: 1. Keyword searching based Social Authentication Module 2. Security Module 3. Search Module 4. Graph Module 1.Keyword searching based Social Authentication Module: A Keyword -based social authentication includes two phases, User Phase: The system prepares keyword searching for a user registration in this phase. Specifically, this is first authenticated with her main authenticator (i.e., password) and then a few(e.g., 5) friends, who also have accounts in the system, are selected by either herself or the service provider from registration friend list and are appointed as Keyword searching.

Admin Phase: The Admin is using for files upload for system. Specifically, this is which file is more then seeing is ease to findout. 2.Security Module: Authentication is essential for securing your account and preventing spoofed messages from damaging your online reputation. Imagine a phishing email being sent from your mail because someone had forged your information. Angry recipients and spam complaints resulting from it become your mess to clean up, in order to repair your reputation. keywordsearching-based social authentication systems ask users to select their own trustees without any constraint. In our experiments we show that the service provider can constrain keywordsearching selections via imposing that no users are selected as keywordsearching by too many other users, which can achieve better security guarantees. 3.Search Module we use a measure, which is the (summed) difference between upper and lower score bounds for keyword pairs (or keywords).

Different from leaf nodes, aggregates in non-leaf nodes summarize bounds or keyword existence information for all possible worlds of graphs (rather than w PWGs). Thus, we do not need to record existence Intuitively, probabilistic r-radius graphs with similar keywords tend to have similar score bounds. This way, tree nodes on different levels can be iteratively built until a final root is obtained we randomly generate m possible keywords (each keyword is hashed to an integer within range [1; 100]), as well as conditional probability tables (CPTs) that depend on vertex keywords from its incoming edges, where m is randomly picked up within [1; 5] by default. We tested two types of distributions for uncertain vertex keywords, Uniform and Skew and denote their corresponding data sets as Uniform and Skew, respectively. 4.Graph Module Which file is more than searching show the graph. we first generate a schema graph, by producing vertices that represent RDF entities, and randomly connecting vertices via directed edges, where the averages of vertex out-degrees and in-degrees are set to by default. CONCLUSIONS: In this paper, we formulate and tackle an important problem of keyword search over probabilistic RDF graphs, called pg-kws queries. We

propose effective pruning methods via offline pre-computed score bounds and probabilistic threshold to quickly filter out false alarms. Moreover, we construct an index for the pre-computed data for RDF, and present an efficient query answering approach. Extensive experiments have been conducted to verify the effectiveness and efficiency of our proposed approaches.