FİYAT ARAMA MOTORU ( ÖZET )



From this document you will learn the answers to the following questions:

What does the human operator show to the system?

What information is collected from online shopping websites?

What is the site Adding Wizard?

Similar documents
Lesson 7 - Website Administration

QUANTIFY INSTALLATION GUIDE

Web Hosting Features. Small Office Premium. Small Office. Basic Premium. Enterprise. Basic. General

PRiSM Security. Configuration and considerations

FireBLAST Marketing Solution v2

DocAve Website Migrator 2.2 for Microsoft SharePoint

Configuring the Active Directory Plug-in

Baidu: Webmaster Tools Overview and Guidelines

User manual of the Work Examiner Contents

Preparing Your Network for an MDsuite Installation

Advanced Event Viewer Manual

SharePoint Integration Framework Developers Cookbook

STATISTICA VERSION 10 STATISTICA ENTERPRISE SERVER INSTALLATION INSTRUCTIONS

Setting Up Resources in VMware Identity Manager

SQL Server 2008 R2 Express Edition Installation Guide

NetWrix USB Blocker. Version 3.6 Administrator Guide

DiskPulse DISK CHANGE MONITOR

INSTALLATION GUIDE Version 1.2

Creating a Website with MS Publisher

Online shopping store

E-Commerce Installation and Configuration Guide

Richmond SupportDesk Web Reports Module For Richmond SupportDesk v6.72. User Guide

TEKLYNX LABEL ARCHIVE

Quick Start Guide for Parallels Virtuozzo

RoboMail Mass Mail Software

Desktop Surveillance Help

EZblue BusinessServer The All - In - One Server For Your Home And Business

Password Reset Server Installation Guide Windows 8 / 8.1 Windows Server 2012 / R2

E-Commerce Installation and Configuration Guide

Training module 2 Installing VMware View

Team Foundation Server 2012 Installation Guide

SysPatrol - Server Security Monitor

Richmond Systems. Self Service Portal

Software Development Kit

Short notes on webpage programming languages

How To Set Up A Xerox Econcierge Powered By Xerx Account

13.1 Backup virtual machines running on VMware ESXi / ESX Server

Medications Shortages Dashboard

Bitrix Site Manager ASP.NET. Installation Guide

Secret Server Installation Windows 8 / 8.1 and Windows Server 2012 / R2

uhub PC Client Guide_25 Jun 2013 uhub PC Client User Guide Version 3.2.0

Your Blueprint websites Content Management System (CMS).

BusinessObjects Enterprise XI Release 2 Administrator s Guide

HELPDESK SYSTEM (HDS) USER MANUAL

WebSpy Vantage Ultimate 2.2 Web Module Administrators Guide

SaskTel Web Hosting Feature Overview

EZblue BusinessServer The All - In - One Server For Your Home And Business

Chapter 25 Backup and Restore

Pcounter Web Administrator User Guide - v Pcounter Web Administrator User Guide Version 1.0

Mapping ITS s File Server Folder to Mosaic Windows to Publish a Website

Microsoft Expression Web

Automated CPanel Backup Script. for home directory backup, remote FTP backup and Amazon S3 backup

WEB2CS INSTALLATION GUIDE

Dwebs IIS Log Archiver Installation and Configuration Guide for Helm 4

Enterprise Server Setup Guide

Database Backup and Recovery Guide

Kentico CMS 5.5 User s Guide

How to Install SQL Server 2008

Understanding offline files

Enrollment Process for Android Devices

Server Installation Guide ZENworks Patch Management 6.4 SP2

CA Nimsoft Monitor Snap

Contents CHAPTER 1 IMail Utilities

Other documents in this series are available at: servernotes.wazmac.com

BT Website Centre Control Panel. User Guide Version 1.0

ShoreTel Advanced Applications Web Utilities

When you first login to your reseller account you will see the following on your screen:

Installing the VPN Client for Microsoft Windows OS

Add in Guide for Microsoft Dynamics CRM May 2012

enter the administrator user name and password for that domain.

nopcommerce User Guide

SETTING UP AND RUNNING A WEB SITE ON YOUR LENOVO STORAGE DEVICE WORKING WITH WEB SERVER TOOLS

Secure Messaging Server Console... 2

How To Backup Your Computer With A Remote Drive Client On A Pc Or Macbook Or Macintosh (For Macintosh) On A Macbook (For Pc Or Ipa) On An Uniden (For Ipa Or Mac Macbook) On

Table of Contents. Introduction... 1 Technical Support... 1

LepideAuditor Suite for File Server. Installation and Configuration Guide

Sitecore Ecommerce Enterprise Edition Installation Guide Installation guide for administrators and developers

MS Enterprise Library 5.0 (Logging Application Block)

Preparing Your Server for an MDsuite Installation

MySQL Quick Start Guide

RBackup Server Installation and Setup Instructions and Worksheet. Read and comply with Installation Prerequisites (In this document)

Preparing to Install SQL Server 2005

RingStor User Manual. Version 2.1 Last Update on September 17th, RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ

Managing Virtual Servers

Installing Globodox Web Client on Windows 7 (64 bit)

ENABLE LOGON/LOGOFF AUDITING

Active Directory integration with CloudByte ElastiStor

Introducing OneDrive for Business

How to use SURA in three simple steps:

Load Testing Hyperion Applications Using Oracle Load Testing 9.1

Vantage Report. Quick Start Guide

Quick Start Guide for VMware and Windows 7

HP WebInspect Tutorial

User Manual Web DataLink for Sage Line 50. Version 1.0.1

The Web Pro Miami, Inc. 615 Santander Ave, Unit C Coral Gables, FL T: info@thewebpro.com

PassKey Manager. Schoolwires Centricity

PineApp Surf-SeCure Quick

Transcription:

II FİYAT ARAMA MOTORU ( ÖZET ) İnternetin insan hayatı üzerindeki etkileri ve kullanım yaygınlığı gün geçtikçe artmaktadır. İlk zamanlarında sadece haberleşmek ve bilgi edinmek için kullanılan internet, günümüzde kullanıcılara birçok işlevi sunmaktadır. İnternetin yaygın olarak kullanıldığı alanlardan biri de çevrimiçi alış veriştir. Gün geçtikçe sayıları artan çevrimiçi alış veriş siteleri sayesinde kullanıcılar alış verişlerini internet ortamında rahat bir şekilde yapabilir hale gelmişlerdir. Çevrimiçi alış veriş sitelerinin sayısının artmasıyla birlikte herhangi bir ürünü almak için en uygun alış verişi yapma, en uygun sağlayıcıyı bulma problemi ortaya çıkmıştır. En uygun alış veriş için, birçok siteye bakılmalı, aralarında karşılaştırma yapılmalıdır. Özellikle birden fazla ürün alınacağı zaman bu problem içinden çıkılmaz hale gelmektedir. Bu projede, internetten alış veriş yapacak kullanıcıların tek bir web sayfasından aradıkları ürün veya ürünler için en uygun çevrimiçi alış veriş koşulları hakkında bilgi edinmelerini ve bu bilgiler ışığında alış verişlerini daha ucuza ve daha hızlı yapabilmelerini sağlayan bir Fiyat Arama Motoru geliştirmiştir. Proje temel olarak; çevrimiçi alış veriş sitelerinden ürün ve ürün fiyatı bilgilerinin toplanması, toplanan bilgilerin web sitesi aracılığıyla kullanıcılara sunulması aşamalarından oluşmaktadır. Bu işlevlerin sağlanabilmesi için birbiriyle etkileşim içerisinde çalışan altı farklı proje bileşeni tasarlanmış ve geliştirilmiştir. Proje bileşenleri şu şekildedir: Çekirdek Kütüphane Veri Tabanı Site Ekleme Sihirbazı Site Tarama Robotu Ürün Entegrasyonu XML Web Servisi Web Sitesi Yapılan geliştirme ve testler sonucunda proje bileşenleri tamamlanmış ve beklenen özellikleri sağlayan, çalışan bir sistem elde edilmiştir.

III PRICE SEARCH ENGINE ( SUMMARY ) The effects of internet on daily life and its widespread use have been improving day by day. At the first times of internet, it was used for communication and access to basic information about general topics. Nowadays, the internet provides many other features to the users. One of the most widespread usage areas of internet is online shopping. By using online shopping and e-commerce websites, which s number is increasing day by day; users can supply their need of shopping as online. By rapid increase in number of online shopping websites, the problem of finding the most suitable provider has begun to occur. For the most suitable online shopping, the users have to search many online shopping web sites and compare them. Especially if the user is searching for more than one product or a group of products to buy together from a website the problem becomes much harder. In this project, a Price Search Engine is developed which enables users to gain information about a product or group of products that they are intended to buy by online shopping. In the light of these information users will be able to make online shopping cheaper and much faster. The project basically consists of collecting product information and product price information from online shopping websites and presentation of collected information to the users on the system web site. To achieve this functionality, six separate modules which are interacted with each other are designed and developed. These modules are: Core Library Database Website Adding Wizard Crawler Product Integration XML Web Service Website As an innovation, a new approach to the crawling mechanism is developed. This mechanism will be explained by explaining the Pattern Selector, Web Site Adding Wizard and Crawler in the corresponding topics below. Moreover two new features, which do not exist in Price Search Engines that are already on active in our country, are developed. These are: Storing the product price information which were crawled in early crawling processes and presenting them to the users via a chart interface Enabling users to search for more than one product and compare the results, to buy them together from an online shopping website Pattern Selector:

IV Most of the Price Search Engines use product integration or crawling methods to collect product information which exist on online shopping websites. If crawling method is used, the general approach is to develop a search method for each of the websites to be crawled. This is because a crawler program gets the html source code of a web page using HttpWebRequest and WebResponse technologies. For a crawler program which has to evaluate specific information from the source code, it is almost impossible to make an evaluation automatically using any parsing method even artificial intelligence technologies are used. Thus, for a crawler program to collect accurate information, general approach is to develop different search methods for each website to be crawled. In this project it is aimed to develop a generic crawling technique which should be able to crawl any online shopping website with a single search method. To achieve this functionality, first a generic product info and product price info data structures are designed. Generic product info data structure consists of these elements: Product name Category of product Brand of product Product image Address of the product detail web page Last updated date Stock status Generic product price info data structure consists of these elements: Raw price (excluding taxes) Final price (including taxes) Special price (money order discount etc.) Discount price (last 5 day discount etc.) Last updated date Using the generic product info and generic product price info structures, a generic crawling method becomes possible. Only a human operator is needed once in the process of adding an online shopping web site to the system to be crawled. Human operator shows the system, corresponding product info and product price info fields on any product detail page on the website which is intended to add in to the system to be crawled, then all the crawling process for all products on that website is done automatically by crawler. Pattern Selector is the technique which is used while the human operator shows corresponding product info and product price info fields to the system. An html web page can be represented as an html document tree and some of the elements of this document tree contain the necessary info that will be shown to the system by the human operator. As the human operator clicks a node on the webpage to show that node to the system as product info node, Pattern Selector analyzes that node and find outs its position in html document tree. In some websites especially which are developed by using technologies like ASP.NET or JSP, all of the nodes in html document tree have id attributes and position of these nodes can easily be figured out from html document tree. However in websites which are developed using technologies like PHP or ASP some nodes may not have an id attribute, so

V figuring out the position of these nodes becomes much harder. In this case, a method is developed in Pattern Selector structure which traces the nodes until reaching the root element (usually <html/> element) by keeping the order of a node among children nodes of parent node of that element. As a result of this process, a route in the document tree is formed which is called TagPath. Using the TagPath same node can be found easily by beginning from the root element and reverse tracking the route. So the crawler would be able to reach any node which is marked as product info or product price info using id attributes or TagPath structure which belongs to that node. This operation would be valid for any product detail page of the same website because the page structure would remain the same since the pages are created dynamically. Project Modules: Project modules which are developed in the project are explained below: Core Library: This is the module which consists of following structures: Data modeling classes (entity classes) which represent the data structures used in the project domain Manager classes for data modeling classes (entity manager classes) which manage data base operations like create, read, update, delete on entity classes Data access classes which provides running commands on SQL database Business tier classes which run various operations and are used common by other project modules Database: Database consists of tables which correspond to entity classes and tables which are used by the website. Site Adding Wizard: This module is a desktop application which is used by a human operator to add online shopping websites to the system to be crawled. Application consists of an integrated web browser control and Pattern Selector structure. Application has an interface with a wizard structure which asks the human operator to show the necessary info field on the product detail web page of web site to be crawled. Human operator simply clicks on the fields that asked by application. Once the operator clicks on a field, the position of underlying html node on the html document tree of the page is found out by Pattern Selector. At the end, human operator clicks on a button to save the information on the database about that online shopping website. Crawler: Crawler is a desktop application which runs on system tray of the operating system. Crawler application keeps track of online shopping websites which are added into the system to be crawled by human operator using Site Adding Wizard. For each website in the system, crawler creates a thread and starts crawling operation on that website. Crawler starts from the home page of the website and follows the hyperlinks on the web page within the same website. If the crawled web page is a product detail page, it processes product info and product price info using id attributes or TagPath structure for each necessary field and saves them to the database. Also for each product found, product image is downloaded, resized and saved into product images folder of system website. If crawler

VI program is closed by the user or interrupted because of any exception, crawling data for each website is saved into the disk and after restarting the application it continues from where it has remained. Product Integration XML Web Service: For the online shopping websites which choose direct integration of product info from their database, a web service is developed. This web service has a method which takes product info and product price info and saves them into the database after authenticating the web service user using username and password which is provided for that user by system operator and kept in the database. Website: In the website module products which are collected by crawler or added by integration web service are shown to the users. Users are able to make product search, product group search and view the results consists of product info, product price info (also shown using chart interface), user comments and votes on that product. User is also able to make comment on a product or vote the product and add the product in owns alarm list. For a product in user s alarm list, when any change in the price of that product is occurred, user will be informed via e-mail. Conclusion: As a result of the development and test phases, project modules are completed and a working system which satisfies the expected features is achieved. However, according to the test results when it is considered that most of the popular online shopping websites have very large product spectrum that contains ten thousands of products, crawling speed remains very slow for a realistic environment. So the crawler application may be improved to run parallel on many computers to achieve a faster crawling process.