Block-o-Matic: a Web Page Segmentation Tool and its Evaluation
|
|
|
- Suzanna Short
- 10 years ago
- Views:
Transcription
1 Block-o-Matic: a Web Page Segmentation Tool and its Evaluation Andrés Sanoja, Stéphane Gançarski To cite this version: Andrés Sanoja, Stéphane Gançarski. Block-o-Matic: a Web Page Segmentation Tool and its Evaluation. 29ème journées Base de données avancées, BDA 13, Oct 2013, Nantes, France <hal > HAL Id: hal Submitted on 8 Nov 2013 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
2 Block-o-Matic: a Web Page Segmentation Tool and its Evaluation1 Andrés Sanoja, Stéphane Gançarski Laboratoire d'informatique Paris 6 (LIP6) 1 Introduction Nowadays, Web page segmentation is an extensively studied topic. Web page segmentation refers to the process of dividing a Web page into visually and semantically coherent segments called blocks. Web page segmentation has a variety of benefits and potential Web applications, in mobile web, voice browsers, web page phishing, duplicate detection, information retrieval, information extraction, user interest detection, evaluating visual quality (aesthetics), caching, archiving, publishing dynamic content, semantic annotation, web accessibility, web page classification and clustering. For instance remove noisy content for visualizing web pages in devices with small screens or blocking part of a web page with inappropriate content for some audiences instead of blocking the whole page. There are different approaches to web page segmentation, most of them described in [5]: structure, layout, hybrid, image, fixed-length and text based approach. The structure based approach references the uses only of the HTML tags as they are in the web page source code or the DOM interface of the web page after rendering in a web browser. The layout based approach focuses on the visual properties of the page. The hybrid approach uses both the structure and the layout of the page to create a hierarchy of blocks through block extraction and recursive refinement. The image based approach takes an image of a web page and apply image processing techniques to detect blocks. The fixed-length based approach removes all the semantic information (tags) from the page and then uses fixed-length algorithms to segment the web page. The text based approach retrieve segments from web pages based on the properties of text such as passage similarity, clustering, among others. We think that the hybrid approach allows having a good perception of the layout of the web page. Because allows to analyze the document from the structural perspective and also evaluates the visual and geometric aspects of the page. However, this aspects are not enough the have a good description of the page. Blocks should be related to some role into the layout, in order to give more information to applications that depends on web page segmentation, for instance the layout flow. To accomplish this, the logic structure of the web page should be taken into account, along with the content and the geometric structure as used in [1]. We propose a tool for web page segmentation: Block-o-Matic. It is based on an original model which combines two popular approaches for web page and document analysis: the Vision-based model and the Geometric Layout Model [1, 4]. A web page is processed to build three structures: content, geometric and logical structure, each structure giving different information about the page document. The content structure distinguishes the HTML elements used to hold content and which serve for organizing. The geometric structure allows representing the page objects based on the presentation. It allows also adjusting the block shape to the content geometry. The logical structure describes the connections between blocks based on the page layout, on blocks geometric properties and on content elements properties [2]. The outcome of the processing is a segmented web page, which is a consolidated view of the three structures above mentioned In this context, one important challenge is the evaluation of the segmentation algorithm. In order to 1 This work was partially supported by the SCAPE Project. The SCAPE project is co-funded by the European Union under FP7 ICT (Grant Agreement number ) 1
3 compare one algorithm with another, it is crucial to have a initial evaluation, validated by human assessors, to check its correctness. Evaluators task is to indicate which parts of the web page represent semantically coherent segments, and at the same time to have a semantic partition of the page. This given information must be then translated in order to identify to which HTML elements correspond those segments. This process is tedious and error prone. Having a ground truth database, ready to use, can take several days or months depending on the collection size. To handle these challenges, we propose a new tool for the pagelyzer suite [6], Block-o-Manual which allows performing the web page segmentation manually. It is intended to be a tool to help assessors discover and segmenting web pages manually, in the more efficient way possible and reduce the intervention of analyst in the process. Altogether allow us to evaluate its automatic contra-part: block-o-matic segmentation tool (in the same suite). Ruby was used as programming language for implementing the Block-o-Matic tool, Nokogiri libraries are used for HTML/XML manipulation. Block-o-Manual is provided as a Firefox extension. It is implemented using the Mozilla Add-on Builder and pure Javascript and its backend in PHP5. There are several similar tools for annotating web pages; however they are oriented to text content annotation, notes, highlighting and text formatting. In our case we want in reveal those structural elements used to organize the content of a web page: blocks. In the rest of this paper, we first present the Block-o-Matic segmentation model, the prototype architecture and finally, we introduce the demonstration scenario. 2 Block-o-Matic Segmentation Model We extend the concepts of [4], a model for segmenting digitized document in the optical character recognition domain, and adapt it to web pages. The content structure is the DOM interface after rendering a page in a browser. The geometric structure describes a page with respect to the geometric properties of content elements. The logical structure describes the connections of blocks at a layout level. For example if a logical block is found at top of page and the next in the bottom, that suggests a vertical layout flow. As shown in Figure 1, the segmentation process of a page (W) is divided into three phases: page analysis, page understanding and page reconstruction. The extraction of the geometric structure (W geom) from the content structure (Wcont) is called web page analysis (c2g function). Mapping the geometric structure (Wgeom) into a logical structure (Wlog) is called document understanding (g2l function). Rebuilding the web page from the logic (Wlog), geometric (Wgeo) and content structure (Wcont) is called web page reconstruction (Rec function) where, W = Rec(W cont, c2g(wcont), g2l(c2g(wcont))). The c2g function constructs the geometric structure from the content structure. It organizes the content in function of the tag as: content, container, content-container or default elements. Additionally, it checks that there are not overlapping between siblings blocks. The g2l function constructs the logic structure from geometric structure. It assigns labels to geometrics blocks depending on their position in the web page, the relative area they occupy and the visual cues of their content elements. The labels are: left, right, top, bottom, middle or in other case standard. The Rec function builds the page (W') which is the original document (W) enriched with all structure, into an linear document in a XML-like style. Figure 1. Web page segmentation process 2
4 3 Prototype Architecture The prototype in question comprise the Block-o-matic component and the Block-o-manual component directly in conjunction with the others components in the pagelyzer suite. At a glance this suite is conformed by three main components: capture, analyzer, comparator and changedetection. The capture component handles the task of having the rendered HTML copy of a web page integrated with the visual cues and a screenshot. The analyzer component implements the block-o-matic segmentation algorithm. The comparator component compares two segmentation results and gives a score. The changedetection component compares two web pages versions and decides if they are similar or not. This decision is based on a combination of structural and visual comparison methods embedded in a statistical discriminative model, a visual similarity measure designed for Web pages that improve change detection and a supervised feature selection method adapted to Web archiving. Figure 1: Prototype architecture Block-o-manual has been developed on top of the Mozilla Add-On SDK. It relies on XUL and Javascript technologies. Using Add-on SDK, the development of this browser extension is done through the design and implementation of a set of components: widgets, content scripts and general data. Widgets components represent functional elements integrated to browser environment (menus, toolbars, etc). Content scripts are javascript code that interacts with the current web page document. General data components are images, web documents or other needed files. Figure 1 shows the prototype architecture. Both manual and automatic segmentation should take the same kind of decisions for similar situations. The main idea is that when an assessor selects an element in the page give him/her the same options as the automatic segmentation would do. For that matter we use the same rules in both tools. Block-o-matic traverse the DOM tree from the root node to the leaves, making decisions in the way, however Block-o-manual depends on user input and it is evaluated one element at a time, as a response for that input. To be more precise in the kind of decisions both tools should make, let us describe the most relevant rules that should be followed to assure parity in the comparison. Rule 1: Only Line-break elements can become blocks. Inspired by the HTML specification2 the block-o-matic segmentation model proposes to organize HTML elements as: inline and line-break elements: Inline elements: elements that use only the space of its content. For example tags <a>, <b> or <i> Line-break elements: elements that organize content and reserve a rectangular space of a web page for rendered content. Only elements that are valid (non terminal and visible) are included. An element is visible if its visibility style property value is either block or inline and its area is not zero. The line break elements are classified into the three following categories: Container elements. Some elements are used generally for organizing content. That is the case, for example, of <div> and <p> tags. Content elements are the elements used to hold enriched text. For example tags such as <li> or <h1>
5 Content-Container elements are elements that, depending on their-self and their children configuration belong to both categories. For example a <p> tag with text and inline elements as children. Rule 2: If the current element has only one line break child, this latter replaces its ancestor Rule 3: If an ancestor in the current element is in any of the line-break categories, it is proposed as a block. Rule 4: If the current element is selected as a block and there are sibling elements which are not, those latter are automatically added as block to complete the partition. Other rules are added to cover the entire segmentation model. Nevertheless, these are suggestions for the assessor and (s)he may decide to use them or not. The next step is that the user is asked to indicate the semantic order of blocks within the page (flow of the layout). With block and layout flow done, a ViXML [3] document is created and sent to the collector server in conjunction with the web page URL. Thereupon, the capture component receives the URL and gets the rendered source code, the visual cues and the screenshot of the web page. Next, the analyzer component is requested to segment the page, producing another ViXML document. The comparator component is invoked, having as arguments the two documents produced (manual and automatic), and it gives a score and two images for visual comparison, one with the result of manual segmentation and the other with the automatic one. 4 Demonstration Scenario In this section, we describe our demonstration scenario by giving the inputs and outputs for the evaluation of the segmentation process. The given scenario is to evaluate the segmentation of current web pages versions, at a coarse level of granularity, for example: the header, the footer, a main navigation or the main content. It is presented in two steps: Step 1. Manual segmentation We ask four assessors to one category each of a set of 100 web pages from the StumbleUpon 3 collaborative filtering services. They are distributed in sets of 25 pages per category. The categories are: Art/History, Computers, Sports and Sci/Tech. These categories are already defined and pages are associated already to them. Figure 2 shows an example of a web page in the Art/History category. In figure 3 can be seen that two blocks are being selected. In red the block clicked and in green the sibling element in order to have a complete partition. Figure 4 shows the final segmentation, composed of three blocks: one at the header and two parallel content blocks. Figure 5 depict the layout flow indicated. In this case the user makes click over all blocks; the click sequence is taken as the orientation of the flow of the layout. Step 2. Automatic Segmentation and Comparison After the manual segmentation is done the automatic segmentation starts the capture of the web page. The rendered HTML source codes in addition to the visual cues are then passed to the analyzer component to perform the segmentation. The outcome is another ViXML. Both documents are then passed to the comparator component as parameter and it returns a score and the two images, above mentioned. For instance Figure 6 shows that the manual and automatic segmentation are the similar but the automatic segmentation did not select the expected, the one that represents the footer of the page and also missed one block 3 4
6 4 References [1] Deng Cai, Shipeng Yu, Ji-Rong Wen, and Wei-Ying Ma. Extracting content structure for web pages based on visual representation. In Fifth Asia Pacific Web Conference (APWeb'03), [2] Jisheng Liang, Ihsin T. Phillips, and Robert M. Haralick. Performance evaluation of document structure extraction algorithms. Computer Vision and Image Understanding, 84(1): , [3] Zeynep Pehlivan, Myriam Ben-Saad, and Stéphane Gançarski. Vi-diff: understanding web pages changes. In Proceedings of the 21st international conference on Database and expert systems applications: Part I, DEXA'10, pages 1-15, Berlin, Heidelberg, Springer-Verlag. [4] Y.Y. Tang, C. Y. Suen, C. D. Yan, and M Cheriet. Document analysis and understanding: a brief survey. First International Conference on Document Analysis and Recognition, pages 17-31, [5] Yeliz Yesilada. Web page segmentation: A review. Technical report, Middle East Technical University Northern Cyprus Campus, March [6] 5
Mobility management and vertical handover decision making in heterogeneous wireless networks
Mobility management and vertical handover decision making in heterogeneous wireless networks Mariem Zekri To cite this version: Mariem Zekri. Mobility management and vertical handover decision making in
FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data
FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez To cite this version: Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez. FP-Hadoop:
ibalance-abf: a Smartphone-Based Audio-Biofeedback Balance System
ibalance-abf: a Smartphone-Based Audio-Biofeedback Balance System Céline Franco, Anthony Fleury, Pierre-Yves Guméry, Bruno Diot, Jacques Demongeot, Nicolas Vuillerme To cite this version: Céline Franco,
Cobi: Communitysourcing Large-Scale Conference Scheduling
Cobi: Communitysourcing Large-Scale Conference Scheduling Haoqi Zhang, Paul André, Lydia Chilton, Juho Kim, Steven Dow, Robert Miller, Wendy E. Mackay, Michel Beaudouin-Lafon To cite this version: Haoqi
VR4D: An Immersive and Collaborative Experience to Improve the Interior Design Process
VR4D: An Immersive and Collaborative Experience to Improve the Interior Design Process Amine Chellali, Frederic Jourdan, Cédric Dumas To cite this version: Amine Chellali, Frederic Jourdan, Cédric Dumas.
An Automatic Reversible Transformation from Composite to Visitor in Java
An Automatic Reversible Transformation from Composite to Visitor in Java Akram To cite this version: Akram. An Automatic Reversible Transformation from Composite to Visitor in Java. CIEL 2012, P. Collet,
QASM: a Q&A Social Media System Based on Social Semantics
QASM: a Q&A Social Media System Based on Social Semantics Zide Meng, Fabien Gandon, Catherine Faron-Zucker To cite this version: Zide Meng, Fabien Gandon, Catherine Faron-Zucker. QASM: a Q&A Social Media
Contribution of Multiresolution Description for Archive Document Structure Recognition
Contribution of Multiresolution Description for Archive Document Structure Recognition Aurélie Lemaitre, Jean Camillerapp, Bertrand Coüasnon To cite this version: Aurélie Lemaitre, Jean Camillerapp, Bertrand
A model driven approach for bridging ILOG Rule Language and RIF
A model driven approach for bridging ILOG Rule Language and RIF Valerio Cosentino, Marcos Didonet del Fabro, Adil El Ghali To cite this version: Valerio Cosentino, Marcos Didonet del Fabro, Adil El Ghali.
Performance Evaluation of Encryption Algorithms Key Length Size on Web Browsers
Performance Evaluation of Encryption Algorithms Key Length Size on Web Browsers Syed Zulkarnain Syed Idrus, Syed Alwee Aljunid, Salina Mohd Asi, Suhizaz Sudin To cite this version: Syed Zulkarnain Syed
A graph based framework for the definition of tools dealing with sparse and irregular distributed data-structures
A graph based framework for the definition of tools dealing with sparse and irregular distributed data-structures Serge Chaumette, Jean-Michel Lepine, Franck Rubi To cite this version: Serge Chaumette,
Flauncher and DVMS Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically
Flauncher and Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically Daniel Balouek, Adrien Lèbre, Flavien Quesnel To cite this version: Daniel Balouek,
Territorial Intelligence and Innovation for the Socio-Ecological Transition
Territorial Intelligence and Innovation for the Socio-Ecological Transition Jean-Jacques Girardot, Evelyne Brunau To cite this version: Jean-Jacques Girardot, Evelyne Brunau. Territorial Intelligence and
A usage coverage based approach for assessing product family design
A usage coverage based approach for assessing product family design Jiliang Wang To cite this version: Jiliang Wang. A usage coverage based approach for assessing product family design. Other. Ecole Centrale
Additional mechanisms for rewriting on-the-fly SPARQL queries proxy
Additional mechanisms for rewriting on-the-fly SPARQL queries proxy Arthur Vaisse-Lesteven, Bruno Grilhères To cite this version: Arthur Vaisse-Lesteven, Bruno Grilhères. Additional mechanisms for rewriting
Faut-il des cyberarchivistes, et quel doit être leur profil professionnel?
Faut-il des cyberarchivistes, et quel doit être leur profil professionnel? Jean-Daniel Zeller To cite this version: Jean-Daniel Zeller. Faut-il des cyberarchivistes, et quel doit être leur profil professionnel?.
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
The truck scheduling problem at cross-docking terminals
The truck scheduling problem at cross-docking terminals Lotte Berghman,, Roel Leus, Pierre Lopez To cite this version: Lotte Berghman,, Roel Leus, Pierre Lopez. The truck scheduling problem at cross-docking
Advantages and disadvantages of e-learning at the technical university
Advantages and disadvantages of e-learning at the technical university Olga Sheypak, Galina Artyushina, Anna Artyushina To cite this version: Olga Sheypak, Galina Artyushina, Anna Artyushina. Advantages
Automatic Generation of Correlation Rules to Detect Complex Attack Scenarios
Automatic Generation of Correlation Rules to Detect Complex Attack Scenarios Erwan Godefroy, Eric Totel, Michel Hurfin, Frédéric Majorczyk To cite this version: Erwan Godefroy, Eric Totel, Michel Hurfin,
An update on acoustics designs for HVAC (Engineering)
An update on acoustics designs for HVAC (Engineering) Ken MARRIOTT To cite this version: Ken MARRIOTT. An update on acoustics designs for HVAC (Engineering). Société Française d Acoustique. Acoustics 2012,
Load testing with. WAPT Cloud. Quick Start Guide
Load testing with WAPT Cloud Quick Start Guide This document describes step by step how to create a simple typical test for a web application, execute it and interpret the results. 2007-2015 SoftLogica
Adding Panoramas to Google Maps Using Ajax
Adding Panoramas to Google Maps Using Ajax Derek Bradley Department of Computer Science University of British Columbia Abstract This project is an implementation of an Ajax web application. AJAX is a new
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
Managing Risks at Runtime in VoIP Networks and Services
Managing Risks at Runtime in VoIP Networks and Services Oussema Dabbebi, Remi Badonnel, Olivier Festor To cite this version: Oussema Dabbebi, Remi Badonnel, Olivier Festor. Managing Risks at Runtime in
Blog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs
IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs Thomas Durieux, Martin Monperrus To cite this version: Thomas Durieux, Martin Monperrus. IntroClassJava: A Benchmark of 297 Small and Buggy
Research and Implementation of View Block Partition Method for Theme-oriented Webpage
, pp.247-256 http://dx.doi.org/10.14257/ijhit.2015.8.2.23 Research and Implementation of View Block Partition Method for Theme-oriented Webpage Lv Fang, Huang Junheng, Wei Yuliang and Wang Bailing * Harbin
Teamcenter s manufacturing process management 8.3. Report Generator Guide. Publication Number PLM00064 E
Teamcenter s manufacturing process management 8.3 Report Generator Guide Publication Number PLM00064 E Proprietary and restricted rights notice This software and related documentation are proprietary to
Novel Client Booking System in KLCC Twin Tower Bridge
Novel Client Booking System in KLCC Twin Tower Bridge Hossein Ameri Mahabadi, Reza Ameri To cite this version: Hossein Ameri Mahabadi, Reza Ameri. Novel Client Booking System in KLCC Twin Tower Bridge.
Study on Cloud Service Mode of Agricultural Information Institutions
Study on Cloud Service Mode of Agricultural Information Institutions Xiaorong Yang, Nengfu Xie, Dan Wang, Lihua Jiang To cite this version: Xiaorong Yang, Nengfu Xie, Dan Wang, Lihua Jiang. Study on Cloud
Content Management Implementation Guide 5.3 SP1
SDL Tridion R5 Content Management Implementation Guide 5.3 SP1 Read this document to implement and learn about the following Content Manager features: Publications Blueprint Publication structure Users
Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation
Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation Credit-By-Assessment (CBA) Competency List Written Assessment Competency List Introduction to the Internet
Developing ASP.NET MVC 4 Web Applications MOC 20486
Developing ASP.NET MVC 4 Web Applications MOC 20486 Course Outline Module 1: Exploring ASP.NET MVC 4 The goal of this module is to outline to the students the components of the Microsoft Web Technologies
Design principles of the Drupal CSC website
CERN IT Department Report Design principles of the Drupal CSC website Stanislav Pelák Supervisor: Giuseppe Lo Presti 26th September 2013 Contents 1 Introduction 1 1.1 Initial situation.........................
Introduction... 3. Designing your Common Template... 4. Designing your Shop Top Page... 6. Product Page Design... 8. Featured Products...
Introduction... 3 Designing your Common Template... 4 Common Template Dimensions... 5 Designing your Shop Top Page... 6 Shop Top Page Dimensions... 7 Product Page Design... 8 Editing the Product Page layout...
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
Developing ASP.NET MVC 4 Web Applications
Course M20486 5 Day(s) 30:00 Hours Developing ASP.NET MVC 4 Web Applications Introduction In this course, students will learn to develop advanced ASP.NET MVC applications using.net Framework 4.5 tools
Global Identity Management of Virtual Machines Based on Remote Secure Elements
Global Identity Management of Virtual Machines Based on Remote Secure Elements Hassane Aissaoui, P. Urien, Guy Pujolle To cite this version: Hassane Aissaoui, P. Urien, Guy Pujolle. Global Identity Management
ANIMATED PHASE PORTRAITS OF NONLINEAR AND CHAOTIC DYNAMICAL SYSTEMS
ANIMATED PHASE PORTRAITS OF NONLINEAR AND CHAOTIC DYNAMICAL SYSTEMS Jean-Marc Ginoux To cite this version: Jean-Marc Ginoux. ANIMATED PHASE PORTRAITS OF NONLINEAR AND CHAOTIC DYNAMICAL SYSTEMS. A.H. Siddiqi,
How To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip
Load testing with WAPT: Quick Start Guide This document describes step by step how to create a simple typical test for a web application, execute it and interpret the results. A brief insight is provided
Information Technology Education in the Sri Lankan School System: Challenges and Perspectives
Information Technology Education in the Sri Lankan School System: Challenges and Perspectives Chandima H. De Silva To cite this version: Chandima H. De Silva. Information Technology Education in the Sri
Terms and Definitions for CMS Administrators, Architects, and Developers
Sitecore CMS 6 Glossary Rev. 081028 Sitecore CMS 6 Glossary Terms and Definitions for CMS Administrators, Architects, and Developers Table of Contents Chapter 1 Introduction... 3 1.1 Glossary... 4 Page
Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski.
Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski. Fabienne Comte, Celine Duval, Valentine Genon-Catalot To cite this version: Fabienne
Minkowski Sum of Polytopes Defined by Their Vertices
Minkowski Sum of Polytopes Defined by Their Vertices Vincent Delos, Denis Teissandier To cite this version: Vincent Delos, Denis Teissandier. Minkowski Sum of Polytopes Defined by Their Vertices. Journal
DataPA OpenAnalytics End User Training
DataPA OpenAnalytics End User Training DataPA End User Training Lesson 1 Course Overview DataPA Chapter 1 Course Overview Introduction This course covers the skills required to use DataPA OpenAnalytics
Generalization Capacity of Handwritten Outlier Symbols Rejection with Neural Network
Generalization Capacity of Handwritten Outlier Symbols Rejection with Neural Network Harold Mouchère, Eric Anquetil To cite this version: Harold Mouchère, Eric Anquetil. Generalization Capacity of Handwritten
GCE APPLIED ICT A2 COURSEWORK TIPS
GCE APPLIED ICT A2 COURSEWORK TIPS COURSEWORK TIPS A2 GCE APPLIED ICT If you are studying for the six-unit GCE Single Award or the twelve-unit Double Award, then you may study some of the following coursework
Aligning subjective tests using a low cost common set
Aligning subjective tests using a low cost common set Yohann Pitrey, Ulrich Engelke, Marcus Barkowsky, Romuald Pépion, Patrick Le Callet To cite this version: Yohann Pitrey, Ulrich Engelke, Marcus Barkowsky,
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
How to Manage Your Eservice Center Knowledge Base
Populating and Maintaining your eservice Center Knowledge Base Table of Contents Populating and Maintaining the eservice Center Knowledge Base...2 Key Terms...2 Setting up the Knowledge Base...3 Consider
OvidSP Quick Reference Guide
OvidSP Quick Reference Guide Opening an OvidSP Session Open the OvidSP URL with a browser or Follow a link on a web page or Use Athens or Shibboleth access Select Resources to Search In the Select Resource(s)
HTML5 Data Visualization and Manipulation Tool Colorado School of Mines Field Session Summer 2013
HTML5 Data Visualization and Manipulation Tool Colorado School of Mines Field Session Summer 2013 Riley Moses Bri Fidder Jon Lewis Introduction & Product Vision BIMShift is a company that provides all
IE Class Web Design Curriculum
Course Outline Web Technologies 130.279 IE Class Web Design Curriculum Unit 1: Foundations s The Foundation lessons will provide students with a general understanding of computers, how the internet works,
GDS Resource Record: Generalization of the Delegation Signer Model
GDS Resource Record: Generalization of the Delegation Signer Model Gilles Guette, Bernard Cousin, David Fort To cite this version: Gilles Guette, Bernard Cousin, David Fort. GDS Resource Record: Generalization
Cache Configuration Reference
Sitecore CMS 6.2 Cache Configuration Reference Rev: 2009-11-20 Sitecore CMS 6.2 Cache Configuration Reference Tips and Techniques for Administrators and Developers Table of Contents Chapter 1 Introduction...
Search Result Optimization using Annotators
Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,
Administrator s Guide
SEO Toolkit 1.3.0 for Sitecore CMS 6.5 Administrator s Guide Rev: 2011-06-07 SEO Toolkit 1.3.0 for Sitecore CMS 6.5 Administrator s Guide How to use the Search Engine Optimization Toolkit to optimize your
Towards Unified Tag Data Translation for the Internet of Things
Towards Unified Tag Data Translation for the Internet of Things Loïc Schmidt, Nathalie Mitton, David Simplot-Ryl To cite this version: Loïc Schmidt, Nathalie Mitton, David Simplot-Ryl. Towards Unified
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Windchill Service Information Manager 10.1. Curriculum Guide
Windchill Service Information Manager 10.1 Curriculum Guide Live Classroom Curriculum Guide Building Information Structures with Windchill Service Information Manager 10.1 Building Publication Structures
Online vehicle routing and scheduling with continuous vehicle tracking
Online vehicle routing and scheduling with continuous vehicle tracking Jean Respen, Nicolas Zufferey, Jean-Yves Potvin To cite this version: Jean Respen, Nicolas Zufferey, Jean-Yves Potvin. Online vehicle
MASTERTAG DEVELOPER GUIDE
MASTERTAG DEVELOPER GUIDE TABLE OF CONTENTS 1 Introduction... 4 1.1 What is the zanox MasterTag?... 4 1.2 What is the zanox page type?... 4 2 Create a MasterTag application in the zanox Application Store...
Windchill PDMLink 10.1. Curriculum Guide
Windchill PDMLink 10.1 Curriculum Guide Live Classroom Curriculum Guide Update to Windchill PDMLink 10.1 from Windchill PDMLink 9.0/9.1 Introduction to Windchill PDMLink 10.1 for Light Users Introduction
A Platform for Large-Scale Machine Learning on Web Design
A Platform for Large-Scale Machine Learning on Web Design Arvind Satyanarayan SAP Stanford Graduate Fellow Dept. of Computer Science Stanford University 353 Serra Mall Stanford, CA 94305 USA [email protected]
What is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
Exercise 1 : Branding with Confidence
EPrints Training: Repository Configuration Exercises Exercise 1 :Branding with Confidence 1 Exercise 2 :Modifying Phrases 5 Exercise 3 :Configuring the Deposit Workflow 7 Exercise 4 :Controlled Vocabularies
Developing ASP.NET MVC 4 Web Applications Course 20486A; 5 Days, Instructor-led
Developing ASP.NET MVC 4 Web Applications Course 20486A; 5 Days, Instructor-led Course Description In this course, students will learn to develop advanced ASP.NET MVC applications using.net Framework 4.5
Kaldeera Workflow Designer 2010 User's Guide
Kaldeera Workflow Designer 2010 User's Guide Version 1.0 Generated May 18, 2011 Index 1 Chapter 1: Using Kaldeera Workflow Designer 2010... 3 1.1 Getting Started with Kaldeera... 3 1.2 Importing and exporting
Professional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008
Professional Organization Checklist for the Computer Science Curriculum Updates Association of Computing Machinery Computing Curricula 2008 The curriculum guidelines can be found in Appendix C of the report
Chapter-1 : Introduction 1 CHAPTER - 1. Introduction
Chapter-1 : Introduction 1 CHAPTER - 1 Introduction This thesis presents design of a new Model of the Meta-Search Engine for getting optimized search results. The focus is on new dimension of internet
Running an HCI Experiment in Multiple Parallel Universes
Running an HCI Experiment in Multiple Parallel Universes,, To cite this version:,,. Running an HCI Experiment in Multiple Parallel Universes. CHI 14 Extended Abstracts on Human Factors in Computing Systems.
Email UAE Bulk Email System. User Guide
Email UAE Bulk Email System User Guide 1 Table of content Features -----------------------------------------------------------------------------------------------------03 Login ---------------------------------------------------------------------------------------------------------08
An introduction to creating JSF applications in Rational Application Developer Version 8.0
An introduction to creating JSF applications in Rational Application Developer Version 8.0 September 2010 Copyright IBM Corporation 2010. 1 Overview Although you can use several Web technologies to create
Expanding Renewable Energy by Implementing Demand Response
Expanding Renewable Energy by Implementing Demand Response Stéphanie Bouckaert, Vincent Mazauric, Nadia Maïzi To cite this version: Stéphanie Bouckaert, Vincent Mazauric, Nadia Maïzi. Expanding Renewable
CHAPTER 1: CLIENT/SERVER INTEGRATED DEVELOPMENT ENVIRONMENT (C/SIDE)
Chapter 1: Client/Server Integrated Development Environment (C/SIDE) CHAPTER 1: CLIENT/SERVER INTEGRATED DEVELOPMENT ENVIRONMENT (C/SIDE) Objectives Introduction The objectives are: Discuss Basic Objects
Visualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government
Web Browsing Quality of Experience Score
Web Browsing Quality of Experience Score A Sandvine Technology Showcase Contents Executive Summary... 1 Introduction to Web QoE... 2 Sandvine s Web Browsing QoE Metric... 3 Maintaining a Web Page Library...
Configuration Manager
After you have installed Unified Intelligent Contact Management (Unified ICM) and have it running, use the to view and update the configuration information in the Unified ICM database. The configuration
A LANGUAGE INDEPENDENT WEB DATA EXTRACTION USING VISION BASED PAGE SEGMENTATION ALGORITHM
A LANGUAGE INDEPENDENT WEB DATA EXTRACTION USING VISION BASED PAGE SEGMENTATION ALGORITHM 1 P YesuRaju, 2 P KiranSree 1 PG Student, 2 Professorr, Department of Computer Science, B.V.C.E.College, Odalarevu,
Unlocking the Java EE Platform with HTML 5
1 2 Unlocking the Java EE Platform with HTML 5 Unlocking the Java EE Platform with HTML 5 Overview HTML5 has suddenly become a hot item, even in the Java ecosystem. How do the 'old' technologies of HTML,
Unemployment Insurance Data Validation Operations Guide
Unemployment Insurance Data Validation Operations Guide ETA Operations Guide 411 U.S. Department of Labor Employment and Training Administration Office of Unemployment Insurance TABLE OF CONTENTS Chapter
Understanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Elgg 1.8 Social Networking
Elgg 1.8 Social Networking Create, customize, and deploy your very networking site with Elgg own social Cash Costello PACKT PUBLISHING open source* community experience distilled - BIRMINGHAM MUMBAI Preface
Authoring Guide for Perception Version 3
Authoring Guide for Version 3.1, October 2001 Information in this document is subject to change without notice. Companies, names, and data used in examples herein are fictitious unless otherwise noted.
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Customising Your Mobile Payment Pages
Corporate Gateway Customising Your Mobile Payment Pages V2.0 May 2014 Use this guide to: Understand how to customise your payment pages for mobile and tablet devices XML Direct Integration Guide > Contents
Sitecore InDesign Connector 1.1
Sitecore Adaptive Print Studio Sitecore InDesign Connector 1.1 - User Manual, October 2, 2012 Sitecore InDesign Connector 1.1 User Manual Creating InDesign Documents with Sitecore CMS User Manual Page
Building and Using Web Services With JDeveloper 11g
Building and Using Web Services With JDeveloper 11g Purpose In this tutorial, you create a series of simple web service scenarios in JDeveloper. This is intended as a light introduction to some of the
