# Information Retrieval. Exercises Skip lists Positional indexing Permuterm index

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Information Retrieval Exercises Skip lists Positional indexing Permuterm index

2 Faster merging using skip lists

3 Recall basic merge Walk through the two postings simultaneously, in time linear in the total number of postings entries Brutus Caesar If the list lengths are m and n, the merge takes O(m+n) operations. Can we do better? Yes, if index isn t changing too fast.

4 Augment postings with skip pointers (at indexing time) Why? To skip postings that will not figure in the search results. Where do we place skip pointers? No skip pointers for intermediate results

5 Recall: basic merge algorithm pointer version The algorithm is Fig. 1.6 at of Chapter 1 of Raghavan's book. Modify to use skip pointers

6 Merge algorithm with skip lists The algorithm is Fig of Chapter 2 of Raghavan's book.

7 Placing skips Simple heuristic: for postings of length L, use L evenly-spaced skip pointers. This ignores the distribution of query terms. Easy if the index is relatively static; harder if L keeps changing because of updates.

8 Placing skips Es. 2.6 on IIR. We have a two-word query. For one term the postings list consists of the following 16 entries: [4,6,10,12,14,16,18,20,22,32,47,81,120,122,157,180] and for the other it is the one entry postings list: [47]. Work out how many comparisons would be done to intersect the two postings lists with the following two strategies. Briefly justify your answers: a. Using standard postings lists b. Using postings lists stored with skip pointers, with a skip length of P

9 Other operators Adapted from Es. 2.5 on IIR. Are skip pointers useful for queries of the form x OR y? Motivate your answer Are skip pointers useful for queries of the form x AND NOT y? Motivate your answer

10 Other operators Adapted from Ex. 2.5 on IIR. Are skip pointers useful for queries of the form x OR y? Motivate your answer No, because you have to visit every docid of each postings list Are skip pointers useful for queries of the form x AND NOT y? Motivate your answer In principle you do less comparisons (consider ex. 2.6 in previous slide). In practice, you still have to take all the elements to return and put them into a result list, so no gain, at least asymptotically

11 Phrase queries and positional indexes

12 Phrase queries Want to answer queries such as stanford university as a phrase Thus the sentence I went to university at Stanford is not a match. The concept of phrase queries has proven easily understood by users; about 10% of web queries are phrase queries No longer suffices to store only <term : docs> entries

13 Solution 2: Positional indexes Store, for each term, entries of the form: <number of docs containing term; doc1: position1, position2 ; doc2: position1, position2 ; etc.>

14 Processing a phrase query Extract inverted index entries for each distinct term: to, be, or, not. Merge their doc:position lists to enumerate all positions with to be or not to be. to: 2:1,17,74,222,551; 4:8,16,190,429,433; 7:13,23,191;... be: 1:17,19; 4:17,191,291,430,434; 5:14,19,101;... Same general method for proximity searches

15 Exercise 2.10 in IIR Consider the following fragment of a positional index with the format: <term>: document: position, position,... ; document: position, Gates: 1: 3 ; 2: 6 ; 3: 2,17 ; 4: 1 ; IBM: 4: 3 ; 7: 14 ; Microsoft: 1: 1 ; 2: 1,21 ; 3: 3 ; 5: 16,22,51 ; The /k operator, word1 /k word2 finds occurrences of word1 within k words of word2 (on either side), where k is a positive integer argument. Thus k = 1 demands that word1 be adjacent to word2. a. Describe the set of documents that satisfy the query Gates /2 Microsoft. b. Describe each set of values for k for which the query Gates /k Microsoft returns a different set of documents as the answer.

16 Exercise 2.12 in IIR (revised) Consider the adaptation of the basic algorithm for intersection of two postings lists (Figure 1.6) to the one in Figure 2.12, which handles proximity queries. A naive algorithm for this operation could be O(PL max2 ), where P is the sum of the lengths of the postings lists (i.e., the sum of term frequencies) and L max is the maximum length of a document (in tokens) a. Prove the trivial O(PL max 2 ) bound b. Go through this algorithm carefully and explain how it works. c. What is the complexity of this algorithm? Justify your answer carefully. d. For certain queries and data distributions, would another algorithm be more efficient? What complexity does it have?

17 Exercise 2.12 in IIR (revised) Find term pairs in doc that are no more than k away

18 Positional index size Can compress position values/offsets. Nevertheless, this expands postings storage substantially

19 Positional index size Need an entry for each occurrence, not just once per document Index size depends on average document size Average web page has <1000 terms SEC filings, books, even some epic poems easily 100,000 terms Consider a term with frequency 0.1% Document size ,000 Postings 1 1 Positional postings 100 Why? 1

20 Resources for today s lecture IIR Chapters

### EECS 395/495 Lecture 3 Scalable Indexing, Searching, and Crawling

EECS 395/495 Lecture 3 Scalable Indexing, Searching, and Crawling Doug Downey Based partially on slides by Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze Announcements Project proposals due

### Homework 1 (10 ) a. Term document incidence matrix. b. inverted index representation for this collection (change the order between hopes and for )

Homework 1 (10 ) Page 9: Exercise 1.2; Exercise 1. Page 12: Exercise 1.6 Page 1: Exercise 1.8; Exercise 1.10 Page : Exercise 2.1; Exercise 2. Page 6: Exercise 2.7 Page 41: Exercise 2.9 Page 51: Exercise.2;

### Inverted Indexes: Trading Precision for Efficiency

Inverted Indexes: Trading Precision for Efficiency Yufei Tao KAIST April 1, 2013 After compression, an inverted index is often small enough to fit in memory. This benefits query processing because it avoids

### Informa(on Retrieval

Introduc*on to Informa(on Retrieval Lecture 4: Dic*onaries and tolerant retrieval 1 Ch. 3 This lecture Dic*onary data structures Tolerant retrieval Wild-card queries Spelling correc*on Soundex 2 Sec. 3.1

### Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process

### Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

### B such that AB = I and BA = I. (We say B is an inverse of A.) Definition A square matrix A is invertible (or nonsingular) if matrix

Matrix inverses Recall... Definition A square matrix A is invertible (or nonsingular) if matrix B such that AB = and BA =. (We say B is an inverse of A.) Remark Not all square matrices are invertible.

### Introduction to Information Retrieval http://informationretrieval.org

Introduction to Information Retrieval http://informationretrieval.org IIR 7: Scores in a Complete Search System Hinrich Schütze Center for Information and Language Processing, University of Munich 2014-05-07

### 1 Boolean retrieval. Online edition (c)2009 Cambridge UP

DRAFT! April 1, 2009 Cambridge University Press. Feedback welcome. 1 1 Boolean retrieval INFORMATION RETRIEVAL The meaning of the term information retrieval can be very broad. Just getting a credit card

### CS/COE 1501 http://cs.pitt.edu/~bill/1501/

CS/COE 1501 http://cs.pitt.edu/~bill/1501/ Lecture 01 Course Introduction Meta-notes These notes are intended for use by students in CS1501 at the University of Pittsburgh. They are provided free of charge

### Divide And Conquer Algorithms

CSE341T/CSE549T 09/10/2014 Lecture 5 Divide And Conquer Algorithms Recall in last lecture, we looked at one way of parallelizing matrix multiplication. At the end of the lecture, we saw the reduce SUM

### Phrase-Based Translation Models

Phrase-Based Translation Models Michael Collins April 10, 2013 1 Introduction In previous lectures we ve seen IBM translation models 1 and 2. In this note we will describe phrasebased translation models.

### So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

### Distributed Aggregation in Cloud Databases. By: Aparna Tiwari tiwaria@umail.iu.edu

Distributed Aggregation in Cloud Databases By: Aparna Tiwari tiwaria@umail.iu.edu ABSTRACT Data intensive applications rely heavily on aggregation functions for extraction of data according to user requirements.

### Introduction to Diophantine Equations

Introduction to Diophantine Equations Tom Davis tomrdavis@earthlink.net http://www.geometer.org/mathcircles September, 2006 Abstract In this article we will only touch on a few tiny parts of the field

### 3. Mathematical Induction

3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

### Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 15 Limited Resource Allocation Today we are going to be talking about

### WRITING PROOFS. Christopher Heil Georgia Institute of Technology

WRITING PROOFS Christopher Heil Georgia Institute of Technology A theorem is just a statement of fact A proof of the theorem is a logical explanation of why the theorem is true Many theorems have this

### Q&As: Microsoft Excel 2013: Chapter 2

Q&As: Microsoft Excel 2013: Chapter 2 In Step 5, why did the date that was entered change from 4/5/10 to 4/5/2010? When Excel recognizes that you entered a date in mm/dd/yy format, it automatically formats

### Question 1. Question 2. Question 3. Question 4. Mert Emin Kalender CS 533 Homework 3

Question 1 Cluster hypothesis states the idea of closely associating documents that tend to be relevant to the same requests. This hypothesis does make sense. The size of documents for information retrieval

### Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

### Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

### Style Guide For Writing Mathematical Proofs

Style Guide For Writing Mathematical Proofs Adapted by Lindsey Shorser from materials by Adrian Butscher and Charles Shepherd A solution to a math problem is an argument. Therefore, it should be phrased

### Dynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction

Lecture 11 Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach

### Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,

### 8.1 Makespan Scheduling

600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programing: Min-Makespan and Bin Packing Date: 2/19/15 Scribe: Gabriel Kaptchuk 8.1 Makespan Scheduling Consider an instance

### Homework Exam 1, Geometric Algorithms, 2016

Homework Exam 1, Geometric Algorithms, 2016 1. (3 points) Let P be a convex polyhedron in 3-dimensional space. The boundary of P is represented as a DCEL, storing the incidence relationships between the

### COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

### The Taxman Game. Robert K. Moniot September 5, 2003

The Taxman Game Robert K. Moniot September 5, 2003 1 Introduction Want to know how to beat the taxman? Legally, that is? Read on, and we will explore this cute little mathematical game. The taxman game

### Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik Demaine and Shafi Goldwasser Quiz 1 Quiz 1 Do not open this quiz booklet until you are directed

### COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about

### International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:

### Covariance and Correlation

Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

### Closest Pair Problem

Closest Pair Problem Given n points in d-dimensions, find two whose mutual distance is smallest. Fundamental problem in many applications as well as a key step in many algorithms. p q A naive algorithm

### Lecture 1: Jan 14, 2015

E0 309: Topics in Complexity Theory Spring 2015 Lecture 1: Jan 14, 2015 Lecturer: Neeraj Kayal Scribe: Sumant Hegde and Abhijat Sharma 11 Introduction The theme of the course in this semester is Algebraic

### boolean retrieval some slides courtesy James

boolean retrieval some slides courtesy James Allan@umass 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the properties of

### On the Integration of Structured Data and Text: A Review of the SIRE Architecture

On the Integration of Structured Data and Text: A Review of the SIRE Architecture (An Invited Overview) O. Frieder, A. Chowdhury, D. Grossman, & M.C. McCabe Information Retrieval Laboratory Illinois Institute

### CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.

Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure

### Fast Sequential Summation Algorithms Using Augmented Data Structures

Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer

### Integer multiplication

Integer multiplication Suppose we have two unsigned integers, A and B, and we wish to compute their product. Let A be the multiplicand and B the multiplier: A n 1... A 1 A 0 multiplicand B n 1... B 1 B

### CS473 - Algorithms I

CS473 - Algorithms I Lecture 9 Sorting in Linear Time View in slide-show mode 1 How Fast Can We Sort? The algorithms we have seen so far: Based on comparison of elements We only care about the relative

### Distributed Data Management Part 2 - Data Broadcasting in Mobile Networks

Distributed Data Management Part 2 - Data Broadcasting in Mobile Networks 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Mobile Data Management - 1 1 Today's Questions 1.

### Lecture 1: Course overview, circuits, and formulas

Lecture 1: Course overview, circuits, and formulas Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: John Kim, Ben Lund 1 Course Information Swastik

### Sec 4.1 Vector Spaces and Subspaces

Sec 4. Vector Spaces and Subspaces Motivation Let S be the set of all solutions to the differential equation y + y =. Let T be the set of all 2 3 matrices with real entries. These two sets share many common

### Statistical Foundations: Measures of Location and Central Tendency and Summation and Expectation

Statistical Foundations: and Central Tendency and and Lecture 4 September 5, 2006 Psychology 790 Lecture #4-9/05/2006 Slide 1 of 26 Today s Lecture Today s Lecture Where this Fits central tendency/location

### Discuss the size of the instance for the minimum spanning tree problem.

3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

Lecture 10 Union-Find The union-nd data structure is motivated by Kruskal's minimum spanning tree algorithm (Algorithm 2.6), in which we needed two operations on disjoint sets of vertices: determine whether

### GMAT Math: Exponents and Roots (Excerpt)

GMAT Math: Exponents and Roots (Excerpt) Jeff Sackmann / GMAT HACKS January 201 Contents 1 Introduction 2 2 Difficulty Levels Problem Solving 4 4 Data Sufficiency 5 5 Answer Key 6 6 Explanations 7 1 1

### Text Analytics. Assignment 2: Full text search. Ulf Leser

Text Analytics Assignment 2: Full text search Ulf Leser Full text search On the website, you find a link to a corpus of approximately 100MB Excerpt from 1977 Medline, do not redistribute Most groups: Write

### Web Search Engines. Search Engine Characteristics. Web Search Queries. Chapter 27, Part C Based on Larson and Hearst s slides at UC-Berkeley

Web Search Engines Chapter 27, Part C Based on Larson and Hearst s slides at UC-Berkeley http://www.sims.berkeley.edu/courses/is202/f00/ Database Management Systems, R. Ramakrishnan 1 Search Engine Characteristics

### Homework 3 Solutions

Homework 3 Solutions Chapter 3A Does it make sense? Decide whether each of the following statements makes sense (or is clearly true) or does not make sense (or is clearly false). Explain your reasoning.

### CLASS 3, GIVEN ON 9/27/2010, FOR MATH 25, FALL 2010

CLASS 3, GIVEN ON 9/27/2010, FOR MATH 25, FALL 2010 1. Greatest common divisor Suppose a, b are two integers. If another integer d satisfies d a, d b, we call d a common divisor of a, b. Notice that as

### SIMS 255 Foundations of Software Design. Complexity and NP-completeness

SIMS 255 Foundations of Software Design Complexity and NP-completeness Matt Welsh November 29, 2001 mdw@cs.berkeley.edu 1 Outline Complexity of algorithms Space and time complexity ``Big O'' notation Complexity

### Math 319 Problem Set #3 Solution 21 February 2002

Math 319 Problem Set #3 Solution 21 February 2002 1. ( 2.1, problem 15) Find integers a 1, a 2, a 3, a 4, a 5 such that every integer x satisfies at least one of the congruences x a 1 (mod 2), x a 2 (mod

### types of information systems computer-based information systems

topics: what is information systems? what is information? knowledge representation information retrieval cis20.2 design and implementation of software applications II spring 2008 session # II.1 information

### Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S

Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard

### Electronic Document Management Using Inverted Files System

EPJ Web of Conferences 68, 0 00 04 (2014) DOI: 10.1051/ epjconf/ 20146800004 C Owned by the authors, published by EDP Sciences, 2014 Electronic Document Management Using Inverted Files System Derwin Suhartono,

### Strong and Weak Ties

Strong and Weak Ties Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz April 11, 2016 Elisabeth Lex (KTI, TU Graz) Networks April 11, 2016 1 / 66 Outline 1 Repetition 2 Strong and Weak Ties 3 General

### Elementary IR Systems: Supporting Boolean Text Search. Information Retrieval

Elementary IR Systems: Supporting Boolean Text Search Based on Hellerstein s slides, UC-Berkeley Database Management Systems, R. Ramakrishnan 1 Information Retrieval v A research field traditionally separate

### Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### [2], [3], which realize a time bound of O(n. e(c + 1)).

SIAM J. COMPUT. Vol. 4, No. 1, March 1975 FINDING ALL THE ELEMENTARY CIRCUITS OF A DIRECTED GRAPH* DONALD B. JOHNSON Abstract. An algorithm is presented which finds all the elementary circuits-of a directed

### Network Flow I. Lecture 16. 16.1 Overview. 16.2 The Network Flow Problem

Lecture 6 Network Flow I 6. Overview In these next two lectures we are going to talk about an important algorithmic problem called the Network Flow Problem. Network flow is important because it can be

### ) ( ) Thus, (, 4.5] [ 7, 6) Thus, (, 3) ( 5, ) = (, 6). = ( 5, 3).

152 Sect 10.1 - Compound Inequalities Concept #1 Union and Intersection To understand the Union and Intersection of two sets, let s begin with an example. Let A = {1, 2,,, 5} and B = {2,, 6, 8}. Union

### Shortcut sets for plane Euclidean networks (Extended abstract) 1

Shortcut sets for plane Euclidean networks (Extended abstract) 1 J. Cáceres a D. Garijo b A. González b A. Márquez b M. L. Puertas a P. Ribeiro c a Departamento de Matemáticas, Universidad de Almería,

### Minimum cost maximum flow, Minimum cost circulation, Cost/Capacity scaling

6.854 Advanced Algorithms Lecture 16: 10/11/2006 Lecturer: David Karger Scribe: Kermin Fleming and Chris Crutchfield, based on notes by Wendy Chu and Tudor Leu Minimum cost maximum flow, Minimum cost circulation,

### Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Is your database application experiencing poor response time, scalability problems, and too many deadlocks or poor application performance? One or a combination of zparms, database design and application

### Objective. Materials. TI-73 Calculator

0. Objective To explore subtraction of integers using a number line. Activity 2 To develop strategies for subtracting integers. Materials TI-73 Calculator Integer Subtraction What s the Difference? Teacher

### Moving Average Filters

CHAPTER 15 Moving Average Filters The moving average is the most common filter in DSP, mainly because it is the easiest digital filter to understand and use. In spite of its simplicity, the moving average

### 7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

### Performance evaluation of Web Information Retrieval Systems and its application to e-business

Performance evaluation of Web Information Retrieval Systems and its application to e-business Fidel Cacheda, Angel Viña Departament of Information and Comunications Technologies Facultad de Informática,

### The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints

### INTRUSION PREVENTION AND EXPERT SYSTEMS

INTRUSION PREVENTION AND EXPERT SYSTEMS By Avi Chesla avic@v-secure.com Introduction Over the past few years, the market has developed new expectations from the security industry, especially from the intrusion

### Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique

### The LCA Problem Revisited

The LA Problem Revisited Michael A. Bender Martín Farach-olton SUNY Stony Brook Rutgers University May 16, 2000 Abstract We present a very simple algorithm for the Least ommon Ancestor problem. We thus

### 361 Computer Architecture Lecture 14: Cache Memory

1 361 Computer Architecture Lecture 14 Memory cache.1 The Motivation for s Memory System Processor DRAM Motivation Large memories (DRAM) are slow Small memories (SRAM) are fast Make the average access

### Welcome to the Reading Workshop. Learning, Loving and Laughing Together

Welcome to the Reading Workshop Aims for the workshop to encourage reading to be a regular and enjoyable activity to explore the best ways to read with your child to try an activity with your child(ren)

### Closest Pair of Points. Kleinberg and Tardos Section 5.4

Closest Pair of Points Kleinberg and Tardos Section 5.4 Closest Pair of Points Closest pair. Given n points in the plane, find a pair with smallest Euclidean distance between them. Fundamental geometric

### CSC148 Lecture 8. Algorithm Analysis Binary Search Sorting

CSC148 Lecture 8 Algorithm Analysis Binary Search Sorting Algorithm Analysis Recall definition of Big Oh: We say a function f(n) is O(g(n)) if there exists positive constants c,b such that f(n)

### CSE 548: Analysis of Algorithms. Lecture 2 ( Divide-and-Conquer Algorithms: Integer Multiplication )

CSE 548: Analysis of Algorithms Lecture 2 ( Divide-and-Conquer Algorithms: Integer Multiplication ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2015 A right trominois an

### Math212a1010 Lebesgue measure.

Math212a1010 Lebesgue measure. October 19, 2010 Today s lecture will be devoted to Lebesgue measure, a creation of Henri Lebesgue, in his thesis, one of the most famous theses in the history of mathematics.

### Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

### 8. Query Processing. Query Processing & Optimization

ECS-165A WQ 11 136 8. Query Processing Goals: Understand the basic concepts underlying the steps in query processing and optimization and estimating query processing cost; apply query optimization techniques;

### Distributed Synchronization

CIS 505: Software Systems Lecture Note on Physical Clocks Insup Lee Department of Computer and Information Science University of Pennsylvania Distributed Synchronization Communication between processes

### the recursion-tree method

the recursion- method recurrence into a 1 recurrence into a 2 MCS 360 Lecture 39 Introduction to Data Structures Jan Verschelde, 22 November 2010 recurrence into a The for consists of two steps: 1 Guess

### The Prime Numbers. Definition. A prime number is a positive integer with exactly two positive divisors.

The Prime Numbers Before starting our study of primes, we record the following important lemma. Recall that integers a, b are said to be relatively prime if gcd(a, b) = 1. Lemma (Euclid s Lemma). If gcd(a,

### Greatest Common Factors and Least Common Multiples with Venn Diagrams

Greatest Common Factors and Least Common Multiples with Venn Diagrams Stephanie Kolitsch and Louis Kolitsch The University of Tennessee at Martin Martin, TN 38238 Abstract: In this article the authors

### Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A

Introduction to IR Systems: Supporting Boolean Text Search Chapter 27, Part A Database Management Systems, R. Ramakrishnan 1 Information Retrieval A research field traditionally separate from Databases

### Experimental Comparison of Set Intersection Algorithms for Inverted Indexing

ITAT 213 Proceedings, CEUR Workshop Proceedings Vol. 13, pp. 58 64 http://ceur-ws.org/vol-13, Series ISSN 1613-73, c 213 V. Boža Experimental Comparison of Set Intersection Algorithms for Inverted Indexing

### AUSTAR turns static data into actionable business intelligence fast

Overview Business Challenge AUSTAR wanted to eliminate the effort of paper and spreadsheet reports, and address the disconnect between online planning and its static reporting environment combining multiple

### Solving Systems of Linear Equations

LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### Signed Binary Arithmetic

Signed Binary Arithmetic In the real world of mathematics, computers must represent both positive and negative binary numbers. For example, even when dealing with positive arguments, mathematical operations

### Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras

Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture No. # 31 Recursive Sets, Recursively Innumerable Sets, Encoding

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 Proofs Intuitively, the concept of proof should already be familiar We all like to assert things, and few of us

### Open Domain Information Extraction. Günter Neumann, DFKI, 2012

Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for

### Repetition and Loops. Additional Python constructs that allow us to effect the (1) order and (2) number of times that program statements are executed.

New Topic Repetition and Loops Additional Python constructs that allow us to effect the (1) order and (2) number of times that program statements are executed. These constructs are the 1. while loop and