String Edit Distance (and intro to dynamic programming) Lecture #4 Computational Linguistics CMPSCI 591N, Spring 2006
|
|
- Roger Bates
- 7 years ago
- Views:
Transcription
1 String Edit Distance (and intro to dynamic programming) Lecture # omputational Linguistics MPSI 59N, Spring 6 University of Massachusetts mherst ndrew Mcallum
2 Dynamic Programming (Not much to do with programming in the S sense.) Dynamic programming is efficient in finding optimal solutions for cases with lots of overlapping sub-problems. It solves problems by recombining solutions to sub-problems, when the sub-problems themselves may share sub-sub-problems.
3 Fibonacci Numbers
4
5 alculating Fibonacci Numbers F(n) = F(n-) + F(n-), where F()=, F()=. Non-Dynamic Programming implementation def fib(n): if n == or n == : return n else: return fib(n-) + fib(n-) For fib(8), how many calls to function fib(n)?
6 DP Example: alculating Fibonacci Numbers Dynamic Programming: avoid repeated calls by remembering function values already calculated. table = {} def fib(n): global table if table.has_key(n): return table[n] if n == or n == : table[n] = n return n else: value = fib(n-) + fib(n-) table[n] = value return value
7 DP Example: alculating Fibonacci Numbers...or alternately, in a list instead of a dictionary... def fib(n): table = [] * (n+) table[] = table[] = for i in range(,n+): table[i] = table[i-] + table[i-] return table[n] We will see this pattern many more times in this course:. reate a table (of the right dimensions to describe our problem.. Fill the table, re-using solutions to previous sub-problems.
8 String Edit Distance Given two strings (sequences) return the distance between the two strings as measured by......the minimum number of character edit operations needed to turn one sequence into the other. ndrew mdrewz. substitute m to n. delete the z Distance =
9 String distance metrics: Levenshtein Given strings s and t Distance is shortest sequence of edit commands that transform s to t, (or equivalently s to t). Simple set of operations: opy character from s over to t (cost ) Delete a character in s (cost ) Insert a character in t (cost ) Substitute one character for another (cost ) This is Levenshtein distance
10 Levenshtein distance - example distance( William ohen, Willliam ohon ) S O E I N H O _ M I L L L I W N H O _ M I L L I W s t edit op cost so far... alignment gap lignment is a little bit like a parse.
11 Finding the Minimum What is the minimum number of operations for...? nother fine day in the park nybody can see him pick the ball Not so easy... not so clear. Not only are the strings, longer, but is isn t immediately obvious where the alignments should happen. What if we consider all possible alignments by brute force? How many alignments are there?
12 Dynamic Program Table for String Edit Measure distance between strings PRK SPKE P R K S P K E c ij c ij = the number of edit operations needed to align P with SP.
13 Dynamic Programming to the Rescue! How to take our big problem and chop it into building-block pieces. Given some partial solution, it isn t hard to figure out what a good next immediate step is. Partial solution = This is the cost for aligning s up to position i with t up to position j. Next step = In order to align up to positions x in s and y in t, should the last operation be a substitute, insert, or delete?
14 Dynamic Program Table for String Edit Measure distance between strings PRK SPKE Edit operations for turning SPKE into PRK S P delete P R K insert K E substitute
15 Dynamic Program Table for String Edit Measure distance between strings PRK SPKE P R K c c c c c 5 S c c c c c P c c c c c c c??? K E
16 Dynamic Program Table for String Edit P R K c c c c c 5 S c c c c c P c c subst c delete c c c insert c??? K E D(i,j) = score of best alignment from s..si to t..tj = min D(i-,j-), if si=tj //copy D(i-,j-)+, if si!=tj //substitute D(i-,j)+ //insert D(i,j-)+ //delete
17 omputing Levenshtein distance - D(i,j) = score of best alignment from s..si to t..tj = min D(i-,j-) + d(si,tj) //subst/copy D(i-,j)+ //insert D(i,j-)+ //delete (simplify by letting d(c,d)= if c=d, else) also let D(i,)=i (for i inserts) and D(,j)=j
18 Dynamic Program Table Initialized P R K S P K E 5 D(i,j) = score of best alignment from s..si to t..tj D(i-,j-)+d(si,tj) //substitute D(i-,j)+ //insert = min D(i,j-)+ //delete
19 Dynamic Program Table... filling in P R K S P K E 5 D(i,j) = score of best alignment from s..si to t..tj D(i-,j-)+d(si,tj) //substitute D(i-,j)+ //insert = min D(i,j-)+ //delete
20 Dynamic Program Table... filling in P R K S P K E 5 D(i,j) = score of best alignment from s..si to t..tj D(i-,j-)+d(si,tj) //substitute D(i-,j)+ //insert = min D(i,j-)+ //delete
21 Dynamic Program Table... filling in P R K S P K E 5 D(i,j) = score of best alignment from s..si to t..tj D(i-,j-)+d(si,tj) //substitute D(i-,j)+ //insert = min D(i,j-)+ //delete
22 Dynamic Program Table... filling in S P K E 5 P D(i,j) = score of best alignment from s..si to t..tj = min R D(i-,j-)+d(si,tj) D(i-,j)+ D(i,j-)+ K //substitute //insert //delete Final cost of aligning all of both strings.
23 DP String Edit Distance def stredit (s,s): "alculate Levenstein edit distance for strings s and s." len = len(s) # vertically len = len(s) # horizontally # llocate the table table = [None]*(len+) for i in range(len+): table[i] = []*(len+) # Initialize the table for i in range(, len+): table[i][] = i for i in range(, len+): table[][i] = i # Do dynamic programming for i in range(,len+): for j in range(,len+): if s[j-] == s[i-]: d = else: d = table[i][j] = min(table[i-][j-] + d, table[i-][j]+, table[i][j-]+)
24 Remebering the lignment (trace) D(i,j) = min D(i-,j-) + d(si,tj) //subst/copy D(i-,j)+ //insert D(i,j-)+ //delete trace indicates where the min value came from, and can be used to find edit operations and/or a best alignment (may be more than ) M O H N 5 O H E N
25 Three Enhanced Variants Needleman-Munch Variable costs Smith-Waterman Find longest soft matching subsequence ffine Gap Distance Make repeated deletions (insertions) cheaper (Implement one for homework?)
26 Needleman-Wunch distance D(i,j) = min D(i-,j-) + d(si,tj) //subst/copy D(i-,j) + G //insert D(i,j-) + G //delete d(c,d) is an arbitrary distance function on characters (e.g. related to typo frequencies, amino acid substitutibility, etc) G = gap cost William ohen Wukkuan igeb
27 Smith-Waterman distance Instead of looking at each sequence in its entirety, this compares segments of all possible lengths and chooses whichever maximize the similarity measure. For every cell the algorithm calculates all possible paths leading to it. These paths can be of any length and can contain insertions and deletions.
28 Smith-Waterman distance D(i,j) = min //start over D(i-,j-) + d(si,tj) //subst/copy D(i-,j) + G //insert D(i,j-) + G //delete O H E N G = d(c,c) = - d(c,d) = + M O H N
29 Example output from Python s ' a l l o n g e r l * o *- - u *- - n *- - - g 5 - * e *-7-6 (My implementation of HW#, task choice #. -Mcallum)
30 ffine gap distances Smith-Waterman fails on some pairs that seem quite similar: William W. ohen William W. Don t call me Dubya ohen Intuitively, single a single long long insertions are is cheaper than a lot lot of of short insertions
31 ffine gap distances - Idea: urrent cost of a gap of n characters: ng Make this cost: + (n-)b, where is cost of opening a gap, and B is cost of continuing a gap.
32 ffine gap distances - D(i,j) = max D(i-,j-) D(i-,j-) + d(si,tj) d(si,tj) //subst/copy D(i-,j)- IS(I-,j-) + d(si,tj) //insert D(i,j-)- //delete IT(I-,j-) + d(si,tj) IS(i,j) = max D(i-,j) - IS(i-,j) - B Best score in which si is aligned with a gap IT(i,j) = max D(i,j-) - IT(i,j-) - B Best score in which tj is aligned with a gap
33 ffine gap distances as automata -d(si,tj) IS -B -d(si,tj) D - - -d(si,tj) IT -B
34 Generative version of affine gap automata (Bilenko&Mooney, TechReport ) HMM emits pairs: (c,d) in state M, pairs (c,-) in state D, and pairs (-,d) in state I. For each state there is a multinomial distribution on pairs. The HMM can trained with EM from a sample of pairs of matched strings (s,t) E-step is forward-backward; M-step uses some ad hoc smoothing
35 ffine gap edit-distance learning: experiments results (Bilenko & Mooney) Experimental method: parse records into fields; append a few key fields together; sort by similarity; pick a threshold T and call all pairs with distance(s,t) < T duplicates ; picking T to maximize F-measure.
36 ffine gap edit-distance learning: experiments results (Bilenko & Mooney)
37 ffine gap edit-distance learning: experiments results (Bilenko & Mooney) Precision/recall for MILING dataset duplicate detection
38 ffine gap distances experiments (from Mcallum, Nigam,Ungar KDD) Goal is to match data like this:
39 The assignment Homework # Start with my stredit.py code Make some modifications Write a little about your experiences Some possible modifications Implement Needleman-Wunch, Smith-Waterman, or ffine Gap Distance. reate a little spell-checker: if entered word isn t in the dictionary, return the dictionary word that is closest. hange implementation to operate on sequences of words rather than characters... get an online translation dictionary, and find alignments between English & French or English & Russian! Try to learn the parameters of the function from data. (Tough.)
Pairwise Sequence Alignment
Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
More informationDynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction
Lecture 11 Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach
More informationProgramming Exercises
s CMPS 5P (Professor Theresa Migler-VonDollen ): Assignment #8 Problem 6 Problem 1 Programming Exercises Modify the recursive Fibonacci program given in the chapter so that it prints tracing information.
More informationArithmetic Coding: Introduction
Data Compression Arithmetic coding Arithmetic Coding: Introduction Allows using fractional parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip More time costly than Huffman, but integer implementation
More informationLecture 13: The Knapsack Problem
Lecture 13: The Knapsack Problem Outline of this Lecture Introduction of the 0-1 Knapsack Problem. A dynamic programming solution to this problem. 1 0-1 Knapsack Problem Informal Description: We have computed
More informationScheduling Shop Scheduling. Tim Nieberg
Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations
More informationData Warehousing. Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de. Winter 2014/15. Jens Teubner Data Warehousing Winter 2014/15 1
Jens Teubner Data Warehousing Winter 2014/15 1 Data Warehousing Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Winter 2014/15 Jens Teubner Data Warehousing Winter 2014/15 152 Part VI ETL Process
More informationSolutions to Homework 6
Solutions to Homework 6 Debasish Das EECS Department, Northwestern University ddas@northwestern.edu 1 Problem 5.24 We want to find light spanning trees with certain special properties. Given is one example
More informationProgramming Using Python
Introduction to Computation and Programming Using Python Revised and Expanded Edition John V. Guttag The MIT Press Cambridge, Massachusetts London, England CONTENTS PREFACE xiii ACKNOWLEDGMENTS xv 1 GETTING
More informationCS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions
CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationClosest Pair Problem
Closest Pair Problem Given n points in d-dimensions, find two whose mutual distance is smallest. Fundamental problem in many applications as well as a key step in many algorithms. p q A naive algorithm
More informationInferring Probabilistic Models of cis-regulatory Modules. BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2015 Colin Dewey cdewey@biostat.wisc.
Inferring Probabilistic Models of cis-regulatory Modules MI/S 776 www.biostat.wisc.edu/bmi776/ Spring 2015 olin Dewey cdewey@biostat.wisc.edu Goals for Lecture the key concepts to understand are the following
More informationVGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams Chen Li University of California, Irvine CA 9697, USA chenli@ics.uci.edu Bin Wang Northeastern University
More informationR-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants
R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions
More informationTopics in Computational Linguistics. Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
Topics in Computational Linguistics Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Presented By: Mohammad Saif Department of Computer
More informationNear Optimal Solutions
Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NP-Complete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.
More informationEventia Log Parsing Editor 1.0 Administration Guide
Eventia Log Parsing Editor 1.0 Administration Guide Revised: November 28, 2007 In This Document Overview page 2 Installation and Supported Platforms page 4 Menus and Main Window page 5 Creating Parsing
More informationagucacaaacgcu agugcuaguuua uaugcagucuua
RNA Secondary Structure Prediction: The Co-transcriptional effect on RNA folding agucacaaacgcu agugcuaguuua uaugcagucuua By Conrad Godfrey Abstract RNA secondary structure prediction is an area of bioinformatics
More informationTHREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
More informationTopological Data Analysis Applications to Computer Vision
Topological Data Analysis Applications to Computer Vision Vitaliy Kurlin, http://kurlin.org Microsoft Research Cambridge and Durham University, UK Topological Data Analysis quantifies topological structures
More informationB490 Mining the Big Data. 2 Clustering
B490 Mining the Big Data 2 Clustering Qin Zhang 1-1 Motivations Group together similar documents/webpages/images/people/proteins/products One of the most important problems in machine learning, pattern
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationEngineering Problem Solving and Excel. EGN 1006 Introduction to Engineering
Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques
More informationDynamic Programming Problem Set Partial Solution CMPSC 465
Dynamic Programming Problem Set Partial Solution CMPSC 465 I ve annotated this document with partial solutions to problems written more like a test solution. (I remind you again, though, that a formal
More informationNew Hash Function Construction for Textual and Geometric Data Retrieval
Latest Trends on Computers, Vol., pp.483-489, ISBN 978-96-474-3-4, ISSN 79-45, CSCC conference, Corfu, Greece, New Hash Function Construction for Textual and Geometric Data Retrieval Václav Skala, Jan
More information2. Select Point B and rotate it by 15 degrees. A new Point B' appears. 3. Drag each of the three points in turn.
In this activity you will use Sketchpad s Iterate command (on the Transform menu) to produce a spiral design. You ll also learn how to use parameters, and how to create animation action buttons for parameters.
More informationData Deduplication in Slovak Corpora
Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain
More informationGENERAL SCIENCE LABORATORY 1110L Lab Experiment 3: PROJECTILE MOTION
GENERAL SCIENCE LABORATORY 1110L Lab Experiment 3: PROJECTILE MOTION Objective: To understand the motion of a projectile in the earth s gravitational field and measure the muzzle velocity of the projectile
More informationCost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:
CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm
More informationSolving Simultaneous Equations and Matrices
Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering
More informationPaper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation
Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation Abstract This paper discusses methods of joining SAS data sets. The different methods and the reasons for choosing a particular
More informationApproximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
More informationComputational Geometry: Line segment intersection
: Line segment intersection Panos Giannopoulos Wolfgang Mulzer Lena Schlipf AG TI SS 2013 Tutorial room change: 055 this building!!! (from next monday on) Outline Motivation Line segment intersection (and
More informationCSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015
1 Introduction These notes introduce basic runtime analysis of algorithms. We would like to be able to tell if a given algorithm is time-efficient, and to be able to compare different algorithms. 2 Linear
More informationBio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More informationError Log Processing for Accurate Failure Prediction. Humboldt-Universität zu Berlin
Error Log Processing for Accurate Failure Prediction Felix Salfner ICSI Berkeley Steffen Tschirpke Humboldt-Universität zu Berlin Introduction Context of work: Error-based online failure prediction: error
More informationScheduling Programming Activities and Johnson's Algorithm
Scheduling Programming Activities and Johnson's Algorithm Allan Glaser and Meenal Sinha Octagon Research Solutions, Inc. Abstract Scheduling is important. Much of our daily work requires us to juggle multiple
More informationMath 55: Discrete Mathematics
Math 55: Discrete Mathematics UC Berkeley, Fall 2011 Homework # 5, due Wednesday, February 22 5.1.4 Let P (n) be the statement that 1 3 + 2 3 + + n 3 = (n(n + 1)/2) 2 for the positive integer n. a) What
More informationFast Sequential Summation Algorithms Using Augmented Data Structures
Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer
More informationIntroduction to Parallel Programming and MapReduce
Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant
More informationEE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27
EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27 The Problem Given a set of N objects, do any two intersect? Objects could be lines, rectangles, circles, polygons, or other geometric objects Simple to
More informationAlgorithm Design and Recursion
Chapter 13 Algorithm Design and Recursion Objectives To understand basic techniques for analyzing the efficiency of algorithms. To know what searching is and understand the algorithms for linear and binary
More informationChapter 2: Algorithm Discovery and Design. Invitation to Computer Science, C++ Version, Third Edition
Chapter 2: Algorithm Discovery and Design Invitation to Computer Science, C++ Version, Third Edition Objectives In this chapter, you will learn about: Representing algorithms Examples of algorithmic problem
More informationComputers. An Introduction to Programming with Python. Programming Languages. Programs and Programming. CCHSG Visit June 2014. Dr.-Ing.
Computers An Introduction to Programming with Python CCHSG Visit June 2014 Dr.-Ing. Norbert Völker Many computing devices are embedded Can you think of computers/ computing devices you may have in your
More informationIntroduction to Computer Science I Spring 2014 Mid-term exam Solutions
Introduction to Computer Science I Spring 2014 Mid-term exam Solutions 1. Question: Consider the following module of Python code... def thing_one (x): y = 0 if x == 1: y = x x = 2 if x == 2: y = -x x =
More informationModule 9 The CIS error profiling technology
Florian Fink Module 9 The CIS error profiling technology 2015-09-15 1 / 24 Module 9 The CIS error profiling technology Florian Fink Centrum für Informations- und Sprachverarbeitung (CIS) Ludwig-Maximilians-Universität
More informationCD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/
CD-HIT User s Guide Last updated: April 5, 2010 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu liwz@sdsc.edu 1. Introduction
More informationRandom Fibonacci-type Sequences in Online Gambling
Random Fibonacci-type Sequences in Online Gambling Adam Biello, CJ Cacciatore, Logan Thomas Department of Mathematics CSUMS Advisor: Alfa Heryudono Department of Mathematics University of Massachusetts
More informationFace detection is a process of localizing and extracting the face region from the
Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.
More informationfor ECM Titanium) This guide contains a complete explanation of the Driver Maker plug-in, an add-on developed for
Driver Maker User Guide (Plug-in for ECM Titanium) Introduction This guide contains a complete explanation of the Driver Maker plug-in, an add-on developed for ECM Titanium, the chip-tuning software produced
More informationLempel-Ziv Coding Adaptive Dictionary Compression Algorithm
Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm 1. LZ77:Sliding Window Lempel-Ziv Algorithm [gzip, pkzip] Encode a string by finding the longest match anywhere within a window of past symbols
More informationStatistical Machine Translation: IBM Models 1 and 2
Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
More informationCollecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
More informationIntroduction to: Computers & Programming: Review for Midterm 2
Introduction to: Computers & Programming: Adam Meyers New York University Summary Some Procedural Matters Summary of what you need to Know For the Test and To Go Further in the Class The Practice Midterm
More informationDesCartes (Combined) Subject: Mathematics Goal: Data Analysis, Statistics, and Probability
DesCartes (Combined) Subject: Mathematics Goal: Data Analysis, Statistics, and Probability RIT Score Range: Below 171 Below 171 171-180 Data Analysis and Statistics Data Analysis and Statistics Solves
More informationArrangements And Duality
Arrangements And Duality 3.1 Introduction 3 Point configurations are tbe most basic structure we study in computational geometry. But what about configurations of more complicated shapes? For example,
More information5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1
5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 General Integer Linear Program: (ILP) min c T x Ax b x 0 integer Assumption: A, b integer The integrality condition
More informationCS177 MIDTERM 2 PRACTICE EXAM SOLUTION. Name: Student ID:
CS177 MIDTERM 2 PRACTICE EXAM SOLUTION Name: Student ID: This practice exam is due the day of the midterm 2 exam. The solutions will be posted the day before the exam but we encourage you to look at the
More informationLinear Programming I
Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins
More informationIBM SPSS Direct Marketing 23
IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release
More informationUnit 5 Length. Year 4. Five daily lessons. Autumn term Unit Objectives. Link Objectives
Unit 5 Length Five daily lessons Year 4 Autumn term Unit Objectives Year 4 Suggest suitable units and measuring equipment to Page 92 estimate or measure length. Use read and write standard metric units
More informationChapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling
Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one
More informationWhy is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
More informationIBM SPSS Direct Marketing 22
IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release
More informationHidden Markov Models
8.47 Introduction to omputational Molecular Biology Lecture 7: November 4, 2004 Scribe: Han-Pang hiu Lecturer: Ross Lippert Editor: Russ ox Hidden Markov Models The G island phenomenon The nucleotide frequencies
More informationFTP client Selection and Programming
COMP 431 INTERNET SERVICES & PROTOCOLS Spring 2016 Programming Homework 3, February 4 Due: Tuesday, February 16, 8:30 AM File Transfer Protocol (FTP), Client and Server Step 3 In this assignment you will
More informationDATA ANALYSIS IN PUBLIC SOCIAL NETWORKS
International Scientific Conference & International Workshop Present Day Trends of Innovations 2012 28 th 29 th May 2012 Łomża, Poland DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS Lubos Takac 1 Michal Zabovsky
More informationComputational Mathematics with Python
Computational Mathematics with Python Basics Claus Führer, Jan Erik Solem, Olivier Verdier Spring 2010 Claus Führer, Jan Erik Solem, Olivier Verdier Computational Mathematics with Python Spring 2010 1
More informationGraphing calculators Transparencies (optional)
What if it is in pieces? Piecewise Functions and an Intuitive Idea of Continuity Teacher Version Lesson Objective: Length of Activity: Students will: Recognize piecewise functions and the notation used
More informationLecture 2, Introduction to Python. Python Programming Language
BINF 3360, Introduction to Computational Biology Lecture 2, Introduction to Python Young-Rae Cho Associate Professor Department of Computer Science Baylor University Python Programming Language Script
More information6.02 Practice Problems: Routing
1 of 9 6.02 Practice Problems: Routing IMPORTANT: IN ADDITION TO THESE PROBLEMS, PLEASE SOLVE THE PROBLEMS AT THE END OF CHAPTERS 17 AND 18. Problem 1. Consider the following networks: network I (containing
More informationComputational Mathematics with Python
Numerical Analysis, Lund University, 2011 1 Computational Mathematics with Python Chapter 1: Basics Numerical Analysis, Lund University Claus Führer, Jan Erik Solem, Olivier Verdier, Tony Stillfjord Spring
More information- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time
Skip Lists CMSC 420 Linked Lists Benefits & Drawbacks Benefits: - Easy to insert & delete in O(1) time - Don t need to estimate total memory needed Drawbacks: - Hard to search in less than O(n) time (binary
More informationLossless Data Compression Standard Applications and the MapReduce Web Computing Framework
Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern
More informationThe Advantages and Disadvantages of Network Computing Nodes
Big Data & Scripting storage networks and distributed file systems 1, 2, in the remainder we use networks of computing nodes to enable computations on even larger datasets for a computation, each node
More informationRecursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1
Recursion Slides by Christopher M Bourke Instructor: Berthe Y Choueiry Fall 007 Computer Science & Engineering 35 Introduction to Discrete Mathematics Sections 71-7 of Rosen cse35@cseunledu Recursive Algorithms
More informationPersistent Data Structures
6.854 Advanced Algorithms Lecture 2: September 9, 2005 Scribes: Sommer Gentry, Eddie Kohler Lecturer: David Karger Persistent Data Structures 2.1 Introduction and motivation So far, we ve seen only ephemeral
More informationOffline sorting buffers on Line
Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com
More informationUsing Edit-Distance Functions to Identify Similar E-Mail Addresses Howard Schreier, U.S. Dept. of Commerce, Washington DC
Paper 073-29 Using Edit-Distance Functions to Identify Similar E-Mail Addresses Howard Schreier, U.S. Dept. of Commerce, Washington DC ABSTRACT Version 9 of SAS software has added functions which can efficiently
More informationLecture 2 Mathcad Basics
Operators Lecture 2 Mathcad Basics + Addition, - Subtraction, * Multiplication, / Division, ^ Power ( ) Specify evaluation order Order of Operations ( ) ^ highest level, first priority * / next priority
More information9.4. The Scalar Product. Introduction. Prerequisites. Learning Style. Learning Outcomes
The Scalar Product 9.4 Introduction There are two kinds of multiplication involving vectors. The first is known as the scalar product or dot product. This is so-called because when the scalar product of
More informationIntroduction to Microsoft Excel 2010
Introduction to Microsoft Excel 2010 Screen Elements Quick Access Toolbar The Ribbon Formula Bar Expand Formula Bar Button File Menu Vertical Scroll Worksheet Navigation Tabs Horizontal Scroll Bar Zoom
More informationMulti-Algorithm Ontology Mapping with Automatic Weight Assignment and Background Knowledge
Multi-Algorithm Mapping with Automatic Weight Assignment and Background Knowledge Shailendra Singh and Yu-N Cheah School of Computer Sciences Universiti Sains Malaysia 11800 USM Penang, Malaysia shai14@gmail.com,
More informationCurriculum Map. Discipline: Computer Science Course: C++
Curriculum Map Discipline: Computer Science Course: C++ August/September: How can computer programs make problem solving easier and more efficient? In what order does a computer execute the lines of code
More informationWhy? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
More informationRecognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
More informationDefinition: A vector is a directed line segment that has and. Each vector has an initial point and a terminal point.
6.1 Vectors in the Plane PreCalculus 6.1 VECTORS IN THE PLANE Learning Targets: 1. Find the component form and the magnitude of a vector.. Perform addition and scalar multiplication of two vectors. 3.
More informationSolutions to Problem Set 1
YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE CPSC 467b: Cryptography and Computer Security Handout #8 Zheng Ma February 21, 2005 Solutions to Problem Set 1 Problem 1: Cracking the Hill cipher Suppose
More informationA parallel algorithm for the extraction of structured motifs
parallel algorithm for the extraction of structured motifs lexandra arvalho MEI 2002/03 omputação em Sistemas Distribuídos 2003 p.1/27 Plan of the talk Biological model of regulation Nucleic acids: DN
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationComputational Mathematics with Python
Boolean Arrays Classes Computational Mathematics with Python Basics Olivier Verdier and Claus Führer 2009-03-24 Olivier Verdier and Claus Führer Computational Mathematics with Python 2009-03-24 1 / 40
More informationTeam Builder Project
Team Builder Project Software Requirements Specification Draft 2 February 2, 2015 Team:.dat ASCII 1 Table of Contents Introduction Purpose 4 Scope of Project.4 Overview.5 Business Context 5 Glossary 6
More information1 Introduction. Dr. T. Srinivas Department of Mathematics Kakatiya University Warangal 506009, AP, INDIA tsrinivasku@gmail.com
A New Allgoriitthm for Miiniimum Costt Liinkiing M. Sreenivas Alluri Institute of Management Sciences Hanamkonda 506001, AP, INDIA allurimaster@gmail.com Dr. T. Srinivas Department of Mathematics Kakatiya
More informationProject Scheduling. Introduction
Project Scheduling Introduction In chapter, the O and ON networks were presented, also the time and cost of individual activities based were calculated. Yet, however, we do not know how long is the total
More informationSeminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it
Seminar Path planning using Voronoi diagrams and B-Splines Stefano Martina stefano.martina@stud.unifi.it 23 may 2016 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International
More information10. THERM DRAWING TIPS
10. THERM DRAWING TIPS 10.1. Drawing Tips The THERM User's Manual describes in detail how to draw cross-sections in THERM. This section of the NFRC Simualation Training Manual presents some additional
More informationBinary Image Scanning Algorithm for Cane Segmentation
Binary Image Scanning Algorithm for Cane Segmentation Ricardo D. C. Marin Department of Computer Science University Of Canterbury Canterbury, Christchurch ricardo.castanedamarin@pg.canterbury.ac.nz Tom
More information