Index support for regular expression search. Alexander Korotkov PGCon 2012, Ottawa
|
|
|
- Christina Thornton
- 10 years ago
- Views:
Transcription
1 Index support for regular expression search Alexander Korotkov PGCon 2012, Ottawa
2 Introduction
3 What is regular expressions? Regular expressions are: powerful tool for text processing based on formal language theory expressing same class of languages as finite automata
4 So automata Regular expression can be transformed into automaton. Moreover, such transformation is really used by regex engines
5 So automata Automaton is a graph which vertices are states and which arcs are labeled by characters. Automaton reads string if you can type that string by a traversal from initial state to final state
6 Example /a(bc)*d/ becomes
7 Example xyzabcbcdxyz
8 Example xyzabcbcdxyz
9 Example xyzabcbcdxyz
10 Example xyzabcbcdxyz
11 Example xyzabcbcdxyz
12 Example xyzabcbcdxyz
13 Example xyzabcbcdxyz
14 Example xyzabcbcdxyz
15 Example xyzabcbcdxyz
16 Example xyzabcbcdxyz
17 Example xyzabcbcdxyz
18 Example xyzabcbcdxyz
19 Example xyzabcbcdxyz Finish! Match!
20 Okay, now all of us know Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
21 Regex based search PostgreSQL can regex based search :) It s only a sequential search for a while :(
22 Inverted indexes on q-grams
23 Q-grams Q-gram is substring of length q which can be used as a signature of original string Widely used in various string processing tasks
24 Inverted index on q-grams Maintain association between q- gram and all the strings where it mentioned. pg_trgm has an implementation for q = 3
25 pg_trgm 1. "regular expressions", 2. "expressive speach", 3. "regular speaker" => ' e': {1,2} 'egu': {1,3} 'pre': {1,2} ' r': {1,3} 'er ': {3} 'reg': {1,3} ' s': {2,3} 'ess': {1,2} 'res': {1,2} ' ex': {1,2} 'exp': {1,2} 'sio': {1} ' re': {1,3} 'gul': {1,3} 'siv': {2} ' sp': {2,3} 'ion': {1} 'spe': {2,3} 'ach': {2} 'ive': {2} 'ssi': {1,2} 'ake': {3} 'ker': {3} 'ula': {1,3} 'ar ': {1,3} 'lar': {1,3} 've ': {2} 'ch ': {2} 'ns ': {1} 'xpr': {1,2} 'eac': {2} 'ons': {1} 'eak': {3} 'pea': {2,3} Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
26 Q-grams frequencies From 2.5M of DBLP paper titles 360K contain trigram the Only 1 contains trigram zzz "Zzzzzzzzzzzzzzzzzzzzzzzzzz" chapter of the book "Formal Specification and Development in Z and B" by David Everett Not all q-grams of fixed q are equally useful.
27 V-grams or multigrams Each q-gram can have specific q Selectivity of all q-grams are similar (and low enough) More effective index search!
28 V-grams or multigrams Problems: Hard to maintain online
29 Patent trololo! Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
30 How to use it for regex search?
31 General idea /[ab]cde/ => (acd OR bcd) AND cde
32 General idea /[ab]cde/ => (acd OR bcd) AND cde How to do this in general case?
33 Existing approaches for q-gram extraction
34 Scholar paper Junghoo Ch and Sridhar Rajagopalan, A fast regular expression indexing engine, Proceedings 18th International Conference on Data Engineering, 2002 Still widely referenced as state of art work about indexing for regular expressions.
35 FREE method Extract tree of continuous string fraction from regex. Transform those continuous fractions to multigrams (q-grams with variable q). Use inverted index on multigrams for query evaluation
36 FREE method: example Tree for /(abcd efgh)(ijklm x*)/
37 Scholar paper: example Replace * nodes with NULL
38 Scholar paper: example NULL eats parent OR node
39 Scholar paper: example AND node eats child NULL
40 Scholar paper: example Simplify a bit
41 Scholar paper: example Expand continuous string fractions into trigrams
42 Google code search Was launched in Supports regex search. Google guys are smart. It can t be a sequential scan. It also seemed to be something better than previous technique. We don t know what... :(
43 We didn t know what until Google code search was closed in 2011 :( Russ Cox has published description of indexing technique in January p/regexp4.html More than 5 years of intrigue!
44 Google code search method Get 5 characteristics about each part of regex: emptyable, exact, prefix, suffix, match. Recursively union them (with possible simplification) Use inverted index of trigrams for query evaluation (similar to pg_trgm)
45 Google code search method Original regex: /a(bc)+d/ a: {exact: a} bc: {exact: bc} d: {exact: d} (bc)+: {prefix:bc, suffix: bc} a(bc)+: {prefix:abc, suffix:bc} a(bc)+d: {prefix:abc, suffix:bcd}
46 Google code search method /a(bc)+d/ {prefix:abc, suffix:bcd} abc AND bcd
47 Proposed method
48 Proposed method Transform automaton into automaton like graph on trigrams Simplify that graph Use pg_trgm indexes
49 Transformation example /a(b+ c+)d/
50 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
51 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
52 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
53 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
54 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
55 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
56 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
57 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
58 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
59 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
60 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
61 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
62 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
63 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
64 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
65 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
66 Transformation example Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
67 Transformation example Result could be simplified
68 abd abb bbd Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa acd acc ccd Transformation example Implemented simplification technique: collect following matrix Means: abd OR (abb AND bbd) OR acd OR (acc AND ccd)
69 Comparison on examples Regex: /(abc cba)def/ FREE: (abc OR cba) AND def GSC: def AND ((abc AND bcd AND cde) OR (ade AND bad AND cba)) My: (abc AND bcd AND cde AND def) OR (ade AND bad AND cba AND def) Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
70 Comparison on examples Regex: /abc+de/ FREE: nothing GSC: abc AND cde My: (abc AND cde AND bcd) OR (abc AND cde AND bcc AND ccd)
71 Comparison on examples Regex: /(abc*)+de/ FREE: nothing GSC: nothing My: (abd AND bde) OR (abc AND bcd AND cde) OR (abc AND bcc AND ccd AND cde)
72 Comparison on examples Regex: /ab(cd)*ef/ FREE: nothing GSC: nothing My: (abe AND bef) OR (abc AND bde AND cde AND def)
73 Performance results 2.5 M DBLP paper titles of 47 avg. length Regex Index scan Seq scan /database.*(sql query)/ 773 ms ms /postgres(ql)?/ 268 ms ms /plan+er/ 253 ms ms /(nucl anino).*acid/ 200 ms ms /[aei](bc)+a/ 2 ms ms
74 WIP patch for pg_trgm WIP patch was posted to mailing list: Any feedback is welcome. Index support for regular expression search, Alexander Korotkov, PGCon 2012, Ottawa
75 Problems Possible large transformed graph Possible large simplified presentation and it s high computational complexity Usage of trigrams rather than v-grams or multigrams
76 What did we miss? Difference between deterministic and nondeterministic automata Grouping characters into colors Handling of start/end of string/line
77 Help needed Regular expressions and string collections from reallife tasks for proving effectiveness of proposed method.
78 Thank you for attention!
Unique column combinations
Unique column combinations Arvid Heise Guest lecture in Data Profiling and Data Cleansing Prof. Dr. Felix Naumann Agenda 2 Introduction and problem statement Unique column combinations Exponential search
Lecture Notes on Database Normalization
Lecture Notes on Database Normalization Chengkai Li Department of Computer Science and Engineering The University of Texas at Arlington April 15, 2012 I decided to write this document, because many students
HOW TO USE MINITAB: DESIGN OF EXPERIMENTS. Noelle M. Richard 08/27/14
HOW TO USE MINITAB: DESIGN OF EXPERIMENTS 1 Noelle M. Richard 08/27/14 CONTENTS 1. Terminology 2. Factorial Designs When to Use? (preliminary experiments) Full Factorial Design General Full Factorial Design
6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010. Class 4 Nancy Lynch
6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010 Class 4 Nancy Lynch Today Two more models of computation: Nondeterministic Finite Automata (NFAs)
5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.
1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The
Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar
Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of
Boolean Algebra (cont d) UNIT 3 BOOLEAN ALGEBRA (CONT D) Guidelines for Multiplying Out and Factoring. Objectives. Iris Hui-Ru Jiang Spring 2010
Boolean Algebra (cont d) 2 Contents Multiplying out and factoring expressions Exclusive-OR and Exclusive-NOR operations The consensus theorem Summary of algebraic simplification Proving validity of an
Formal Languages and Automata Theory - Regular Expressions and Finite Automata -
Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March
Unit 3 Boolean Algebra (Continued)
Unit 3 Boolean Algebra (Continued) 1. Exclusive-OR Operation 2. Consensus Theorem Department of Communication Engineering, NCTU 1 3.1 Multiplying Out and Factoring Expressions Department of Communication
Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology
Honors Class (Foundations of) Informatics Tom Verhoeff Department of Mathematics & Computer Science Software Engineering & Technology www.win.tue.nl/~wstomv/edu/hci c 2011, T. Verhoeff @ TUE.NL 1/20 Information
Finite Automata. Reading: Chapter 2
Finite Automata Reading: Chapter 2 1 Finite Automaton (FA) Informally, a state diagram that comprehensively captures all possible states and transitions that a machine can take while responding to a stream
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams Chen Li University of California, Irvine CA 9697, USA [email protected] Bin Wang Northeastern University
Die Welt Multimedia-Reichweite
Die Welt Multimedia-Reichweite 1) Background The quantification of Die Welt s average daily audience (known as Multimedia-Reichweite, MMR) has been developed by Die Welt management, including the research
Introduction to Finite Automata
Introduction to Finite Automata Our First Machine Model Captain Pedro Ortiz Department of Computer Science United States Naval Academy SI-340 Theory of Computing Fall 2012 Captain Pedro Ortiz (US Naval
How to bet using different NairaBet Bet Combinations (Combo)
How to bet using different NairaBet Bet Combinations (Combo) SINGLES Singles consists of single bets. I.e. it will contain just a single selection of any sport. The bet slip of a singles will look like
Finite Automata. Reading: Chapter 2
Finite Automata Reading: Chapter 2 1 Finite Automata Informally, a state machine that comprehensively captures all possible states and transitions that a machine can take while responding to a stream (or
10CS35: Data Structures Using C
CS35: Data Structures Using C QUESTION BANK REVIEW OF STRUCTURES AND POINTERS, INTRODUCTION TO SPECIAL FEATURES OF C OBJECTIVE: Learn : Usage of structures, unions - a conventional tool for handling a
The safer, easier way to help you pass any IT exams. SAP Certified Application Associate - SAP HANA 1.0. Title : Version : Demo 1 / 5
Exam : C_HANAIMP_1 Title : SAP Certified Application Associate - SAP HANA 1.0 Version : Demo 1 / 5 1.Which of the following nodes can you use when you create a calculation view with the SAP HANA studio
United States Naval Academy Electrical and Computer Engineering Department. EC262 Exam 1
United States Naval Academy Electrical and Computer Engineering Department EC262 Exam 29 September 2. Do a page check now. You should have pages (cover & questions). 2. Read all problems in their entirety.
KNOWLEDGE IS POWER The BEST first-timers guide to betting on and winning at the races you ll EVER encounter
KNOWLEDGE IS POWER The BEST first-timers guide to betting on and winning at the races you ll EVER encounter Copyright 2013 James Witherite. All Rights Reserved.. WELCOME TO THE RACES! If you re new to
CH3 Boolean Algebra (cont d)
CH3 Boolean Algebra (cont d) Lecturer: 吳 安 宇 Date:2005/10/7 ACCESS IC LAB v Today, you ll know: Introduction 1. Guidelines for multiplying out/factoring expressions 2. Exclusive-OR and Equivalence operations
http://www.castlelearning.com/review/teacher/assignmentprinting.aspx 5. 2 6. 2 1. 10 3. 70 2. 55 4. 180 7. 2 8. 4
of 9 1/28/2013 8:32 PM Teacher: Mr. Sime Name: 2 What is the slope of the graph of the equation y = 2x? 5. 2 If the ratio of the measures of corresponding sides of two similar triangles is 4:9, then the
Compiler Construction
Compiler Construction Regular expressions Scanning Görel Hedin Reviderad 2013 01 23.a 2013 Compiler Construction 2013 F02-1 Compiler overview source code lexical analysis tokens intermediate code generation
Data Mining Apriori Algorithm
10 Data Mining Apriori Algorithm Apriori principle Frequent itemsets generation Association rules generation Section 6 of course book TNM033: Introduction to Data Mining 1 Association Rule Mining (ARM)
Automata and Formal Languages
Automata and Formal Languages Winter 2009-2010 Yacov Hel-Or 1 What this course is all about This course is about mathematical models of computation We ll study different machine models (finite automata,
Data Mining Association Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 6. Introduction to Data Mining
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/24
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms Simone Faro and Thierry Lecroq Università di Catania, Viale A.Doria n.6, 95125 Catania, Italy Université de Rouen, LITIS EA 4108,
Python Loops and String Manipulation
WEEK TWO Python Loops and String Manipulation Last week, we showed you some basic Python programming and gave you some intriguing problems to solve. But it is hard to do anything really exciting until
Converting Finite Automata to Regular Expressions
Converting Finite Automata to Regular Expressions Alexander Meduna, Lukáš Vrábel, and Petr Zemek Brno University of Technology, Faculty of Information Technology Božetěchova 1/2, 612 00 Brno, CZ http://www.fit.vutbr.cz/
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
PES Institute of Technology-BSC QUESTION BANK
PES Institute of Technology-BSC Faculty: Mrs. R.Bharathi CS35: Data Structures Using C QUESTION BANK UNIT I -BASIC CONCEPTS 1. What is an ADT? Briefly explain the categories that classify the functions
Domain Name System. DNS is an example of a large scale client-server application. Copyright 2014 Jim Martin
Domain Name System: DNS Objective: map names to IP addresses (i.e., high level names to low level names) Original namespace was flat, didn t scale.. Hierarchical naming permits decentralization by delegating
Association Analysis: Basic Concepts and Algorithms
6 Association Analysis: Basic Concepts and Algorithms Many business enterprises accumulate large quantities of data from their dayto-day operations. For example, huge amounts of customer purchase data
Genetic programming with regular expressions
Genetic programming with regular expressions Børge Svingen Chief Technology Officer, Open AdExchange [email protected] 2009-03-23 Pattern discovery Pattern discovery: Recognizing patterns that characterize
ASSIGNMENT ONE SOLUTIONS MATH 4805 / COMP 4805 / MATH 5605
ASSIGNMENT ONE SOLUTIONS MATH 4805 / COMP 4805 / MATH 5605 (1) (a) (0 + 1) 010 (finite automata below). (b) First observe that the following regular expression generates the binary strings with an even
Frequent item set mining
Frequent item set mining Christian Borgelt Frequent item set mining is one of the best known and most popular data mining methods. Originally developed for market basket analysis, it is used nowadays for
Automata Theory. Şubat 2006 Tuğrul Yılmaz Ankara Üniversitesi
Automata Theory Automata theory is the study of abstract computing devices. A. M. Turing studied an abstract machine that had all the capabilities of today s computers. Turing s goal was to describe the
CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions
CSC45 AUTOMATA 2. Finite Automata: Examples and Definitions Finite Automata: Examples and Definitions A finite automaton is a simple type of computer. Itsoutputislimitedto yes to or no. It has very primitive
Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar
Lexical Analysis and Scanning Honors Compilers Feb 5 th 2001 Robert Dewar The Input Read string input Might be sequence of characters (Unix) Might be sequence of lines (VMS) Character set ASCII ISO Latin-1
Large-Scale Data Cleaning Using Hadoop. UC Irvine
Chen Li UC Irvine Joint work with Michael Carey, Alexander Behm, Shengyue Ji, Rares Vernica 1 Overview Importance of information Importance of information quality Data cleaning Large scale Hadoop 2 Data
Soccer Bet Types Content
Soccer Bet Types Content Content 1 1. Asian Handicap 2 Examples on Asian Handicap All-Up Win bets 3 2. Win/Draw/Win 5 3. All-Up Win 6 Banker Combo 9 Multiple Combo 12 4. Over/Under 14 5. Correct Scores
Database Design and Normalization
Database Design and Normalization Chapter 10 (Week 11) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 Computing Closure F + Example: List all FDs with: - a single
Association Rule Mining
Association Rule Mining Association Rules and Frequent Patterns Frequent Pattern Mining Algorithms Apriori FP-growth Correlation Analysis Constraint-based Mining Using Frequent Patterns for Classification
CIRCLE THEOREMS. Edexcel GCSE Mathematics (Linear) 1MA0
Edexcel GCSE Mathematics (Linear) 1MA0 CIRCLE THEOREMS Materials required for examination Ruler graduated in centimetres and millimetres, protractor, compasses, pen, HB pencil, eraser. Tracing paper may
Boolean Algebra Part 1
Boolean Algebra Part 1 Page 1 Boolean Algebra Objectives Understand Basic Boolean Algebra Relate Boolean Algebra to Logic Networks Prove Laws using Truth Tables Understand and Use First Basic Theorems
The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION GEOMETRY. Thursday, January 24, 2013 9:15 a.m. to 12:15 p.m.
GEOMETRY The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION GEOMETRY Thursday, January 24, 2013 9:15 a.m. to 12:15 p.m., only Student Name: School Name: The possession or use of any
Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran
Wan Accelerators: Optimizing Network Traffic with Compression Bartosz Agas, Marvin Germar & Christopher Tran Introduction A WAN accelerator is an appliance that can maximize the services of a point-to-point(ptp)
Pushdown Automata. place the input head on the leftmost input symbol. while symbol read = b and pile contains discs advance head remove disc from pile
Pushdown Automata In the last section we found that restricting the computational power of computing devices produced solvable decision problems for the class of sets accepted by finite automata. But along
Class Notes CS 3137. 1 Creating and Using a Huffman Code. Ref: Weiss, page 433
Class Notes CS 3137 1 Creating and Using a Huffman Code. Ref: Weiss, page 433 1. FIXED LENGTH CODES: Codes are used to transmit characters over data links. You are probably aware of the ASCII code, a fixed-length
How To Find Out What A Key Is In A Database Engine
Database design theory, Part I Functional dependencies Introduction As we saw in the last segment, designing a good database is a non trivial matter. The E/R model gives a useful rapid prototyping tool,
Lecture 18 Regular Expressions
Lecture 18 Regular Expressions Many of today s web applications require matching patterns in a text document to look for specific information. A good example is parsing a html file to extract tags
1. Domain Name System
1.1 Domain Name System (DNS) 1. Domain Name System To identify an entity, the Internet uses the IP address, which uniquely identifies the connection of a host to the Internet. However, people prefer to
澳 門 彩 票 有 限 公 司 SLOT Sociedade de Lotarias e Apostas Mútuas de Macau, Lda. Soccer Bet Types
Soccer Bet Types 1. Asian Handicap Bet on a team to win in a designated match. Bets will be fully refunded in the case of a draw result after calculating handicap-goal*. *Handicap-goal Handicap-goal applies
Mining Association Rules. Mining Association Rules. What Is Association Rule Mining? What Is Association Rule Mining? What is Association rule mining
Mining Association Rules What is Association rule mining Mining Association Rules Apriori Algorithm Additional Measures of rule interestingness Advanced Techniques 1 2 What Is Association Rule Mining?
Why is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Lecture 24: Saccheri Quadrilaterals
Lecture 24: Saccheri Quadrilaterals 24.1 Saccheri Quadrilaterals Definition In a protractor geometry, we call a quadrilateral ABCD a Saccheri quadrilateral, denoted S ABCD, if A and D are right angles
Informatique Fondamentale IMA S8
Informatique Fondamentale IMA S8 Cours 1 - Intro + schedule + finite state machines Laure Gonnord http://laure.gonnord.org/pro/teaching/ [email protected] Université Lille 1 - Polytech Lille
Being Regular with Regular Expressions. John Garmany Session
Being Regular with Regular Expressions John Garmany Session John Garmany Senior Consultant Burleson Consulting Who Am I West Point Graduate GO ARMY! Masters Degree Information Systems Graduate Certificate
CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions
CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly
COLLEGE ALGEBRA. Paul Dawkins
COLLEGE ALGEBRA Paul Dawkins Table of Contents Preface... iii Outline... iv Preliminaries... Introduction... Integer Exponents... Rational Exponents... 9 Real Exponents...5 Radicals...6 Polynomials...5
Hint 1. Answer (b) first. Make the set as simple as possible and try to generalize the phenomena it exhibits. [Caution: the next hint is an answer to
Problem. Consider the collection of all subsets A of the topological space X. The operations of osure A Ā and ementation A X A are functions from this collection to itself. (a) Show that starting with
CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions
CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions Theory of Formal Languages In the English language, we distinguish between three different identities: letter, word, sentence.
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
Benchmark Databases for Testing Big-Data Analytics In Cloud Environments
North Carolina State University Graduate Program in Operations Research Benchmark Databases for Testing Big-Data Analytics In Cloud Environments Rong Huang Rada Chirkova Yahya Fathi ICA CON 2012 April
Finite Automata and Regular Languages
CHAPTER 3 Finite Automata and Regular Languages 3. Introduction 3.. States and Automata A finite-state machine or finite automaton (the noun comes from the Greek; the singular is automaton, the Greek-derived
E-mail Listeners. E-mail Formats. Free Form. Formatted
E-mail Listeners 6 E-mail Formats You use the E-mail Listeners application to receive and process Service Requests and other types of tickets through e-mail in the form of e-mail messages. Using E- mail
28 Simply Confirming Onsite
28 Simply Confirming Onsite Status 28.1 This chapter describes available monitoring tools....28-2 28.2 Monitoring Operational Status...28-5 28.3 Monitoring Device Values... 28-11 28.4 Monitoring Symbol
Today s Agenda. Automata and Logic. Quiz 4 Temporal Logic. Introduction Buchi Automata Linear Time Logic Summary
Today s Agenda Quiz 4 Temporal Logic Formal Methods in Software Engineering 1 Automata and Logic Introduction Buchi Automata Linear Time Logic Summary Formal Methods in Software Engineering 2 1 Buchi Automata
DATA STRUCTURES USING C
DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give
Introduction to SQL for Data Scientists
Introduction to SQL for Data Scientists Ben O. Smith College of Business Administration University of Nebraska at Omaha Learning Objectives By the end of this document you will learn: 1. How to perform
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, [email protected]
Business Process Modeling
Business Process Concepts Process Mining Kelly Rosa Braghetto Instituto de Matemática e Estatística Universidade de São Paulo [email protected] January 30, 2009 1 / 41 Business Process Concepts Process
International Journal of Advanced Research in Computer Science and Software Engineering
Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:
A Partition-Based Efficient Algorithm for Large Scale. Multiple-Strings Matching
A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching Ping Liu Jianlong Tan, Yanbing Liu Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing,
03 - Lexical Analysis
03 - Lexical Analysis First, let s see a simplified overview of the compilation process: source code file (sequence of char) Step 2: parsing (syntax analysis) arse Tree Step 1: scanning (lexical analysis)
AREAS OF PARALLELOGRAMS AND TRIANGLES
15 MATHEMATICS AREAS OF PARALLELOGRAMS AND TRIANGLES CHAPTER 9 9.1 Introduction In Chapter 5, you have seen that the study of Geometry, originated with the measurement of earth (lands) in the process of
Course Manual Automata & Complexity 2015
Course Manual Automata & Complexity 2015 Course code: Course homepage: Coordinator: Teachers lectures: Teacher exercise classes: Credits: X_401049 http://www.cs.vu.nl/~tcs/ac prof. dr. W.J. Fokkink home:
SPAMMING BOTNETS: SIGNATURES AND CHARACTERISTICS
SPAMMING BOTNETS: SIGNATURES AND CHARACTERISTICS INTRODUCTION BOTNETS IN SPAMMING WHAT IS AUTORE? FACING CHALLENGES? WE CAN SOLVE THEM METHODS TO DEAL WITH THAT CHALLENGES Extract URL string, source server
Managing large sound databases using Mpeg7
Max Jacob 1 1 Institut de Recherche et Coordination Acoustique/Musique (IRCAM), place Igor Stravinsky 1, 75003, Paris, France Correspondence should be addressed to Max Jacob ([email protected]) ABSTRACT
Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, 2014. School of Informatics University of Edinburgh [email protected].
Pushdown automata Informatics 2A: Lecture 9 Alex Simpson School of Informatics University of Edinburgh [email protected] 3 October, 2014 1 / 17 Recap of lecture 8 Context-free languages are defined by context-free
University Convocation. IT 3203 Introduction to Web Development. Pattern Matching. Why Match Patterns? The Search Method. The Replace Method
IT 3203 Introduction to Web Development Regular Expressions October 12 Notice: This session is being recorded. Copyright 2007 by Bob Brown University Convocation Tuesday, October 13, 11:00 AM 12:15 PM
Using Logic to Design Computer Components
CHAPTER 13 Using Logic to Design Computer Components Parallel and sequential operation In this chapter we shall see that the propositional logic studied in the previous chapter can be used to design digital
Deterministic Finite Automata
1 Deterministic Finite Automata Definition: A deterministic finite automaton (DFA) consists of 1. a finite set of states (often denoted Q) 2. a finite set Σ of symbols (alphabet) 3. a transition function
Computational Geometry Lab: FEM BASIS FUNCTIONS FOR A TETRAHEDRON
Computational Geometry Lab: FEM BASIS FUNCTIONS FOR A TETRAHEDRON John Burkardt Information Technology Department Virginia Tech http://people.sc.fsu.edu/ jburkardt/presentations/cg lab fem basis tetrahedron.pdf
Topic 11. Statistics 514: Design of Experiments. Topic Overview. This topic will cover. 2 k Factorial Design. Blocking/Confounding
Topic Overview This topic will cover 2 k Factorial Design Blocking/Confounding Fractional Factorial Designs 3 k Factorial Design Statistics 514: Design of Experiments Topic 11 2 k Factorial Design Each
Lecture 1. Basic Concepts of Set Theory, Functions and Relations
September 7, 2005 p. 1 Lecture 1. Basic Concepts of Set Theory, Functions and Relations 0. Preliminaries...1 1. Basic Concepts of Set Theory...1 1.1. Sets and elements...1 1.2. Specification of sets...2
Regular Languages and Finite State Machines
Regular Languages and Finite State Machines Plan for the Day: Mathematical preliminaries - some review One application formal definition of finite automata Examples 1 Sets A set is an unordered collection
: HP HP0-M28. : Implementing HP Asset Manager Software. Version : R6.1
Exam : HP HP0-M28 Title : Implementing HP Asset Manager Software Version : R6.1 Prepking - King of Computer Certification Important Information, Please Read Carefully Other Prepking products A) Offline
CAD Algorithms. P and NP
CAD Algorithms The Classes P and NP Mohammad Tehranipoor ECE Department 6 September 2010 1 P and NP P and NP are two families of problems. P is a class which contains all of the problems we solve using
ALTIRIS CMDB Solution 6.5 Product Guide
ALTIRIS CMDB Solution 6.5 Product Guide Notice Altiris CMDB Solution 6.5 2001-2007 Altiris, Inc. All rights reserved. Document Date: July 19, 2007 Information in this document: (i) is provided for informational
Lecture 2: Regular Languages [Fa 14]
Caveat lector: This is the first edition of this lecture note. Please send bug reports and suggestions to [email protected]. But the Lord came down to see the city and the tower the people were building.
Geometry Module 4 Unit 2 Practice Exam
Name: Class: Date: ID: A Geometry Module 4 Unit 2 Practice Exam Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Which diagram shows the most useful positioning
