4.5 Symbol Table Applications

Size: px
Start display at page:

Download "4.5 Symbol Table Applications"

Transcription

1 Set ADT 4.5 Symbol Table Applications Set ADT: unordered collection of distinct keys. Insert a key. Check if set contains a given key. Delete a key. SET interface. addkey) insert the key containskey) is given key present? removekey) remove the key iterator) return iterator over all keys Q. How to implement? Java library: java.util.hashset. Set Client: Remove Duplicates More Set Applications Remove duplicates. [e.g., from commercial mailing list] Read in a key. If key is not in set, insert and print it out. Application Purpose Key Spell checker Identify misspelled words Word Browser Highlight previously visited pages URL public class DeDup { public static void mainstring[] args) { SET<String> set = new SET<String>); while StdIn.isEmpty)) { String key = StdIn.readString); if set.containskey)) { set.addkey); System.out.printlnkey); Chess Detect repetition draw Board position Spam blacklist Prevent spam IP addresses of spammers 3 4

2 Vectors and Matrices Inverted Index Inverted index. Given a list of documents, preprocess so that you can quickly find all documents containing a given query. Ex 1. Book index. Ex. Web search engine index. Key. Query word. Value. Set of documents. 6 Inverted Index Implementation Inverted Index Implementation cont) public class InvertedIndex { public static void mainstring[] args) { ST<String, SET<String>> st; st = new ST<String, SET<String>>); use SET to avoid duplicates // create inverted index for String filename : args) { In in = new Infilename); while in.isempty)) { String word = in.readstring); if st.containsword)) st.putword, new SET<String>)); SET<String> set = st.getword); set.addfilename); // read queries from standard input In stdin = new In); while stdin.isempty)) { String query = in.readstring); if st.containsquery)) System.out.println"NOT FOUND"); else { SET<String> set = st.getquery); for String filename : set) System.out.printfilename + " "); System.out.println); 7 8

3 Inverted Index Extensions. Ignore stopwords: the, on, of, etc. Multi-word queries: set intersection AND) set union OR). Record position and number of occurrences of word in document. Sparse Vectors and Matrices 9 Vectors and Matrices Sparsity Vector. Ordered sequence of N real numbers. Matrix. N-by-N table of real numbers. Def. An N-by-N matrix is sparse if it has ON) nonzeros entries. Empirical fact. Large matrices that arise in practice are usually sparse. [ ], b = ["1 ] [ ] a = a + b = " a o b = 0 # "1) + 3 # ) + 15 # ) = 36 Matrix representations. D array: space proportional to N. Goal: space proportional to number of nonzeros, without sacrificing fast access to individual elements. A = # 0 1 1& 4 ", B = $ ' # 1 0 0& 0 0, A + B = $ 0 0 3' # 1 1 1& 6 " $ ' Ex. Google performs matrix-vector product with N = 4 billion # 0 1 1& 4 " ) $ ' #"1& $ ' = # 4& $ 36' 11 1

4 Sparse Vector Implementation Sparse Vector Implementation cont) public class SparseVector { private final int N; private ST<Integer, Double> st = new ST<Integer, Double>); public SparseVectorint N) { this.n = N; public void putint i, double value) { if value == 0.0) st.removei); else st.puti, value); public double getint i) { if st.containsi)) return st.geti); else return 0.0; a[i] = value a[i] // return a b public double dotsparsevector b) { SparseVector a = this; double sum = 0.0; for int i : a.st) if b.st.containsi)) sum += a.geti) * b.geti); return sum; // return c = a + b public SparseVector plussparsevector b) { SparseVector a = this; SparseVector c = new SparseVectorN); for int i : a.st) c.puti, a.geti)); for int i : b.st) c.puti, b.geti) + c.geti)); return c; Sparse Matrix Implementation Sparse Matrix Implementation cont) public class SparseMatrix { private final int N; private SparseVector[] rows; public SparseMatrixint N) { this.n = N; rows = new SparseVector[N]; for int i = 0; i < N; i++) rows[i] = new SparseVectorN); public void putint i, int j, double value) { rows[i].putj, value); public double getint i, int j) { return rows[i].getj); N-by-N matrix of all zeros A[i][j] = value A[i][j] // return b = A*x public SparseVector timessparsevector x) { SparseMatrix A = this; SparseVector b = new SparseVectorN); for int i = 0; i < N; i++) b.puti, rows[i].dotx)); return b; // return C = A + B public SparseMatrix plussparsematrix B) { SparseMatrix A = this; SparseMatrix C = new SparseMatrixN); for int i = 0; i < N; i++) C.rows[i] = A.rows[i].plusB.rows[i]); return C; 15 16

5 A Plan for Spam A Plan for Spam Bayesian spam filter. Filter based on analysis of previous messages. User trains the filter by classifying messages as spam or ham. Parse messages into tokens alphanumeric, dashes, ', $) Build data structures. Hash table A of tokens and frequencies for spam. Hash table B of tokens and frequencies for ham. Hash table C of tokens with probability p that they appear in spam. double h =.0 * ham.freqword); double s = 1.0 * spam.freqword); double p = s/spams) / h/hams + s/spams); bias probabilities to avoid false positives Reference: 18 A Plan for Spam Identify incoming as spam or ham. Find 15 most interesting tokens difference from 0.5). Combine probabilities using Bayes law. which data structure? p 1 " p " L " p 15 p1 " p " L" p15) + 1 p1) " 1 p ) " L" 1 p15) ) Declare as spam if threshold > 0.9. Details. Words you've never seen. Words that appear in ham corpus but not spam corpus, vice versa. Words that appear less than 5 times in spam and ham corpuses. Update data structures. 19

http://algs4.cs.princeton.edu dictionary find definition word definition book index find relevant pages term list of page numbers

http://algs4.cs.princeton.edu dictionary find definition word definition book index find relevant pages term list of page numbers ROBERT SEDGEWICK KEVI WAYE F O U R T H E D I T I O ROBERT SEDGEWICK KEVI WAYE ROBERT SEDGEWICK KEVI WAYE Symbol tables Symbol table applications Key-value pair abstraction. Insert a value with specified.

More information

cs2010: algorithms and data structures

cs2010: algorithms and data structures cs2010: algorithms and data structures Lecture 11: Symbol Table ADT. Vasileios Koutavas 28 Oct 2015 School of Computer Science and Statistics Trinity College Dublin Algorithms ROBERT SEDGEWICK KEVIN WAYNE

More information

Hashing. hash functions collision resolution applications. References: Algorithms in Java, Chapter 14. http://www.cs.princeton.edu/introalgsds/42hash

Hashing. hash functions collision resolution applications. References: Algorithms in Java, Chapter 14. http://www.cs.princeton.edu/introalgsds/42hash Hashing hash functions collision resolution applications References: Algorithms in Java, Chapter 14 http://www.cs.princeton.edu/introalgsds/42hash 1 Summary of symbol-table implementations implementation

More information

1.4 Arrays Introduction to Programming in Java: An Interdisciplinary Approach Robert Sedgewick and Kevin Wayne Copyright 2002 2010 2/6/11 12:33 PM!

1.4 Arrays Introduction to Programming in Java: An Interdisciplinary Approach Robert Sedgewick and Kevin Wayne Copyright 2002 2010 2/6/11 12:33 PM! 1.4 Arrays Introduction to Programming in Java: An Interdisciplinary Approach Robert Sedgewick and Kevin Wayne Copyright 2002 2010 2/6/11 12:33 PM! A Foundation for Programming any program you might want

More information

Achieve more with less

Achieve more with less Energy reduction Bayesian Filtering: the essentials - A Must-take approach in any organization s Anti-Spam Strategy - Whitepaper Achieve more with less What is Bayesian Filtering How Bayesian Filtering

More information

Homework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class

Homework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class Problem 1. (10 Points) James 6.1 Problem 2. (10 Points) James 6.3 Problem 3. (10 Points) James 6.5 Problem 4. (15 Points) James 6.7 Problem 5. (15 Points) James 6.10 Homework 4 Statistics W4240: Data Mining

More information

API for java.util.iterator. ! hasnext() Are there more items in the list? ! next() Return the next item in the list.

API for java.util.iterator. ! hasnext() Are there more items in the list? ! next() Return the next item in the list. Sequences and Urns 2.7 Lists and Iterators Sequence. Ordered collection of items. Key operations. Insert an item, iterate over the items. Design challenge. Support iteration by client, without revealing

More information

AP Computer Science Java Mr. Clausen Program 9A, 9B

AP Computer Science Java Mr. Clausen Program 9A, 9B AP Computer Science Java Mr. Clausen Program 9A, 9B PROGRAM 9A I m_sort_of_searching (20 points now, 60 points when all parts are finished) The purpose of this project is to set up a program that will

More information

Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007

Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007 Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007 Naïve Bayes Components ML vs. MAP Benefits Feature Preparation Filtering Decay Extended Examples

More information

Anti Spamming Techniques

Anti Spamming Techniques Anti Spamming Techniques Written by Sumit Siddharth In this article will we first look at some of the existing methods to identify an email as a spam? We look at the pros and cons of the existing methods

More information

Adaption of Statistical Email Filtering Techniques

Adaption of Statistical Email Filtering Techniques Adaption of Statistical Email Filtering Techniques David Kohlbrenner IT.com Thomas Jefferson High School for Science and Technology January 25, 2007 Abstract With the rise of the levels of spam, new techniques

More information

CompSci 125 Lecture 08. Chapter 5: Conditional Statements Chapter 4: return Statement

CompSci 125 Lecture 08. Chapter 5: Conditional Statements Chapter 4: return Statement CompSci 125 Lecture 08 Chapter 5: Conditional Statements Chapter 4: return Statement Homework Update HW3 Due 9/20 HW4 Due 9/27 Exam-1 10/2 Programming Assignment Update p1: Traffic Applet due Sept 21 (Submit

More information

4.2 Sorting and Searching

4.2 Sorting and Searching Sequential Search: Java Implementation 4.2 Sorting and Searching Scan through array, looking for key. search hit: return array index search miss: return -1 public static int search(string key, String[]

More information

Apache Lucene. Searching the Web and Everything Else. Daniel Naber Mindquarry GmbH ID 380

Apache Lucene. Searching the Web and Everything Else. Daniel Naber Mindquarry GmbH ID 380 Apache Lucene Searching the Web and Everything Else Daniel Naber Mindquarry GmbH ID 380 AGENDA 2 > What's a search engine > Lucene Java Features Code example > Solr Features Integration > Nutch Features

More information

Email Spam Detection A Machine Learning Approach

Email Spam Detection A Machine Learning Approach Email Spam Detection A Machine Learning Approach Ge Song, Lauren Steimle ABSTRACT Machine learning is a branch of artificial intelligence concerned with the creation and study of systems that can learn

More information

Building a Multi-Threaded Web Server

Building a Multi-Threaded Web Server Building a Multi-Threaded Web Server In this lab we will develop a Web server in two steps. In the end, you will have built a multi-threaded Web server that is capable of processing multiple simultaneous

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly

More information

OPINION MINING IN PRODUCT REVIEW SYSTEM USING BIG DATA TECHNOLOGY HADOOP

OPINION MINING IN PRODUCT REVIEW SYSTEM USING BIG DATA TECHNOLOGY HADOOP OPINION MINING IN PRODUCT REVIEW SYSTEM USING BIG DATA TECHNOLOGY HADOOP 1 KALYANKUMAR B WADDAR, 2 K SRINIVASA 1 P G Student, S.I.T Tumkur, 2 Assistant Professor S.I.T Tumkur Abstract- Product Review System

More information

Lecture 2 Matrix Operations

Lecture 2 Matrix Operations Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or

More information

Email Spam Detection Using Customized SimHash Function

Email Spam Detection Using Customized SimHash Function International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email

More information

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences

More information

Using Files as Input/Output in Java 5.0 Applications

Using Files as Input/Output in Java 5.0 Applications Using Files as Input/Output in Java 5.0 Applications The goal of this module is to present enough information about files to allow you to write applications in Java that fetch their input from a file instead

More information

Webapps Vulnerability Report

Webapps Vulnerability Report Tuesday, May 1, 2012 Webapps Vulnerability Report Introduction This report provides detailed information of every vulnerability that was found and successfully exploited by CORE Impact Professional during

More information

Question 2: How do you solve a matrix equation using the matrix inverse?

Question 2: How do you solve a matrix equation using the matrix inverse? Question : How do you solve a matrix equation using the matrix inverse? In the previous question, we wrote systems of equations as a matrix equation AX B. In this format, the matrix A contains the coefficients

More information

Sample CSE8A midterm Multiple Choice (circle one)

Sample CSE8A midterm Multiple Choice (circle one) Sample midterm Multiple Choice (circle one) (2 pts) Evaluate the following Boolean expressions and indicate whether short-circuiting happened during evaluation: Assume variables with the following names

More information

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013 Oct 4, 2013, p 1 Name: CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013 1. (max 18) 4. (max 16) 2. (max 12) 5. (max 12) 3. (max 24) 6. (max 18) Total: (max 100)

More information

Chapter 8: Bags and Sets

Chapter 8: Bags and Sets Chapter 8: Bags and Sets In the stack and the queue abstractions, the order that elements are placed into the container is important, because the order elements are removed is related to the order in which

More information

Bayesian Filtering. Scoring

Bayesian Filtering. Scoring 9 Bayesian Filtering The Bayesian filter in SpamAssassin is one of the most effective techniques for filtering spam. Although Bayesian statistical analysis is a branch of mathematics, one doesn't necessarily

More information

Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms. Spiros Papadimitriou, IBM Research Jimeng Sun, IBM Research Rong Yan, Facebook

Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms. Spiros Papadimitriou, IBM Research Jimeng Sun, IBM Research Rong Yan, Facebook Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms Spiros Papadimitriou, IBM Research Jimeng Sun, IBM Research Rong Yan, Facebook Part 2:Mining using MapReduce Mining algorithms using MapReduce

More information

Part 1: Link Analysis & Page Rank

Part 1: Link Analysis & Page Rank Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Exam on the 5th of February, 216, 14. to 16. If you wish to attend, please

More information

Securepoint Security Systems

Securepoint Security Systems HowTo: Configuration of the spam filter Securepoint Security Systems Version 2007nx Release 3 Contents 1 Configuration of the spam filter with the Securepoint Security Manager... 3 2 Spam filter configuration

More information

Qlik REST Connector Installation and User Guide

Qlik REST Connector Installation and User Guide Qlik REST Connector Installation and User Guide Qlik REST Connector Version 1.0 Newton, Massachusetts, November 2015 Authored by QlikTech International AB Copyright QlikTech International AB 2015, All

More information

Simple Language Models for Spam Detection

Simple Language Models for Spam Detection Simple Language Models for Spam Detection Egidio Terra Faculty of Informatics PUC/RS - Brazil Abstract For this year s Spam track we used classifiers based on language models. These models are used to

More information

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different

More information

Big Data & Scripting Part II Streaming Algorithms

Big Data & Scripting Part II Streaming Algorithms Big Data & Scripting Part II Streaming Algorithms 1, Counting Distinct Elements 2, 3, counting distinct elements problem formalization input: stream of elements o from some universe U e.g. ids from a set

More information

Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

More information

J a v a Quiz (Unit 3, Test 0 Practice)

J a v a Quiz (Unit 3, Test 0 Practice) Computer Science S-111a: Intensive Introduction to Computer Science Using Java Handout #11 Your Name Teaching Fellow J a v a Quiz (Unit 3, Test 0 Practice) Multiple-choice questions are worth 2 points

More information

Bayesian Spam Filtering

Bayesian Spam Filtering Bayesian Spam Filtering Ahmed Obied Department of Computer Science University of Calgary [email protected] http://www.cpsc.ucalgary.ca/~amaobied Abstract. With the enormous amount of spam messages propagating

More information

Smart E-Marketer s Guide

Smart E-Marketer s Guide 30 insider tips to maximise your email deliverability rate 30 insider tips Step 1. Ensure the domain you use for sending emails is configured to enable authentication (SPF / Sender ID/ DomainKeys). Step

More information

Spam Filtering based on Naive Bayes Classification. Tianhao Sun

Spam Filtering based on Naive Bayes Classification. Tianhao Sun Spam Filtering based on Naive Bayes Classification Tianhao Sun May 1, 2009 Abstract This project discusses about the popular statistical spam filtering process: naive Bayes classification. A fairly famous

More information

Moving from CS 61A Scheme to CS 61B Java

Moving from CS 61A Scheme to CS 61B Java Moving from CS 61A Scheme to CS 61B Java Introduction Java is an object-oriented language. This document describes some of the differences between object-oriented programming in Scheme (which we hope you

More information

Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006

Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006 Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006 Final Examination January 24 th, 2006 NAME: Please read all instructions carefully before start answering. The exam will be

More information

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner. Handout 1 CS603 Object-Oriented Programming Fall 15 Page 1 of 11 Handout 1 Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner. Java

More information

First Java Programs. V. Paúl Pauca. CSC 111D Fall, 2015. Department of Computer Science Wake Forest University. Introduction to Computer Science

First Java Programs. V. Paúl Pauca. CSC 111D Fall, 2015. Department of Computer Science Wake Forest University. Introduction to Computer Science First Java Programs V. Paúl Pauca Department of Computer Science Wake Forest University CSC 111D Fall, 2015 Hello World revisited / 8/23/15 The f i r s t o b l i g a t o r y Java program @author Paul Pauca

More information

(Eng. Hayam Reda Seireg) Sheet Java

(Eng. Hayam Reda Seireg) Sheet Java (Eng. Hayam Reda Seireg) Sheet Java 1. Write a program to compute the area and circumference of a rectangle 3 inche wide by 5 inches long. What changes must be made to the program so it works for a rectangle

More information

Translating to Java. Translation. Input. Many Level Translations. read, get, input, ask, request. Requirements Design Algorithm Java Machine Language

Translating to Java. Translation. Input. Many Level Translations. read, get, input, ask, request. Requirements Design Algorithm Java Machine Language Translation Translating to Java Introduction to Computer Programming The job of a programmer is to translate a problem description into a computer language. You need to be able to convert a problem description

More information

Visual Basic Programming. An Introduction

Visual Basic Programming. An Introduction Visual Basic Programming An Introduction Why Visual Basic? Programming for the Windows User Interface is extremely complicated. Other Graphical User Interfaces (GUI) are no better. Visual Basic provides

More information

System.out.println("\nEnter Product Number 1-5 (0 to stop and view summary) :

System.out.println(\nEnter Product Number 1-5 (0 to stop and view summary) : Benjamin Michael Java Homework 3 10/31/2012 1) Sales.java Code // Sales.java // Program calculates sales, based on an input of product // number and quantity sold import java.util.scanner; public class

More information

1 Determinants and the Solvability of Linear Systems

1 Determinants and the Solvability of Linear Systems 1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped

More information

Common Patterns and Pitfalls for Implementing Algorithms in Spark. Hossein Falaki @mhfalaki [email protected]

Common Patterns and Pitfalls for Implementing Algorithms in Spark. Hossein Falaki @mhfalaki hossein@databricks.com Common Patterns and Pitfalls for Implementing Algorithms in Spark Hossein Falaki @mhfalaki [email protected] Challenges of numerical computation over big data When applying any algorithm to big data

More information

CMSC 202H. ArrayList, Multidimensional Arrays

CMSC 202H. ArrayList, Multidimensional Arrays CMSC 202H ArrayList, Multidimensional Arrays What s an Array List ArrayList is a class in the standard Java libraries that can hold any type of object an object that can grow and shrink while your program

More information

12-6 Write a recursive definition of a valid Java identifier (see chapter 2).

12-6 Write a recursive definition of a valid Java identifier (see chapter 2). CHAPTER 12 Recursion Recursion is a powerful programming technique that is often difficult for students to understand. The challenge is explaining recursion in a way that is already natural to the student.

More information

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT By GROUP Avadhut Gurjar Mohsin Patel Shraddha Pandhe Page 1 Contents 1. Introduction... 3 2. DNS Recursive Query Mechanism:...5 2.1. Client

More information

The world s largest matrix computation. (This chapter is out of date and needs a major overhaul.)

The world s largest matrix computation. (This chapter is out of date and needs a major overhaul.) Chapter 7 Google PageRank The world s largest matrix computation. (This chapter is out of date and needs a major overhaul.) One of the reasons why Google TM is such an effective search engine is the PageRank

More information

Savita Teli 1, Santoshkumar Biradar 2

Savita Teli 1, Santoshkumar Biradar 2 Effective Spam Detection Method for Email Savita Teli 1, Santoshkumar Biradar 2 1 (Student, Dept of Computer Engg, Dr. D. Y. Patil College of Engg, Ambi, University of Pune, M.S, India) 2 (Asst. Proff,

More information

General Framework for an Iterative Solution of Ax b. Jacobi s Method

General Framework for an Iterative Solution of Ax b. Jacobi s Method 2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,

More information

Glossary of Object Oriented Terms

Glossary of Object Oriented Terms Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction

More information

Lecture 9. Semantic Analysis Scoping and Symbol Table

Lecture 9. Semantic Analysis Scoping and Symbol Table Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax

More information

This material is built based on, Patterns covered in this class FILTERING PATTERNS. Filtering pattern

This material is built based on, Patterns covered in this class FILTERING PATTERNS. Filtering pattern 2/23/15 CS480 A2 Introduction to Big Data - Spring 2015 1 2/23/15 CS480 A2 Introduction to Big Data - Spring 2015 2 PART 0. INTRODUCTION TO BIG DATA PART 1. MAPREDUCE AND THE NEW SOFTWARE STACK 1. DISTRIBUTED

More information

Spam Filtering Based on Latent Semantic Indexing

Spam Filtering Based on Latent Semantic Indexing Spam Filtering Based on Latent Semantic Indexing Wilfried N. Gansterer Andreas G. K. Janecek Robert Neumayer Abstract In this paper, a study on the classification performance of a vector space model (VSM)

More information

dedupe Documentation Release 1.0.0 Forest Gregg, Derek Eder, and contributors

dedupe Documentation Release 1.0.0 Forest Gregg, Derek Eder, and contributors dedupe Documentation Release 1.0.0 Forest Gregg, Derek Eder, and contributors January 04, 2016 Contents 1 Important links 3 2 Contents 5 2.1 API Documentation...........................................

More information

How To Filter Spam Image From A Picture By Color Or Color

How To Filter Spam Image From A Picture By Color Or Color Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among

More information

International Journal of Research in Advent Technology Available Online at: http://www.ijrat.org

International Journal of Research in Advent Technology Available Online at: http://www.ijrat.org IMPROVING PEFORMANCE OF BAYESIAN SPAM FILTER Firozbhai Ahamadbhai Sherasiya 1, Prof. Upen Nathwani 2 1 2 Computer Engineering Department 1 2 Noble Group of Institutions 1 [email protected] ABSTARCT:

More information

Link Analysis. Chapter 5. 5.1 PageRank

Link Analysis. Chapter 5. 5.1 PageRank Chapter 5 Link Analysis One of the biggest changes in our lives in the decade following the turn of the century was the availability of efficient and accurate Web search, through search engines such as

More information

Introduction to Object-Oriented Programming

Introduction to Object-Oriented Programming Introduction to Object-Oriented Programming Programs and Methods Christopher Simpkins [email protected] CS 1331 (Georgia Tech) Programs and Methods 1 / 8 The Anatomy of a Java Program It is customary

More information

MIDTERM 1 REVIEW WRITING CODE POSSIBLE SOLUTION

MIDTERM 1 REVIEW WRITING CODE POSSIBLE SOLUTION MIDTERM 1 REVIEW WRITING CODE POSSIBLE SOLUTION 1. Write a loop that computes (No need to write a complete program) 100 1 99 2 98 3 97... 4 3 98 2 99 1 100 Note: this is not the only solution; double sum

More information

G563 Quantitative Paleontology. SQL databases. An introduction. Department of Geological Sciences Indiana University. (c) 2012, P.

G563 Quantitative Paleontology. SQL databases. An introduction. Department of Geological Sciences Indiana University. (c) 2012, P. SQL databases An introduction AMP: Apache, mysql, PHP This installations installs the Apache webserver, the PHP scripting language, and the mysql database on your computer: Apache: runs in the background

More information

Evaluation. Copy. Evaluation Copy. Chapter 7: Using JDBC with Spring. 1) A Simpler Approach... 7-2. 2) The JdbcTemplate. Class...

Evaluation. Copy. Evaluation Copy. Chapter 7: Using JDBC with Spring. 1) A Simpler Approach... 7-2. 2) The JdbcTemplate. Class... Chapter 7: Using JDBC with Spring 1) A Simpler Approach... 7-2 2) The JdbcTemplate Class... 7-3 3) Exception Translation... 7-7 4) Updating with the JdbcTemplate... 7-9 5) Queries Using the JdbcTemplate...

More information

Building Java Programs

Building Java Programs Building Java Programs Chapter 3 Lecture 3-3: Interactive Programs w/ Scanner reading: 3.3-3.4 self-check: #16-19 exercises: #11 videos: Ch. 3 #4 Interactive programs We have written programs that print

More information

Basics of Java Programming Input and the Scanner class

Basics of Java Programming Input and the Scanner class Basics of Java Programming Input and the Scanner class CSC 1051 Algorithms and Data Structures I Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/1051/

More information

Search Engines. Stephen Shaw <[email protected]> 18th of February, 2014. Netsoc

Search Engines. Stephen Shaw <stesh@netsoc.tcd.ie> 18th of February, 2014. Netsoc Search Engines Stephen Shaw Netsoc 18th of February, 2014 Me M.Sc. Artificial Intelligence, University of Edinburgh Would recommend B.A. (Mod.) Computer Science, Linguistics, French,

More information

Unit 6. Loop statements

Unit 6. Loop statements Unit 6 Loop statements Summary Repetition of statements The while statement Input loop Loop schemes The for statement The do statement Nested loops Flow control statements 6.1 Statements in Java Till now

More information

CS170 Lab 11 Abstract Data Types & Objects

CS170 Lab 11 Abstract Data Types & Objects CS170 Lab 11 Abstract Data Types & Objects Introduction: Abstract Data Type (ADT) An abstract data type is commonly known as a class of objects An abstract data type in a program is used to represent (the

More information

Immunity from spam: an analysis of an artificial immune system for junk email detection

Immunity from spam: an analysis of an artificial immune system for junk email detection Immunity from spam: an analysis of an artificial immune system for junk email detection Terri Oda and Tony White Carleton University, Ottawa ON, Canada [email protected], [email protected] Abstract.

More information

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov [email protected] Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

More information

Web Forensic Evidence of SQL Injection Analysis

Web Forensic Evidence of SQL Injection Analysis International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University

More information

Install Java Development Kit (JDK) 1.8 http://www.oracle.com/technetwork/java/javase/downloads/index.html

Install Java Development Kit (JDK) 1.8 http://www.oracle.com/technetwork/java/javase/downloads/index.html CS 259: Data Structures with Java Hello World with the IntelliJ IDE Instructor: Joel Castellanos e-mail: joel.unm.edu Web: http://cs.unm.edu/~joel/ Office: Farris Engineering Center 319 8/19/2015 Install

More information

The following program is aiming to extract from a simple text file an analysis of the content such as:

The following program is aiming to extract from a simple text file an analysis of the content such as: Text Analyser Aim The following program is aiming to extract from a simple text file an analysis of the content such as: Number of printable characters Number of white spaces Number of vowels Number of

More information

Hadoop WordCount Explained! IT332 Distributed Systems

Hadoop WordCount Explained! IT332 Distributed Systems Hadoop WordCount Explained! IT332 Distributed Systems Typical problem solved by MapReduce Read a lot of data Map: extract something you care about from each record Shuffle and Sort Reduce: aggregate, summarize,

More information

LAB4 Making Classes and Objects

LAB4 Making Classes and Objects LAB4 Making Classes and Objects Objective The main objective of this lab is class creation, how its constructer creation, object creation and instantiation of objects. We will use the definition pane to

More information

CS 348: Introduction to Artificial Intelligence Lab 2: Spam Filtering

CS 348: Introduction to Artificial Intelligence Lab 2: Spam Filtering THE PROBLEM Spam is e-mail that is both unsolicited by the recipient and sent in substantively identical form to many recipients. In 2004, MSNBC reported that spam accounted for 66% of all electronic mail.

More information

LINKED DATA STRUCTURES

LINKED DATA STRUCTURES LINKED DATA STRUCTURES 1 Linked Lists A linked list is a structure in which objects refer to the same kind of object, and where: the objects, called nodes, are linked in a linear sequence. we keep a reference

More information

Using an Access Database

Using an Access Database A Few Terms Using an Access Database These words are used often in Access so you will want to become familiar with them before using the program and this tutorial. A database is a collection of related

More information

AP Computer Science Java Subset

AP Computer Science Java Subset APPENDIX A AP Computer Science Java Subset The AP Java subset is intended to outline the features of Java that may appear on the AP Computer Science A Exam. The AP Java subset is NOT intended as an overall

More information

How does the Excalibur Technology SPAM & Virus Protection System work?

How does the Excalibur Technology SPAM & Virus Protection System work? How does the Excalibur Technology SPAM & Virus Protection System work? All e-mail messages sent to your e-mail address are analyzed by the Excalibur Technology SPAM & Virus Protection System before being

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools

Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools Spam Track Wednesday 1 March, 2006 APRICOT Perth, Australia James Carpinter & Ray Hunt Dept. of Computer Science and Software

More information

Using ilove SharePoint Web Services Workflow Action

Using ilove SharePoint Web Services Workflow Action Using ilove SharePoint Web Services Workflow Action This guide describes the steps to create a workflow that will add some information to Contacts in CRM. As an example, we will use demonstration site

More information

Spam filtering. Peter Likarish Based on slides by EJ Jung 11/03/10

Spam filtering. Peter Likarish Based on slides by EJ Jung 11/03/10 Spam filtering Peter Likarish Based on slides by EJ Jung 11/03/10 What is spam? An unsolicited email equivalent to Direct Mail in postal service UCE (unsolicited commercial email) UBE (unsolicited bulk

More information

Introduction to Java

Introduction to Java Introduction to Java The HelloWorld program Primitive data types Assignment and arithmetic operations User input Conditional statements Looping Arrays CSA0011 Matthew Xuereb 2008 1 Java Overview A high

More information

CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals

CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals 1 Recall From Last Time: Java Program import java.util.scanner; public class EggBasket { public static void main(string[]

More information