Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations

Similar documents

Information Leakage in Encrypted Network Traffic

Prac%cal A)acks Against Encrypted VoIP Communica%ons

VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.)

HMM Profiles for Network Traffic Classification

Traffic Analysis. Scott E. Coull RedJack, LLC. Silver Spring, MD USA. Side-channel attack, information theory, cryptanalysis, covert channel analysis

Lecture 12: An Overview of Speech Recognition

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS

Basic principles of Voice over IP

How To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3

Performance Analysis Proposal

ON TRAFFIC ANALYSIS ATTACKS TO ENCRYPTED VOIP CALLS

Voice-Over-IP. Daniel Zappala. CS 460 Computer Networking Brigham Young University

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

APTA TransiTech Conference Communications: Vendor Perspective (TT) Phoenix, Arizona, Tuesday, VoIP Solution (101)

Simple Voice over IP (VoIP) Implementation

Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics

TECHNICAL CHALLENGES OF VoIP BYPASS

Ericsson T18s Voice Dialing Simulator

VoIP Bandwidth Considerations - design decisions

Voice Over IP. Priscilla Oppenheimer

Department of MIIT, University of Kuala Lumpur (UniKL), Malaysia

Radio over Internet Protocol (RoIP)

RTP Performance Enhancing Proxy

Peer-to-Peer VoIP Communications Using Anonymisation Overlay Networks

An Adaptive Codec Switching Scheme for SIP-based VoIP

Hardware Implementation of Probabilistic State Machine for Word Recognition

Evaluating Data Networks for Voice Readiness

An Introduction to VoIP Protocols

Performance Evaluation of AODV, OLSR Routing Protocol in VOIP Over Ad Hoc

Curso de Telefonía IP para el MTC. Sesión 2 Requerimientos principales. Mg. Antonio Ocampo Zúñiga

Objective Speech Quality Measures for Internet Telephony

An Arabic Text-To-Speech System Based on Artificial Neural Networks

Voice Over IP Per Call Bandwidth Consumption

ANALYSIS OF LONG DISTANCE 3-WAY CONFERENCE CALLING WITH VOIP

Applications that Benefit from IPv6

Perceived Speech Quality Prediction for Voice over IP-based Networks

VOIP over Space Networks

ANALYSIS OF VOICE OVER IP DURING VERTICAL HANDOVERS IN HETEROGENEOUS WIRELESS AND MOBILE NETWORKS

Voice Over IP Performance Assurance

TraceSim 3.0: Advanced Measurement Functionality. of Video over IP Traffic

technology standards and protocol for ip telephony solutions

Cisco Networks (ONT) 2006 Cisco Systems, Inc. All rights reserved.

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

VoIP in Mika Nupponen. S Postgraduate Course in Radio Communications 06/04/2004 1

Voice over Internet Protocol (VoIP) systems can be built up in numerous forms and these systems include mobile units, conferencing units and

Clearing the Way for VoIP

Application Note How To Determine Bandwidth Requirements

Project Code: SPBX. Project Advisor : Aftab Alam. Project Team: Umair Ashraf (Team Lead) Imran Bashir Khadija Akram

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

This document explains how to enable the SIP option and adjust the levels for the connected radio(s) using the below network example:

Department of Communications and Networking. S /3133 Networking Technology, laboratory course A/B

Voice Encryption over GSM:

CHAPTER 1 INTRODUCTION

Error Log Processing for Accurate Failure Prediction. Humboldt-Universität zu Berlin

TLS and SRTP for Skype Connect. Technical Datasheet

DVoIP: DYNAMIC VOICE-OVER-IP TRANSFORMATIONS FOR QUALITY OF SERVICE IN BANDWIDTH CONSTRAINED ENVIRONMENTS

Multi-Class Traffic Morphing for Encrypted VoIP Communication

Delivering reliable VoIP Services

White paper. SIP An introduction

Priority Based Dynamic Rate Control for VoIP Traffic

Introduction to Packet Voice Technologies and VoIP

Introduction VOIP in an Network VOIP 3

Admission Control for VoIP Traffic in IEEE Networks

UVOIP: CROSS-LAYER OPTIMIZATION OF BUFFER OPERATIONS FOR PROVIDING SECURE VOIP SERVICES ON CONSTRAINED EMBEDDED DEVICES

Cisco Unified IP Phone 7962G and 7942G

VoIP over MANET (VoMAN): QoS & Performance Analysis of Routing Protocols for Different Audio Codecs

icall VoIP (User Agent) Configuration

Performance Evaluation of VoIP Services using Different CODECs over a UMTS Network

Simulative Investigation of QoS parameters for VoIP over WiMAX networks

Receiving the IP packets Decoding of the packets Digital-to-analog conversion which reproduces the original voice stream

PRIVACY OF ENCRYPTED VOICE OVER INTERNET PROTOCOL. A Thesis TUNEESH KUMAR LELLA

Digital Speech Coding

Advanced Networking Voice over IP: RTP/RTCP The transport layer

Assessing the quality of VoIP transmission affected by playout buffer scheme and encoding scheme

PERFORMANCE ANALYSIS OF VOIP TRAFFIC OVER INTEGRATING WIRELESS LAN AND WAN USING DIFFERENT CODECS

Application Note. Onsight Mobile Collaboration Video Endpoint Interoperability v5.0

IR-Cut. Day/Night. Filter

Mobile VoIP Audio Quality CRASH COURSE THE INS AND OUTS OF MOBILE VOIP

Active Monitoring of Voice over IP Services with Malden

SIP Trunking and Voice over IP

Improving Quality in Voice Over Internet Protocol (VOIP) on Mobile Devices in Pervasive Environment

Performance Evaluation of VoIP in Different Settings

Voice over IP in PDC Packet Data Network

TELEMETRY NETWORK INTRUSION DETECTION SYSTEM

How to make free phone calls and influence people by the grugq

Indepth Voice over IP and SIP Networking Course

Voice over IP Protocols And Compression Algorithms

White Paper. D-Link International Tel: (65) , Fax: (65) Web:

VoIP Bandwidth Calculation

TCOM 370 NOTES 99-6 VOICE DIGITIZATION AND VOICE/DATA INTEGRATION

Security (WEP, WPA\WPA2) 19/05/2009. Giulio Rossetti Unipi

Transcription:

Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations C. Wright, L. Ballard, S. Coull, F. Monrose, G. Masson Talk held by Goran Doychev Selected Topics in Information Security and Cryptography Seminar 1 / 30

Overview 1 How does VoIP work? 2 Recognizing previously seen phrases 3 Recognizing phrases without example utterances 4 Evaluation 2 / 30

1 How does VoIP work? 2 Recognizing previously seen phrases 3 Recognizing phrases without example utterances 4 Evaluation 3 / 30

How does VoIP work? Control channel: SIP, XMPP, Skype negotiate IP ports, supported codecs etc. Voice data: RTP over UDP Speech codec: GSM, G.728, isac, Speex 4 / 30

Operation of a Codec audio stream sampling at 8000 or 16000 samples per second (Hz) n most recent samples compressed to packet (usually 20ms) Example 16kHz audio source: n = 320 samples per packet 8kHz audio source: n = 160 samples per packet 5 / 30

Operation of a Codec (2) brute-force search over entries in codebook of audio vectors find one that most closely reproduces audio packet audio packet 01001110 digital representation In Out 01001010 0110 01001110 0111 01011001 1000 01011010 1001 01011110 1010 codebook 0111 output 6 / 30

Operation of a Codec (3) Quality of sound depends on # entries in codebook Classification of coders according to bit-rate: Category Bit-rate range High bit-rate > 15 kbps Medium bit-rate 5 to 15 kbps Low bit-rate 2 to 5 kbps Very low bit-rate < 2 kbps 7 / 30

Variable Bit Rate Variable bit rate (VBR): adaptively choose bit rate for each packet Balance between audio quality and bandwidth In a two-way conversation: speaker silent 63% of the time 8 / 30

LEAKAGE: Bit rate depends on encoded data Variable Bit Rate (2) e.g., Speex encodes vowel sounds (aa, aw) at higher bit rate than fricative sounds (f, s) 9 / 30

1 How does VoIP work? 2 Recognizing previously seen phrases 3 Recognizing phrases without example utterances 4 Evaluation 10 / 30

Problem Description Given: utterances of n phrases phrase 1 phrase 2 phrase 3 packet sizes of one of the phrases (5k,7k,3k,8k,12k,2k,1k) Goal: recognize the phrase (5k,7k,3k,8k,12k,2k,1k) the phrase 11 / 30

Profile Hidden Markov Model (HMM) Match states - expected distribution of packet sizes at each position in the sequence Insert states - emit packets according to some distribution (uniform). Allows insertion of additional packets. Delete states - silent states. Allows omitting packets. 12 / 30

Building a Profile HMM Initially: set Match state probabilities to uniform distribution transition probabilities: make Match the most likely transition 13 / 30

Building a Profile HMM Initially: set Match state probabilities to uniform distribution transition probabilities: make Match the most likely transition Train the HMM using example utterances 13 / 30

Building a Profile HMM Initially: set Match state probabilities to uniform distribution transition probabilities: make Match the most likely transition Train the HMM using example utterances: Apply Baum & Welch algorithm: iteratively improves the probability of the training sequences Baum & Welch finds locally optimal set of parameters apply Simulated annealing Apply Viterbi training to further refine parameters. 13 / 30

Problem Description Given: utterances of n phrases phrase 1 phrase 2 phrase 3 packet sizes of one of the phrases (5k,7k,3k,8k,12k,2k,1k) Goal: recognize the phrase (5k,7k,3k,8k,12k,2k,1k) the phrase 14 / 30

Searching for a Phrase Changes: Random - emit packets according to uniform distribution. Matches packets not part of phrase of interest Profile Start/End - matches start/end of phrase from PS: transition to the first M state is most likely 15 / 30

Searching for a Phrase (2) Apply the Viterbi algorithm - find most likely sequence of states to explain observed packet sizes A hit : subsequence of states that belong to the profile part of the model 16 / 30

Searching for a Phrase (2) Apply the Viterbi algorithm - find most likely sequence of states to explain observed packet sizes A hit : subsequence of states that belong to the profile part of the model Evaluate the hit s goodness: l i,..., l j packet lengths of the phrase of interest score i,j = log Pr[l i,..., l j Profile] Pr[l i,..., l j Random] Discard hits below a threshold 16 / 30

1 How does VoIP work? 2 Recognizing previously seen phrases 3 Recognizing phrases without example utterances 4 Evaluation 17 / 30

Phrase Models from Phonemes Phonemes sounds like b, ch, t, s, aa, aw (English - 40 to 60 phonemes) Idea: words built up by concatenated phonemes model phonemes instead 18 / 30

Phrase Models from Phonemes Phonemes sounds like b, ch, t, s, aa, aw (English - 40 to 60 phonemes) Idea: words built up by concatenated phonemes model phonemes instead Advantages: Flexibility Cheaper 18 / 30

Problem Description Given: recordings of all phonemes aa, ae, ah, ao, aw, ay, b, ch, d, dh, eh, er, ey, f, g, hh, etc. packet sizes of a phrase (5k,7k,3k,8k,12k,2k,1k) Goal: recognize the phrase (5k,7k,3k,8k,12k,2k,1k) the phrase 19 / 30

Phrase Models from Phonemes (2) Straightforward method: 1 build HMMs for phonemes 2 concatenate them, build word HMM 3 concatenate word HMMs to phrase HMM 20 / 30

Straightforward method: Phrase Models from Phonemes (2) 1 build HMMs for phonemes 2 concatenate them, build word HMM 3 concatenate word HMMs to phrase HMM American English: the phrase (5k,7k,1k,8k,12k,2k,1k) (dh,ah),(f,r,ey,z) ( the ),( phrase ) the phrase 20 / 30

Straightforward method: Phrase Models from Phonemes (2) 1 build HMMs for phonemes 2 concatenate them, build word HMM 3 concatenate word HMMs to phrase HMM Scottish English: the phrase (5k,7k,1k,8k,10k,2k,1k) (dh,ah),(f,r,eh,z) ( the ),( frese?)? 20 / 30

Problem Description Given: recordings of all phonemes aa, ae, ah, ao, aw, ay, b, ch, d, dh, eh, er, ey, f, g, hh, etc. packet sizes of a phrase (5k,7k,3k,8k,12k,2k,1k) Goal: recognize the phrase (5k,7k,3k,8k,12k,2k,1k) the phrase 21 / 30

Problem Description Given: recordings of all phonemes aa, ae, ah, ao, aw, ay, b, ch, d, dh, eh, er, ey, f, g, hh, etc. packet sizes of a phrase (5k,7k,3k,8k,12k,2k,1k) phonetic pronunciation dictionary Goal: recognize the phrase (5k,7k,3k,8k,12k,2k,1k) the phrase 21 / 30

Phrase Models from Phonemes (3) Advanced method: build initial profile HMM for phrase (as usual) train it using synthetic training set search for phrase (as usual) 22 / 30

Phrase Models from Phonemes (3) Advanced method: build initial profile HMM for phrase (as usual) train it using synthetic training set search for phrase (as usual) Synthetic training set: phrase: the phrase split into words: the phrase create list of phonemes: dh ah f r ey z replace with packet sizes: 9k 20k 5k 8k 14k 3k 22 / 30

Phrase Models from Phonemes (3) Advanced method: build initial profile HMM for phrase (as usual) train it using synthetic training set search for phrase (as usual) Synthetic training set: phrase: the phrase split into words: the phrase create list of phonemes: dh ah f r ey z replace with packet sizes: 9k 20k 5k 8k 14k 3k Improved Model: use diphones and triphones instead of words 22 / 30

1 How does VoIP work? 2 Recognizing previously seen phrases 3 Recognizing phrases without example utterances 4 Evaluation 23 / 30

Experimental Setup Use TIMIT continuous speech corporus Concatenate sentences to conversation Training of HMM: TIMIT pronunciation dictionary ( proper American English) PRONLEX pronunciation dictionary (more colloquial English) 24 / 30

Evaluation Metrics recall: Probability that algorithm finds phrase precision: Probability that reported match is correct 25 / 30

Results of the Experiment recall precision 51% 50% 26 / 30

Results of the Experiment recall precision 51% 50% Some phrases were found with high accuracy: Young children should avoid exposure to contagious diseases. (recall = 0.99, precision = 1) 26 / 30

Results of the Experiment recall precision 51% 50% Some phrases were found with high accuracy: Young children should avoid exposure to contagious diseases. (recall = 0.99, precision = 1) A high deviation of results for individual speakers 26 / 30

Robustness to Noise Using pink noise: energy logarithmically distributed across range of human hearing harder for noise removal algorithms to filter it sound noise recall precision 100% -.51.50 90% 10%.39.40 75% 25%.23.22 27 / 30

Robustness to Noise Using pink noise: energy logarithmically distributed across range of human hearing harder for noise removal algorithms to filter it sound noise recall precision 100% -.51.50 90% 10%.39.40 75% 25%.23.22 attacker can identify an alarming number of the phrases 27 / 30

Mitigation Techniques Padding packets to a coarser granularity: granulity recall precision overhead multiples of 128bit 0.15 0.16 8.81% multiples of 256bit 0.04 0.04 16,5% In these tests: continuous speech In practice: 63% idle time in conversations greater overhead 28 / 30

Overview 1 How does VoIP work? 2 Recognizing previously seen phrases 3 Recognizing phrases without example utterances 4 Evaluation 29 / 30

References Charles V. Wright, Lucas Ballard, Scott E. Coull, Fabian Monrose, and Gerald M. Masson. Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations. In SP 08: Proceedings of the 2008 IEEE Symposium on Security and Privacy (sp 2008), pages 35 49, Washington, DC, USA, 2008. IEEE Computer Society. Charles V. Wright, Lucas Ballard, Fabian Monrose, and Gerald M. Masson. Language identification of encrypted voip traffic: Alejandra y Roberto or Alice and Bob? In SS 07: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, pages 1 12, Berkeley, CA, USA, 2007. USENIX Association. Wai C. Chu. Speech Coding Algorithms: Foundation and Evolution of Standardized Coders. John Wiley & Sons, Inc., New York, NY, USA, 2003. Lawrence R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, pages 257 286, 1989. S. R. Eddy. Profile hidden markov models (review). Bioinformatics, 14(9):755 763, 1998. 30 / 30