TLD Data Analysis. ICANN Tech Day, Dublin. October 19th 2015 Maarten Wullink, SIDN. Klik om de s+jl te bewerken



Similar documents
DNS Big Data

.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken

ENTRADA: Enabling DNS Big Data Applications

A Privacy framework for DNS big data. November 28, 2014 Jelte Jansen

Big data security on.nl: infrastructure and one application

Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

Malicious domains: Automatic Detection with DNS traffic analysis

A privacy framework for DNS big data applications *

Telecom and Internet Regulatory Challenges and Opportunities Names, Numbers, Internet Governance

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9

A privacy framework for DNS big data applications *

Using RDBMS, NoSQL or Hadoop?

Internet Security and Resiliency: A Collaborative Effort

IANA Functions to cctlds Sofia, Bulgaria September 2008

The Data Reservoir. 10 th September Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Informa4on Solu4ons

Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology

NANOG DNS BoF. DNS DNSSEC IPv6 Tuesday, February 1, 2011 NATIONAL ENGINEERING & TECHNICAL OPERATIONS

DNS Traffic Monitoring. Dave Piscitello VP Security and ICT Coordina;on, ICANN

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Domain Name System Security

Complexity and Transition Management. Theory and practice

Strengthening our Ecosystem through Stakeholder Collaboration. Jia-Rong Low, Sr Director, Asia 20 August 2015

More Than A Buzzword: Big Data in the Environmental Arena

Hadoop & SAS Data Loader for Hadoop

DDOS Mi'ga'on in RedIRIS. SIG- ISM. Vienna

NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015

Apache Spark and the future of big data applica5ons. Eric Baldeschwieler

UNIFIED, END- TO- END EDISCOVERY

The Right BI Tool for the Job in a non- SAP Applica9on Environment

A versatile platform for DNS metrics with its application to IPv6

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

SQream Technologies Ltd - Confiden7al

FAQ (Frequently Asked Questions)

DTCC Data Quality Survey Industry Report

Computer Security Incident Handling Detec6on and Analysis

Performance Management in Big Data Applica6ons. Michael Kopp, Technology

Data Stream Algorithms in Storm and R. Radek Maciaszek

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

The IANA Functions. An Introduction to the Internet Assigned Numbers Authority (IANA) Functions

Big Data for Investment Research Management

Texas Digital Government Summit. Data Analysis Structured vs. Unstructured Data. Presented By: Dave Larson

Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional.

How To Use A Webmail On A Pc Or Macodeo.Com

Main Research Gaps in Cyber Security

ICANN STRATEGIC PLAN JULY 2012 JUNE 2015

An to Big Data, Apache Hadoop, and Cloudera

Sicurezza & Big Data: la Security Intelligence aiuta le aziende a difendersi dagli attacchi

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

APPLICATION PROGRAMMING INTERFACE

Oracle Big Data SQL Technical Update

BEST PRACTICES FOR IMPROVING EXTERNAL DNS RESILIENCY AND PERFORMANCE

Computer Networks: Domain Name System

Best Practices in Domain Name Registry Solutions Understanding the Technical Requirements of ICANN's Applicant Guidebook

DOMAIN NAME DAY. + Helsinki; 14 th February; Nigel Hickson, ICANN

SAC 049 SSAC Report on DNS Zone Risk Assessment and Management

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study

Splunk for Networking and SDN

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Testing Tools using Visual Studio. Randy Pagels Sr. Developer Technology Specialist Microsoft Corporation

OFERTIE OpenFlow Experiments in Real- Time Interac7ve Edutainment

MSc Data Science at the University of Sheffield. Started in September 2014

Private Cloud Website Solu2on

Final. Dr. Paul Twomey President and Chief Executive Officer Internet Corporation for Assigned Names and Numbers (ICANN)

DNS Security, Stability and Resiliency

Big data analy+cs for global change monitoring and research in forestry and agriculture. Lubia Vinhas

ICANN: achievements and challenges of a multi-stakeholder, bottom up, transparent model

The Internet Ecosystem and ICANN!! Steve Stanford University, Center for Information and Society! 29 April 2013!

Cisco and Splunk: Under the Hood of Cisco IT

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP

Blue Medora VMware vcenter Opera3ons Manager Management Pack for Oracle Enterprise Manager

Lesson 13: DNS Security. Javier Osuna GMV Head of Security and Process Consulting Division

SIDekICk SuspIcious DomaIn Classification in the.nl Zone

Transcription:

Klik om de s+jl te bewerken Klik om de models+jlen te bewerken Tweede niveau Derde niveau Vierde niveau TLD Data Analysis Vijfde niveau ICANN Tech Day, Dublin October 19th 2015 Maarten Wullink, SIDN Wie zijn wij? Mijlpalen Organisa@e Het huidige internet Missie - Visie Diensten Referen@es SamenvaJng 1

SIDN Domain name registry for.nl cctld > 5,6 million domain names 2,46 million domain names secured with DNSSEC SIDN Labs is the R&D team of SIDN

DNS Data @SIDN > 3.1 million dis@nct resolvers > 1.3 billion query's daily > 300 GB of PCAP data daily

ENTRADA ENhanced Top-Level Domain Resilience through Advanced Data Analysis Goal: data-driven improved security & stability of.nl and the Internet at large Problem: Exis@ng solu@ons for analyzing network data do not work well with large datasets and have limited analy@cal capabili@es. Main requirement: high-performance, near real-@me data warehouse Approach: avoid expensive pcap analysis: Convert pcap data to a performance-op@mized format (key) Perform analysis with tools/engines that leverage that

Use Cases Focussed on increasing the security and stability of.nl Visualize DNS pagerns (visualize traffic pagerns for phishing domain names) Detect botnet infec@ons Real-@me Phishing detec@on Sta@s@cs (stats.sidnlabs.nl) Scien@fic research (collabora@on with Dutch Universi@es) Opera@onal support for DNS operators

Example Applica@ons DNS security scoreboard Resolver reputa@on

DNS Security Scoreboard Goal: Visualize DNS pagerns for malicious ac@vity How: Combine external phishing feeds with DNS data

Architecture Security feed I new event Security feed II new event Hadoop Event Analyzer save enriched event PostgreSQL REST API Web UI retrieve event data

Traffic Visualiza@on

Resolver Reputa@on (RESREP) Goal: Try to detect malicious ac@vity by assigning reputa@on scores to resolvers How: fingerprin@ng resolver behaviour

RESREP Concept ISP Resolvers.nl Registry Authorita@ve DNS.nl Malicious ac@vity: Spam-runs Botnets like Cutwail DNS-amplifica@on agacks DNS ques@ons and responses

RESREP Architecture Root operator.nl Privacy Board ISP network Resolvers RESREP service RESREP Privacy Policy ENTRADA Plaqorm AbuseHUB Abusedesk User HTTP Child operator (example.nl) www.example.nl

ENTRADA Architecture DNS big data system Goal: develop applica@ons and services that further enhance the security and stability of.nl, the DNS, and the Internet at large ENTRADA main components Applica@ons and services Plaqorm and data sources Privacy framework Plaqorm + privacy framework = ENTRADA plumbing

ENTRADA Privacy Framework Part of the ENTRADA plumbing Juridisch%en%organisatorisch% ENTRADA%data%plaKorm%(technisch)% toepassingssilos% ENTRADA%privacyraamwerk% Key concepts Applica@on-specific privacy policy Privacy Board Enforcement Points R&D% licenje% Template% Auteur% (Ontwikkelaar% toepassing%t1)% Aanpassingen% Concept% policy% voor%t1% Privacy% Board% Policy% voor%t1% T1% T2% TN% PEP#G% PEP#A% PEP#O% PEP#G% PEP#A% PEP#O% PEP#G% PEP#A% PEP#O% Security%en%stability%% services%en%dashboards% Data#analyse%% algoritmes% Database%queries% Opslag% DNS%packets%(PCAP)% Policy elements include PEP#V% PEP#V% PEP#V% Verzameling% Purpose Data used.nl%nameservers% Filters DNS%query s%en%responses% Reten@on period resolvers% Type of applica@on (R&D vs. produc@on)

ENTRADA Technical Architecture ENTRADA-specific components Workflow Services DNS Library PCAP Conversion ENTRADA plaqorm Open source Hadoop (generic components) Support IMPALA HDFS Parquet

Workflow name server PCAP staging PCAP decode Applica@on X Applica@on Y Join Filter Hadoop Impala Analyst Enrich Parquet Monitoring Metrics Import Query data available for analysis within 10 minutes

Performance Example query, count # ipv4 queries per day. select concat_ws( -,day,month,year), count(1) from dns.queries where ipv=4 group by concat_ws( -,day,month,year) Query response @mes 1 Year of data is 2.2TB Parquet ~ 52TB of PCAP

ENTRADA Status Name server feeds 2 Queries per day ~320M Daily PCAP volume(gzipped) ~70GB Daily Parquet volume ~14GB Months opera@onal 18 Total # queries stored > 74B Total Parquet volume > 3TB HDFS (3x replica@on) > 9TB Cluster capacity ~150B-200B tuples

Conclusions Technical: Hadoop HDFS + Parquet + Impala is a winning combina@on! Contribu@ons: Research by SIDN Labs and universi@es Iden@fied malicious domain names and botnets External data feed to the Abuse Informa@on Exchange Insight into DNS query data

Future Work Combine data from.nl authorita@ve name server with scans of the complete.nl zone and ISP data. Get data from more name servers and resolvers Expand Open Data program

Ques@ons and Feedback Maarten Wullink Senior Research Engineer maarten.wullink@sidn.nl @wulliak www.sidnlabs.nl hgps://stats.sidnlabs.nl