Lars Francke Diplom Wirtschaftsinformatiker (FH) Sülldorfer Kirchenweg 34



Similar documents
Communicating with the Elephant in the Data Center

Bringing Big Data to People

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Hadoop Ecosystem B Y R A H I M A.

Dominik Wagenknecht Accenture

Upcoming Announcements

Data Analyst Program- 0 to 100

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Why Spark on Hadoop Matters

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Apache Sentry. Prasad Mujumdar

Has been into training Big Data Hadoop and MongoDB from more than a year now

Peers Techno log ies Pv t. L td. HADOOP

How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning

Constructing a Data Lake: Hadoop and Oracle Database United!

HDP Hadoop From concept to deployment.

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Qsoft Inc

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

BIG DATA TRENDS AND TECHNOLOGIES

Moving Large Workloads from a Public Cloud to an OpenStack Private Cloud: Is It Really Worth It?

HADOOP. Revised 10/19/2015

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Data Security in Hadoop

The Future of Data Management with Hadoop and the Enterprise Data Hub

Big Data Analytics - Accelerated. stream-horizon.com

HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

#TalendSandbox for Big Data

Building Your Big Data Team

Workshop on Hadoop with Big Data

Apache Bigtop: 100% Apache Bigdata management distribution. (and so much more!)

ITG Software Engineering

Moving From Hadoop to Spark

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

Oracle Big Data Fundamentals Ed 1 NEW

Data Services Advisory

HDP Enabling the Modern Data Architecture

Open Source for Cloud Infrastructure

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead

Cloudera Administrator Training for Apache Hadoop

Open source software for building a private cloud

Deploying Hadoop with Manager

Cloudera Enterprise Data Hub in Telecom:

SQL on NoSQL (and all of the data) With Apache Drill

Introduction to Big Data Training

Comprehensive Analytics on the Hortonworks Data Platform

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Information Builders Mission & Value Proposition

Self-service BI for big data applications using Apache Drill

Big Data Course Highlights

Taming Operations in the Apache Hadoop Ecosystem. Jon Hsieh, Kate Ting, USENIX LISA 14 Nov 14, 2014

Modern Web development and operations practices. Grig Gheorghiu VP Tech Operations Nasty Gal

Dell In-Memory Appliance for Cloudera Enterprise

Complete Java Classes Hadoop Syllabus Contact No:

Modernizing Your Data Warehouse for Hadoop

Cisco IT Hadoop Journey

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS PART 4 BEYOND MAPREDUCE...385

and Brief Journey into the New World of Next Generation Data Infrastructures IT Transformation Advisory Proposal

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

There's Plenty of Room in the Cloud

BIG DATA HADOOP TRAINING

Self-service BI for big data applications using Apache Drill

Hortonworks Data Platform for Hadoop and SAP HANA

Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013

Open Source Technologies on Microsoft Azure

The Inside Scoop on Hadoop

MySQL and Hadoop. Percona Live 2014 Chris Schneider

Trend Micro Big Data Platform and Apache Bigtop. 葉 祐 欣 (Evans Ye) Big Data Conference 2015

June Production Hadoop systems in the enterprise

Iskandar Najmuddin. 10 Beaumont Road, London W4 5AP +44 (0)

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

Big Data Management and Security

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Job Oriented Training Agenda

Big Data and Industrial Internet

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Hadoop is hard. Rackspace makes it easy.

Apache Hadoop: Past, Present, and Future

Cloudera Manager Installation Guide

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Ali Ghodsi Head of PM and Engineering Databricks

Implement Hadoop jobs to extract business value from large and varied data sets

University of Texas CS Talk Headline Goes Here

Fundamentals Curriculum HAWQ

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

COURSE CONTENT Big Data and Hadoop Training

The Future of Data Management

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

ITG Software Engineering

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Training Catalog. Summer 2015 Training Catalog. Apache Hadoop Training from the Experts. Apache Hadoop Training From the Experts

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Transcription:

CURRICULUM VITAE PERSONAL DATA Name Street + Nr. Postal code + City Country Phone E- Mail Date of birth Homepage Lars Francke Diplom Wirtschaftsinformatiker (FH) Sülldorfer Kirchenweg 34 22587 Hamburg Germany +49 172 / 4554978 mail@lars- francke.de 1981-12- 17 http://lars- francke.de PROJECTS 02.2014 Cloudera, Palo Alto, USA I am a Certified Cloudera Consultant. Working with Cloudera EMEA on customer engagements. This entails deployment and optimization of Hadoop clusters based on CDH including security using Kerberos. Projects: Retailer, Great Britain (October 2015) Setting up High Availability for Hue, Hive, Oozie and others using HAProxy Backup & Disaster Recovery Plan and Implementation using BDR & DistCP YARN Resource Pool Configuration Benchmarking Spark application debugging Financial institution, Great Britain (August September 2015) Setup of secure Hadoop on two CDH Clusters o Active Directory Integration o Sentry SSL/TLS Encryption * LDAP Authentication Financial institution, Great Britain (August 2015) Setup of a CDH 5.4 cluster Integration of the company s Active Directory 1 / 5

Online ticketing and event company, Germany (November 2014) Installation of CDH 5.2 on a new cluster (Ubuntu 14.04, Spark on YARN) Integration of the company s Active Directory to provide secure Hadoop Software development company, Poland (9.2014) Installation of CDH5 on a new virtual cluster o Installation of a local MIT KDC and setting up Hadoop security Training on Cloudera Manager Integration of Pig in a C++ application Consultation around best practices in Hadoop development Price comparison site, United Kingdom (6.2014-7.2014) Installation of CDH5 on a new cluster Review and optimization of the configuration Import of existing MongoDB databases in BSON format to HDFS Processing of the BSON files using Hive, Impala and Pig, Transformation of the data to Avro and Parquet Export of data to SQL- Server using Sqoop Online marketplace for car sales, Germany (5.2014) Migration of an existing CDH4 cluster which was installed using packages with Puppet to Cloudera Manager and Parcels Design and deployment of a Security concept using Kerberos with Active Directory Integration and Sentry Optimization of an existing Flume infrastructure Setup and demonstration of Hue, Oozie and Impala Telecommunications company, Belgium (3.2014-4.2014) Certification of an existing CDH4 cluster Review and optimization of the cluster Design and deployment of a Security concept using Kerberos with Active Directory Integration and Sentry IT Consulting company, France (2.2014) Advice on Hardware selection as well as network design Preparation of the Operating System (CentOS) Installation and optimization of CDH4 Training of employees in using Hue, Hadoop as well as development of Hive UDFs 08.2015 Euroclear, Brussels, Belgium Setup of a CDH 5.4 cluster Integration of the company s Active Directory including Sentry 07.2015 LeanBI, Stettlen, Schweiz Spark & Hadoop Consulting and Troubleshooting 07.2015 P3 Communications, Aachen, Germany Spark consulting 2 / 5

06.2015 05.2015 04.2015 04.2015 06.2015 03.2015 01.2015 09.2014 Roche Diagnostics, Mannheim, Germany Hadoop consultancy Maintenance of a cluster based on Amazon s EC2 OTTO, Hamburg, Germany Consultancy around the BRAIN project (new BI platform) HBase, Hadoop, Spark, Realtime simpli.fi, Fort Worth, USA Consultancy around Hadoop, Spark, Best Practices, Kafka Review of an architecture based on Kafka, Flume, Hadoop Review of an existing cluster regarding Best practices, performance Cluster sizing based on predicted usage T- Systems Iberia, Barcelona, Spain & Deutsche Telekom, Bremen, Germany Review of a proposed Hadoop based architecture to replaced a Oracle & Informatica based Data Warehouse and ETL process Consultancy and training on all things Hadoop, Spark, HBase Setup of a development Hadoop cluster SDG Consulting, Hamburg, Germany Consultancy and training on all things Hadoop, Spark and Big Data Development of Spark applications and Hive UDFs for PoC projects Tableau & Spark Integration Installation of a Hadoop Clusters on Microsoft Azure GfK SE, Nuremberg, Germany Documentation and consultation around making and validating informed decisions for the following topics: o HBase vs. Accumulo, Spark, SQL- on- Hadoop o Backup and High availability of Hadoop clusters o PaaS, IaaS, Bare- metal deployments in public and private cloud scenarios Development of code for HBase backed projects General Hadoop and Spark consultancy xplosion Interactive, Hamburg, Germany Migration of a Hadoop installation (set up using Chef) to Cloudera Manager Setup of Kerberos with Samba4 & Univention UCS for Hadoop Security Troubleshooting 3 / 5

09.2014 12.2014 advanced STORE, Berlin, Germany Consultancy around Big Data solutions for tools in the real time bidding world (e.g. generating models) Development of a prototype/proof of concept in Java using Dropwizard, Aerospike, RxJava and MongoDB Focus on pre- processing data using MongoDB and Aerospike and low latency Java web applications Setup of the ELK stack (Elasticsearch, Logstash, Kibana) 11.2014 CartoDB, Madrid, Spain Big Data consultancy around a scalable solution for ingesting and processing large amounts of Geospatial data Prototyping using Amazon's Elastic MapReduce and Cloudera Director 09.2014 10.2014 05.2013 01.2015 Land Resource Management Unit, JRC, European Commission, Ispra, Italy YARN/MRv2) Review and optimization of the cluster Training of employees on YARN and concepts such as HDFS High Availability Development of Hive UDFs and Queries to process large amounts of GIS data using the ESRI Spatial Framework for Hadoop Collins GmbH & Co. KG, Hamburg, Germany The project started with building an infrastructure for a newly formed BI team as well as the development of applications: Hardware selection for a new Hadoop cluster Installation of the Operating System (CentOS) as well as CDH4 Ingestion of data from various sources (MySQL, Elasticsearch, MongoDB, CSV files and others) Prepare and provide the data for analysis with Hive, Pig, Impala, Scalding and other tools PoC for a realtime infrastructure to analyse clickstream data using Storm, Kafka and Elasticsearch Implementation of a recommendation engine using Hadoop, Mahout, Elasticsearch and other components Provide ad- hoc analysis as well as regular reports Migration of the cluster's operating system from CentOS to Debian while keeping the cluster running 4 / 5

10.2010 12.2013 08.2010 09.2010 03.2010 06.2010 IT SKILLS Global Biodiversity Information Facility (GBIF), Copenhagen, Denmark Migration of a batch- oriented MySQL based workflow to process biodiversity data to a Hadoop based solution Installation of CDH3 using Puppet Upgrade of the cluster from CDH3 to CDH4 (including HBase and Solr) and migration from Puppet to Cloudera Manager Management and troubleshooting of the Hadoop cluster Provided Hadooptraining Introduction of Maven, Nexus, Jenkins and SonarQube Design and implementation of a Crawler for Biodiversity data using DiGIR, BioCASe, TAPIR and DwC- A Adternity GmbH, Dortmund, Germany Design of a Data Warehousing concept for the online advertisement business on the basis of open source technologies (namely Hadoop and Hive) Installation of CDH3 Implementation of the concept using Hadoop and Hive VZnet Netzwerke Ltd., Berlin, Germany Architecture of projects and highly scalable systems concerning geolocation for the StudiVZ Platform Implementation using Java (Jersey, Jackson) and Python Consultancy around Hadoop and HBase Core skills Big Data (Hadoop ecosystem) Software architecture and - development in Java Programming languages Java Scala Python JavaScript Big Data Experience using Hadoop, HBase and other related tools (Oozie, Sqoop, Hive, ZooKeeper, Spark, Storm, Kafka, Cloudera CDH etc.) 2009 Elasticsearch Participating in projects (Mailing lists, Reviews, Patches) Hive Committer Attending conferences and meetups Details Maven, Jenkins, SonarQube, Nexus Jersey (JAX- RS), Jackson, Avro, Dropwizard, Play, Akka, as well as the usual libraries and frameworks HBase, PostgreSQL, PostGIS, MySQL, Berkeley DB, Cassandra, MongoDB (mongo- hadoop), Redis, SQL Linux with a focus on CentOS/RedHat, Puppet, Ansible, Foreman, Fabric, Logstash, Kibana, Graylog2, Ganglia, Graphite JIRA, Confluence, Fisheye, Crucible, Git, OpenStreetMap (OSM), RabbitMQ, Varnish, Vagrant, Docker, Kerberos 5 / 5