Massachusetts Institute of Technology (MIT), Cambridge, USA Postdoctoral Associate (April 2013-present), BigData at CSAIL, Mentor: Samuel Madden



Similar documents
Exploring the Efficiency of Big Data Processing with Hadoop MapReduce

Database as a Service Polystore Systems Self-Managed Elastic Systems

A B S T R A C T. Index Terms : Apache s Hadoop, Map/Reduce, HDFS, Hashing Algorithm. I. INTRODUCTION

International Journal of Innovative Research in Computer and Communication Engineering

ANALYSING THE FEATURES OF JAVA AND MAP/REDUCE ON HADOOP

keywords: big-graphs, big-data, graph-systems, knowledge graphs, uncertain graphs, graph streams, revenue maximization, moving objects indexing.

Xianrui Meng. MCS 138, 111 Cummington Mall Department of Computer Science Boston, MA (857)

MATTEO RIONDATO Curriculum vitae

Query Optimizer for the ETL Process in Data Warehouses

IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April 2014, Pg: STARFISH

WORKSHOP. if you want to survive, you must learn to adapt! CAMBRIDGE. Naspers Academy presents

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

One-Size-Fits-All: A DBMS Idea Whose Time has Come and Gone. Michael Stonebraker December, 2008

Medical Big Data Workshop 12:30-5pm Star Conference Room. #MedBigData15

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

5-Layered Architecture of Cloud Database Management System

Abhinandan Das EDUCATION AREAS OF INTEREST REPRESENTATIVE RESEARCH

Brave New World: Hadoop vs. Spark

A Comparison of Approaches to Large-Scale Data Analysis

Preview of Award Annual Project Report Cover Accomplishments Products Participants/Organizations Impacts Changes/Problems

AN EFFECTIVE PROPOSAL FOR SHARING OF DATA SERVICES FOR NETWORK APPLICATIONS

Analysing Large Web Log Files in a Hadoop Distributed Cluster Environment

How To Use Big Data For Telco (For A Telco)

City University of Hong Kong Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe

MapReduce With Columnar Storage

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

Play with Big Data on the Shoulders of Open Source

Juan (Jenn) Du. Homepage: www4.ncsu.edu/ jdu/ Co-advisors: Dr. Xiaohui (Helen) Gu and Dr. Douglas Reeves

Log Mining Based on Hadoop s Map and Reduce Technique

DiNoDB: Efficient Large-Scale Raw Data Analytics

Towards Integrated Data Center Design

Benchmarking and Analysis of NoSQL Technologies

BIG DATA TOOLS. Top 10 open source technologies for Big Data

SCALABLE GRAPH ANALYTICS WITH GRADOOP AND BIIIG

The Recovery System for Hadoop Cluster

Amr El Abbadi. Computer Science, UC Santa Barbara

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Large-scale Data Processing on the Cloud

Report Data Management in the Cloud: Limitations and Opportunities

Summary. Grants. Publications. Peer Esteem Indicators. Teaching Experience. Administrative Experience. Industrial Experience

In-Memory Analytics for Big Data

Curriculum Vitae Ruben Sipos

Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May Santa Clara, CA

Report for the seminar Algorithms for Database Systems F1: A Distributed SQL Database That Scales

List and Sort Biographies of Participants _SUMMER SCHOOL 2010

Big Data Provenance: Challenges and Implications for Benchmarking

From GWS to MapReduce: Google s Cloud Technology in the Early Days

A Professional Big Data Master s Program to train Computational Specialists

In-Memory Data Management for Enterprise Applications

Sharareh Noorbaloochi Department of Psychology New York University 6 Washington Place, 559, New York, NY noorbaloochi@nyu.

COMPUTING SCIENCE. Scalable and Responsive Event Processing in the Cloud. Visalakshmi Suresh, Paul Ezhilchelvan and Paul Watson

Hadoop & its Usage at Facebook

Fig. 3. PostgreSQL subsystems

Big Data :: Big Demand

Daniel J. Adabi. Workshop presentation by Lukas Probst

ISSN: CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS

Principles of Distributed Database Systems

How to Build a High-Performance Data Warehouse By David J. DeWitt, Ph.D.; Samuel Madden, Ph.D.; and Michael Stonebraker, Ph.D.

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

How To Analyze Log Files In A Web Application On A Hadoop Mapreduce System

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

The Sierra Clustered Database Engine, the technology at the heart of

Self-Compressive Approach for Distributed System Monitoring

How, What, and Where of Data Warehouses for MySQL

A Survey on: Efficient and Customizable Data Partitioning for Distributed Big RDF Data Processing using hadoop in Cloud.

Analysis of EU PhD Education and Research. Prof. Dr. Hans G. Sonntag, MF Heidelberg

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

Big Data Security Challenges and Recommendations

Distributed Architecture of Oracle Database In-memory

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS

Stream Processing on GPUs Using Distributed Multimedia Middleware

How To Build A Cloud Computer

Il mondo dei DB Cambia : Tecnologie e opportunita`

Bundling Hadoop & Map Reduce for Data-Intensive Computing in Distributed Systems

Navigating the Big Data infrastructure layer Helena Schwenk

Using OBIEE for Location-Aware Predictive Analytics

Bloomington, IN, USA Fall 2007-Current. CSSE 371- Software Requirements and Specification

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

Big Data Research in the AMPLab: BDAS and Beyond

SQL Server 2012 Performance White Paper

Lecture Data Warehouse Systems

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Data Refinery with Big Data Aspects

Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Big Data and Data Science: Behind the Buzz Words

Challenges for Data Driven Systems

Keywords Big Data Analytic Tools, Data Mining, Hadoop and MapReduce, HBase and Hive tools, User-Friendly tools.

Improving Job Scheduling in Hadoop

Adapting scientific computing problems to cloud computing frameworks Ph.D. Thesis. Pelle Jakovits

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

CIS492 Special Topics: Cloud Computing د. منذر الطزاونة

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

The Department of Education. in Science and Technology

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS

Dr Christos Anagnostopoulos. 1. Education. 2. Present employment. 3. Previous Appointments. Page 1 of 6

Transcription:

Alekh Jindal MIT CSAIL Phone: +1 617-710-0572 32 Vassar St., Room 32-G908 Email: alekh@csail.mit.edu Cambirdge, MA 02141, USA Web: http://people.csail.mit.edu/alekh Education Massachusetts Institute of Technology (MIT), Cambridge, USA Postdoctoral Associate (April 2013-present), BigData at CSAIL, Mentor: Samuel Madden Saarland University & Max Plank Institute for Informatics, Saarbrücken, Germany Ph.D., Summa Cum Laude (2010-2012), Computer Science, Advisor: Jens Dittrich Thesis: OctopusDB: Flexible and Scalable Storage Management for Arbitrary Database Engines M.Sc., Honors (2008-2010), Computer Science, Advisors: Jens Dittrich & Gerhard Weikum Thesis: Quality in Phrase Mining Indian Institute of Technology (IIT), Kanpur, India B.Tech., (2002-2006) Electrical Engineering, Advisor: Shyama P Das Project: Micro-controller based Power Distribution Monitoring & Control Research Interests Awards and Honors Grants Databases, Large-scale Data-intensive Systems, MapReduce, Graph Analytics, Data Preparation, Data Cleaning, and Physical Database Design. VLDB 2014 s Best Paper Award for The Uncracked Pieces in Database Cracking, 2014 Summa Cum Laude for Doctoral degree. Awarded if all reporters have evaluated the thesis with this grade and all members of the examination board vote in favour of this. Committee: Sebastian Hack, Jens Dittrich, Gerhard Weikum, Anastasia Ailamaki, 2012 CIDR 2011 s Best Outrageous Ideas and Vision Paper Award for Towards a one-size-fits-all Database Architecture, 2011 Strategic Innovation Fund support for PhD research from Multimodal Computing and Interaction Cluster of Excellence (M2CI), Saarbruecken, Germany, 2010-2012 Honors degree in Computer Science at Saarland University for finishing studies in a time below the standard duration and the overall grade above 1.3 (scale 1 to 6, with 1 being the best), 2010 Saarland University s merit-based tuition fee exemption award for winter term 2009/2010 Max-Planck Institute for Informatics IMPRS scholarship for Master studies. This was awarded to 15 people from 12 different countries across the world, 2008-2010 British Telecom s Iteration Champion award for best performance in platform development team at Bangalore, India, 2007 British Telecom s Hot-House winner for best fresher project at Bangalore, India, 2006 IIT Kanpur s top-3 final year B.Tech Project in Electrical Engineering, India, 2006 IIT Joint Entrance Examination s All India Rank 216 (out of over 160,000 participants), 2002 1st in College in Indian School Certificate Examination, 2001 3rd in College in Indian Certificate for Secondary Examination, 1999 Served on the National Science Foundation (NSF) Panel on Database Analytics, 2014. Helped Prof. Dittrich in writing a 3-year grant to the German Ministry of Education and Science. The proposal received 1.1 million Euros for 6 PhD positions, 2011. 1

Research Projects Large-scale Data-intensive Systems Database Techniques for Cluster & Cloud Computing (VLDB 10, SOCC 11, VLDB 12, SIGMOD 13) Integrating database concepts, such as data layouts, indexing, and join processing, into largescale data flow systems to significantly improve their performance over several workloads. Heterogenous data replication in storage systems, with each replica specialized for a different subset of the workload, while still preserving fault-tolerance. Performance modeling and resource prediction for database-as-a-service. Data Preparation and Cleaning (SIGMOD 13, under-submissions) Fine-grained data transformation, including sampling, replication, partitioning, erasure coding, compression, and data placement. Query interfaces, join algorithms, and efficient implementation of scalable violation detection and repair. Database Architectures & Database Design Towards One-size-fits-all Database (VLDB PhD Workshop 10, CIDR 11, CIDR 13) Building a flexible data management system for the end-users which mimics a variety of specialized systems and adapts to dynamic query workloads. Dropping the assumption of a fixed data store to improve performance and lower costs. Mixed Query Workloads (CIDR 13, VLDB 13, VLDB 14, IEEE BigData 14, under-submission) Supporting several query workloads in relational databases, including: (i) transactional (OLTP) and analytical (OLAP), (ii) relational (OLAP) and graph analytics. (iii) full-scans and ad-hoc range queries. Database Adaptivity & Robustness (BIRTE 11, VLDB 14, CIDR 15) Adaptive vertical partitioning to hit the middle ground between row and column layouts. Adaptive indexing with faster convergence, robustness, and awareness to modern hardware. Data-driven robust data transformations for ad-hoc query workloads. Selected Publications [P15] Alekh Jindal, Samuel Madden, GraphiQL: A Graph Intuitive Query Language for Relational Databases IEEE BigData, 441-450, 2014. Acceptance rate: 18.5% [P14] Felix Martin Schuhknecht, Alekh Jindal, Jens Dittrich, The Uncracked Pieces in Database Cracking PVLDB, 7(2):97-108, 2014. [Best Paper Award] [P13] Alekh Jindal, Endre Palatinus, Vladimir Pavlov, Jens Dittrich, A Comparison of Knives for Bread Slicing PVLDB, 6(6):361-372, 2013. [P12] Barzan Mozafari, Carlo Curino, Alekh Jindal, Samuel Madden, Performance and Resource Modeling in Highly-Concurrent OLTP Workloads SIGMOD, 301-312, 2013. [P11] Alekh Jindal, Jorge-Arnulfo Quiane-Ruiz, Jens Dittrich, WWHow! Freeing Data Storage from Cages CIDR, 2013. [P10] Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte, How Achaeans Would Construct Columns in Troy CIDR, 2013. [P9] Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, Stefan Richter, Stefan Schuh, Alekh Jindal, Jörg Schad, Only Aggressive Elephants are Fast Elephants PVLDB, 5(9):1591-1602, 2012. 2

[P8] Alekh Jindal, Jorge-Arnulfo Quiane-Ruiz, Jens Dittrich, Trojan Data Layouts: Right Shoes for a Running Elephant ACM SOCC, 21:1-21:14, 2011. [P7] Jens Dittrich, Alekh Jindal, Towards a one-size-fits-all Database Architecture CIDR, 195-198, 2011. [Best Outrageous Ideas and Vision Paper Award] [P6] Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay Setty, and Jörg Schad, Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) PVLDB, 3(1):518-529, 2010. Other Publications [P5] Alekh Jindal, Robust Data Transformations CIDR, 2015. [Abstract] [P4] Alekh Jindal, Praynaa Rawlani, Eugene Wu, Samuel Madden, Amol Deshpande, Mike Stonebraker, Vertexica: Your Relational Friend for Graph Analytics! PVLDB, 7(13):1669-1672, 2014. [Demo] Under Submission/Preparation [P3] Alekh Jindal, Jorge-Arnulfo Quiane-Ruiz, Samuel Madden Cartilage: Adding Flexibility to the Hadoop Skeleton SIGMOD, 1057-1060, 2013. [Demo] [P2] Alekh Jindal, Jens Dittrich Relax and Let the Database do the Partitioning Online VLDB BIRTE, 65-80, 2011. [P1] Alekh Jindal The Mimicking Octopus: Towards a one-size-fits-all Database Architecture VLDB PhD Workshop, 78-83, 2010. [U3] Felix Martin Schuhknecht, Alekh Jindal, Jens Dittrich, An Experimental Evaluation and Analysis of Database Cracking. (Extended Version of [P14]). [U2] Alekh Jindal, Jorge Quiané-Ruiz, Samuel Madden, Fine-grained Data Preparation and Storage, Dec 2014. [U1] Alekh Jindal, Samuel Madden, Malú Castellanos, Meichun Hsu, Graph Analytics using the Vertica Relational Database, Nov 2014. http://arxiv.org/abs/1412.5263 Miscellaneous Writings International Patents Teaching Alekh Jindal, Graph Analytics: The New Use Case for Relational Databases, blog at Intel Science and Technology Center (ISTC) for Big Data, July 2014. Alekh Jindal, Benchmarking Graph Databases, blog at Intel Science and Technology Center (ISTC) for Big Data, Sept 2013. Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, Stefan Richter, Stefan Schuh, Alekh Jindal, Jörg Schad, Replicated Data Storage System and Methods. WO Patent WO2013139379. Jens Dittrich, Alekh Jindal, A method for storing and accessing data in a database system. US Patent US20130226959 A1, European Patent EP2614450A1, WO Patent WO2012032184A1. Teaching with Educational Technologies, MIT, IAP 2015 Teaching Certificate Program, MIT, Summer 2014 Lab Assistant, From ASCII to Answers, MIT, Fall 2013 TA, Advanced Information Systems Lab, Germany, Winter 2011 TA, Advanced Information Systems Lab, Germany, Summer 2011 TA, NOSQL: Managing Data (almost) without a Database System, Germany, Winter 2010 TA, Advanced Information Systems Lab: OctopusDB, Germany, Summer 2010 3

TA, Database Systems core lecture, Germany, Winter 2009 Research Associate, National Program on Technology Enhanced Learning, India, 2005-2006 Mentoring Industrial Praynaa Rawlani, M.Eng. Thesis, Graphs anlaytics on relational databases. Felix Martin Schuhknecht, 1st year PhD, Evaluating and improving database cracking algorithms. Endre Palatinus, 1st year PhD, Evaluating vertical partitioning algorithms and their impact. Karen Khachatryan, 1st year PhD, Techniques for emulating columns stores in row databases. Stefan Chouteau, B.Sc Thesis, Implementing a log-structured main-memory database system. Sebastian Wendland, M.Sc Thesis, Implementing column store access layer in PostgreSQL. Marco Huester, M.Sc Thesis, Applying database cracking over two-dimensional data. Felix Martin Schuhknecht, B.Sc Thesis, Compression schemes over hybrid data layouts. Qatar Computing Research Institute (QCRI), Doha, Qatar Visiting Scholar, Data Analytics Team March 2014 Worked with QCRI scientists to set up a data cleaning research agenda. Ibibo Web (Naspers Group), Gurgaon, India Senior Software Engineer, Search Development Team 2007-2008 Developed vertical search engines using Lucene/Solr technologies and their front-end using LAMP. Built web analytics framework to analyze search engine activity, including user behavior, search engine performance, and search quality, on a daily as well as real-time basis. Prototyped a lead-generation system for Mobile Ads Business. British Telecom (BT), Bangalore, India Associate Consultant, Platform Development Team 2006-2007 Worked on building an order fulfillment system for the Mobility division. IBM Software Labs, Pune, India Summer Intern, Secure Area Network File System Team Summer 2005 Implemented an autonomic logging and recovery infrastructure using network blocked devices. Professional Activities PC Member, Special Interest Group on Management of Data (SIGMOD), 2015 PC Member, Proceedings on Very Large Data Bases (PVLDB), 2015 PC Member, Demo, Extending Database Technology (EDBT), 2015 PC Member, Data Engineering meets the Semantic Web (DESWeb) Workshop, 2015 PC Member, Proceedings on Very Large Data Bases (PVLDB), 2014 Panelist, National Science Foundation (NSF) panel on Database Analytic, 2014 Referee, SIGMOD Record Research Surveys, 2014 Referee, Distributed and Parallel Databases (DAPD), 2013 Thesis Reviewer, Adaptive Indexing and Covering in OctopusMK at Saarland University, 2013 External Reviewer for several conferences: SIGMOD (2014, 2013), VLDB (2013, 2012, 2010), ICDE (2012, 2013), CIKM (2013), EDBT (2011). Talks Conference on Innovative Database Research (CIDR), Asilomar, California, January, 2015 IEEE International Conference on BigData, Washington DC, October, 2014 IEEE International Conference on BigData, Washington DC, October, 2014 New England Database Summit (NEDB), MIT, Cambridge, Massachusetts, January, 2014 Intel Head-quarters, Santa Clara, USA, December, 2013 Symposium on Cloud Computing, Cascais, Portugal, October, 2011 VLDB, Singapore, September, 2010 4

References Prof. Samuel Madden Massachusetts Institute of Technology 32 Vassar Street Cambridge, MA 02139, USA +1 (617) 258-6643 madden@csail.mit.edu Prof. Jens Dittrich Saarland University Campus E1.1, Room 311 66123 Saarbrücken, Germany +49 (681) 302-70141 jens.dittrich@infosys.uni-saarland.de Prof. Gerhard Weikum Max-Planck Institute for Informatics Campus E1.4, Room 402 66123 Saarbrücken, Germany +49 (681) 9325-500 weikum@mpi-inf.mpg.de 5