Professional Hadoop Solutions



Similar documents
U.S. Call Center Software Markets

World Enterprise, Broadband, Mobile Video Transcoders Market

Global Big Data Analytics Market for Test and Measurement

Analysis of the Global Enterprise Firewall Market

Brochure More information from

World Wireless Protocol Analyzers and Network Monitoring Systems

2015 U.S. Technical and Trade Schools Industry - Industry Report

Strategic Global Sourcing Best Practices

Professional Alfresco. Practical Solutions for Enterprise Content Management

Project Scheduling and Management for Construction. 4th Edition. RSMeans

Global Opioid Dependence Drugs Market Highlights

Next Generation Enterprise Mobility Management Market Insight

Strategic Analysis of the Impact of Big Data on the European and North American Automotive Industry

Forms 1099 & W-9 Update - Current Year IRS Information Reporting Form Guidelines - Recorded Webinar

Business Intelligence and the Cloud. Strategic Implementation Guide. Wiley and SAS Business Series

Europe Rheumatoid Arthritis Market Highlights

Public Cloud Computing Market for SMBs in India - Affordable Connectivity and Virtualization Technologies to Drive Adoption of Public Cloud

IP VPN Market Forecast in India to 2016

"Personal Accident and Health Insurance Claims and Expenses in Morocco to 2018: Market Databook"

ZOHO Company Profile, focussing on CRM Activities

Essentials of Working Capital Management. Essentials Series

European Electronic Medical Records (EMR) Markets

Global Big Data Analytics Market

Global Multiple Sclerosis Epidemiology and Patient Flow Analysis

Global Multiple Myeloma Epidemiology and Patient Flow Analysis

The Practical Guide to Project Management Documentation

Global Haemophilia Epidemiology and Patient Flow Analysis

U.S. Mobile Device Management (MDM) Market 2012: Solving the Many Challenges in Enterprise Mobility

Analysis of the North American Automotive Wire and Cable Materials Market: Price-performance Index of Materials Will be Key in Driving Growth

'Personal Accident and Health Insurance Premiums and Claims in Australia to 2018: Market Brief' contains

Estonia: Clay Tiles And Roofing - Market Report. Analysis And Forecast To 2020

Trends and Opportunities in the UAE Life Insurance Industry to 2016: Market Profile

Personal Accident and Health Insurance Investments in Russia to 2018: Market Databook

A Project Manager's Book of Forms. A Companion to the PMBOK Guide. 2nd Edition

Strategic Analysis of Fleet Vehicle Leasing Market in Ireland

Global Physical Security Information Management Market Assessment

Effective Software Project Management

Personal Accident and Health Insurance Claims and Expenses in Belarus to 2016: Market Databook

'Personal Accident and Health Insurance Premiums and Claims in Kenya to 2018: Market Brief' contains

Data Warehousing Fundamentals for IT Professionals. 2nd Edition

BP p.l.c. (BP) Company Profile- Business Overview, Strategies, SWOT and Financial Analysis

Personal Accident and Health Insurance Claims and Expenses in South Africa to 2017: Market Databook

Non-Life Insurance Premiums and Claims in Georgia to 2017: Market Brief

Global Technology Trends Report: Big Data and Extreme Info Processing

Non-Life Insurance Premiums and Claims in Brazil to 2018: Market Brief

EMR - Emerson Electric CoCompany Profile - Business Description, Strategies and SWOT Analysis

The Laboratory Quality Assurance System. A Manual of Quality Procedures and Forms. 3rd Edition

Trends and Opportunities in Cambodia Personal Accident and Health Insurance Industry to 2017: Market Profile

Administering Data Centers. Servers, Storage, and Voice over IP

Microsoft Dynamics CRM 2011 Administration Bible

Analysis of the Brazilian Data Center Power Supplies Market

North American Video Conferencing Hosted and Managed Services Market: Growing Amidst a Long-term Transition and Economic Turbulence

MEAN/Full Stack Web Development - Training Course Package

Microsoft SQL Server 2008 Bible

North America Insurance Market Outlook to US Insurance Market Headstarting the Lost Momentum

Cloud Infrastructure Testing and Cloud-based Application Performance Monitoring Market

Vulnerability Management (VM) - Global Market Analysis

Cloud Infrastructure as a Service Market Update 2015

General Dynamics Corporation - Mergers & Acquisitions (M&A), Partnerships & Alliances and Investment Report

Project Manager's Spotlight on Change Management

Individual Life Insurance in Indonesia to 2019: Market Databook

Building the Agile Enterprise. The MK/OMG Press

Professional Java Tools for Extreme Programming. Ant, XDoclet, JUnit, Cactus, and Maven

Building and Renovating Schools. Design, Construction Management, Cost Control. RSMeans

Saudi Cable Company Company Profile - Business Description, Strategies and SWOT Analysis

The Fundamentals of Organizational Behavior. What Managers Need to Know

Sarbanes-Oxley Ongoing Compliance Guide. Key Processes and Summary Checklists

Premiere Global Services, Inc. Company Profile - Business Description, Strategies, SWOT and Financial Analysis

Genesis Oil & Gas Consultants Ltd Company Profile - Business Description, Strategies and SWOT Analysis

Predictive Analytics for Human Resources. Wiley and SAS Business Series

Enbridge Energy Management, L.L.C. Company Profile - Business Description, Strategies and SWOT Analysis

Jindal Steel & Power Ltd (JSPL) Company Profile- Business Overview, Key Strategies, Operations and SWOT

Corporate Performance Management Best Practices. A Case Study Approach to Accelerating CPM Results. Wiley Corporate F&A

Africa Insurance Market Outlook to Surge in Insurance Density, Key Medium for Growth

VASCO Data Security International, Inc Company Profile - Business Description, Strategies, SWOT and Financial Analysis

Corinthian Colleges Inc. Company Profile - Business Description, Strategies, SWOT and Financial Analysis

Risk and Financial Management in Construction

General Cable Corporation - Mergers & Acquisitions (M&A), Partnerships & Alliances and Investment Report

U.S. Database Management System Software by Vertical Market

ITIL Foundation Exam Study Guide

Complete B2B Online Marketing

London Stock Exchange Group PLC Company Profile - Business Description, Strategies, SWOT and Financial Analysis

Miclyn Express Offshore Limited Company Profile - Business Description, Strategies and SWOT Analysis

Enterprise VoIP - Future Potential of the Indian Market for Managed VoIP Solutions

Mangalore Electricity Supply Company LimitedCompany Profile - Business Description, Strategies and SWOT Analysis

Port of Melbourne CorporationCompany Profile - Business Description, Strategies and SWOT Analysis

How To Write A Book On Online Hiring

Western European Storage Area Network (SAN) Market

Transcription:

Brochure More information from http://www.researchandmarkets.com/reports/2542488/ Professional Hadoop Solutions Description: The go-to guidebook for deploying Big Data solutions with Hadoop Today's enterprise architects need to understand how the Hadoop frameworks and APIs fit together, and how they can be integrated to deliver real-world solutions. This book is a practical, detailed guide to building and implementing those solutions, with code-level instruction in the popular Wrox tradition. It covers storing data with HDFS and Hbase, processing data with MapReduce, and automating data processing with Oozie. Hadoop security, running Hadoop with Amazon Web Services, best practices, and automating Hadoop processes in real time are also covered in depth. With in-depth code examples in Java and XML and the latest on recent additions to the Hadoop ecosystem, this complete resource also covers the use of APIs, exposing their inner workings and allowing architects and developers to better leverage and customize them. - The ultimate guide for developers, designers, and architects who need to build and deploy Hadoop applications - Covers storing and processing data with various technologies, automating data processing, Hadoop security, and delivering real-time solutions - Includes detailed, real-world examples and code-level guidelines - Explains when, why, and how to use these tools effectively - Written by a team of Hadoop experts in the programmer-to-programmer Wrox style Professional Hadoop Solutions is the reference enterprise architects and developers need to maximize the power of Hadoop. Contents: Introduction xvii Chapter 1: Big Data and the Hadoop Ecosystem 1 Big Data Meets Hadoop 2 Hadoop: Meeting the Big Data Challenge 3 Data Science in the Business World 5 The Hadoop Ecosystem 7 Hadoop Core Components 7 Hadoop Distributions 10 Developing Enterprise Applications with Hadoop 12 Summary 16 Chapter 2: Storing Data in Hadoop 19 HDFS 19 HDFS Architecture 20 Using HDFS Files 24 Hadoop-Specific File Types 26 HDFS Federation and High Availability 32

HBase 34 HBase Architecture 34 HBase Schema Design 40 Programming for HBase 42 New HBase Features 50 Combining HDFS and HBase for Effective Data Storage 53 Using Apache Avro 53 Managing Metadata with HCatalog 58 Choosing an Appropriate Hadoop Data Organization for Your Applications 60 Summary 62 Chapter 3: Processing Your Data with MapReduce 63 Getting to Know MapReduce 63 MapReduce Execution Pipeline 65 Runtime Coordination and Task Management in MapReduce 68 Your First MapReduce Application 70 Building and Executing MapReduce Programs 74 Designing MapReduce Implementations 78 Using MapReduce as a Framework for Parallel Processing 79 Simple Data Processing with MapReduce 81 Building Joins with MapReduce 82 Building Iterative MapReduce Applications 88 To MapReduce or Not to MapReduce? 94 Common MapReduce Design Gotchas 95 Summary 96 Chapter 4: Customizing MapReduce Execution 97 Controlling MapReduce Execution with InputFormat 98 Implementing InputFormat for Compute-Intensive Applications 100 Implementing InputFormat to Control the Number of Maps 106 Implementing InputFormat for Multiple HBase Tables 112 Reading Data Your Way with Custom RecordReaders 116 Implementing a Queue-Based RecordReader 116

Implementing RecordReader for XML Data 119 Organizing Output Data with Custom Output Formats 123 Implementing OutputFormat for Splitting MapReduce Job s Output into Multiple Directories 124 Writing Data Your Way with Custom RecordWriters 133 Implementing a RecordWriter to Produce Outputtar Files 133 Optimizing Your MapReduce Execution with a Combiner 135 Controlling Reducer Execution with Partitioners 139 Implementing a Custom Partitioner for One-to-Many Joins 140 Using Non-Java Code with Hadoop 143 Pipes 143 Hadoop Streaming 143 Using JNI 144 Summary 146 Chapter 5: Building Reliable MapReduce Apps 147 Unit Testing MapReduce Applications 147 Testing Mappers 150 Testing Reducers 151 Integration Testing 152 Local Application Testing with Eclipse 154 Using Logging for Hadoop Testing 156 Processing Applications Logs 160 Reporting Metrics with Job Counters 162 Defensive Programming in MapReduce 165 Summary 166 Chapter 6: Automating Data Processing with Oozie 167 Getting to Know Oozie 168 Oozie Workflow 170 Executing Asynchronous Activities in Oozie Workflow 173 Oozie Recovery Capabilities 179 Oozie Workflow Job Life Cycle 180 Oozie Coordinator 181

Oozie Bundle 187 Oozie Parameterization with Expression Language 191 Workflow Functions 192 Coordinator Functions 192 Bundle Functions 193 Other EL Functions 193 Oozie Job Execution Model 193 Accessing Oozie 197 Oozie SLA 199 Summary 203 Chapter 7: Using Oozie 205 Validating Information about Places Using Probes 206 Designing Place Validation Based on Probes 207 Designing Oozie Workflows 208 Implementing Oozie Workflow Applications 211 Implementing the Data Preparation Workflow 212 Implementing Attendance Index and Cluster Strands Workflows 220 Implementing Workflow Activities 222 Populating the Execution Context from a java Action 223 Using MapReduce Jobs in Oozie Workflows 223 Implementing Oozie Coordinator Applications 226 Implementing Oozie Bundle Applications 231 Deploying, Testing, and Executing Oozie Applications 232 Deploying Oozie Applications 232 Using the Oozie CLI for Execution of an Oozie Application 234 Passing Arguments to Oozie Jobs 237 Using the Oozie Console to Get Information about Oozie Applications 240 Getting to Know the Oozie Console Screens 240 Getting Information about a Coordinator Job 245

Summary 247 Chapter 8: Advanced Oozie FEATURES 249 Building Custom Oozie Workflow Actions 250 Implementing a Custom Oozie Workflow Action 251 Deploying Oozie Custom Workflow Actions 255 Adding Dynamic Execution to Oozie Workflows 257 Overall Implementation Approach 257 A Machine Learning Model, Parameters, and Algorithm 261 Defining a Workflow for an Iterative Process 262 Dynamic Workflow Generation 265 Using the Oozie Java API 268 Using Uber Jars with Oozie Applications 272 Data Ingestion Conveyer 276 Summary 283 Chapter 9: Real-Time Hadoop 285 Real-Time Applications in the Real World 286 Using HBase for Implementing Real-Time Applications 287 Using HBase as a Picture Management System 289 Using HBase as a Lucene Back End 296 Using Specialized Real-Time Hadoop Query Systems 317 Apache Drill 319 Impala 320 Comparing Real-Time Queries to MapReduce 323 Using Hadoop-Based Event-Processing Systems 323 HFlame 324 Storm 326 Comparing Event Processing to MapReduce 329 Summary 330 Chapter 10: Hadoop Security 331 A Brief History: Understanding Hadoop Security Challenges 333 Authentication 334 Kerberos Authentication 334

Delegated Security Credentials 344 Authorization 350 HDFS File Permissions 350 Service-Level Authorization 354 Job Authorization 356 Oozie Authentication and Authorization 356 Network Encryption 358 Security Enhancements with Project Rhino 360 HDFS Disk-Level Encryption 361 Token-Based Authentication and Unified Authorization Framework 361 HBase Cell-Level Security 362 Putting it All Together Best Practices for Securing Hadoop 362 Authentication 363 Authorization 364 Network Encryption 364 Stay Tuned for Hadoop Enhancements 365 Summary 365 Chapter 11: Running Hadoop Applications on AWS 367 Getting to Know AWS 368 Options for Running Hadoop on AWS 369 Custom Installation using EC2 Instances 369 Elastic MapReduce 370 Additional Considerations before Making Your Choice 370 Understanding the EMR-Hadoop Relationship 370 EMR Architecture 372 Using S3 Storage 373 Maximizing Your Use of EMR 374 Utilizing CloudWatch and Other AWS Components 376 Accessing and Using EMR 377 Using AWS S3 383 Understanding the Use of Buckets 383

Content Browsing with the Console 386 Programmatically Accessing Files in S3 387 Using MapReduce to Upload Multiple Files to S3 397 Automating EMR Job Flow Creation and Job Execution 399 Orchestrating Job Execution in EMR 404 Using Oozie on an EMR Cluster 404 AWS Simple Workflow 407 AWS Data Pipeline 408 Summary 409 Chapter 12: Building Enterprise Security Solutions for Hadoop Implementations 411 Security Concerns for Enterprise Applications 412 Authentication 414 Authorization 414 Confidentiality 415 Integrity 415 Auditing 416 What Hadoop Security Doesn t Natively Provide for Enterprise Applications 416 Data-Oriented Access Control 416 Differential Privacy 417 Encrypted Data at Rest 419 Enterprise Security Integration 419 Approaches for Securing Enterprise Applications Using Hadoop 419 Access Control Protection with Accumulo 420 Encryption at Rest 430 Network Isolation and Separation Approaches 430 Summary 434 Chapter 13: Hadoop s Future 435 Simplifying MapReduce Programming with DSLs 436 What Are DSLs? 436 DSLs for Hadoop 437 Faster, More Scalable Processing 449 Apache YARN 449

Tez 452 Security Enhancements 452 Emerging Trends 453 Summary 454 APPENDIX : Useful Reading 455 Index Ordering: Order Online - http://www.researchandmarkets.com/reports/2542488/ Order by Fax - using the form below Order by Post - print the order form below and send to Research and Markets, Guinness Centre, Taylors Lane, Dublin 8, Ireland.

Page 1 of 2 Fax Order Form To place an order via fax simply print this form, fill in the information below and fax the completed form to 646-607-1907 (from USA) or +353-1-481-1716 (from Rest of World). If you have any questions please visit http://www.researchandmarkets.com/contact/ Order Information Please verify that the product information is correct. Product Name: Web Address: Office Code: Professional Hadoop Solutions http://www.researchandmarkets.com/reports/2542488/ SC Product Format Please select the product format and quantity you require: Hard Copy (Paper back): Quantity USD 115 + USD 29 Shipping/Handling * Shipping/Handling is only charged once per order. Contact Information Please enter all the information below in BLOCK CAPITALS Title: Mr Mrs Dr Miss Ms Prof First Name: Last Name: Email Address: * Job Title: Organisation: Address: City: Postal / Zip Code: Country: Phone Number: Fax Number: * Please refrain from using free email accounts when ordering (e.g. Yahoo, Hotmail, AOL)

Page 2 of 2 Payment Information Please indicate the payment method you would like to use by selecting the appropriate box. Pay by credit card: You will receive an email with a link to a secure webpage to enter your credit card details. Pay by check: Please post the check, accompanied by this form, to: Research and Markets, Guinness Center, Taylors Lane, Dublin 8, Ireland. Pay by wire transfer: Please transfer funds to: Account number 833 130 83 Sort code 98-53-30 Swift code IBAN number Bank Address ULSBIE2D IE78ULSB98533083313083 Ulster Bank, 27-35 Main Street, Blackrock, Co. Dublin, Ireland. If you have a Marketing Code please enter it below: Marketing Code: Please note that by ordering from Research and Markets you are agreeing to our Terms and Conditions at http://www.researchandmarkets.com/info/terms.asp Please fax this form to: (646) 607-1907 or (646) 964-6609 - From USA +353-1-481-1716 or +353-1-653-1571 - From Rest of World