Cloud Computing and Big Data What Technical Writers Need to Know Greg Olson, Senior Director Black Duck Software For the Society of Technical Writers Berkeley Chapter Black Duck 2014
Agenda Introduction A Brief History of Computing Paradigms Defining Cloud Computing Big Data Discussion Q and A 2 Black Duck 2014
Computing Paradigm Evolution Mainframe Minicomputer Batch Processing Card Punch 1950s-1960s 3 Time Sharing Teletype Terminal 1970s-1980s Black Duck 2014 Data Center Cloud Applications and Web Clients Desktop PC 1990s 2000-2010 Mobile Device TODAY
Defining Cloud Computing 4 Black Duck 2014
Cloud Computing Definition Computing that involves a large number of computers connected through a network such as the Internet Distributed computing over a network, with the ability to run applications on many connected computers at the same time Types of Cloud Computing IaaS Infrastructure-as-a-Service (storage, compute services, etc.) PaaS Platform-as-a-Service (solution stack or compute platform) SaaS Software-as-a-Service (cloud-hosted application) Cloud vs. Web A web app is one that presents its UI in a web browser; a cloud app is an app hosted in the Cloud Most Cloud apps are web apps, but not all web apps are hosted in the cloud some may run on physical (not virtual) servers 5 Black Duck 2014
Functional Differences Characteristics of Cloud Computing - The NIST Definition of Cloud Computing On-demand self-service Resources provisioned automatically Nearly instant Broad network access Capabilities available over the network Can be accessed by a variety of end-user terminal devices, including thick clients on desktop machines as well as thin clients on mobile devices (phones, tablets, laptops, etc.). Resource pooling All resources are located in one or more common pools Exact physical location of the resources are not specified Rapid elasticity Capacity may be rapidly allocate and released Measured service Resources billed on a fine granularity as they are consumed 6 Black Duck 2014
Cloud Computing Distinctions Cloud vs. Web Web apps can be hosted on physical web servers or on virtualized hosts in the cloud Cloud apps appear as web apps, using HTTP and related protocols, and presenting UI/UX in a web browser Native mobile and desktop apps can have back ends hosted in the Cloud Public vs. Private Clouds Public Cloud a Cloud-based resource, platform or service available over the public internet Private Cloud - cloud infrastructure operated for/by a single organization, managed internally or by a third-party and hosted internally or externally 7 Black Duck 2014
What Makes a Cloud Work? Request Management Provisioning Account Management Usage Accounting Security Monitoring Image Management Network Virtualization Storage Virtualization Compute Virtualization License Management Usage Accounting 8 Black Duck 2014
Commercial Cloud Offerings (Iaas and PaaS) Amazon Web Services IBM Smart Cloud Microsoft Cloud, Windows Azure Rackspace Oracle Cloud + Every other hosting provider now offers cloud services 9 Black Duck 2014
Commercial Cloud Applications (SaaS) Adobe Creative Cloud Microsoft Office 365 Google Docs and Google Drive Box DropBox Salesforce.com Certify Netflix Gaming, e.g, CityVille Healthcare.gov New Relic 10 Black Duck 2014
Open Source Cloud Projects and Technologies Platforms OpenStack the Cloud platform launched by Rackspace and Nasa OSS supported by OpenStack Foundation whose members include 300 industryleading companies Commercial supported by many companies: Rackspace, IBM, HP CloudStack cloud software for building IaaS OSS supported by Apache CloudStack project, commercial supported by Citrix Eucalyptus cloud platform software for building public and private clouds compatible with Amazon Web Services OSS and commercial supported by Eucalyptus Xen a virtualization platform used in most public cloud platforms OSS and commercial supported by Citrix Vmware a virtualization platform used in most private clouds Commercial supported by VMware Application Development Tools PHP a popular scripting language for web apps Rails - a web application framework built on Ruby Ruby - a dynamic, reflective, object-oriented, general-purpose programming language popular for building web/cloud apps Node.js scalable server-side Javascript Java widely used 3GL for enterprise-scale applications 11 Black Duck 2014
Why Open Source for the Cloud? Most service providers build on OSS to reduce cost Linux, Xen, etc. Open Source is advantageous for any software that interoperates with other software The Internet Mobile communications Evolution has been very rapid, placing emphasis on speed of innovation rather than protecting market territory Open Source community collaboration has lead many areas 12 Black Duck 2014
Big Data Discussion 13 Black Duck 2014
Big Data Definition Big data involves the collection, management and processing of data sets whose extreme size (tera- and petabytes) outstrips the capabilities of convention databases and data warehouses Whereas conventional legacy databases handle structured data (records), Big Data must also handle un-structured data (text) Input domains for Big Data include web commerce, data center and cloud logging/metrics, mobile data, and the Internet of Things 14 Black Duck 2014
Big Data Platforms Hadoop Hadoop is a Java-based, open source, Big Data management system built from clusters of commodity computers (blades), and leads the market as a general purpose Big Data platform NoSQL Databases NoSQL ( Not-Only SQL ) databases are optimized for storing/managing entire documents and large amorphous data records. NoSQL databases are often used in support of Hadoop and can also serve as Big Data engines in their own right Examples: Cassandra, CouchDB, Hbase, MangoDB Cloud Big Data Platform Instead of running on local clusters, Cloud Big Data is hosted on virtual clusters as a Cloud service, usually to handle highly variable volumes of data traffic (elastic Cloud) Examples: Google BigTables, Amazon Elastic Web Service Super-scaled legacy databases and data warehouses Legacy database and data warehouse technologies can match Hadoop or NoSQL Big Data capabilities but at high price points 15 Black Duck 2014
Big Data Commercial Ecosystem Services Providers Development, Analytics and Visualization Data Management and Storage 16 Black Duck 2014
Big Data Open Source Projects AMBARI Zookeeper PIG 17 Black Duck 2014
The Open Source Big Data Stack 18 Black Duck 2014
Operationalized Big Data 19 Black Duck 2014
Why Open Source for Big Data? Most compute clusters are built on OSS to reduce cost Linux, Xen, etc. Open Source is advantageous for any software that interoperates with other software Distributed data collection Multiple access options to support analytic approaches Evolution is still very rapid, placing emphasis on speed of innovation rather than protecting market territory Open Source communities currently lead all major areas of big data technology 20 Black Duck 2014
Q and A 21 Black Duck 2014