Figure 1. Accessing via External Tables with in-database MapReduce
|
|
|
- Lily Freeman
- 10 years ago
- Views:
Transcription
1
2 1
3 The simplest way to access external files or external data on a file system from within an Oracle database is through an external table. See here for an introduction to External tables. External tables present data stored in a file system in a table format and can be used in SQL queries transparently. External tables could thus potentially be used to access data stored in HDFS (the Hadoop File System) from inside the Oracle database. Unfortunately HDFS files are not directly accessible through the normal operating system calls that the external table driver relies on. The FUSE (File system in Userspace) project provides a workaround in this case. There are a number of FUSE drivers that allow users to mount a HDFS store and treat it like a normal file system. By using one of these drivers and mounting HDFS on the database instance (on every instance if this was a RAC database), HDFS files can be easily accessed using the External Table infrastructure. Figure 1. Accessing via External Tables with in-database MapReduce In Figure 1we are utilizing Oracle Database 11g to implement in-database mapreduce as described in this article. In general, the parallel execution framework in Oracle Database 11g is sufficient to run most of the desired operations in parallel directly from the external table. The external table approach may not be suitable in some cases (say if FUSE is unavailable). Oracle Table Functions provide an alternate way to fetch data from Hadoop. Our attached example outlines one way of doing this. At a high level we implement a table function that uses 2
4 the DBMS_SCHEDULER framework to asynchronously launch an external shell script that submits a Hadoop Map-Reduce job. The table function and the mapper communicate using Oracle's Advanced Queuing feature. The Hadoop mapper en-queue's data into a common queue while the table function de-queues data from it. Since this table function can be run in parallel additional logic is used to ensure that only one of the slaves submits the External Job. Figure 2. Leveraging Table Functions for parallel processing The queue gives us load balancing since the table function could run in parallel while the Hadoop streaming job will also run in parallel with a different degree of parallelism and outside the control of Oracle's Query Coordinator. 3
5 As an example we translated the architecture shown in Figure 2 in a real example. Note that our example only shows a template implementation of using a Table Function to access data stored in Hadoop. Other, possibly better, implementations are clearly possible. The following diagrams are a technically more accurate and more detailed representation of the original schematic in Figure 2 explaining where and how we use the pieces of actual code that follow: Figure 3. Starting the Mapper jobs and retrieving data In step 1 we figure out who gets to be the query coordinator. For this we use a simple mechanism that writes records with the same key value into a table. The first insert wins and will function as the query coordinator (QC) for the process. Note that the QC table function invocation does play a processing role as well. In step 2 this table function invocation (QC) starts an asynchronous job using dbms_scheduler the Job Controller in Figure 3 that than runs the synchronous bash script on the Hadoop cluster. This bash script, called the launcher in Figure 3 starts the mapper processes (step 3) on the Hadoop cluster. 4
6 The mapper processes process data and write to a queue in step 5. In the current example we chose a queue as it is available cluster wide. For now we simply chose to write any output directly into the queue. You can achieve better performance by either batching up the outputs and then moving them into the queue. Obviously you can choose various other delivery mechanisms, including pipes and relational tables. Step 6 is then the de-queuing process which is done by parallel invocations of the table function running in the database. As these parallel invocations process data it gets served up to the query that is requesting the data. The table function leverages both the Oracle data and the data from the queue and thereby integrates data from both sources in a single result set for the end user(s). Figure 4. Monitoring the process After the Hadoop side processes (the mappers) are kicked off, the job monitor process keeps an eye on the launcher script. Once the mappers have finished processing data on the Hadoop cluster, the bash script finishes as is shown in Figure 4. The job monitor monitors a database scheduler queue and will notice that the shell script has finished (step 7). The job monitor checks the data queue for remaining data elements, step 8. As long as data is present in the queue the table function invocations keep on processing that data (step 6). 5
7 Figure 5. Closing down the processing When the queue is fully de-queued by the parallel invocations of the table function, the job monitor terminates the queue (step 9 in Figure 5) ensuring the table function invocations in Oracle stop. At that point all the data has been delivered to the requesting query. 6
8 The solution implemented in Figure 3 through Figure 5 uses the following code. All code is tested on Oracle Database 11g and a 5 node Hadoop cluster. As with most whitepaper text, copy these scripts into a text editor and ensure the formatting is correct. This script contains certain set up components. For example the initial part of the script shows the creation of the arbitrator table in step 1 of Figure 3. In the example we use the always popular OE schema. connect oe/oe -- Table to use as locking mechanisim for the hdfs reader as -- leveraged in Figure 3 step 1 DROP TABLE run_hdfs_read; CREATE TABLE run_hdfs_read(pk_id number, status varchar2(100), primary key (pk_id)); -- Object type used for AQ that receives the data CREATE OR REPLACE TYPE hadoop_row_obj AS OBJECT ( a number, b number); / connect / as sysdba -- system job to launch external script -- this job is used to eventually run the bash script -- described in Figure 3 step 3 CREATE OR REPLACE PROCEDURE launch_hadoop_job_async(in_directory IN VARCHAR2, id number) IS cnt number; BEGIN begin DBMS_SCHEDULER.DROP_JOB ('ExtScript' id, TRUE); 7
9 exception when others then null; end; -- Run a script DBMS_SCHEDULER.CREATE_JOB ( job_name job_type => 'ExtScript' id, => 'EXECUTABLE', ); job_action => '/bin/bash', number_of_arguments => 1 DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE ('ExtScript' id, 1, in_directory); DBMS_SCHEDULER.ENABLE('ExtScript' id); -- Wait till the job is done. This ensures the hadoop job is completed loop select count(*) into cnt from DBA_SCHEDULER_JOBS where job_name = 'EXTSCRIPT' id; dbms_output.put_line('scheduler Count is ' cnt); if (cnt = 0) then else exit; dbms_lock.sleep(5); end if; end loop; -- Wait till the queue is empty and then drop it -- as shown in Figure 5 -- The TF will get an exception and it will finish quietly loop 8
10 select sum(c) into cnt from ( select enqueued_msgs - dequeued_msgs c from gv$persistent_queues where queue_name = 'HADOOP_MR_QUEUE' union all select num_msgs+spill_msgs c from gv$buffered_queues where queue_name = 'HADOOP_MR_QUEUE' ); union all select 0 c from dual if (cnt = 0) then else -- Queue is done. stop it. DBMS_AQADM.STOP_QUEUE ('HADOOP_MR_QUEUE'); DBMS_AQADM.DROP_QUEUE ('HADOOP_MR_QUEUE'); return; -- Wait for a while dbms_lock.sleep(5); end if; end loop; END; / -- Grants needed to make hadoop reader package work grant execute on launch_hadoop_job_async to oe; 9
11 grant select on v_$session to oe; grant select on v_$instance to oe; grant select on v_$px_process to oe; grant execute on dbms_aqadm to oe; grant execute on dbms_aq to oe; connect oe/oe -- Simple reader package to read a file containing two numbers CREATE OR REPLACE PACKAGE hdfs_reader IS -- Return type of pl/sql table function TYPE return_rows_t IS TABLE OF hadoop_row_obj; -- Checks if current invocation is serial FUNCTION is_serial RETURN BOOLEAN; -- Function to actually launch a Hadoop job FUNCTION launch_hadoop_job(in_directory IN VARCHAR2, id in out number) RETURN BOOLEAN; -- Tf to read from Hadoop -- This is the main processing code reading from the queue in -- Figure 3 step 6. It also contains the code to insert into -- the table in Figure 3 step 1 FUNCTION read_from_hdfs_file(pcur IN SYS_REFCURSOR, in_directory IN VARCHAR2) RETURN return_rows_t END; / PIPELINED PARALLEL_ENABLE(PARTITION pcur BY ANY); 10
12 CREATE OR REPLACE PACKAGE BODY hdfs_reader IS -- Checks if current process is a px_process FUNCTION is_serial RETURN BOOLEAN IS c NUMBER; BEGIN SELECT COUNT (*) into c FROM v$px_process WHERE sid = SYS_CONTEXT('USERENV','SESSIONID'); IF c <> 0 THEN ELSE RETURN false; RETURN true; END IF; exception when others then END; RAISE; FUNCTION launch_hadoop_job(in_directory IN VARCHAR2, id IN OUT NUMBER) RETURN BOOLEAN IS PRAGMA AUTONOMOUS_TRANSACTION; instance_id NUMBER; jname varchar2(4000); BEGIN if is_serial then else -- Get id by mixing instance # and session id id := SYS_CONTEXT('USERENV', 'SESSIONID'); SELECT instance_number INTO instance_id FROM v$instance; id := instance_id * id; -- Get id of the QC 11
13 SELECT ownerid into id from v$session where sid = SYS_CONTEXT('USERENV', 'SESSIONID'); end if; -- Create a row to 'lock' it so only one person does the job -- schedule. Everyone else will get an exception -- This is in Figure 3 step 1 INSERT INTO run_hdfs_read VALUES(id, 'RUNNING'); jname := 'Launch_hadoop_job_async'; -- Launch a job to start the hadoop job DBMS_SCHEDULER.CREATE_JOB ( job_name job_type => jname, => 'STORED_PROCEDURE', ); job_action => 'sys.launch_hadoop_job_async', number_of_arguments => 2 DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (jname, 1, in_directory); DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (jname, 2, CAST (id AS VARCHAR2)); DBMS_SCHEDULER.ENABLE('Launch_hadoop_job_async'); COMMIT; RETURN true; EXCEPTION END; -- one of my siblings launched the job. Get out quitely WHEN dup_val_on_index THEN dbms_output.put_line('dup value exception'); RETURN false; WHEN OTHERs THEN RAISE; 12
14 FUNCTION read_from_hdfs_file(pcur IN SYS_REFCURSOR, in_directory IN VARCHAR2) RETURN return_rows_t PIPELINED PARALLEL_ENABLE(PARTITION pcur BY ANY) IS PRAGMA AUTONOMOUS_TRANSACTION; cleanup BOOLEAN; payload hadoop_row_obj; id dopt mprop msgid NUMBER; dbms_aq.dequeue_options_t; dbms_aq.message_properties_t; raw(100); BEGIN -- Launch a job to kick off the hadoop job cleanup := launch_hadoop_job(in_directory, id); dopt.visibility := DBMS_AQ.IMMEDIATE; dopt.delivery_mode := DBMS_AQ.BUFFERED; loop payload := NULL; -- Get next row DBMS_AQ.DEQUEUE('HADOOP_MR_QUEUE', dopt, mprop, payload, msgid); commit; pipe row(payload); end loop; exception when others then if cleanup then delete run_hdfs_read where pk_id = id; commit; end if; 13
15 END; END; / This short script is the outside-of-the-database controller as shown in Figure 3 steps 3 and 4. This is a synchronous step remaining on the system for as long as the Hadoop mappers are running. #!/bin/bash cd HADOOP_HOME- A="/net/scratch/java/jdk1.6.0_16/bin/java -classpath /home/hadoop:/home/hadoop/ojdbc6.jar StreamingEq" bin/hadoop fs -rmr output bin/hadoop jar./contrib/streaming/hadoop streaming.jar - input input/nolist.txt -output output -mapper "$A" -jobconf mapred.reduce.tasks=0 For this example we wrote a small and simple mapper process to be executed on our Hadoop cluster. Undoubtedly more comprehensive mappers exist. This one converts a string into two numbers and provides those to the queue in a row-by-row fashion. // Simplified mapper example for Hadoop cluster import java.sql.*; //import oracle.jdbc.*; //import oracle.sql.*; import oracle.jdbc.pool.*; //import java.util.arrays; //import oracle.sql.array; //import oracle.sql.arraydescriptor; 14
16 public class StreamingEq { { public static void main (String args[]) throws Exception // connect through driver OracleDataSource ods = new OracleDataSource(); Connection conn = null; Statement stmt = null; ResultSet rset = null; java.io.bufferedreader stdin; String line; int countarr = 0, i; oracle.sql.array sqlarray; oracle.sql.arraydescriptor arrdesc; try { "); ods.seturl("jdbc:oracle:thin:oe/oe@$oracle_instance conn = ods.getconnection(); } catch (Exception e){ } System.out.println("Got exception " + e); if (rset!= null) rset.close(); if (conn!= null) conn.close(); System.exit(0); return ; System.out.println("connection works conn " + conn); stdin = new java.io.bufferedreader(new java.io.inputstreamreader(system.in)); 15
17 String query = "declare dopt dbms_aq.enqueue_options_t; mprop dbms_aq.message_properties_t; msgid raw(100); begin dopt.visibility := DBMS_AQ.IMMEDIATE; dopt.delivery_mode := DBMS_AQ.BUFFERED; dbms_aq.enqueue('hadoop_mr_queue', dopt, mprop, hadoop_row_obj(?,?), msgid); commit; end;"; } CallableStatement pstmt = null; try { } pstmt = conn.preparecall(query); catch (Exception e){ } System.out.println("Got exception " + e); if (rset!= null) rset.close(); if (conn!= null) conn.close(); while ((line = stdin.readline())!= null) { } if (line.replaceall(" ", "").length() < 2) continue; String [] data = line.split(","); if (data.length!= 2) continue; countarr++; pstmt.setint(1, Integer.parseInt(data[0])); pstmt.setint(2, Integer.parseInt(data[1])); pstmt.executeupdate(); if (conn!= null) conn.close(); System.exit(0); return; 16
18 } An example of running a select on the system as described above leveraging the table function processing would be: -- Set up phase for the data queue execute DBMS_AQADM.CREATE_QUEUE_TABLE('HADOOP_MR_QUEUE', 'HADOOP_ROW_OBJ'); execute DBMS_AQADM.CREATE_QUEUE('HADOOP_MR_QUEUE', 'HADOOP_MR_QUEUE'); execute DBMS_AQADM.START_QUEUE ('HADOOP_MR_QUEUE'); -- Query being executed by an end user or BI tool -- Note the hint is not a real hint, but a comment -- to clarify why we use the cursor select max(a), avg(b) from table(hdfs_reader.read_from_hdfs_file (cursor (select /*+ FAKE cursor to kick start parallelism */ * from orders), '/home/hadoop/eq_test4.sh')); 17
19 As the examples in this paper show integrating a Hadoop system with Oracle Database 11g is easy to do. The approaches discussed in this paper allow customers to stream data directly from Hadoop into an Oracle query. This avoids fetching the data into a local file system as well as materializing it into an Oracle table before accessing it in a SQL query. 18
20
COSC344 Database Theory and Applications. Java and SQL. Lecture 12
COSC344 Database Theory and Applications Lecture 12: Java and SQL COSC344 Lecture 12 1 Last Lecture Trigger Overview This Lecture Java & SQL Source: Lecture notes, Textbook: Chapter 12 JDBC documentation
An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
Database Programming. Week 10-2. *Some of the slides in this lecture are created by Prof. Ian Horrocks from University of Oxford
Database Programming Week 10-2 *Some of the slides in this lecture are created by Prof. Ian Horrocks from University of Oxford SQL in Real Programs We have seen only how SQL is used at the generic query
<Insert Picture Here> Oracle and/or Hadoop And what you need to know
Oracle and/or Hadoop And what you need to know Jean-Pierre Dijcks Data Warehouse Product Management Agenda Business Context An overview of Hadoop and/or MapReduce Choices, choices,
What is ODBC? Database Connectivity ODBC, JDBC and SQLJ. ODBC Architecture. More on ODBC. JDBC vs ODBC. What is JDBC?
What is ODBC? Database Connectivity ODBC, JDBC and SQLJ CS2312 ODBC is (Open Database Connectivity): A standard or open application programming interface (API) for accessing a database. SQL Access Group,
Spring Data JDBC Extensions Reference Documentation
Reference Documentation ThomasRisberg Copyright 2008-2015The original authors Copies of this document may be made for your own use and for distribution to others, provided that you do not charge any fee
CS 377 Database Systems SQL Programming. Li Xiong Department of Mathematics and Computer Science Emory University
CS 377 Database Systems SQL Programming Li Xiong Department of Mathematics and Computer Science Emory University 1 A SQL Query Joke A SQL query walks into a bar and sees two tables. He walks up to them
An Oracle White Paper October 2009. In-Database Map-Reduce
An Oracle White Paper October 2009 In-Database Map-Reduce Introduction... 2 The Theory... 3 Step-by-Step Example... 4 Step 1 Setting up the Environment... 4 Step 2 Creating the Mapper... 6 Step 3 A Simple
Integrating VoltDB with Hadoop
The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.
Hadoop Streaming. Table of contents
Table of contents 1 Hadoop Streaming...3 2 How Streaming Works... 3 3 Streaming Command Options...4 3.1 Specifying a Java Class as the Mapper/Reducer... 5 3.2 Packaging Files With Job Submissions... 5
Introduction to MapReduce and Hadoop
Introduction to MapReduce and Hadoop Jie Tao Karlsruhe Institute of Technology [email protected] Die Kooperation von Why Map/Reduce? Massive data Can not be stored on a single machine Takes too long to process
An Oracle White Paper June 2013. Migrating Applications and Databases with Oracle Database 12c
An Oracle White Paper June 2013 Migrating Applications and Databases with Oracle Database 12c Disclaimer The following is intended to outline our general product direction. It is intended for information
OTN Developer Day: Oracle Big Data
OTN Developer Day: Oracle Big Data Hands On Lab Manual Oracle Big Data Connectors: Introduction to Oracle R Connector for Hadoop ORACLE R CONNECTOR FOR HADOOP 2.0 HANDS-ON LAB Introduction to Oracle R
JDBC (Java / SQL Programming) CS 377: Database Systems
JDBC (Java / SQL Programming) CS 377: Database Systems JDBC Acronym for Java Database Connection Provides capability to access a database server through a set of library functions Set of library functions
Developing SQL and PL/SQL with JDeveloper
Seite 1 von 23 Developing SQL and PL/SQL with JDeveloper Oracle JDeveloper 10g Preview Technologies used: SQL, PL/SQL An Oracle JDeveloper Tutorial September 2003 Content This tutorial walks through the
OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)
Use Data from a Hadoop Cluster with Oracle Database Hands-On Lab Lab Structure Acronyms: OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS) All files are
I have been playing with Oracle Streams again lately. My goal is to capture changes in 10g and send them to a 9i database.
I have been playing with Oracle Streams again lately. My goal is to capture changes in 10g and send them to a 9i database. Below is the short list for setting up Change Data Capture using Oracle Streams.
Integrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
Xiaoming Gao Hui Li Thilina Gunarathne
Xiaoming Gao Hui Li Thilina Gunarathne Outline HBase and Bigtable Storage HBase Use Cases HBase vs RDBMS Hands-on: Load CSV file to Hbase table with MapReduce Motivation Lots of Semi structured data Horizontal
Performance Tuning for the JDBC TM API
Performance Tuning for the JDBC TM API What Works, What Doesn't, and Why. Mark Chamness Sr. Java Engineer Cacheware Beginning Overall Presentation Goal Illustrate techniques for optimizing JDBC API-based
Introduction to NoSQL Databases and MapReduce. Tore Risch Information Technology Uppsala University 2014-05-12
Introduction to NoSQL Databases and MapReduce Tore Risch Information Technology Uppsala University 2014-05-12 What is a NoSQL Database? 1. A key/value store Basic index manager, no complete query language
Connecting Hadoop with Oracle Database
Connecting Hadoop with Oracle Database Sharon Stephen Senior Curriculum Developer Server Technologies Curriculum The following is intended to outline our general product direction.
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig Contents Acknowledgements... 1 Introduction to Hive and Pig... 2 Setup... 2 Exercise 1 Load Avro data into HDFS... 2 Exercise 2 Define an
The Hadoop Eco System Shanghai Data Science Meetup
The Hadoop Eco System Shanghai Data Science Meetup Karthik Rajasethupathy, Christian Kuka 03.11.2015 @Agora Space Overview What is this talk about? Giving an overview of the Hadoop Ecosystem and related
JDBC. It is connected by the Native Module of dependent form of h/w like.dll or.so. ex) OCI driver for local connection to Oracle
JDBC 4 types of JDBC drivers Type 1 : JDBC-ODBC bridge It is used for local connection. ex) 32bit ODBC in windows Type 2 : Native API connection driver It is connected by the Native Module of dependent
Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
Services for a DBA May Your Workloads RIP (Run In Peace)
Services for a DBA May Your Workloads RIP (Run In Peace) David Austin Manager Technical Writing for RAC and Grid Oracle USA, Inc. [email protected] What, When, Why, Where, How? Connection load balancing
Istanbul Şehir University Big Data Camp 14. Hadoop Map Reduce. Aslan Bakirov Kevser Nur Çoğalmış
Istanbul Şehir University Big Data Camp 14 Hadoop Map Reduce Aslan Bakirov Kevser Nur Çoğalmış Agenda Map Reduce Concepts System Overview Hadoop MR Hadoop MR Internal Job Execution Workflow Map Side Details
Big Data Too Big To Ignore
Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction
SQL and Java. Database Systems Lecture 19 Natasha Alechina
Database Systems Lecture 19 Natasha Alechina In this Lecture SQL in Java SQL from within other Languages SQL, Java, and JDBC For More Information Sun Java tutorial: http://java.sun.com/docs/books/tutorial/jdbc
Course Objectives. Database Applications. External applications. Course Objectives Interfacing. Mixing two worlds. Two approaches
Course Objectives Database Applications Design Construction SQL/PSM Embedded SQL JDBC Applications Usage Course Objectives Interfacing When the course is through, you should Know how to connect to and
Miami University RedHawk Cluster Working with batch jobs on the Cluster
Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.
SnapLogic Salesforce Snap Reference
SnapLogic Salesforce Snap Reference Document Release: October 2012 SnapLogic, Inc. 71 East Third Avenue San Mateo, California 94401 U.S.A. www.snaplogic.com Copyright Information 2012 SnapLogic, Inc. All
SQL Programming. CS145 Lecture Notes #10. Motivation. Oracle PL/SQL. Basics. Example schema:
CS145 Lecture Notes #10 SQL Programming Example schema: CREATE TABLE Student (SID INTEGER PRIMARY KEY, name CHAR(30), age INTEGER, GPA FLOAT); CREATE TABLE Take (SID INTEGER, CID CHAR(10), PRIMARY KEY(SID,
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
Database Access from a Programming Language: Database Access from a Programming Language
Database Access from a Programming Language: Java s JDBC Werner Nutt Introduction to Databases Free University of Bozen-Bolzano 2 Database Access from a Programming Language Two Approaches 1. Embedding
Database Access from a Programming Language:
Database Access from a Programming Language: Java s JDBC Werner Nutt Introduction to Databases Free University of Bozen-Bolzano 2 Database Access from a Programming Language Two Approaches 1. Embedding
Programming Database lectures for mathema
Programming Database lectures for mathematics students April 25, 2015 Functions Functions are defined in Postgres with CREATE FUNCTION name(parameter type,...) RETURNS result-type AS $$ function-body $$
The JAVA Way: JDBC and SQLJ
The JAVA Way: JDBC and SQLJ David Toman School of Computer Science University of Waterloo Introduction to Databases CS348 David Toman (University of Waterloo) JDBC/SQLJ 1 / 21 The JAVA way to Access RDBMS
Internals of Hadoop Application Framework and Distributed File System
International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop
Using distributed technologies to analyze Big Data
Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
Oracle Database 12c Enables Quad Graphics to Quickly Migrate from Sybase to Oracle Exadata
Oracle Database 12c Enables Quad Graphics to Quickly Migrate from Sybase to Oracle Exadata Presented with Prakash Nauduri Technical Director Platform Migrations Group, Database Product Management Sep 30,
Cabot Consulting Oracle Solutions. The Benefits of this Approach. Infrastructure Requirements
Scheduling Workbooks through the Application Concurrent Manager By Rod West, Cabot Oracle Application users will be very familiar with the Applications concurrent manager and how to use it to schedule
NoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
Migrating Non-Oracle Databases and their Applications to Oracle Database 12c O R A C L E W H I T E P A P E R D E C E M B E R 2 0 1 4
Migrating Non-Oracle Databases and their Applications to Oracle Database 12c O R A C L E W H I T E P A P E R D E C E M B E R 2 0 1 4 1. Introduction Oracle provides products that reduce the time, risk,
Creating a Simple Business Collaboration Scenario With a BPMS
Creating a Simple Business Collaboration Scenario With a BPMS Using BPMN 2.0 and Bonita Open Solution Prof. Dr. Thomas Allweyer University of Applied Sciences Kaiserslautern June 2011 Contact: Prof. Dr.
AWS Schema Conversion Tool. User Guide Version 1.0
AWS Schema Conversion Tool User Guide AWS Schema Conversion Tool: User Guide Copyright 2016 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may
Package hive. January 10, 2011
Package hive January 10, 2011 Version 0.1-9 Date 2011-01-09 Title Hadoop InteractiVE Description Hadoop InteractiVE, is an R extension facilitating distributed computing via the MapReduce paradigm. It
Replicating to everything
Replicating to everything Featuring Tungsten Replicator A Giuseppe Maxia, QA Architect Vmware About me Giuseppe Maxia, a.k.a. "The Data Charmer" QA Architect at VMware Previously at AB / Sun / 3 times
SGE Roll: Users Guide. Version @VERSION@ Edition
SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1
CASE STUDY OF HIVE USING HADOOP 1
CASE STUDY OF HIVE USING HADOOP 1 Sai Prasad Potharaju, 2 Shanmuk Srinivas A, 3 Ravi Kumar Tirandasu 1,2,3 SRES COE,Department of er Engineering, Kopargaon,Maharashtra, India 1 [email protected]
How To Use The Database In Jdbc.Com On A Microsoft Gdbdns.Com (Amd64) On A Pcode (Amd32) On An Ubuntu 8.2.2 (Amd66) On Microsoft
CS 7700 Transaction Design for Microsoft Access Database with JDBC Purpose The purpose of this tutorial is to introduce the process of developing transactions for a Microsoft Access Database with Java
TIBCO ActiveMatrix BusinessWorks Plug-in for Big Data User s Guide
TIBCO ActiveMatrix BusinessWorks Plug-in for Big Data User s Guide Software Release 1.0 November 2013 Two-Second Advantage Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE.
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
Webapps Vulnerability Report
Tuesday, May 1, 2012 Webapps Vulnerability Report Introduction This report provides detailed information of every vulnerability that was found and successfully exploited by CORE Impact Professional during
Evaluation. Copy. Evaluation Copy. Chapter 7: Using JDBC with Spring. 1) A Simpler Approach... 7-2. 2) The JdbcTemplate. Class...
Chapter 7: Using JDBC with Spring 1) A Simpler Approach... 7-2 2) The JdbcTemplate Class... 7-3 3) Exception Translation... 7-7 4) Updating with the JdbcTemplate... 7-9 5) Queries Using the JdbcTemplate...
Self-test Database application programming with JDBC
Self-test Database application programming with JDBC Document: e1216test.fm 18/04/2012 ABIS Training & Consulting P.O. Box 220 B-3000 Leuven Belgium TRAINING & CONSULTING INTRODUCTION TO THE SELF-TEST
map/reduce connected components
1, map/reduce connected components find connected components with analogous algorithm: map edges randomly to partitions (k subgraphs of n nodes) for each partition remove edges, so that only tree remains
http://glennengstrand.info/analytics/fp
Functional Programming and Big Data by Glenn Engstrand (September 2014) http://glennengstrand.info/analytics/fp What is Functional Programming? It is a style of programming that emphasizes immutable state,
Chancery SMS 7.5.0 Database Split
TECHNICAL BULLETIN Microsoft SQL Server replication... 1 Transactional replication... 2 Preparing to set up replication... 3 Setting up replication... 4 Quick Reference...11, 2009 Pearson Education, Inc.
Package hive. July 3, 2015
Version 0.2-0 Date 2015-07-02 Title Hadoop InteractiVE Package hive July 3, 2015 Description Hadoop InteractiVE facilitates distributed computing via the MapReduce paradigm through R and Hadoop. An easy
Java and RDBMS. Married with issues. Database constraints
Java and RDBMS Married with issues Database constraints Speaker Jeroen van Schagen Situation Java Application store retrieve JDBC Relational Database JDBC Java Database Connectivity Data Access API ( java.sql,
Oracle Database: SQL and PL/SQL Fundamentals
Oracle University Contact Us: 1.800.529.0165 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This course is designed to deliver the fundamentals of SQL and PL/SQL along
Supplement IV.C: Tutorial for Oracle. For Introduction to Java Programming By Y. Daniel Liang
Supplement IV.C: Tutorial for Oracle For Introduction to Java Programming By Y. Daniel Liang This supplement covers the following topics: Connecting and Using Oracle Creating User Accounts Accessing Oracle
Oracle PL/SQL Injection
Oracle PL/SQL Injection David Litchfield What is PL/SQL? Procedural Language / Structured Query Language Oracle s extension to standard SQL Programmable like T-SQL in the Microsoft world. Used to create
Java and Databases. COMP514 Distributed Information Systems. Java Database Connectivity. Standards and utilities. Java and Databases
Java and Databases COMP514 Distributed Information Systems Java Database Connectivity One of the problems in writing Java, C, C++,, applications is that the programming languages cannot provide persistence
ORACLE 10g Job Scheduler
Dec 13 th, 2005 ORACLE 10g Job Scheduler Inderpal S. Johal Principal Consultant AGENDA Why we need Scheduler Oracle Scheduler Solutions Benefit of Oracle 10g Scheduler Scheduler Components Scheduler Architecture
An Introduction to PL/SQL. Mehdi Azarmi
1 An Introduction to PL/SQL Mehdi Azarmi 2 Introduction PL/SQL is Oracle's procedural language extension to SQL, the non-procedural relational database language. Combines power and flexibility of SQL (4GL)
Services. Relational. Databases & JDBC. Today. Relational. Databases SQL JDBC. Next Time. Services. Relational. Databases & JDBC. Today.
& & 1 & 2 Lecture #7 2008 3 Terminology Structure & & Database server software referred to as Database Management Systems (DBMS) Database schemas describe database structure Data ordered in tables, rows
Using SQL Server Management Studio
Using SQL Server Management Studio Microsoft SQL Server Management Studio 2005 is a graphical tool for database designer or programmer. With SQL Server Management Studio 2005 you can: Create databases
Facebook s Petabyte Scale Data Warehouse using Hive and Hadoop
Facebook s Petabyte Scale Data Warehouse using Hive and Hadoop Why Another Data Warehousing System? Data, data and more data 200GB per day in March 2008 12+TB(compressed) raw data per day today Trends
NGASI AppServer Manager SaaS/ASP Hosting Automation for Cloud Computing Administrator and User Guide
NGASI AppServer Manager SaaS/ASP Hosting Automation for Cloud Computing Administrator and User Guide NGASI SaaS Hosting Automation is a JAVA SaaS Enablement infrastructure that enables web hosting services
Redundant Storage Cluster
Redundant Storage Cluster For When It's Just Too Big Bob Burgess radian 6 Technologies MySQL User Conference 2009 Scope Fundamentals of MySQL Proxy Fundamentals of LuaSQL Description of the Redundant Storage
Grid Engine Users Guide. 2011.11p1 Edition
Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
ITG Software Engineering
Introduction to Apache Hadoop Course ID: Page 1 Last Updated 12/15/2014 Introduction to Apache Hadoop Course Overview: This 5 day course introduces the student to the Hadoop architecture, file system,
Introduction to Hadoop on the cloud using BigInsights on BlueMix dev@pulse, Feb. 24-25, 2014
Hands on Lab Introduction to Hadoop on the cloud using BigInsights on BlueMix dev@pulse, Feb. 24-25, 2014 Cindy Saracco, Senior Solutions Architect, [email protected], @IBMbigdata Nicolas Morales, Solutions
Hadoop. Bioinformatics Big Data
Hadoop Bioinformatics Big Data Paolo D Onorio De Meo Mattia D Antonio [email protected] [email protected] Big Data Too much information! Big Data Explosive data growth proliferation of data capture
Hadoop Integration Guide
HP Vertica Analytic Database Software Version: 7.1.x Document Release Date: 12/9/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements
Hadoop Integration Guide
HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 2/20/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements
Performance Overhead on Relational Join in Hadoop using Hive/Pig/Streaming - A Comparative Analysis
Performance Overhead on Relational Join in Hadoop using Hive/Pig/Streaming - A Comparative Analysis Prabin R. Sahoo Tata Consultancy Services Yantra Park, Thane Maharashtra, India ABSTRACT Hadoop Distributed
Hadoop and Eclipse. Eclipse Hawaii User s Group May 26th, 2009. Seth Ladd http://sethladd.com
Hadoop and Eclipse Eclipse Hawaii User s Group May 26th, 2009 Seth Ladd http://sethladd.com Goal YOU can use the same technologies as The Big Boys Google Yahoo (2000 nodes) Last.FM AOL Facebook (2.5 petabytes
CS/CE 2336 Computer Science II
CS/CE 2336 Computer Science II UT D Session 23 Database Programming with Java Adapted from D. Liang s Introduction to Java Programming, 8 th Ed. and other sources 2 Database Recap Application Users Application
Why Is This Important? Database Application Development. SQL in Application Code. Overview. SQL in Application Code (Contd.
Why Is This Important? Database Application Development Chapter 6 So far, accessed DBMS directly through client tools Great for interactive use How can we access the DBMS from a program? Need an interface
Hadoop Basics with InfoSphere BigInsights
An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Unit 2: Using MapReduce An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government Users Restricted Rights
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
Oracle EXAM - 1Z0-117. Oracle Database 11g Release 2: SQL Tuning. Buy Full Product. http://www.examskey.com/1z0-117.html
Oracle EXAM - 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Buy Full Product http://www.examskey.com/1z0-117.html Examskey Oracle 1Z0-117 exam demo product is here for you to test the quality of the
Chain of Command Design Pattern & BI Publisher. Miroslav Samoilenko. Claremont is a trading name of Premiertec Consulting Ltd
Chain of Command Design Pattern & BI Publisher Claremont is a trading name of Premiertec Consulting Ltd Chain of Command Design Pattern & BI Publisher Statement of the Problem Consider the following simple
Hadoop and Map-reduce computing
Hadoop and Map-reduce computing 1 Introduction This activity contains a great deal of background information and detailed instructions so that you can refer to it later for further activities and homework.
Big Data and Scripting map/reduce in Hadoop
Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb
TECH TUTORIAL: EMBEDDING ANALYTICS INTO A DATABASE USING SOURCEPRO AND JMSL
TECH TUTORIAL: EMBEDDING ANALYTICS INTO A DATABASE USING SOURCEPRO AND JMSL This white paper describes how to implement embedded analytics within a database using SourcePro and the JMSL Numerical Library,
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at [email protected].
A basic create statement for a simple student table would look like the following.
Creating Tables A basic create statement for a simple student table would look like the following. create table Student (SID varchar(10), FirstName varchar(30), LastName varchar(30), EmailAddress varchar(30));
LearnFromGuru Polish your knowledge
SQL SERVER 2008 R2 /2012 (TSQL/SSIS/ SSRS/ SSAS BI Developer TRAINING) Module: I T-SQL Programming and Database Design An Overview of SQL Server 2008 R2 / 2012 Available Features and Tools New Capabilities
White paper. Advanced analytics with the EXASolution analytical in-memory database. August 2013
White paper Advanced analytics with the EXASolution analytical in-memory database August 2013 Abstract: In light of continuously growing volumes of data, increasingly complex calculations must be performed
