Programme SURF Research Boot Camp 21 April 2016 Registration at
|
|
- Kevin Page
- 7 years ago
- Views:
Transcription
1 Programme SURF Research Boot Camp 21 April 2016 Registration at h PROGRAMMING DATA MANAGEMENT COMPUTE MISCELLANEOUS Introduction to UNIX Jeroen Engelberts SURFsara Data cleaning with OpenRefine Mateusz Kuzak Netherlands escience Center Introduction to compute infrastructures Jan Bot SURFsara Good practices for research data sharing and transfer Niek Bosch SURFsara, Paul van Dijk SURFnet BIG DATA part 1 Scalable data analysis with Apache Spark and Hadoop Mathijs Kattenberg, Jeroen Schot & Machiel Jansen SURFsara 12:30 14:30 h PROGRAMMING DATA MANAGEMENT part 1 COMPUTE MISCELLANEOUS Python crash course Ad Thiers & Maurice Verheesen AT Computing PID and irods web tools to manage and share research data Ton Smeele Utrecht University Christine Staiger SURFsara High performance computing in the cloud Markus van Dijk & Ander Astudillo SURFsara Local and remote data visualisation Paul Melis & Casper van Leeuwen SURFsara BIG DATA part 2 see BIG DATA part 1 15:00 17:00 h PROGRAMMING DATA MANAGEMENT part 2 COMPUTE MISCELLANEOUS Scientific computing with Python Ad Thiers & Maurice Verheesen AT Computing PID and irods command line tools to securely store and manage research data Ton Smeele Utrecht University Christine Staiger SURFsara Cluster computing Jeroen Engelberts SURFsara Version control with Git Carlos Martinez Ortiz Netherlands escience Center BIG DATA part 3 see BIG DATA part 1
2 PROGRAMMING TRACK # Introduction to UNIX Time: h Jeroen Engelberts SURFsara In this part of the Programming track, you will learn the history of UNIX and get acquainted with some basic commands you will need to start using data, compute, and network facilities. work with a UNIX terminal/shell use some basic UNIX commands login to a UNIX cluster follow the Introduction to Cluster Computing and Introduction to HPC Cloud from the Compute track. Some basic knowledge working with a computer. Bring along your own laptop. Make sure that you have installed Firefox on your laptop. Please install the FireSSH plugin, which can be obtained free of charge here: US/firefox/addon/firessh/ You will receive a username and password to login to a UNIX system. # Python crash course Time: 12:30 14:30 h Ad Thiers & Maurice Verheesen AT Computing In this session you will learn the basics of the Python programming language. In last couple of years, Python received much attention, specifically in the realm of technical computing. Its benefits superb integration with existing (C, C++ and Fortran) code, and a very simple, but powerful syntax obviously contribute to this popularity. This workshop will cover basic language constructs like loops, if then else statements and exceptions, and it will briefly deal with data types.
3 read and understand existing Python code create a simple Python script determine whether Python should be part of their 'research toolbox' This workshop assumes some programming / scripting familiarity. Bring your own laptop. A basic Python installation is required. This can be downloaded from A more complete installation for scientific computing with Python is Anaconda available from Be sure to install the Python 3 versions. # Scientific computing with Python Time: 15:00 17:00 h Ad Thiers & Maurice Verheesen AT Computing You will be introduced into the world of scientific Python. Plain Python due to its versatility is not well suited for number crunching. There is, however, a wealth of Python software available modules and packages allowing you to perform computationally intensive jobs. All these packages are built around NumPy and matplotlib. Their basic functionality working with n dimensional arrays and visualizing data will be covered in this workshop to some level. You will also get an overview of the major modules in SciPy an enormous toolkit for scientific calculations. At the end of the track, you are be able to: create a simple NumPy script and manipulate arrays create plots with matplotlib find their way in the SciPy / NumPy toolkit This workshop assumes a basic knowledge of the Python programming language Bring your own laptop. An installation for scientific computing with Python, with NumPy, SciPy and MatPlotLib is required. The Anaconda distribution from contains all prerequisites (and more). Be sure to install the Python 3 versions.
4 DATA MANAGEMENT TRACK # Data Cleaning with OpenRefine Time: h Mateusz Kuzak Netherlands escience Center OpenRefine is a powerful tool for working with messy data e.g. to: clean data transform data from one format into another extending it with web services and external data With OpenRefine you will get a better picture of your dataset. You will learn how to use faceting, clustering and filtering features and to correct errors in a dataset. You will also learn how to find and remove whitespace errors and how to split columns. In addition, you will learn how to move forward and backward on the timeline of changes you applied to the data set and how to script changes for future reuse. At the end, you will export a clean dataset to a new file. See also: for introduction videos on OpenRefine import tabular data to OpenRefine find and correct errors in the dataset find and clean whitespace errors for whole columns script cleaning steps This workshop assumes only familiarity with tabular data formats, like comma or tab separated. Bring your own laptop. You will need Firefox Web Browser installed and the OpenRefine browser plugin. Here is the guide to install OpenRefine: Instructions # Data management part 1: PID and irods web tools to manage and share research data Time: 12:30 14:30 h Ton Smeele Utrecht University Christine Staiger SURFsara
5 The workshop explains the European common data infrastructure EUDAT services including persistent identifiers. You will use web tools such as B2FIND to discover research data sets, B2SHARE to publish your own research data and the Handle/EPIC websites to inspect persistent identifiers. In addition, you securely store and manage your data using web tools to work with an irods data grid. At the end of the workshop, you will be able to: understand the common EUDAT B2 suite data services know how to use web tools to find and share research data sets across Europe know how to use web tools to store and manage research data understand the concept of persistent identifiers for research data and know how to use them This is an introductory workshop open to all disciplines. No specific software has to be installed. Bring your own laptop. # Data management part 2: PID and irods command line tools to securely store and manage research data Time: 15:00 17:00 h Ton Smeele Utrecht University Christine Staiger SURFsara During the workshop you create persistent identifiers for your data. Also the workshop introduces the key functions of a data grid and allows you to experiment with an existing irods grid: save files to the grid and retrieve them again, use advanced search techniques to find your files based on their metadata context, build a pipeline to automate (post)processing of data files. At the end of the workshop, you will be able to: understand how to create EPIC persistent identifiers using Python programs understand the benefits of using data grids for storing research data understand the architecture of data grids know the basic set of commands to interact with a data grid know how to automate pipelines using a data grid This workshop assumes some basic familiarity with terminal command line (Linux or DOS).
6 It also assumes the participant is familiar with general concept of persistent identifiers such as DOI and EPIC (these concepts are introduced in the workshop Data Management Part 1: PID and irods web tools to manage and share research data ). While knowledge of Python programming language is an advantage it is not a prerequisite. Bring your own laptop. SSH/Putty tools is required in order to access the data grid server. o Linux and Mac users don t have to install anything; an SSH client is installed. o Windows users: download and install Putty or SSH US/firefox/addon/firessh/ COMPUTE TRACK # Introduction to compute infrastructures Time: h Jan Bot SURFsara We will provide you with a basic understanding of the different compute infrastructures that are available and whether you should consider using one. This module is a prerequisite for the cluster and cloud compute hands ons. We will explain in which situations these infrastructures are useful and provide you with enough background knowledge to decide which infrastructure is best suited for your research. choose between the different computational infrastructures have a basic understanding of how cluster computing systems work have a basic understanding of how cloud computing works None None
7 # High performance computing in the cloud Time: 12:30 14:30 h Ander Astudillo & Markus van Dijk SURFsara Computing in the cloud allows you flexible and easy access to computing and data resources that you would otherwise have to host yourself. SURFsara runs the HPC Cloud providing an Infrastructure as a Service (IaaS) model (as will be explained in the workshop Introduction to High Performance Computing ). This workshop provides a general introduction to cloud computing, teaches HPC Cloud characteristics and how to use it hands on. At the end of the workshop, your are able to: use the HPC Cloud understand and apply different scaling models for parallel computing build (clusters of) Virtual Machines This workshop assumes familiarity with the Unix command line and SSH (can be learned from Introduction to UNIX, in the first hour of the Programming track). Bring your own laptop with a browser (Chrome or Firefox will do fine) and a SSH client: o Linux and Mac users don t have to install anything; an SSH client is installed. o Windows users: download and install git for windows : for windows.github.io # Cluster computing Time: 15:00 17:00 h Jeroen Engelberts SURFsara In this part of the Compute track, you will learn how the national cluster Lisa and the national supercomputer Cartesius are setup. This presentation will be followed up by a hands on with some small and easy to follow examples on both systems. login to a UNIX cluster prepare, submit and analyze a batch job on the national cluster Lisa / supercomputer Cartesius
8 Some basic knowledge of the UNIX operating system (can be learned from Introduction to UNIX, in the first hour of the Programming track). Bring your laptop. Make sure that you have installed Firefox on your laptop. Please install the FireSSH plugin, which can be obtained free of charge here: US/firefox/addon/firessh/ MISCELLANEOUS TRACK # Good practices for research data sharing and transfer Time : h Niek Bosch SURFsara Paul van DIjk SURFnet Description : You will obtain some basic knowledge of tooling and protocols that will help you to share your data fast, secure and easy! Transferring data to colleagues world wide can sometimes be a real hassle. Whether you are dealing with portable hard disks or transfers via the internet, slow transfer times often occur when using suboptimal protocols or solutions. By following some simple guides and tricks, these problems soon belong to the past. We will also touch topics like legal requirements, encryption of data and the prevention of research data disasters. At the end of the track, you are able to: assess which tools meet with legal guidelines choose the most suitable file systems and protocols to transfer data select suitable data sharing tools and services and learn to know their cons and pros use basic encryption tools like PGP None Software needed: No specific software has to be installed. Bring your own laptop.
9 # Local and remote data visualisation Time: 12:30 14:30 h Paul Melis & Casper van Leeuwen SURFsara Visualisation can play an important role in research, but also in communicating research and results to stakeholders. We will give a practical introduction to both scientific visualisation and information visualisation using open source tools (ParaView, Jupyter, matplotlib). Have an overview of different visualisation domains Create basic scientific visualisations using ParaView Perform basic data visualisation with matplotlib and Jupyter No specific prior knowledge is needed. Bring your own laptop, with a web browser (preferably Chrome). ParaView (version 5.0) needs to installed in advance to work on the exercises. It is available for Windows, MacOS X and Linux. See: # Version control with Git Time: 15:00 17:00 h Carlos Martinez Ortiz Netherlands escience Center Version control is extremely useful to manage collaborative work. Nothing committed to version control (git) will ever be lost, it is always possible to go back in time to see exactly who wrote what on a particular day. When several people collaborate in the same project, git automatically notifies users in case of a conflict between changes made by two people. Lone researchers can also benefit immensely of keeping a record of what was changed, when, and why. Git is extremely useful for all researchers if they ever need to come back to the project later on (e.g., a year later, when memory has faded). set up git for tracking changes in a project use git for working in parallel on the same set of files
10 go back to previous versions of a document. Familiarity with working in the command line (either in Windows, OS X or Linux) is recommended. Installation of git client will be required: scm.com/downloads or for windows.github.io/ Creation of a github account would also be advisable. BIG DATA TRACK # Scalable data analysis with Apache Spark and Hadoop Time: :00 h Mathijs Kattenberg, Jeroen Schot & Machiel Jansen SURFsara You are introduced to the Apache Hadoop and Spark frameworks for processing big data. These frameworks offer a novel way for creating data analysis applications that easily scale over hundreds to thousands of machines. This data parallel approach has been pioneered in industry by tech companies such as Google and Facebook, and is very applicable to many scientific workloads in general. We introduce you to the key concepts and features of the Apache Hadoop and Spark stacks. In addition, you will work on hands on Spark exercises in a Jupyter notebook environment. The presentations, exercises and demos will provide a basic understanding of Hadoop and Spark and teach you about fundamental concepts in big data processing. Understand Spark and Hadoop concepts and fundamentals Understand requirements for scalable applications Run and create basic Spark code in a notebook environment This workshop is for anyone who would like to get started with Apache Spark and Hadoop to build robust and scalable applications. You should be familiar with the basics of programming (preferably Python) and the Unix command line. Most scientific programmers and technically minded researchers will feel right at home. No specific software has to be installed. Bring your own laptop.
SURFsara HPC Cloud Workshop
SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current
More informationUnlocking the True Value of Hadoop with Open Data Science
Unlocking the True Value of Hadoop with Open Data Science Kristopher Overholt Solution Architect Big Data Tech 2016 MinneAnalytics June 7, 2016 Overview Overview of Open Data Science Python and the Big
More informationSURFsara HPC Cloud Workshop
SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current
More informationReal-Time Analytics on Large Datasets: Predictive Models for Online Targeted Advertising
Real-Time Analytics on Large Datasets: Predictive Models for Online Targeted Advertising Open Data Partners and AdReady April 2012 1 Executive Summary AdReady is working to develop and deploy sophisticated
More informationScaling Out With Apache Spark. DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf
Scaling Out With Apache Spark DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf Your hosts Mathijs Kattenberg Technical consultant Jeroen Schot Technical consultant
More informationFREE computing using Amazon EC2
FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat
More informationSURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
More informationAssignment # 1 (Cloud Computing Security)
Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual
More informationGlobus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis
Globus Research Data Management: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Presentations and other useful information available at globus.org/events/xsede15/tutorial 2 Thank you to
More informationAutomating Big Data Benchmarking for Different Architectures with ALOJA
www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.
More informationLesson 7 - Website Administration
Lesson 7 - Website Administration If you are hired as a web designer, your client will most likely expect you do more than just create their website. They will expect you to also know how to get their
More informationThe Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18
The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific
More informationMicrosoft Research Windows Azure for Research Training
Copyright 2013 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the
More informationThe full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.
Content Introduction... 2 Data Access Server Control Panel... 2 Running the Sample Client Applications... 4 Sample Applications Code... 7 Server Side Objects... 8 Sample Usage of Server Side Objects...
More informationXpoLog Center Suite Log Management & Analysis platform
XpoLog Center Suite Log Management & Analysis platform Summary: 1. End to End data management collects and indexes data in any format from any machine / device in the environment. 2. Logs Monitoring -
More informationAnalytic Modeling in Python
Analytic Modeling in Python Why Choose Python for Analytic Modeling A White Paper by Visual Numerics August 2009 www.vni.com Analytic Modeling in Python Why Choose Python for Analytic Modeling by Visual
More informationSoftware Defined Whatever @SURFsara RON TROMPERT
Software Defined Whatever @SURFsara RON TROMPERT About SURFsara Supports research in the Netherlands (and abroad) by offering advanced ICT infrastructure, services and expertise National Supercomputer
More informationWeek Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration
ULI101 Week 06b Week Overview Installing Linux Linux on your Desktop Virtualization Basic Linux system administration Installing Linux Standalone installation Linux is the only OS on the computer Any existing
More informationMicrosoft Research Microsoft Azure for Research Training
Copyright 2014 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the
More informationSession 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm!
Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm! Moderator: David L. Snell, ASA, MAAA Presenters: Brian D. Holland, FSA, MAAA
More informationNaviCell Data Visualization Python API
NaviCell Data Visualization Python API Tutorial - Version 1.0 The NaviCell Data Visualization Python API is a Python module that let computational biologists write programs to interact with the molecular
More informationETHERNET OAM MONITORING
ETHERNET OAM MONITORING IN ICINGA AND CACTI Presentation for the edupert Monthly Call Presented by Erik Ruiter SURFsara Science Park 140 1098 XG Amsterdam the Netherlands T +31 (0)20 592 3000 F +31 (0)20
More informationINASP: Effective Network Management Workshops
INASP: Effective Network Management Workshops Linux Familiarization and Commands (Exercises) Based on the materials developed by NSRC for AfNOG 2013, and reused with thanks. Adapted for the INASP Network
More information3DHOP Local Setup. Lezione 14 Maggio 2015
Lezione 14 Maggio 2015 3DHOP what is it? Basically a set of web files :.html (hyper text markup language) The main file, it contains the Web page structure e some basic functions..js (javascript) The brain
More informationDigital Asset Management. Content Control for Valuable Media Assets
Digital Asset Management Content Control for Valuable Media Assets Overview Digital asset management is a core infrastructure requirement for media organizations and marketing departments that need to
More informationDATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
More informationAnalysis Programs DPDAK and DAWN
Analysis Programs DPDAK and DAWN An Overview Gero Flucke FS-EC PNI-HDRI Spring Meeting April 13-14, 2015 Outline Introduction Overview of Analysis Programs: DPDAK DAWN Summary Gero Flucke (DESY) Analysis
More informationWeb Conferencing Version 8.3 Troubleshooting Guide
System Requirements General Requirements Web Conferencing Version 8.3 Troubleshooting Guide Listed below are the minimum requirements for participants accessing the web conferencing service. Systems which
More informationHow to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.
How to Ingest Data into Google BigQuery using Talend for Big Data A Technical Solution Paper from Saama Technologies, Inc. July 30, 2013 Table of Contents Intended Audience What you will Learn Background
More informationEvaluation of Open Source Data Cleaning Tools: Open Refine and Data Wrangler
Evaluation of Open Source Data Cleaning Tools: Open Refine and Data Wrangler Per Larsson plarsson@cs.washington.edu June 7, 2013 Abstract This project aims to compare several tools for cleaning and importing
More informationSparkLab May 2015 An Introduction to
SparkLab May 2015 An Introduction to & Apostolos N. Papadopoulos Assistant Professor Data Engineering Lab, Department of Informatics, Aristotle University of Thessaloniki Abstract Welcome to SparkLab!
More informationUbuntu Linux Reza Ghaffaripour May 2008
Ubuntu Linux Reza Ghaffaripour May 2008 Table of Contents What is Ubuntu... 3 How to get Ubuntu... 3 Ubuntu Features... 3 Linux Advantages... 4 Cost... 4 Security... 4 Choice... 4 Software... 4 Hardware...
More informationGrinder in the Cloud. Get Loaded!
Grinder in the Cloud Get Loaded! Contents Contents... 2 Changes... 3 This Document... 3 Intended Audience... 3 Prerequisites... 3 The Solution... 4 Architectural Overview... 4 Benefits... 6 Costs... 6
More informationMust Haves for your Cloud Toolbox Driving DevOps with Crowbar and Dasein
Must Haves for your Cloud Toolbox Driving DevOps with Crowbar and Dasein Joseph B. George Director, Cloud and Big Data Solutions, Dell Board of Directors, OpenStack Foundation Tim Cook Senior Virtualization
More informationBig Data Paradigms in Python
Big Data Paradigms in Python San Diego Data Science and R Users Group January 2014 Kevin Davenport! http://kldavenport.com kldavenportjr@gmail.com @KevinLDavenport Thank you to our sponsors: Setting up
More informationCorso di Reti di Calcolatori L-A. Cloud Computing
Università degli Studi di Bologna Facoltà di Ingegneria Corso di Reti di Calcolatori L-A Cloud Computing Antonio Corradi Luca Foschini Some Clouds 1 What is Cloud computing? The architecture and terminology
More informationImplementing Microsoft Azure Infrastructure Solutions 20533B; 5 Days, Instructor-led
Implementing Microsoft Azure Infrastructure Solutions 20533B; 5 Days, Instructor-led Course Description This course is aimed at experienced IT Professionals who currently administer their on-premises infrastructure.
More informationWHITE PAPER. ClusterWorX 2.1 from Linux NetworX. Cluster Management Solution C ONTENTS INTRODUCTION
WHITE PAPER A PRIL 2002 C ONTENTS Introduction 1 Overview 2 Features 2 Architecture 3 Monitoring 4 ICE Box 4 Events 5 Plug-ins 6 Image Manager 7 Benchmarks 8 ClusterWorX Lite 8 Cluster Management Solution
More informationMicrosoft Dynamics CRM 2013 Applications Introduction Training Material Version 2.0
Microsoft Dynamics CRM 2013 Applications Introduction Training Material Version 2.0 www.firebrandtraining.com Course content Module 0 Course Content and Plan... 4 Objectives... 4 Course Plan... 4 Course
More informationCourse 20533B: Implementing Microsoft Azure Infrastructure Solutions
Course 20533B: Implementing Microsoft Azure Infrastructure Solutions Sales 406/256-5700 Support 406/252-4959 Fax 406/256-0201 Evergreen Center North 1501 14 th St West, Suite 201 Billings, MT 59102 Course
More informationCloud Computing an introduction
Prof. Dr. Claudia Müller-Birn Institute for Computer Science, Networked Information Systems Cloud Computing an introduction January 30, 2012 Netzprogrammierung (Algorithmen und Programmierung V) Our topics
More informationZend Server Amazon AMI Quick Start Guide
Zend Server Amazon AMI Quick Start Guide By Zend Technologies www.zend.com Disclaimer This is the Quick Start Guide for The Zend Server Zend Server Amazon Machine Image The information in this document
More informationPython for Data Analysis and Visualiza4on. Fang (Cherry) Liu, Ph.D fang.liu@oit.gatech.edu PACE Gatech July 2013
Python for Data Analysis and Visualiza4on Fang (Cherry) Liu, Ph.D PACE Gatech July 2013 Outline System requirements and IPython Why use python for data analysis and visula4on Data set US baby names 1880-2012
More informationOverview. Timeline Cloud Features and Technology
Overview Timeline Cloud is a backup software that creates continuous real time backups of your system and data to provide your company with a scalable, reliable and secure backup solution. Storage servers
More informationBig Data and Cloud Computing for GHRSST
Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge
More informationCONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities
CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities DNS name: turing.cs.montclair.edu -This server is the Departmental Server
More informationHow to set up SQL Source Control. The short guide for evaluators
How to set up SQL Source Control The short guide for evaluators Content Introduction Team Foundation Server & Subversion setup Git setup Setup without a source control system Making your first commit Committing
More informationAn Introduction to Using Python with Microsoft Azure
An Introduction to Using Python with Microsoft Azure If you build technical and scientific applications, you're probably familiar with Python. What you might not know is that there are now tools available
More informationSelenium An Effective Weapon In The Open Source Armory
Selenium An Effective Weapon In The Open Source Armory Komal Joshi Director: Atlantis Software Limited Anand Ramdeo Head of Quality Assurance: GCAP Media Agenda Introduction to Selenium Selenium IDE Lets
More informationClassroom Demonstrations of Big Data
Classroom Demonstrations of Big Data Eric A. Suess Abstract We present examples of accessing and analyzing large data sets for use in a classroom at the first year graduate level or senior undergraduate
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationImproved metrics collection and correlation for the CERN cloud storage test framework
Improved metrics collection and correlation for the CERN cloud storage test framework September 2013 Author: Carolina Lindqvist Supervisors: Maitane Zotes Seppo Heikkila CERN openlab Summer Student Report
More informationData-Intensive Programming. Timo Aaltonen Department of Pervasive Computing
Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationPragmatic Version Control
Extracted from: Pragmatic Version Control using Subversion, 2nd Edition This PDF file contains pages extracted from Pragmatic Version Control, one of the Pragmatic Starter Kit series of books for project
More informationScyld Cloud Manager User Guide
Scyld Cloud Manager User Guide Preface This guide describes how to use the Scyld Cloud Manager (SCM) web portal application. Contacting Penguin Computing 45800 Northport Loop West Fremont, CA 94538 1-888-PENGUIN
More informationApache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
More informationRevit products will use multiple cores for many tasks, using up to 16 cores for nearphotorealistic
Autodesk Revit 2013 Product Line System s and Recommendations Autodesk Revit Architecture 2013 Autodesk Revit MEP 2013 Autodesk Revit Structure 2013 Autodesk Revit 2013 Minimum: Entry-Level Configuration
More informationUser Guide FOR TOSHIBA STORAGE PLACE
User Guide FOR TOSHIBA STORAGE PLACE (This page left blank for 2-sided "book" printing.) Table of Contents Overview... 5 System Requirements... 5 Storage Place Interfaces... 5 Getting Started... 6 Using
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
More informationHPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
More informationChapter 1 Basic Introduction to Computers. Discovering Computers 2012. Your Interactive Guide to the Digital World
Chapter 1 Basic Introduction to Computers Discovering Computers 2012 Your Interactive Guide to the Digital World Objectives Overview Explain why computer literacy is vital to success in today s world Define
More informationRTI Quick Start Guide for JBoss Operations Network Users
RTI Quick Start Guide for JBoss Operations Network Users This is the RTI Quick Start guide for JBoss Operations Network Users. It will help you get RTI installed and collecting data on your application
More informationSoftware Automated Testing
Software Automated Testing Keyword Data Driven Framework Selenium Robot Best Practices Agenda ² Automation Engineering Introduction ² Keyword Data Driven ² How to build a Test Automa7on Framework ² Selenium
More informationEuropean Data Infrastructure - EUDAT Data Services & Tools
European Data Infrastructure - EUDAT Data Services & Tools Dr. Ing. Morris Riedel Research Group Leader, Juelich Supercomputing Centre Adjunct Associated Professor, University of iceland BDEC2015, 2015-01-28
More informationTables in the Cloud. By Larry Ng
Tables in the Cloud By Larry Ng The Idea There has been much discussion about Big Data and the associated intricacies of how it can be mined, organized, stored, analyzed and visualized with the latest
More informationSource Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE
Source Code Management for Continuous Integration and Deployment Version 1.0 Copyright 2013, 2014 Amazon Web Services, Inc. and its affiliates. All rights reserved. This work may not be reproduced or redistributed,
More informationWeb Class Configuration and Test Guide
Web Class Configuration and Test Guide Web class visual material is accessed via your web browser via the URL provided for each web class. The new Engage web class system supports most operating systems:
More informationCloudCIX Bootcamp. The essential IaaS getting started guide. http://www.cix.ie
The essential IaaS getting started guide. http://www.cix.ie Revision Date: 17 th August 2015 Contents Acronyms... 2 Table of Figures... 3 1 Welcome... 4 2 Architecture... 5 3 Getting Started... 6 3.1 Login
More informationThe big data revolution
The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing
More informationLSKA 2010 Survey Report Job Scheduler
LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,
More informationDevOps Course Content
DevOps Course Content INTRODUCTION TO DEVOPS What is DevOps? History of DevOps Dev and Ops DevOps definitions DevOps and Software Development Life Cycle DevOps main objectives Infrastructure As A Code
More informationDocDokuPLM Innovative PLM solution
PLM DocDokuPLM Innovative PLM solution DocDokuPLM: a business solution Manage the entire lifecycle of your products from ideas to market and setup your information backbone. DocDokuPLM highlights Anywhere
More informationAPP DEV. We build your ideas into web and mobile applications. steicho. Technological Solutions
We build your ideas into web and mobile applications. steicho Technological Solutions Automate your processes, through a commercial custom made application We offer software solutions to automate, streamline,
More informationPROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm
PROGRAMMING FOR BIOLOGISTS BIOL 6297 Monday, Wednesday 10 am -12 pm Tomorrow is Ada Lovelace Day Ada Lovelace was the first person to write a computer program Today s Lecture Overview of the course Philosophy
More informationTips for getting started! with! Virtual Data Center!
Tips for getting started with Virtual Data Center Last Updated: 1 July 2014 Table of Contents Safe Swiss Cloud Self Service Control Panel 2 Please note the following about for demo accounts: 2 Add an Instance
More informationScientific Programming, Analysis, and Visualization with Python. Mteor 227 Fall 2015
Scientific Programming, Analysis, and Visualization with Python Mteor 227 Fall 2015 Python The Big Picture Interpreted General purpose, high-level Dynamically type Multi-paradigm Object-oriented Functional
More informationBuilding a Continuous Integration Pipeline with Docker
Building a Continuous Integration Pipeline with Docker August 2015 Table of Contents Overview 3 Architectural Overview and Required Components 3 Architectural Components 3 Workflow 4 Environment Prerequisites
More informationDeployment Guide: Unidesk and Hyper- V
TECHNICAL WHITE PAPER Deployment Guide: Unidesk and Hyper- V This document provides a high level overview of Unidesk 3.x and Remote Desktop Services. It covers how Unidesk works, an architectural overview
More informationABOUT TOOLS4EVER ABOUT DELOITTE RISK SERVICES
CONTENTS About Tools4ever... 3 About Deloitte Risk Services... 3 HelloID... 4 Microsoft Azure... 5 HelloID Security Architecture... 6 Scenarios... 8 SAML Identity Provider (IDP)... 8 Service Provider SAML
More informationSSH Connections MACs the MAC XTerm application can be used to create an ssh connection, no utility is needed.
Overview of MSU Compute Servers The DECS Linux based compute servers are well suited for programs that are too slow to run on typical desktop computers but do not require the power of supercomputers. The
More informationAutodesk Revit 2016 Product Line System Requirements and Recommendations
Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016, Autodesk Revit Architecture 2016, Autodesk Revit MEP 2016, Autodesk Revit Structure 2016 Minimum: Entry-Level
More informationConnecting to the School of Computing Servers and Transferring Files
Connecting to the School of Computing Servers and Transferring Files Connecting This document will provide instructions on how to connect to the School of Computing s server. Connect Using a Mac or Linux
More informationPearson Onscreen Platform (POP) Using POP Offline testing system guide
Pearson Onscreen Platform (POP) Version 1.0 October 2014 02 What s in this guide? Contents 1 Before you start 2 Download a test 3 Play test 4 Upload response Read more Read more Read more Read more 03
More informationWeb Hosting. E-Mail Hosting. Cloud File Hosting. The Genio Group (214) 732-7411 info@thegeniogroup.com www.thegeniogroup.com
Web Hosting E-Mail Hosting Cloud File Hosting Genio Hosting Servers All of Genio s Hosting Servers run on Apple hardware running Mac OS X Server. Mac OS X Server leverages the computing power of 64-bit
More informationHow To Use Senior Systems Cloud Services
Senior Systems Cloud Services In this guide... Senior Systems Cloud Services 1 Cloud Services User Guide 2 Working In Your Cloud Environment 3 Cloud Profile Management Tool 6 How To Save Files 8 How To
More informationCourse 20533: Implementing Microsoft Azure Infrastructure Solutions
Course 20533: Implementing Microsoft Azure Infrastructure Solutions Overview About this course This course is aimed at experienced IT Professionals who currently administer their on-premises infrastructure.
More informationEnhanced Research Data Management and Publication with Globus
Enhanced Research Data Management and Publication with Globus Vas Vasiliadis Jim Pruyne Presented at OR2015 June 8, 2015 Presentations and other useful information available at globus.org/events/or2015/tutorial
More informationC T D W C O N F E R E N C E J U N E 1 7, 1 8 2 0 1 4 C O L L I E R A N D C L A Y S T E V E N S 1
C O L L I E R A N D C L A Y S T E V E N S 1 CHROMEBOOK C O L L I E R A N D C L A Y S T E V E N S 2 Overview Constant internet connection Synced to the cloud server so everything you do is automatically
More informationSSH to BeagleBone Black over USB
SSH to BeagleBone Black over USB Created by Simon Monk Last updated on 2015-06-01 12:50:09 PM EDT Guide Contents Guide Contents Overview You Will Need Preparation Installing Drivers (Windows) Installing
More informationVCL Access. VCL provides access to Linux and Windows 7 Virtual Machines. Users will only see those images that they are authorized to access.
What is VCL? VCL (Virtual Computer Lab) is a service running on servers in IIT s datacenter that enables users to schedule and connect to virtual desktops running specific academic software applications
More informationA Sales Strategy to Increase Function Bookings
A Sales Strategy to Increase Function Bookings It s Time to Start Selling Again! It s time to take on a sales oriented focus for the bowling business. Why? Most bowling centres have lost the art and the
More informationTableau Online. Understanding Data Updates
Tableau Online Understanding Data Updates Author: Francois Ajenstat July 2013 p2 Whether your data is in an on-premise database, a database, a data warehouse, a cloud application or an Excel file, you
More informationVirtual Machines and Cloud Cluster. Dan Thanh Ton University of Colorado Denver 2010 SIParCS Internship Mentor: Irfan Elahi
Virtual Machines and Cloud Cluster Dan Thanh Ton University of Colorado Denver 2010 SIParCS Internship Mentor: Irfan Elahi Overview Installed two operadng systems on one computer Installed two virtual
More informationAn Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
More informationAPACHE WEB SERVER. Andri Mirzal, PhD N28-439-03
APACHE WEB SERVER Andri Mirzal, PhD N28-439-03 Introduction The Apache is an open source web server software program notable for playing a key role in the initial growth of the World Wide Web Typically
More informationOpen Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)
Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University
More informationVMware vsphere Data Protection 6.1
VMware vsphere Data Protection 6.1 Technical Overview Revised August 10, 2015 Contents Introduction... 3 Architecture... 3 Deployment and Configuration... 5 Backup... 6 Application Backup... 6 Backup Data
More information