Introduction to Big Data Analysis for Scientists and Engineers
|
|
|
- Kerry Owen
- 10 years ago
- Views:
Transcription
1 Introduction to Big Data Analysis for Scientists and Engineers About this white paper: This paper was written by David C. Young, an employee of CSC. It was written as supplemental documentation for use by the HPC account holders at the Alabama Supercomputer Center (ASC). This was originally written in 2012, and updated in There have been a number of pushes over the years to make the most of existing data. These have come with corporate buzzwords like "Big Data", "Analytics", "Data Mining", and "Decision Support". Although the media has focused on the successes of data rich companies like Facebook, LinkedIn, Amazon, and Google, there are also some good ideas here that can be useful to people in science and engineering fields. The following is a general discussion of these ideas. For HPC users at ASC: The subdirectories underneath /opt/asn/doc/data_analysis contain a number of interesting examples. Commercial Examples Companies try to use the data available to them to improve their business. Google attempts to identify when a search term is usually looking for a local business and will thus show results for businesses near you. LinkedIn improved their volume by suggesting "People You May Know". Amazon suggests package deals of bundling product A with product B, or telling you what other products were purchased by people that got the one you just added to your cart. These simple improvements have been responsible for millions if not billions of dollars of additional revenue. In addition to the customer-facing examples mentioned in the previous paragraph, many companies analyze their old sales data to see how their business has historically fluctuated with their pricing, the economy, and their competitor s activities. This analysis can be used for future planning, and current pricing. At the time this was being written, the current buzzword was Big Data. This movement points to Internet companies as examples, but it is more generally the use of available data, which is often unstructured data. Unstructured data is information that is not already organized into a relational database. It might be in text files, publicly available on the Internet, or (perhaps worst of all) in some proprietary format. Applying Big Data Analysis Ideas to Science and Engineering Certainly the use of statistics is nothing new to those in scientific and technical fields. What may be new to many is the use of a wealth of tools, technologies and data sources that are relative new comers to the world. For example, social scientists can now access an incredibly large amount of data generated on social network sites. While there is data there, it is not as neatly organized, controlled, or structured as a scientifically conducted survey. Thus the opportunity comes with a whole new set of issues to be addressed. 1
2 This document is meant to be an introduction to these ideas. If successful, it will give you some things to think about and some examples of how to perform various functions using tools available on many High Performance Computing (HPC) systems and available for desktop computers. Obtaining Data Most users of the HPC systems transfer files between their computer on campus and the login node with commands like "sftp" and "scp" or some Windows based graphic front end that runs these same protocols. Often people start with "sftp" as they are attracted to the interactive nature of it. A typical sftp session may involve typing the following commands sftp [email protected] type your password cd calc cd gaussian cd tests get test002.com get test002.log : : quit Within sftp, the process of getting each file could be made easier by typing a command like this. mget test002* This command results in getting all of the files that have names starting with test002 People that use systems heavily, particularly Linux and Macintosh systems, often switch to using "scp" in place of "sftp" most of the time. This is because it takes far less typing to transfer a large number of files. For example, the following command transfers a directory including all files, subdirectories, and files in subdirectories from a remote computer to the local system. scp -r [email protected]:~/calc/gausian/test. This command asks you to type your password, then transfers all of those files. The period indicates that the local copy of the test directory should be placed in the directory you are in when you type the command. 2
3 Example of using wget Not all data is sitting in your directories on machines where you have accounts. It is sometimes desirable to download files off of a web site. You could do this by clicking on files in your web browser, but this can get tedious if there are hundreds of files and impossible if there are tens of thousands. The following is an example of using the wget command to automate this task. For HPC users at ASC: The directory /opt/asn/doc/data_analysis/wget-example has sample files to accompany this wget example. However, you can generate the same files by following the steps below. wget is a Linux command for downloading files from the web. It is non-interactive, meaning that it can be put in a script or left to do it's work without having to type things or click on anything. To see the manual page for wget, type "man wget" For this example, let's say that you would like to analyze trends in the computational chemistry field by looking at all of the messages posted to the Computational Chemistry List. It is an list server, which as an archive of previous messages available on the web at The file and directory structure is not published, so the following paragraphs walk you through how to figure it out. If you browse to one of the archived messages from the list, you find that it has a URL like this The web page shows the message, which is the data you want, but it has other information and navigation links above it and to the left. You can download this web page with the command; wget This puts a file named message-new? in your directory. This is a text file, so you can look at the file contents with a number of Linux utilities such as "more" or "nano". This file is not what you really wanted. It defines the frames that divide the web page into a top section, left section, and the area with the message you really wanted. Looking at those frame tags, it would be a good guess that the message you really want is defined by the following tag. <frame name="message" src="/chemistry/resources/messages/1994/08/ dir/index.html" /> This has the location of a web page, but as a relative location in the directory tree with no "http" out front. You can add the site's address and try loading the resulting web address in a different tab of your web browser using a URL like this This gives the desired data in the web browser, so you can download that file with the command. wget This puts a file named index.html in your directory. Looking at this text file, you see that it does have the message text that you wanted, but also a bunch of html tags that you don't really want to look at. We'll discuss manipulating the contents of files in another example. 3
4 Now we want to get all of the messages. The simplest idea is to use the "-r" flag to wget to recursively get all of the messages, say from August of 1994, as shown below. WARNING: DO NOT RUN THIS COMMAND. wget -r This command does not do what you expected. It does get the messages posted to the CCL in August of 1994, but it also recursively follows the links in those html pages. By following those links, it downloads the index of messages in August of 1994, which is linked to the index of all months in all years, which is linked to the site main page with links to archives of other types of software, data, images, etc. Many hours and gigabytes later you will have downloaded everything on the CCL web site that has any link pointing to it. You also will have had information about what wget is doing continuously scrolling in the window. Taking the wget recursive flag a bit more cautiously, we add the "-np" flag which tells it not to go into the parent directories, thus preventing it from picking up the rest of it's own web site. We also add "-D ccl.net" to specify the domain so that it will not follow links to any other web sites and thus attempt to download the entire web. We also add "-o log.txt" to put all of those scrolling messages into a file named log.txt Now our command to download messages from August of 1994 is as follows. wget -r -np -D ccl.net -o log.txt NOTE: If you don't include the slash after 08 you will get all of 1994 because the syntax told it no parents of 1994 instead of no parents of 08. The command above will take a few minutes to download 22 megabytes of data. It maintains the directory structure by creating the directory tree but the 08 directory is the only one that contains something other than the one subdirectory. A bit of looking at the resulting files and directories shows that each message is in raw form with a name like , has a little html file pointing to it with a name like index and has the message with html tags included in a directory with a name like dir/index.html We can quickly remove the extra files and directories with a few command like this rm -r *dir rm *index* Now we have a directory with just the raw messages from August Our 22 MB of downloaded data has now been reduced to the 5 MB that we really wanted. Macintoshes come with "curl" in place of wget. Many Linux distributions include both wget and curl. There are many other utilities for transferring data over networks, each giving improvements in specific situations. Most notable are multithreaded transfer programs, which can move data significantly faster than wget. The trade off is that wget is designed to be robust and reliable even if various types of network problems are encountered. If you are moving large amounts of data every day for months, it might be worth investigating other tools, most of which do not come with the operating system. 4
5 Structuring Data The wget example shown above pulled down a bunch of messages from a list server devoted to computational chemistry. This data isn't tabulated, sorted, or annotated. The only existing structure consists of things common to all messages, which could be about anything. Linux is the dominant operating system for servers and high performance computing systems. One big reason for this is that the Linux command set provides an incredibly flexible toolbox that can be used to manipulate information, particularly data that is available as text in files or as the output from other programs. The following examples are based on working with the files in the 08 directory that was created in the wget example above. Examining File Types Change directories into the 08 subdirectory. First consider that files may contain text, programs, or binary data. Sometimes the file name suggests what is in the file, but that is not a guarantee. Also, there are thousands of file names that used for text files with various types of data in them. Use the Linux "file" command to see what type of files these are like this file * The "file" command looks inside the files and gives it's best description of what type of data is in them. Most say "ASCII mail text", but a few say "ASCII English text", or "ASCII Pascal program text", or simply "data". Take a look at the content of one of the files identified as "ASCII mail text". Use your favorite tool for viewing text files such as "more" or "nano". The first few lines of the file contain things you would expect in an message like "From:", "To:", "Date:", and "Subject:". These are followed by the typical text of an message. Now take a look at a few of the files that the "file" command identified as being something other than messages. These are also messages, but they are different somehow. Some don't contain all of the header information that most programs put on messages. Some have stray binary characters. If the message contained a programming example, the "file" program may have classified it as source code rather than . One reason for using the Linux data handling commands is that they can robustly handle this variation in input data. 5
6 Extracting Data Now let's do something with this data. Perhaps we want to know which software packages the computational chemists are talking about this month. Let's look for discussions about the Gaussian program. The "grep" command extracts lines that contain a given sequence of characters. For example, we can look for the word "Gaussian" with the command grep Gaussian * This shows lines from all of the files (the wild card "*" meaning all was put where we give grep a file name). However, our grep command did not show instances of the word "gaussian". We can look for both capitalizations like this. grep [Gg]aussian * The "grep" command supports a list of options and regular expression matching patterns. There are books and web sites with examples of using grep in ways that have never occurred to you. We may also want to look for instances of GAUSSIAN or other capitalizations. Case insensitive searching is common enough that grep has a special flag for it, like this. grep -i gaussian * If we want to determine how many lines contain the word Gaussian, we can combine this with the "wc" command, which stands for word count. Counting lines is done like this grep -i gaussian * wc -l This found 68 occurrences of the word Gaussian. The vertical bar (shift-backslash) is called a pipe and tells Linux to send the output from one command into the next command as it's input. We may want to distinguish between messages where Gaussian is the main topic of discussion from ones where it was mentioned in passing. In messages, the topic is usually on a line that starts with the word "Subject:". More specifically, the word "Subject:" should start at the left most letter of a line. The beginning of a line is denoted by a carat (^) like this. grep ^Subject: * grep -i gaussian wc -l Now we find out that five messages contained gaussian in the subject line. 6
7 Sorting Perhaps we would like to see the list of subjects discussed this month. We can extract the Subject lines like this grep ^Subject: * The first 16 characters of each line contain the file name and the word Subject: which is unneeded. Keep the information from the 17th character on in each line, like this. grep ^Subject: * cut -c17- These topics would be easier to look through if they were alphabetized like this. grep ^Subject: * cut -c17- sort Some topics such as "This is a test" are listed multiple times. We can remove the duplicates by asking the sort command for unique entries only with the following command. grep ^Subject: * cut -c17- sort -u Extracting Columns We may want to collect metrics on how long the messages are. We can see the file sizes in bytes with the command ls -l The file sizes are in the fifth column. The numbers in that column may be three, four, five, or more digits long. Thus the numbers don't always start at the same position, so the "cut" command is a poor choice. The awk more robustly handles variable numbers of white space characters, like this. ls -l awk '{print $5}' Unlike the "expr" command, awk can also be used to do floating point mathematics. For example, we might extract the file sizes, then compute the average file size like this. ls -l awk '{sum+=$5} END { print "Average = ",sum/nr}' awk is an interpretive programming language in it's own right. However, it is often seen filling in the gaps that other shell scripting functions can't handle. 7
8 Creating Files of Formatted Data The tr command translates one character into another. This can be a handy tool for reformatting data. For example, we can make a list of the files in the directory and their file sizes like this. ls -l awk '{print $8,$5}' The two columns are separated by a space. That space can be changed into a comma with the tr command like this. ls -l awk '{print $8,$5}' tr ' ' ',' Then the resulting information can be put into a file called files.csv like this ls -l awk '{print $8,$5}' tr ' ' ',' >files.csv The extension.csv is a common way of denoting the contents of the file as "comma separated values". Many programs, such as spreadsheets, can import data in this format. Visualization Once you have data, the first order of business is often to get some sort of understanding of the data. The term "visualization" refers to a wide range of techniques for displaying data as pictures or graphs of some sort. See the white paper "Getting Started with Visualization" for a discussion of various visualization techniques and software package used to generate those images. For HPC users at ASC: See the /opt/asn/doc/data_analysis/heatmap-example directory for an example of how geographically distributed data can be used to create a colorized map, in this case a map of the state of Alabama showing where the supercomputer users are located. This gives another example of extracting usable data from a data source that, at first glance, doesn't seem to give you the correct information. The Getting Stared with Visualization paper is at /opt/asn/doc/visualization/visualization.pdf Statistical Analysis Shell scripts are handy as a way to quickly do simple mathematical calculations. However, they aren't the best option for advanced mathematical and statistical analysis of data. For small data sets, a spreadsheet program such as Excel can be a very convenient way to generate statistical descriptions of rows or columns of data. However, spreadsheets have a limit on the number of rows and columns they can handle. Depending upon the version of Excel, the maximum number of rows ranges from 65,536 to over 1 million. 8
9 A popular tool is "R". R is an environment for data handling, storage, analysis and display (providing you logged in via X-Windows). R can be used interactively through a non-graphical ssh session. It can display a variety of graphs provided you logged in through and X-Window capable ssh session. It can also be run in batch mode and run through the queue system. R implements a programming language called "S". R can be used for a variety of programming tasks, but it's strength is it's utility for mathematical, and in particular statistical, analysis of data. R is well documented in the form of a number of pdf files. The R-intro.pdf document is a good place to start. These can be found on the R web site at For HPC users at ASC: The R documentation is in the directory /opt/asn/doc/r Octave is a public domain clone of Matlab. Octave will run Matlab.m files, but will not run precompiled Matlab programs and does not have the Matlab toolboxes. For HPC users at ASC: The octave documentation is in the directory /opt/asn/doc/octave R and Octave are interpretive languages. This means that they can be used interactively, and commands typed into them are converted into machine language for execution on the fly. The disadvantage of interpretive languages is that they run slower and require more memory than programs written with compiled languages, such as C++. We have had a case in the past where a researcher that had been using Matlab rewrote their program in C++ and found that it ran far faster and used 1/10 the memory, thus allowing them to do much larger simulations with the C++ version. When the performance of an interpretive language becomes unacceptable, it is time to switch to a compiled language, specifically C, C++, or FORTRAN. The current generation of compilers tends to give the fastest executing program with C++, although the difference between the three is fairly small. Further Information For a discussion of how big data is used to increase profits in the business world, see the October 2012 issue of the Harvard Business Review. Some examples of manipulating data with Unix commands are at There is much more that can be done simply with shell scripts calling programs that come with most Linux distributions. There are some more examples in Chapter 7 of the HPC User Manual at Also, see the following books. Burtch, Ken. Linux Shell Scripting with Bash. Indianapolis, IN, Sams Publishing Robbins, Arnold, Nelson Beebe. Classic Shell Scripting. Sebastopol, CA, O Reilly & Associates, Inc.,
10 Cooper, Mendel. Advanced Bash-Scripting Guide. Kochan, Stephen, Patrick Wood. Unix Shell Programming. Carmel, IN, Hayden Books, Some Linux commands are rich enough in functionality that there is an entire book devoted to that one command. There are also quite a few examples that can be found by a simple web search. However, it is highly advisable that you take the time to understand commands you find on web searches. Many will do something similar to what you want, but not exactly correct. 10
Introduction to UNIX and SFTP
Introduction to UNIX and SFTP Introduction to UNIX 1. What is it? 2. Philosophy and issues 3. Using UNIX 4. Files & folder structure 1. What is UNIX? UNIX is an Operating System (OS) All computers require
The Power Loader GUI
The Power Loader GUI (212) 405.1010 [email protected] Follow: @1010data www.1010data.com The Power Loader GUI Contents 2 Contents Pre-Load To-Do List... 3 Login to Power Loader... 4 Upload Data Files to
Hypercosm. Studio. www.hypercosm.com
Hypercosm Studio www.hypercosm.com Hypercosm Studio Guide 3 Revision: November 2005 Copyright 2005 Hypercosm LLC All rights reserved. Hypercosm, OMAR, Hypercosm 3D Player, and Hypercosm Studio are trademarks
Make a folder named Lab3. We will be using Unix redirection commands to create several output files in that folder.
CMSC 355 Lab 3 : Penetration Testing Tools Due: September 31, 2010 In the previous lab, we used some basic system administration tools to figure out which programs where running on a system and which files
NASA Workflow Tool. User Guide. September 29, 2010
NASA Workflow Tool User Guide September 29, 2010 NASA Workflow Tool User Guide 1. Overview 2. Getting Started Preparing the Environment 3. Using the NED Client Common Terminology Workflow Configuration
File Transfer Examples. Running commands on other computers and transferring files between computers
Running commands on other computers and transferring files between computers 1 1 Remote Login Login to remote computer and run programs on that computer Once logged in to remote computer, everything you
User Guide to the Content Analysis Tool
User Guide to the Content Analysis Tool User Guide To The Content Analysis Tool 1 Contents Introduction... 3 Setting Up a New Job... 3 The Dashboard... 7 Job Queue... 8 Completed Jobs List... 8 Job Details
MATLAB on EC2 Instructions Guide
MATLAB on EC2 Instructions Guide Contents Welcome to MATLAB on EC2...3 What You Need to Do...3 Requirements...3 1. MathWorks Account...4 1.1. Create a MathWorks Account...4 1.2. Associate License...4 2.
CEFNS Web Hosting a Guide for CS212
CEFNS Web Hosting a Guide for CS212 INTRODUCTION: TOOLS: In CS212, you will be learning the basics of web development. Therefore, you want to keep your tools to a minimum so that you understand how things
Unix Sampler. PEOPLE whoami id who
Unix Sampler PEOPLE whoami id who finger username hostname grep pattern /etc/passwd Learn about yourself. See who is logged on Find out about the person who has an account called username on this host
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler 1) Operating systems a) Windows b) Unix and Linux c) Macintosh 2) Data manipulation tools a) Text Editors b) Spreadsheets
ICS 351: Today's plan
ICS 351: Today's plan routing protocols linux commands Routing protocols: overview maintaining the routing tables is very labor-intensive if done manually so routing tables are maintained automatically:
Efficiency of Web Based SAX XML Distributed Processing
Efficiency of Web Based SAX XML Distributed Processing R. Eggen Computer and Information Sciences Department University of North Florida Jacksonville, FL, USA A. Basic Computer and Information Sciences
EVALUATION ONLY. WA2088 WebSphere Application Server 8.5 Administration on Windows. Student Labs. Web Age Solutions Inc.
WA2088 WebSphere Application Server 8.5 Administration on Windows Student Labs Web Age Solutions Inc. Copyright 2013 Web Age Solutions Inc. 1 Table of Contents Directory Paths Used in Labs...3 Lab Notes...4
Linux Overview. Local facilities. Linux commands. The vi (gvim) editor
Linux Overview Local facilities Linux commands The vi (gvim) editor MobiLan This system consists of a number of laptop computers (Windows) connected to a wireless Local Area Network. You need to be careful
Getting Started with KompoZer
Getting Started with KompoZer Contents Web Publishing with KompoZer... 1 Objectives... 1 UNIX computer account... 1 Resources for learning more about WWW and HTML... 1 Introduction... 2 Publishing files
INASP: Effective Network Management Workshops
INASP: Effective Network Management Workshops Linux Familiarization and Commands (Exercises) Based on the materials developed by NSRC for AfNOG 2013, and reused with thanks. Adapted for the INASP Network
WWA FTP/SFTP CONNECTION GUIDE KNOW HOW TO CONNECT TO WWA USING FTP/SFTP
WWA FTP/SFTP CONNECTION GUIDE KNOW HOW TO CONNECT TO WWA USING FTP/SFTP Table OF Contents WWA FTP AND SFTP CONNECTION GUIDE... 3 What is FTP:... 3 What is SFTP:... 3 Connection to WWA VIA FTP:... 4 FTP
ithenticate User Manual
ithenticate User Manual Updated November 20, 2009 Contents Introduction 4 New Users 4 Logging In 4 Resetting Your Password 5 Changing Your Password or Username 6 The ithenticate Account Homepage 7 Main
DiskPulse DISK CHANGE MONITOR
DiskPulse DISK CHANGE MONITOR User Manual Version 7.9 Oct 2015 www.diskpulse.com [email protected] 1 1 DiskPulse Overview...3 2 DiskPulse Product Versions...5 3 Using Desktop Product Version...6 3.1 Product
1 Recommended Readings. 2 Resources Required. 3 Compiling and Running on Linux
CSC 482/582 Assignment #2 Securing SimpleWebServer Due: September 29, 2015 The goal of this assignment is to learn how to validate input securely. To this purpose, students will add a feature to upload
MobaXTerm: A good gnome-terminal like tabbed SSH client for Windows / Windows Putty Tabs Alternative
MobaXTerm: A good gnome-terminal like tabbed SSH client for Windows / Windows Putty Tabs Alternative Author : admin Last 10+ years I worked on GNU / Linux as Desktop. Last 7 years most of my SSH connections
Backup & Restore Guide
The email Integrity Company Sendio Email Security Platform Appliance Backup & Restore Guide Sendio, Inc. 4911 Birch St. Suite 150 Newport Beach, CA 92660 USA +1.949.274.4375 www.sendio.com 2010 Sendio,
Lesson 7 - Website Administration
Lesson 7 - Website Administration If you are hired as a web designer, your client will most likely expect you do more than just create their website. They will expect you to also know how to get their
Office of Information Technologies (OIT) Network File Shares
Office of Information Technologies (OIT) Network File Shares October 13, 2008 Contents 1 Introduction... 1 1.1 Quick Overview of Microsoft s Distributed File System (DFS)... 1 1.2 Web-based Distributed
Exclaimer Mail Archiver User Manual
User Manual www.exclaimer.com Contents GETTING STARTED... 8 Mail Archiver Overview... 9 Exchange Journaling... 9 Archive Stores... 9 Archiving Policies... 10 Search... 10 Managing Archived Messages...
Eucalyptus 3.4.2 User Console Guide
Eucalyptus 3.4.2 User Console Guide 2014-02-23 Eucalyptus Systems Eucalyptus Contents 2 Contents User Console Overview...4 Install the Eucalyptus User Console...5 Install on Centos / RHEL 6.3...5 Configure
Search help. More on Office.com: images templates
Page 1 of 14 Access 2010 Home > Access 2010 Help and How-to > Getting started Search help More on Office.com: images templates Access 2010: database tasks Here are some basic database tasks that you can
PuTTY/Cygwin Tutorial. By Ben Meister Written for CS 23, Winter 2007
PuTTY/Cygwin Tutorial By Ben Meister Written for CS 23, Winter 2007 This tutorial will show you how to set up and use PuTTY to connect to CS Department computers using SSH, and how to install and use the
Code Estimation Tools Directions for a Services Engagement
Code Estimation Tools Directions for a Services Engagement Summary Black Duck software provides two tools to calculate size, number, and category of files in a code base. This information is necessary
Hadoop Basics with InfoSphere BigInsights
An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Part: 1 Exploring Hadoop Distributed File System An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government
Web File Management with SSH Secure Shell 3.2.3
Web File Management with SSH Secure Shell 3.2.3 June 2003 Information Technologies Copyright 2003 University of Delaware. Permission to copy without fee all or part of this material is granted provided
CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities
CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities DNS name: turing.cs.montclair.edu -This server is the Departmental Server
Tutorial Guide to the IS Unix Service
Tutorial Guide to the IS Unix Service The aim of this guide is to help people to start using the facilities available on the Unix and Linux servers managed by Information Services. It refers in particular
W3Perl A free logfile analyzer
W3Perl A free logfile analyzer Features Works on Unix / Windows / Mac View last entries based on Perl scripts Web / FTP / Squid / Email servers Session tracking Others log format can be added easily Detailed
Web Hosting: Pipeline Program Technical Self Study Guide
Pipeline Program Technical Self Study Guide Thank you for your interest in InMotion Hosting and our Technical Support positions. Our technical support associates operate in a call center environment, assisting
MFCF Grad Session 2015
MFCF Grad Session 2015 Agenda Introduction Help Centre and requests Dept. Grad reps Linux clusters using R with MPI Remote applications Future computing direction Technical question and answer period MFCF
Adobe Marketing Cloud Using FTP and sftp with the Adobe Marketing Cloud
Adobe Marketing Cloud Using FTP and sftp with the Adobe Marketing Cloud Contents File Transfer Protocol...3 Setting Up and Using FTP Accounts Hosted by Adobe...3 SAINT...3 Data Sources...4 Data Connectors...5
Setting Up Dreamweaver for FTP and Site Management
518 442-3608 Setting Up Dreamweaver for FTP and Site Management This document explains how to set up Dreamweaver CS5.5 so that you can transfer your files to a hosting server. The information is applicable
Computing Service G72. File Transfer Using SCP, SFTP or FTP. many leaflets can be found at: http://www.cam.ac.uk/cs/docs
Computing Service G72 File Transfer Using SCP, SFTP or FTP many leaflets can be found at: http://www.cam.ac.uk/cs/docs 50 pence April 2003 Contents Introduction Servers and clients... 2 Comparison of FTP,
ithenticate User Manual
ithenticate User Manual Version: 2.0.2 Updated March 16, 2012 Contents Introduction 4 New Users 4 Logging In 4 Resetting Your Password 5 Changing Your Password or Username 6 The ithenticate Account Homepage
Cloud Server powered by Mac OS X. Getting Started Guide. Cloud Server. powered by Mac OS X. AKJZNAzsqknsxxkjnsjx Getting Started Guide Page 1
Getting Started Guide Cloud Server powered by Mac OS X Getting Started Guide Page 1 Getting Started Guide: Cloud Server powered by Mac OS X Version 1.0 (02.16.10) Copyright 2010 GoDaddy.com Software, Inc.
2 Advanced Session... Properties 3 Session profile... wizard. 5 Application... preferences. 3 ASCII / Binary... Transfer
Contents I Table of Contents Foreword 0 Part I SecEx Overview 3 1 What is SecEx...? 3 2 Quick start... 4 Part II Configuring SecEx 5 1 Session Profiles... 5 2 Advanced Session... Properties 6 3 Session
CASHNet Secure File Transfer Instructions
CASHNet Secure File Transfer Instructions Copyright 2009, 2010 Higher One Payments, Inc. CASHNet, CASHNet Business Office, CASHNet Commerce Center, CASHNet SMARTPAY and all related logos and designs are
Hands-on Network Traffic Analysis. 2015 Cyber Defense Boot Camp
Hands-on Network Traffic Analysis 2015 Cyber Defense Boot Camp What is this about? Prerequisite: network packet & packet analyzer: (header, data) Enveloped letters inside another envelope Exercises Basic
WS_FTP Professional 12
WS_FTP Professional 12 Tools Guide Contents CHAPTER 1 Introduction Ways to Automate Regular File Transfers...5 Check Transfer Status and Logs...6 Building a List of Files for Transfer...6 Transfer Files
7 Why Use Perl for CGI?
7 Why Use Perl for CGI? Perl is the de facto standard for CGI programming for a number of reasons, but perhaps the most important are: Socket Support: Perl makes it easy to create programs that interface
Prepare the environment Practical Part 1.1
Prepare the environment Practical Part 1.1 The first exercise should get you comfortable with the computer environment. I am going to assume that either you have some minimal experience with command line
ithenticate User Manual
ithenticate User Manual Version: 2.0.8 Updated February 4, 2014 Contents Introduction 4 New Users 4 Logging In 4 Resetting Your Password 5 Changing Your Password or Username 6 The ithenticate Account Homepage
Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller
Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller Most of you want to use R to analyze data. However, while R does have a data editor, other programs such as excel are often better
HPCC - Hrothgar Getting Started User Guide
HPCC - Hrothgar Getting Started User Guide Transfer files High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents Transferring files... 3 1.1 Transferring files using
Cisco Networking Academy Program Curriculum Scope & Sequence. Fundamentals of UNIX version 2.0 (July, 2002)
Cisco Networking Academy Program Curriculum Scope & Sequence Fundamentals of UNIX version 2.0 (July, 2002) Course Description: Fundamentals of UNIX teaches you how to use the UNIX operating system and
Git - Working with Remote Repositories
Git - Working with Remote Repositories Handout New Concepts Working with remote Git repositories including setting up remote repositories, cloning remote repositories, and keeping local repositories in-sync
Using the AVR microcontroller based web server
1 of 7 http://tuxgraphics.org/electronics Using the AVR microcontroller based web server Abstract: There are two related articles which describe how to build the AVR web server discussed here: 1. 2. An
Bioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
Penetration Testing Lab. Reconnaissance and Mapping Using Samurai-2.0
Penetration Testing Lab Reconnaissance and Mapping Using Samurai-2.0 Notes: 1. Be careful about running most of these tools against machines without permission. Even the poorest intrusion detection system
TECHNICAL REFERENCE GUIDE
TECHNICAL REFERENCE GUIDE SOURCE TARGET Kerio Microsoft Exchange/Outlook (PST) (versions 2010, 2007) Copyright 2014 by Transend Corporation EXECUTIVE SUMMARY This White Paper provides detailed information
File transfer clients manual File Delivery Services
File transfer clients manual File Delivery Services Publisher Post CH Ltd Information Technology Webergutstrasse 12 CH-3030 Berne (Zollikofen) Contact Post CH Ltd Information Technology Webergutstrasse
The Einstein Depot server
The Einstein Depot server Have you ever needed a way to transfer large files to colleagues? Or allow a colleague to send large files to you? Do you need to transfer files that are too big to be sent as
Terminal Server Guide
Terminal Server Guide Contents What is Terminal Server?... 2 How to use Terminal Server... 2 Remote Desktop Connection Client... 2 Logging in... 3 Important Security Information... 4 Logging Out... 4 Closing
Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5.
1 2 3 4 Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5. It replaces the previous tools Database Manager GUI and SQL Studio from SAP MaxDB version 7.7 onwards
Bootstrap guide for the File Station
Bootstrap guide for the File Station Introduction Through the File Server it is possible to store files and create automated backups on a reliable, redundant storage system. NOTE: this guide considers
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
How To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip
Load testing with WAPT: Quick Start Guide This document describes step by step how to create a simple typical test for a web application, execute it and interpret the results. A brief insight is provided
SSL Tunnels. Introduction
SSL Tunnels Introduction As you probably know, SSL protects data communications by encrypting all data exchanged between a client and a server using cryptographic algorithms. This makes it very difficult,
FTP Service Reference
IceWarp Server FTP Service Reference Version 10 Printed on 12 August, 2009 i Contents FTP Service 1 V10 New Features... 2 FTP Access Mode... 2 FTP Synchronization... 2 FTP Service Node... 3 FTP Service
Learning Management System (LMS) Guide for Administrators
Learning Management System (LMS) Guide for Administrators www.corelearningonline.com Contents Core Learning Online LMS Guide for Administrators Overview...2 Section 1: Administrator Permissions...3 Assigning
User Guide. Making EasyBlog Your Perfect Blogging Tool
User Guide Making EasyBlog Your Perfect Blogging Tool Table of Contents CHAPTER 1: INSTALLING EASYBLOG 3 1. INSTALL EASYBLOG FROM JOOMLA. 3 2. INSTALL EASYBLOG FROM DIRECTORY. 4 CHAPTER 2: CREATING MENU
How to use the UNIX commands for incident handling. June 12, 2013 Koichiro (Sparky) Komiyama Sam Sasaki JPCERT Coordination Center, Japan
How to use the UNIX commands for incident handling June 12, 2013 Koichiro (Sparky) Komiyama Sam Sasaki JPCERT Coordination Center, Japan Agenda Training Environment Commands for incident handling network
Infoview XIR3. User Guide. 1 of 20
Infoview XIR3 User Guide 1 of 20 1. WHAT IS INFOVIEW?...3 2. LOGGING IN TO INFOVIEW...4 3. NAVIGATING THE INFOVIEW ENVIRONMENT...5 3.1. Home Page... 5 3.2. The Header Panel... 5 3.3. Workspace Panel...
Version control. with git and GitHub. Karl Broman. Biostatistics & Medical Informatics, UW Madison
Version control with git and GitHub Karl Broman Biostatistics & Medical Informatics, UW Madison kbroman.org github.com/kbroman @kwbroman Course web: kbroman.org/tools4rr Slides prepared with Sam Younkin
Getting Started with PRTG Network Monitor 2012 Paessler AG
Getting Started with PRTG Network Monitor 2012 Paessler AG All rights reserved. No parts of this work may be reproduced in any form or by any means graphic, electronic, or mechanical, including photocopying,
InfiniteInsight 6.5 sp4
End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...
MassTransit vs. FTP Comparison
MassTransit vs. Comparison If you think is an optimal solution for delivering digital files and assets important to the strategic business process, think again. is designed to be a simple utility for remote
AnzioWin FTP Dialog. AnzioWin version 15.0 and later
AnzioWin FTP Dialog AnzioWin version 15.0 and later With AnzioWin version 15.0, we have included an enhanced interactive FTP dialog that operates similar to Windows Explorer. The FTP dialog, shown below,
Version Control with. Ben Morgan
Version Control with Ben Morgan Developer Workflow Log what we did: Add foo support Edit Sources Add Files Compile and Test Logbook ======= 1. Initial version Logbook ======= 1. Initial version 2. Remove
Customer Control Panel Manual
Customer Control Panel Manual Contents Introduction... 2 Before you begin... 2 Logging in to the Control Panel... 2 Resetting your Control Panel password.... 3 Managing FTP... 4 FTP details for your website...
White paper: Google Analytics 12 steps to advanced setup for developers
White paper: Google Analytics 12 steps to advanced setup for developers We at Core work with a range of companies who come to us to advises them and manage their search and social requirements. Dr Jess
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
Novell ZENworks Asset Management 7.5
Novell ZENworks Asset Management 7.5 w w w. n o v e l l. c o m October 2006 USING THE WEB CONSOLE Table Of Contents Getting Started with ZENworks Asset Management Web Console... 1 How to Get Started...
Linux command line. An introduction to the Linux command line for genomics. Susan Fairley
Linux command line An introduction to the Linux command line for genomics Susan Fairley Aims Introduce the command line Provide an awareness of basic functionality Illustrate with some examples Provide
JOOMLA 2.5 MANUAL WEBSITEDESIGN.CO.ZA
JOOMLA 2.5 MANUAL WEBSITEDESIGN.CO.ZA All information presented in the document has been acquired from http://docs.joomla.org to assist you with your website 1 JOOMLA 2.5 MANUAL WEBSITEDESIGN.CO.ZA BACK
FTP Guide - Main Document Secure File Transfer Protocol (SFTP) Instruction Guide
FTP Guide - Main Document Secure File Transfer Protocol (SFTP) Instruction Guide Version 5.8.0 November 2015 Prepared For: Defense Logistics Agency Prepared By: CACI Enterprise Solutions, Inc. 50 North
SSH and Basic Commands
SSH and Basic Commands In this tutorial we'll introduce you to SSH - a tool that allows you to send remote commands to your Web server - and show you some simple UNIX commands to help you manage your website.
MEDIAplus administration interface
MEDIAplus administration interface 1. MEDIAplus administration interface... 5 2. Basics of MEDIAplus administration... 8 2.1. Domains and administrators... 8 2.2. Programmes, modules and topics... 10 2.3.
Table of Contents. Introduction: 2. Settings: 6. Archive Email: 9. Search Email: 12. Browse Email: 16. Schedule Archiving: 18
MailSteward Manual Page 1 Table of Contents Introduction: 2 Settings: 6 Archive Email: 9 Search Email: 12 Browse Email: 16 Schedule Archiving: 18 Add, Search, & View Tags: 20 Set Rules for Tagging or Excluding:
Version Control Script
Version Control Script Mike Jackson, The Software Sustainability Institute Things you should do are written in bold. Suggested dialog is in normal text. Command- line excerpts and code fragments are in
My Secure Backup: How to reduce your backup size
My Secure Backup: How to reduce your backup size As time passes, we find our backups getting bigger and bigger, causing increased space charges. This paper takes a few Newsletter and other articles I've
Lab 2: Secure Network Administration Principles - Log Analysis
CompTIA Security+ Lab Series Lab 2: Secure Network Administration Principles - Log Analysis CompTIA Security+ Domain 1 - Network Security Objective 1.2: Apply and implement secure network administration
INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3
INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3 Often the most compelling way to introduce yourself to a software product is to try deliver value as soon as possible. Simego DS3 is designed to get you
Using Dedicated Servers from the game
Quick and short instructions for running and using Project CARS dedicated servers on PC. Last updated 27.2.2015. Using Dedicated Servers from the game Creating multiplayer session hosted on a DS Joining
Understand for FORTRAN
Understand Your Software... Understand for FORTRAN User Guide and Reference Manual Version 1.4 Scientific Toolworks, Inc. Scientific Toolworks, Inc. 1579 Broad Brook Road South Royalton, VT 05068 Copyright
HPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
Creating your personal website. Installing necessary programs Creating a website Publishing a website
Creating your personal website Installing necessary programs Creating a website Publishing a website The objective of these instructions is to aid in the production of a personal website published on
