No black magic: Text processing using the UNIX command line

Size: px
Start display at page:

Download "No black magic: Text processing using the UNIX command line"

Transcription

1 ? No black magic: Text processing using the UNIX command line Barbara Plank Nov 6, 2014

2 Motivation (1994)

3 What is UNIX? Operating system (OS), 1969 AT&T / Bell labs Used loosely to refer to any OS sharing the same basic design (Linux, Solaris, Mac OS) Unix philosophy: Build functionality out of small programs that do one thing and do it well Slide inspired by:

4 What is the command line? $ command prompt this window is called the terminal which is the program that allows us to interact with the shell the shell is an environment that executes the commands we type in at the command prompt

5 What is the command line? INPUT OUTPUT REPL: read-eval-print loop very different from the well-known graphical user interface

6 Input Output process model Shell programs do I/O (input/output) with the terminal, using three streams: Terminal INPUT Keyboard stdin shell program stderr Display (print) stdout shell environment (e.g. Bash shell) OUTPUT Interactively, you rarely notice there's separate stdout and stderr (today we won t worry about stderr)

7 Unix philosophy combine many small programs for more advanced functionality Terminal Keyboard Display (print) stdin stderr shell program stdout shell program stdout shell program stdout shell environment (e.g. Bash shell)

8 Why (still) the command line? Advantages: allows you to be agile (REPL vs edit-compile-run-debug cycle) this window is called the terminal the command line is extensible and complementary which is the program that allows us automation and reproducibility to interact with the shell to run jobs on big clusters of computers (HPC computing) the shell executes the commands we type in Disadvantage: (e.g. Bash shell) takes some time to get acquainted

9 Start a terminal: on Mac OS Applications, Utilities, Terminal

10 On Linux

11 Note: Windows Windows Command Prompt cmd (or PowerShell) is fundamentally different and incompatible with the commands we will see today! for today: download PuTTY

12 Getting started: start terminal Connect to the server for today s workshop (see handout): ssh username@hostname Type yes when you see this: The authenticity of host.(10.1.) can't be established. ECDSA key fingerprint is c0:7b: 40:5f:c9:d4:97:6f:33:27:76:8f:5e:b9:25:92. Are you sure you want to continue connecting (yes/no)? yes Enter the password You now have a prompt: Windows users use PuTTY: hostname

13 now we are all connected to a shell where we can issue commands

14 First shell commands Type text (your command) after the prompt ($), followed by ENTER: pwd: print working directory (shows the current location in the filesystem)

15 Shell command: Structure A shell command (or shell program) usually takes parameters: arguments (required) and options (optional) Shell program with argument(s) cat text.txt cat text1.txt text2.txt text3.txt With argument and option: cat -n text.txt (prefix every line by line number)

16 Note shell commands are CaSE SeNsItVe pwd PWD Pwd pwd spaces have special meanings (do not use them for file names or folder names)

17 Where to find help To know what options and arguments a command takes consult the man (manual) pages: man whoami man cat Use q to exit

18 Tips m<tab> (use auto-completion) use the arrow up key to reload command from your command history (or more advanced to search history of commands: <CTRL>+r) <CTRL>+d or <CTRL>+c or just q to quit

19 Word frequency list

20 Prerequisite: Copy file Copy the text file from my home directory to yours: cp /home/bplank/text.txt. command name arg1: what? arg2: where to? (copy) Check if the file is in your directory with ls:

21 Inspect files head text.txt prints out the first ten lines of the file Try out the following commands - what do they do? tail text.txt cat text.txt less text.txt (continue with SPACE or arrow UP/DOWN; quit by typing q)

22 line-based processing head text.txt prints out the first (by default) ten lines of the file head -4 text.txt prints out the first 4 lines of the file

23 I/O redirection to files Shell commands can be redirected to write to files instead of to the screen, or read from files instead of the keyboard Append to any command: > myfile send stdout to file called myfile < myfile send content of myfile as input to some program < 2> > 2> myfile send stderr to file called myfile

24 line-based processing and I/O redirection head text.txt equivalent to head < text.txt head -1 text.txt > tmp prints out the first line of the file and stores it in file tmp Exercise: store the last 4 lines of the file text.txt in a file called footer.txt

25 Recipe for counting words An algorithm: a. split text into one word per line (tokenize) b. sort words c. count how often each word appears

26 a) split text: word per line translate A into B A=set of characters B=single character (\n newline) -s squeezes multiple blanks, -c complement tr -sc [a-za-z] \n < text.txt More examples: tr -sc [a-za-z0-9] \n < text.txt tr -sc [:alnum:] \n < text.txt tr -sc [:alnum:]@# \n < tweets.txt

27 b) sorting lines of text: sort FILE sort -r (reverse sort) sort -n (numeric) sort sort -nr (reverse numeric sort) Exercise: try out the sort command with the different options above on the the file: /home/bplank/numbers

28 c) count words = count duplicate lines in a sorted text file: uniq -c uniq assumes a SORTED file as input! uniq -c SORTEDFILE Exercise: frequency list of numbers in file sort the numbers file and save it (> redirect to file) in a new file called numsorted now use uniq -c to count how often each number appears Solution: sort -n /home/bplank/numbers > numsorted uniq -c numsorted

29 Now we have seen all necessary ingredients for our recipe on counting words An algorithm: a. split text into one word per line (tokenize) b. sort words c. count how often each word appears

30 The UNIX game commands ~ bricks building more powerful tools by combining bricks using the pipe:

31 The Pipe Unix philosophy: combine many small programs Terminal Keyboard stdin stderr tr -sc [:alnum:] \n shell program stdout Display (print) stdout use as glue uniq -q shell program sort shell program stdout shell environment (e.g. Bash shell)

32 Word frequency list combining the three single commands (tr,sort,uniq): tr -sc [:alnum:] \n < text.txt sort uniq -q Terminal specify input for first program combine commands using the pipe (the symbol), i.e., the stdout of the previous is the stdin for the next command

33 The Pipe: tr -sc [:alnum:] \n < text.txt sort uniq -q Terminal Keyboard Display (print) stdin stderr tr -sc [:alnum:] \n shell program stdout sort shell program stdout shell environment (e.g. Bash shell) uniq -q shell program stdout

34 Using pipe to avoid extra files without pipe (2 commandos = 2 REPLs): with pipe (no intermediate file necessary! 1 REPL):

35 alternative to split test: sed sed (replace) command: sed s/what/with/g FILE sed s/ /\n/g text.txt What happens if you leave out g? Try the following (with and without g): sed s/i/**you**/g /home/bplank/ short.txt

36 tr Another use of tr: tr '[:upper:]' '[:lower:]' < text.txt! Extra exercise: Merge upper and lower case by downcasing everything

37 Exercise Extract the 10 most frequent hashtags from the file /home/bplank/tweets.txt (hint: create a word frequency list first and then use sort and head) Also, use the command grep ^# (grep # ) in your pipeline (to extract words that start with a hashtag) we will see grep again later

38 File system and navigation

39 File system usual system with files, folders, paths to files root of the file system hierarchy is always: / paths can be absolute or relative, e.g. /home/bplank/data vs data/ Commonly used directories:. current working directory.. parent directory ~ home directory of user (for me: /home/bplank == ~bplank)

40 Navigating the file system cd change directory cd data/001/ mkdir project creates a directory called project ls list content of directory ls /home/bplank pwd

41 What we have seen so far What is UNIX, what is the command line, why Inspecting a file on the command line Creating a word frequency lists (sed, sort, uniq, tr, and the pipe), extract most frequent words File system and navigation

42 Overview Bigrams, working with tabular data Searching files with grep A final tiny story

43 Bigram = word pairs Algorithm: tokenize by word print word_i and word_i+1 next to each other count

44 Print words next to each other paste command paste FILE1 FILE2 if your two files contain lists of words, prints them next to each other

45 get next word create a file with one word per line create a second file from the first, but which starts at the second line: tail -n +2 file > next [start with the second file and output all until the end]

46 Bigrams Exercise: find the 5 most frequent bigrams of text.txt

47 Solution: Find the 5 most common bigrams Extra: Find the 5 most common trigrams

48 Tabular data paste FILES (in contrast to cat) cut -f1 FILE (cut out first column from FILE) Exercise: create a frequency list from column 4 in file parses.conll cut -f 4 parses.conll sed '/^$/d' sort uniq -c sort -nr

49 grep grep finds lines that match a given pattern grep star text.txt

50 grep grep finds patterns specified as regular expression globally search for regular expression and print grep is a filter - you only keep certain lines of the input e.g., words that end with -ing: grep -w "[a-z]*ing" text.txt Exercises: try the above command: without -w option with the -o and -w option (or -ow for shorthand) what does the -v and -i option do? use man grep to find out

51 grep grep gh keep lines containing gh grep -i gh keep lines containing gh independent of casing (gh GH..) grep ^ch keep lines beginning with ch grep ing$ keep lines ending with ing grep -v gh do NOT keep lines containing gh

52 More on regular expressions see Lindberg [1] or chapter 2 on regular expressions of [4] Jurafsky & Manning

53 Counting: wc Counting lines (-l), words and characters in a file: wc FILE Why is the number of words different?

54 Exercises with grep & wc How many uppercase words are in text.txt? How many 4-letter words? How many 1 syllable words are there (with exactly one vowel)?

55 stop words

56 Removing bigrams that contain stop words Exercise: Use grep to filter out stop words from the text.bigram file

57 Most frequent bigrams w/o stop words towards more useful bigrams pre-processing matters!

58 Shell scripts Basically, a shell script is a text file with shell commands in it. To automate and avoid repetition Example: backup.sh make executable: chmod +x backup.sh run:./backup.sh (or sh backup.sh)

59 Shell scripts: example Create text file called bigram.sh Execute it on a text file:./bigram.sh head -5 sh bigram.sh sort uniq -c sort -nr head

60 a tiny story (real-world example) in the end..

61 I never seem to remember when the New York Fashion Week takes place

62 New York Fashion week we ll consult the New York Times (web API) to find out. Step 1: get the data <your-key>

63 New York Fashion week Step 2: combine the results

64 Extract year-month Extract year and month and sort by frequency to get a first impression

65

66 References [1] Nikolaj Lindberg. egrep for Linguists. stts.se/egrep_for_linguists/egrep_for_linguists.pdf [2] Ken W. Church (1994). Unix for Poets. cst.dk/bplank/refs/unixforpoets.pdf [3] Jeroen Janssens (2014). Data Science at the Command Line. O Reilly. [4] Jursfky & Martin. Speech and Language Processing. 2nd edition (2009).

Linux command line. An introduction to the Linux command line for genomics. Susan Fairley

Linux command line. An introduction to the Linux command line for genomics. Susan Fairley Linux command line An introduction to the Linux command line for genomics Susan Fairley Aims Introduce the command line Provide an awareness of basic functionality Illustrate with some examples Provide

More information

Command Line - Part 1

Command Line - Part 1 Command Line - Part 1 STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat Course web: gastonsanchez.com/teaching/stat133 GUIs 2 Graphical User Interfaces

More information

INASP: Effective Network Management Workshops

INASP: Effective Network Management Workshops INASP: Effective Network Management Workshops Linux Familiarization and Commands (Exercises) Based on the materials developed by NSRC for AfNOG 2013, and reused with thanks. Adapted for the INASP Network

More information

Command Line Crash Course For Unix

Command Line Crash Course For Unix Command Line Crash Course For Unix Controlling Your Computer From The Terminal Zed A. Shaw December 2011 Introduction How To Use This Course You cannot learn to do this from videos alone. You can learn

More information

Beginners Shell Scripting for Batch Jobs

Beginners Shell Scripting for Batch Jobs Beginners Shell Scripting for Batch Jobs Evan Bollig and Geoffrey Womeldorff Before we begin... Everyone please visit this page for example scripts and grab a crib sheet from the front http://www.scs.fsu.edu/~bollig/techseries

More information

Tutorial 0A Programming on the command line

Tutorial 0A Programming on the command line Tutorial 0A Programming on the command line Operating systems User Software Program 1 Program 2 Program n Operating System Hardware CPU Memory Disk Screen Keyboard Mouse 2 Operating systems Microsoft Apple

More information

A Crash Course on UNIX

A Crash Course on UNIX A Crash Course on UNIX UNIX is an "operating system". Interface between user and data stored on computer. A Windows-style interface is not required. Many flavors of UNIX (and windows interfaces). Solaris,

More information

Cisco Networking Academy Program Curriculum Scope & Sequence. Fundamentals of UNIX version 2.0 (July, 2002)

Cisco Networking Academy Program Curriculum Scope & Sequence. Fundamentals of UNIX version 2.0 (July, 2002) Cisco Networking Academy Program Curriculum Scope & Sequence Fundamentals of UNIX version 2.0 (July, 2002) Course Description: Fundamentals of UNIX teaches you how to use the UNIX operating system and

More information

LSN 10 Linux Overview

LSN 10 Linux Overview LSN 10 Linux Overview ECT362 Operating Systems Department of Engineering Technology LSN 10 Linux Overview Linux Contemporary open source implementation of UNIX available for free on the Internet Introduced

More information

A UNIX/Linux in a nutshell

A UNIX/Linux in a nutshell bergman p.1/23 A UNIX/Linux in a nutshell Introduction Linux/UNIX Tommi Bergman tommi.bergman[at]csc.fi Computational Environment & Application CSC IT center for science Ltd. Espoo, Finland bergman p.2/23

More information

CPSC2800: Linux Hands-on Lab #3 Explore Linux file system and file security. Project 3-1

CPSC2800: Linux Hands-on Lab #3 Explore Linux file system and file security. Project 3-1 CPSC2800: Linux Hands-on Lab #3 Explore Linux file system and file security Project 3-1 Linux support many different file systems that can be mounted using the mount command. In this project, you use the

More information

Introduction to Operating Systems

Introduction to Operating Systems Introduction to Operating Systems It is important that you familiarize yourself with Windows and Linux in preparation for this course. The exercises in this book assume a basic knowledge of both of these

More information

ICS 351: Today's plan

ICS 351: Today's plan ICS 351: Today's plan routing protocols linux commands Routing protocols: overview maintaining the routing tables is very labor-intensive if done manually so routing tables are maintained automatically:

More information

New Lab Intro to KDE Terminal Konsole

New Lab Intro to KDE Terminal Konsole New Lab Intro to KDE Terminal Konsole After completing this lab activity the student will be able to; Access the KDE Terminal Konsole and enter basic commands. Enter commands using a typical command line

More information

HP-UX Essentials and Shell Programming Course Summary

HP-UX Essentials and Shell Programming Course Summary Contact Us: (616) 875-4060 HP-UX Essentials and Shell Programming Course Summary Length: 5 Days Prerequisite: Basic computer skills Recommendation Statement: Student should be able to use a computer monitor,

More information

Unix Sampler. PEOPLE whoami id who

Unix Sampler. PEOPLE whoami id who Unix Sampler PEOPLE whoami id who finger username hostname grep pattern /etc/passwd Learn about yourself. See who is logged on Find out about the person who has an account called username on this host

More information

Chapter 2 Text Processing with the Command Line Interface

Chapter 2 Text Processing with the Command Line Interface Chapter 2 Text Processing with the Command Line Interface Abstract This chapter aims to help demystify the command line interface that is commonly used in UNIX and UNIX-like systems such as Linux and Mac

More information

Introduction to Shell Programming

Introduction to Shell Programming Introduction to Shell Programming what is shell programming? about cygwin review of basic UNIX TM pipelines of commands about shell scripts some new commands variables parameters and shift command substitution

More information

Basic C Shell. helpdesk@stat.rice.edu. 11th August 2003

Basic C Shell. helpdesk@stat.rice.edu. 11th August 2003 Basic C Shell helpdesk@stat.rice.edu 11th August 2003 This is a very brief guide to how to use cshell to speed up your use of Unix commands. Googling C Shell Tutorial can lead you to more detailed information.

More information

Command-Line Operations : The Shell. Don't fear the command line...

Command-Line Operations : The Shell. Don't fear the command line... Command-Line Operations : The Shell Don't fear the command line... Shell Graphical User Interface (GUI) Graphical User Interface : displays to interact with the computer - Open and manipulate files and

More information

Introduction to Programming and Computing for Scientists

Introduction to Programming and Computing for Scientists Oxana Smirnova (Lund University) Programming for Scientists Tutorial 7b 1 / 48 Introduction to Programming and Computing for Scientists Oxana Smirnova Lund University Tutorial 7b: Grid certificates and

More information

Unix the Bare Minimum

Unix the Bare Minimum Unix the Bare Minimum Norman Matloff September 27, 2005 c 2001-2005, N.S. Matloff Contents 1 Purpose 2 2 Shells 2 3 Files and Directories 4 3.1 Creating Directories.......................................

More information

1 Basic commands. 2 Terminology. CS61B, Fall 2009 Simple UNIX Commands P. N. Hilfinger

1 Basic commands. 2 Terminology. CS61B, Fall 2009 Simple UNIX Commands P. N. Hilfinger CS61B, Fall 2009 Simple UNIX Commands P. N. Hilfinger 1 Basic commands This section describes a list of commonly used commands that are available on the EECS UNIX systems. Most commands are executed by

More information

Tutorial Guide to the IS Unix Service

Tutorial Guide to the IS Unix Service Tutorial Guide to the IS Unix Service The aim of this guide is to help people to start using the facilities available on the Unix and Linux servers managed by Information Services. It refers in particular

More information

Linux Overview. Local facilities. Linux commands. The vi (gvim) editor

Linux Overview. Local facilities. Linux commands. The vi (gvim) editor Linux Overview Local facilities Linux commands The vi (gvim) editor MobiLan This system consists of a number of laptop computers (Windows) connected to a wireless Local Area Network. You need to be careful

More information

Hands-On UNIX Exercise:

Hands-On UNIX Exercise: Hands-On UNIX Exercise: This exercise takes you around some of the features of the shell. Even if you don't need to use them all straight away, it's very useful to be aware of them and to know how to deal

More information

Unix Shell Scripts. Contents. 1 Introduction. Norman Matloff. July 30, 2008. 1 Introduction 1. 2 Invoking Shell Scripts 2

Unix Shell Scripts. Contents. 1 Introduction. Norman Matloff. July 30, 2008. 1 Introduction 1. 2 Invoking Shell Scripts 2 Unix Shell Scripts Norman Matloff July 30, 2008 Contents 1 Introduction 1 2 Invoking Shell Scripts 2 2.1 Direct Interpretation....................................... 2 2.2 Indirect Interpretation......................................

More information

An Introduction to the Linux Command Shell For Beginners

An Introduction to the Linux Command Shell For Beginners An Introduction to the Linux Command Shell For Beginners Presented by: Victor Gedris In Co-Operation With: The Ottawa Canada Linux Users Group and ExitCertified Copyright and Redistribution This manual

More information

grep, awk and sed three VERY useful command-line utilities Matt Probert, Uni of York grep = global regular expression print

grep, awk and sed three VERY useful command-line utilities Matt Probert, Uni of York grep = global regular expression print grep, awk and sed three VERY useful command-line utilities Matt Probert, Uni of York grep = global regular expression print In the simplest terms, grep (global regular expression print) will search input

More information

CS2043 - Unix Tools & Scripting Lecture 9 Shell Scripting

CS2043 - Unix Tools & Scripting Lecture 9 Shell Scripting CS2043 - Unix Tools & Scripting Lecture 9 Shell Scripting Spring 2015 1 February 9, 2015 1 based on slides by Hussam Abu-Libdeh, Bruno Abrahao and David Slater over the years Announcements Coursework adjustments

More information

SEO - Access Logs After Excel Fails...

SEO - Access Logs After Excel Fails... Server Logs After Excel Fails @ohgm Prepare for walls of text. About Me Former Senior Technical Consultant @ builtvisible. Now Freelance Technical SEO Consultant. @ohgm on Twitter. ohgm.co.uk for my webzone.

More information

Lab 1: Introduction to C, ASCII ART and the Linux Command Line Environment

Lab 1: Introduction to C, ASCII ART and the Linux Command Line Environment .i.-' `-. i..' `/ \' _`.,-../ o o \.' ` ( / \ ) \\\ (_.'.'"`.`._) /// \\`._(..: :..)_.'// \`. \.:-:. /.'/ `-i-->..

More information

Lecture 4. Regular Expressions grep and sed intro

Lecture 4. Regular Expressions grep and sed intro Lecture 4 Regular Expressions grep and sed intro Previously Basic UNIX Commands Files: rm, cp, mv, ls, ln Processes: ps, kill Unix Filters cat, head, tail, tee, wc cut, paste find sort, uniq comm, diff,

More information

AN INTRODUCTION TO UNIX

AN INTRODUCTION TO UNIX AN INTRODUCTION TO UNIX Paul Johnson School of Mathematics September 24, 2010 OUTLINE 1 SHELL SCRIPTS Shells 2 COMMAND LINE Command Line Input/Output 3 JOBS Processes Job Control 4 NETWORKING Working From

More information

Unix Guide. Logo Reproduction. School of Computing & Information Systems. Colours red and black on white backgroun

Unix Guide. Logo Reproduction. School of Computing & Information Systems. Colours red and black on white backgroun Logo Reproduction Colours red and black on white backgroun School of Computing & Information Systems Unix Guide Mono positive black on white background 2013 Mono negative white only out of any colou 2

More information

PHP Debugging. Draft: March 19, 2013 2013 Christopher Vickery

PHP Debugging. Draft: March 19, 2013 2013 Christopher Vickery PHP Debugging Draft: March 19, 2013 2013 Christopher Vickery Introduction Debugging is the art of locating errors in your code. There are three types of errors to deal with: 1. Syntax errors: When code

More information

University of Toronto

University of Toronto 1 University of Toronto APS 105 Computer Fundamentals A Tutorial about UNIX Basics Fall 2011 I. INTRODUCTION This document serves as your introduction to the computers we will be using in this course.

More information

Shellshock Security Patch for X86

Shellshock Security Patch for X86 Shellshock Security Patch for X86 Guide for Using the FFPS Update Manager October 2014 Version 1.0. Page 1 Page 2 This page is intentionally blank Table of Contents 1.0 OVERVIEW - SHELLSHOCK/BASH SHELL

More information

SSH and Basic Commands

SSH and Basic Commands SSH and Basic Commands In this tutorial we'll introduce you to SSH - a tool that allows you to send remote commands to your Web server - and show you some simple UNIX commands to help you manage your website.

More information

SSH Connections MACs the MAC XTerm application can be used to create an ssh connection, no utility is needed.

SSH Connections MACs the MAC XTerm application can be used to create an ssh connection, no utility is needed. Overview of MSU Compute Servers The DECS Linux based compute servers are well suited for programs that are too slow to run on typical desktop computers but do not require the power of supercomputers. The

More information

Open Source Computational Fluid Dynamics

Open Source Computational Fluid Dynamics Open Source Computational Fluid Dynamics An MSc course to gain extended knowledge in Computational Fluid Dynamics (CFD) using open source software. Teachers: Miklós Balogh and Zoltán Hernádi Department

More information

Lab 1 Beginning C Program

Lab 1 Beginning C Program Lab 1 Beginning C Program Overview This lab covers the basics of compiling a basic C application program from a command line. Basic functions including printf() and scanf() are used. Simple command line

More information

UNIX, Shell Scripting and Perl Introduction

UNIX, Shell Scripting and Perl Introduction UNIX, Shell Scripting and Perl Introduction Bart Zeydel 2003 Some useful commands grep searches files for a string. Useful for looking for errors in CAD tool output files. Usage: grep error * (looks for

More information

sftp - secure file transfer program - how to transfer files to and from nrs-labs

sftp - secure file transfer program - how to transfer files to and from nrs-labs last modified: 2014-01-29 p. 1 CS 111 - useful details The purpose of this handout is to summarize several details you will need for this course: 1. sftp - how to transfer files to and from nrs-labs 2.

More information

Installing IBM Websphere Application Server 7 and 8 on OS4 Enterprise Linux

Installing IBM Websphere Application Server 7 and 8 on OS4 Enterprise Linux Installing IBM Websphere Application Server 7 and 8 on OS4 Enterprise Linux By the OS4 Documentation Team Prepared by Roberto J Dohnert Copyright 2013, PC/OpenSystems LLC This whitepaper describes how

More information

Thirty Useful Unix Commands

Thirty Useful Unix Commands Leaflet U5 Thirty Useful Unix Commands Last revised April 1997 This leaflet contains basic information on thirty of the most frequently used Unix Commands. It is intended for Unix beginners who need a

More information

Fred Hantelmann LINUX. Start-up Guide. A self-contained introduction. With 57 Figures. Springer

Fred Hantelmann LINUX. Start-up Guide. A self-contained introduction. With 57 Figures. Springer Fred Hantelmann LINUX Start-up Guide A self-contained introduction With 57 Figures Springer Contents Contents Introduction 1 1.1 Linux Versus Unix 2 1.2 Kernel Architecture 3 1.3 Guide 5 1.4 Typographical

More information

CS 2112 Lab: Version Control

CS 2112 Lab: Version Control 29 September 1 October, 2014 Version Control What is Version Control? You re emailing your project back and forth with your partner. An hour before the deadline, you and your partner both find different

More information

CS10110 Introduction to personal computer equipment

CS10110 Introduction to personal computer equipment CS10110 Introduction to personal computer equipment PRACTICAL 4 : Process, Task and Application Management In this practical you will: Use Unix shell commands to find out about the processes the operating

More information

Editing Locally and Using SFTP: the FileZilla-Sublime-Terminal Flow

Editing Locally and Using SFTP: the FileZilla-Sublime-Terminal Flow Editing Locally and Using SFTP: the FileZilla-Sublime-Terminal Flow Matthew Salim, 20 May 2016 This guide focuses on effective and efficient offline editing on Sublime Text. The key is to use SFTP for

More information

The Linux Operating System and Linux-Related Issues

The Linux Operating System and Linux-Related Issues Review Questions: The Linux Operating System and Linux-Related Issues 1. Explain what is meant by the term copyleft. 2. In what ways is the Linux operating system superior to the UNIX operating system

More information

Text Clustering Using LucidWorks and Apache Mahout

Text Clustering Using LucidWorks and Apache Mahout Text Clustering Using LucidWorks and Apache Mahout (Nov. 17, 2012) 1. Module name Text Clustering Using Lucidworks and Apache Mahout 2. Scope This module introduces algorithms and evaluation metrics for

More information

There s a variety of software that can be used, but the approach described here uses freely available Cygwin software: (1) Cygwin/X (2) Cygwin/openssh

There s a variety of software that can be used, but the approach described here uses freely available Cygwin software: (1) Cygwin/X (2) Cygwin/openssh To do this you need two pieces of software: (1) An X server running on your PC, and (2) A secure shell for making a network connection to a UNIX host. There s a variety of software that can be used, but

More information

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE Source Code Management for Continuous Integration and Deployment Version 1.0 Copyright 2013, 2014 Amazon Web Services, Inc. and its affiliates. All rights reserved. This work may not be reproduced or redistributed,

More information

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St

More information

SparkLab May 2015 An Introduction to

SparkLab May 2015 An Introduction to SparkLab May 2015 An Introduction to & Apostolos N. Papadopoulos Assistant Professor Data Engineering Lab, Department of Informatics, Aristotle University of Thessaloniki Abstract Welcome to SparkLab!

More information

Introduction to Linux operating system. module Basic Bioinformatics PBF

Introduction to Linux operating system. module Basic Bioinformatics PBF Introduction to Linux operating system module Basic Bioinformatics PBF What is Linux? A Unix-like Operating System A famous open source project Free to use, distribute, modify under a compatible licence

More information

Linux System Administration on Red Hat

Linux System Administration on Red Hat Linux System Administration on Red Hat Kenneth Ingham September 29, 2009 1 Course overview This class is for people who are familiar with Linux or Unix systems as a user (i.e., they know file manipulation,

More information

Cygwin command line windows. Get that Linux feeling - on Windows http://cygwin.com/

Cygwin command line windows. Get that Linux feeling - on Windows http://cygwin.com/ Cygwin command line windows Get that Linux feeling - on Windows http://cygwin.com/ 1 Outline 1. What is Cygwin? 2. Why learn it? 3. The basic commands 4. Combining commands in scripts 5. How to get more

More information

How to use the UNIX commands for incident handling. June 12, 2013 Koichiro (Sparky) Komiyama Sam Sasaki JPCERT Coordination Center, Japan

How to use the UNIX commands for incident handling. June 12, 2013 Koichiro (Sparky) Komiyama Sam Sasaki JPCERT Coordination Center, Japan How to use the UNIX commands for incident handling June 12, 2013 Koichiro (Sparky) Komiyama Sam Sasaki JPCERT Coordination Center, Japan Agenda Training Environment Commands for incident handling network

More information

Automated Offsite Backup with rdiff-backup

Automated Offsite Backup with rdiff-backup Automated Offsite Backup with rdiff-backup Michael Greb 2003-10-21 Contents 1 Overview 2 1.1 Conventions Used........................................... 2 2 Setting up SSH 2 2.1 Generating SSH Keys........................................

More information

Extreme computing lab exercises Session one

Extreme computing lab exercises Session one Extreme computing lab exercises Session one Michail Basios (m.basios@sms.ed.ac.uk) Stratis Viglas (sviglas@inf.ed.ac.uk) 1 Getting started First you need to access the machine where you will be doing all

More information

File Transfer Examples. Running commands on other computers and transferring files between computers

File Transfer Examples. Running commands on other computers and transferring files between computers Running commands on other computers and transferring files between computers 1 1 Remote Login Login to remote computer and run programs on that computer Once logged in to remote computer, everything you

More information

Introduction to the UNIX Operating System and Open Windows Desktop Environment

Introduction to the UNIX Operating System and Open Windows Desktop Environment Introduction to the UNIX Operating System and Open Windows Desktop Environment Welcome to the Unix world! And welcome to the Unity300. As you may have already noticed, there are three Sun Microsystems

More information

Introduction to Mac OS X

Introduction to Mac OS X Introduction to Mac OS X The Mac OS X operating system both a graphical user interface and a command line interface. We will see how to use both to our advantage. Using DOCK The dock on Mac OS X is the

More information

LECTURE-7. Introduction to DOS. Introduction to UNIX/LINUX OS. Introduction to Windows. Topics:

LECTURE-7. Introduction to DOS. Introduction to UNIX/LINUX OS. Introduction to Windows. Topics: Topics: LECTURE-7 Introduction to DOS. Introduction to UNIX/LINUX OS. Introduction to Windows. BASIC INTRODUCTION TO DOS OPERATING SYSTEM DISK OPERATING SYSTEM (DOS) In the 1980s or early 1990s, the operating

More information

Training Day : Linux

Training Day : Linux Training Day : Linux Objectives At the end of the day, you will be able to use Linux command line in order to : Connect to «genotoul» server Use available tools Transfer files between server and desktop

More information

Instructions for Accessing the Advanced Computing Facility Supercomputing Cluster at the University of Kansas

Instructions for Accessing the Advanced Computing Facility Supercomputing Cluster at the University of Kansas ACF Supercomputer Access Instructions 1 Instructions for Accessing the Advanced Computing Facility Supercomputing Cluster at the University of Kansas ACF Supercomputer Access Instructions 2 Contents Instructions

More information

CS 103 Lab Linux and Virtual Machines

CS 103 Lab Linux and Virtual Machines 1 Introduction In this lab you will login to your Linux VM and write your first C/C++ program, compile it, and then execute it. 2 What you will learn In this lab you will learn the basic commands and navigation

More information

Using a login script for deployment of Kaspersky Network Agent to Mac OS X clients

Using a login script for deployment of Kaspersky Network Agent to Mac OS X clients Using a login script for deployment of Kaspersky Network Agent to Mac OS X clients EXECUTIVE SUMMARY This document describes how an administrator can configure a login script to deploy Kaspersky Lab Network

More information

TS-800. Configuring SSH Client Software in UNIX and Windows Environments for Use with the SFTP Access Method in SAS 9.2, SAS 9.3, and SAS 9.

TS-800. Configuring SSH Client Software in UNIX and Windows Environments for Use with the SFTP Access Method in SAS 9.2, SAS 9.3, and SAS 9. TS-800 Configuring SSH Client Software in UNIX and Windows Environments for Use with the SFTP Access Method in SAS 9.2, SAS 9.3, and SAS 9.4 dsas Table of Contents Overview... 1 Configuring OpenSSH Software

More information

UNIX / Linux commands Basic level. Magali COTTEVIEILLE - September 2009

UNIX / Linux commands Basic level. Magali COTTEVIEILLE - September 2009 UNIX / Linux commands Basic level Magali COTTEVIEILLE - September 2009 What is Linux? Linux is a UNIX system Free Open source Developped in 1991 by Linus Torvalds There are several Linux distributions:

More information

Introduction to UNIX and SFTP

Introduction to UNIX and SFTP Introduction to UNIX and SFTP Introduction to UNIX 1. What is it? 2. Philosophy and issues 3. Using UNIX 4. Files & folder structure 1. What is UNIX? UNIX is an Operating System (OS) All computers require

More information

CMSC 216 UNIX tutorial Fall 2010

CMSC 216 UNIX tutorial Fall 2010 CMSC 216 UNIX tutorial Fall 2010 Larry Herman Jandelyn Plane Gwen Kaye August 28, 2010 Contents 1 Introduction 2 2 Getting started 3 2.1 Logging in........................................... 3 2.2 Logging

More information

SSH with private/public key authentication

SSH with private/public key authentication SSH with private/public key authentication In this exercise we ll show how you can eliminate passwords by using ssh key authentication. Choose the version of the exercises depending on what OS you are

More information

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g.

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g. Big Data Computing Instructor: Prof. Irene Finocchi Master's Degree in Computer Science Academic Year 2013-2014, spring semester Installing Hadoop Emanuele Fusco (fusco@di.uniroma1.it) Prerequisites You

More information

Shell Scripts (1) For example: #!/bin/sh If they do not, the user's current shell will be used. Any Unix command can go in a shell script

Shell Scripts (1) For example: #!/bin/sh If they do not, the user's current shell will be used. Any Unix command can go in a shell script Shell Programming Shell Scripts (1) Basically, a shell script is a text file with Unix commands in it. Shell scripts usually begin with a #! and a shell name For example: #!/bin/sh If they do not, the

More information

CPSC 226 Lab Nine Fall 2015

CPSC 226 Lab Nine Fall 2015 CPSC 226 Lab Nine Fall 2015 Directions. Our overall lab goal is to learn how to use BBB/Debian as a typical Linux/ARM embedded environment, program in a traditional Linux C programming environment, and

More information

HDFS Installation and Shell

HDFS Installation and Shell 2012 coreservlets.com and Dima May HDFS Installation and Shell Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop training courses

More information

Remote Access to Unix Machines

Remote Access to Unix Machines Remote Access to Unix Machines Alvin R. Lebeck Department of Computer Science Department of Electrical and Computer Engineering Duke University Overview We are using OIT Linux machines for some homework

More information

Extending Remote Desktop for Large Installations. Distributed Package Installs

Extending Remote Desktop for Large Installations. Distributed Package Installs Extending Remote Desktop for Large Installations This article describes four ways Remote Desktop can be extended for large installations. The four ways are: Distributed Package Installs, List Sharing,

More information

Hadoop Basics with InfoSphere BigInsights

Hadoop Basics with InfoSphere BigInsights An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Part: 1 Exploring Hadoop Distributed File System An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government

More information

USEFUL UNIX COMMANDS

USEFUL UNIX COMMANDS cancel cat file USEFUL UNIX COMMANDS cancel print requested with lp Display the file cat file1 file2 > files Combine file1 and file2 into files cat file1 >> file2 chgrp [options] newgroup files Append

More information

Recommended File System Ownership and Privileges

Recommended File System Ownership and Privileges FOR MAGENTO COMMUNITY EDITION Whenever a patch is released to fix an issue in the code, a notice is sent directly to your Admin Inbox. If the update is security related, the incoming message is colorcoded

More information

Cloud Server powered by Mac OS X. Getting Started Guide. Cloud Server. powered by Mac OS X. AKJZNAzsqknsxxkjnsjx Getting Started Guide Page 1

Cloud Server powered by Mac OS X. Getting Started Guide. Cloud Server. powered by Mac OS X. AKJZNAzsqknsxxkjnsjx Getting Started Guide Page 1 Getting Started Guide Cloud Server powered by Mac OS X Getting Started Guide Page 1 Getting Started Guide: Cloud Server powered by Mac OS X Version 1.0 (02.16.10) Copyright 2010 GoDaddy.com Software, Inc.

More information

INT322. By the end of this week you will: (1)understand the interaction between a browser, web server, web script, interpreter, and database server.

INT322. By the end of this week you will: (1)understand the interaction between a browser, web server, web script, interpreter, and database server. Objective INT322 Monday, January 19, 2004 By the end of this week you will: (1)understand the interaction between a browser, web server, web script, interpreter, and database server. (2) know what Perl

More information

How To Use The Librepo Software On A Linux Computer (For Free)

How To Use The Librepo Software On A Linux Computer (For Free) An introduction to Linux for bioinformatics Paul Stothard March 11, 2014 Contents 1 Introduction 2 2 Getting started 3 2.1 Obtaining a Linux user account....................... 3 2.2 How to access your

More information

TP1: Getting Started with Hadoop

TP1: Getting Started with Hadoop TP1: Getting Started with Hadoop Alexandru Costan MapReduce has emerged as a leading programming model for data-intensive computing. It was originally proposed by Google to simplify development of web

More information

CLC Server Command Line Tools USER MANUAL

CLC Server Command Line Tools USER MANUAL CLC Server Command Line Tools USER MANUAL Manual for CLC Server Command Line Tools 2.5 Windows, Mac OS X and Linux September 4, 2015 This software is for research purposes only. QIAGEN Aarhus A/S Silkeborgvej

More information

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959

More information

TIBCO ActiveMatrix BusinessWorks Plug-in for TIBCO Managed File Transfer Software Installation

TIBCO ActiveMatrix BusinessWorks Plug-in for TIBCO Managed File Transfer Software Installation TIBCO ActiveMatrix BusinessWorks Plug-in for TIBCO Managed File Transfer Software Installation Software Release 6.0 November 2015 Two-Second Advantage 2 Important Information SOME TIBCO SOFTWARE EMBEDS

More information

This presentation explains how to monitor memory consumption of DataStage processes during run time.

This presentation explains how to monitor memory consumption of DataStage processes during run time. This presentation explains how to monitor memory consumption of DataStage processes during run time. Page 1 of 9 The objectives of this presentation are to explain why and when it is useful to monitor

More information

Syntax: cd <Path> Or cd $<Custom/Standard Top Name>_TOP (In CAPS)

Syntax: cd <Path> Or cd $<Custom/Standard Top Name>_TOP (In CAPS) List of Useful Commands for UNIX SHELL Scripting We all are well aware of Unix Commands but still would like to walk you through some of the commands that we generally come across in our day to day task.

More information

CSIL MiniCourses. Introduction To Unix (I) John Lekberg Sean Hogan Cannon Matthews Graham Smith. Updated on: 2015-10-14

CSIL MiniCourses. Introduction To Unix (I) John Lekberg Sean Hogan Cannon Matthews Graham Smith. Updated on: 2015-10-14 CSIL MiniCourses Introduction To Unix (I) John Lekberg Sean Hogan Cannon Matthews Graham Smith Updated on: 2015-10-14 What s a Unix? 2 Now what? 2 Your Home Directory and Other Things 2 Making a New Directory

More information

There are many different ways in which we can connect to a remote machine over the Internet. These include (but are not limited to):

There are many different ways in which we can connect to a remote machine over the Internet. These include (but are not limited to): Remote Connection Protocols There are many different ways in which we can connect to a remote machine over the Internet. These include (but are not limited to): - telnet (typically to connect to a machine

More information

Tour of the Terminal: Using Unix or Mac OS X Command-Line

Tour of the Terminal: Using Unix or Mac OS X Command-Line Tour of the Terminal: Using Unix or Mac OS X Command-Line hostabc.princeton.edu% date Mon May 5 09:30:00 EDT 2014 hostabc.princeton.edu% who wc l 12 hostabc.princeton.edu% Dawn Koffman Office of Population

More information

How to Tunnel Remote Desktop using SSH (Cygwin) for Windows XP (SP2)

How to Tunnel Remote Desktop using SSH (Cygwin) for Windows XP (SP2) How to Tunnel Remote Desktop using SSH (Cygwin) for Windows XP (SP2) The ssh server is an emulation of the UNIX environment and OpenSSH for Windows, by Redhat, called cygwin This manual covers: Installation

More information

Hadoop Shell Commands

Hadoop Shell Commands Table of contents 1 DFShell... 3 2 cat...3 3 chgrp...3 4 chmod...3 5 chown...4 6 copyfromlocal... 4 7 copytolocal... 4 8 cp...4 9 du...4 10 dus... 5 11 expunge... 5 12 get... 5 13 getmerge... 5 14 ls...

More information

TELNET CLIENT 5.11 SSH SUPPORT

TELNET CLIENT 5.11 SSH SUPPORT TELNET CLIENT 5.11 SSH SUPPORT This document provides information on the SSH support available in Telnet Client 5.11 This document describes how to install and configure SSH support in Wavelink Telnet

More information

Hadoop Shell Commands

Hadoop Shell Commands Table of contents 1 FS Shell...3 1.1 cat... 3 1.2 chgrp... 3 1.3 chmod... 3 1.4 chown... 4 1.5 copyfromlocal...4 1.6 copytolocal...4 1.7 cp... 4 1.8 du... 4 1.9 dus...5 1.10 expunge...5 1.11 get...5 1.12

More information