Linux command line An introduction to the Linux command line for genomics Susan Fairley
Aims Introduce the command line Provide an awareness of basic functionality Illustrate with some examples Provide some information on how to find out more
What we will not achieve Immediate proficiency in the command line As with learning a language, it takes time and use A comprehensive survey of the command line There is a vast array of commands, this session can only cover a small fraction
Format Series of short talks followed by exercises Should not need to listen and type at the same time Suggested reading listed at end
Overview Introduction to Linux Navigating the filesystem Basic commands Linking commands and directing output Additional commands Shells and shell scripts
Introduction to Linux What is Linux and what is an operating system? Unix and Linux what s the difference? Why consider Unix/Linux? The command prompt
Linux is an operating system Operating systems enable applications and users to make use of computer hardware User Applications Operating system Hardware
Operating systems Operating systems act as resource managers for the machine on which they are installed They wrap, and provide access to, hardware functionality The OS kernel controls the hardware Access to kernel services is provided to higher level applications and system utilities via system calls
Operating systems and shells Applications and system utilities can be started via a shell or GUI A shell is a textual command line interface A variety of shells, with slightly different features, exist Examples of shells include bash, bourne, csh and tcsh Using the shell can provide useful functionality
Unix and Linux From http://www.doc.ic.ac.uk/~wjk/unixintro/lecture1.html
Unix and Linux There are many variations of these systems They have some differences but many similarities Examples of Unix and Unix-like systems include Sun Solaris, GNU/Linux and Mac OS X Popular Linux distributions (packaging a Linux kernel with system utilities, GUI and applications) include Redhat and Debian, among others
Why Linux? Linux systems are commonplace in bioinformatics Large variety of software is developed by academic groups for these platforms Free and open source software
The command prompt The command line (shell) and GUI enable interaction with applications and system utilities A command prompt, where commands are entered at the command line, is accessed via software that provides a terminal window
The command prompt
The command prompt A terminal window can be opened when logged in to a machine Also options to open terminals on remote machines ssh, telnet and PuTTY Today, we will use PuTTY to connect from the classroom Windows machines to a Linux machine
Commands The command prompt 1) The command 2) Options 3) What the command is to run on Example: prompt$ command option thing_to_run_on
Commands White space Quotes Special characters We ll return to some of these topics later Typically best to avoid white space in file names
Important points No need to be afraid but Type with care You will NOT be asked if you really mean it Some commands are powerful and can remove many files at once Sometimes a command will run and run and run because something is wrong Can use Ctrl-C to kill processes in most cases If in doubt, ask!
In Exercise 1 Open PuTTY Connect to a remote machine Copy material to the remote machine
Exercise 1
Overview Introduction to Linux Navigating the filesystem Basic commands Linking commands and directing output Additional commands Shells and shell scripts
Navigating the filesystem Where am I? What is here (and permissions)? Moving around Searching
Navigating the filesystem In Exercise 1 we used some commands ls listed the contents of the directory tar unpackaged linux_course.tar.gz These commands followed the pattern we described earlier Command option thing_to_run_on The command is given along with any necessary additional information
Navigating the filesystem Now we are going to look at commands related to navigating the filesystem At any point in time, the command prompt is somewhere within the filesystem The filesystem is similar to the directory structure you will be familiar with in graphical interfaces, where you navigate by clicking on folders and documents
Navigating the filesystem
Navigating the filesystem
Where am I? The current directory is also called the working directory pwd print working directory Gives the path from the top of the file system (or root) to the current directory Root can be written as / [s08sf2@login1(maxwell) ~]$ pwd /users/s08sf2
What is here? We ve already used ls ls lists the contents of the current directory We used the simple form of ls Options can also be specified, including l or combinations of options such as lh -l gives the long version of output and h converts file sizes to human-readable form
What is here? [s08sf2@login1(maxwell) spades_dec14]$ pwd /users/s08sf2/ken_forbes/spades_dec14 [s08sf2@login1(maxwell) spades_dec14]$ ls mv.sh process_quast.pl quast_summary.txt notes.txt quast.sh spades_listeria_dec14.sh
What is here? [s08sf2@login1(maxwell) spades_dec14]$ ls -l total 136 -rw-r--r-- 1 s08sf2 clsm 50028 Dec 10 16:53 mv.sh -rw-r--r-- 1 s08sf2 clsm 45948 Dec 12 14:55 notes.txt -rw-r--r-- 1 s08sf2 clsm 1932 Dec 10 16:51 process_quast.pl -rw-r--r-- 1 s08sf2 clsm 634 Dec 10 16:13 quast.sh -rw-r--r-- 1 s08sf2 clsm 23025 Dec 10 16:53 quast_summary.txt -rw-r--r-- 1 s08sf2 clsm 977 Dec 10 11:19 spades_listeria_dec14.sh [s08sf2@login1(maxwell) spades_dec14]$ ls -lh total 136K -rw-r--r-- 1 s08sf2 clsm 49K Dec 10 16:53 mv.sh -rw-r--r-- 1 s08sf2 clsm 45K Dec 12 14:55 notes.txt -rw-r--r-- 1 s08sf2 clsm 1.9K Dec 10 16:51 process_quast.pl -rw-r--r-- 1 s08sf2 clsm 634 Dec 10 16:13 quast.sh -rw-r--r-- 1 s08sf2 clsm 23K Dec 10 16:53 quast_summary.txt -rw-r--r-- 1 s08sf2 clsm 977 Dec 10 11:19 spades_listeria_dec14.sh
File type Owner Permissions -rw-r--r-- All users Group First character is file type: - for file, d for directory Then groups of three characters describing permissions for the user (u), group (g) and others (o) Each set of three characters is read, write and execute r = read, w = write, x=execute, -=no permission Here, we have a file (not a directory) where the owner can read and write, the group and all users can read and nobody can execute the file
Permissions Linux cares about permissions Permissions (including who the owner of a file is) can, in some cases, get carried over when moving or copying files Permissions can be changed using chmod
chmod chmod has various ways in which it can be used We ll look at one where a number is supplied for each of user, group and other How do we know what number to supply for each category?
chmod --- 0 --x 1 -w- 2 -wx 3 r-- 4 r-x 5 rw- 6 rwx 7
chmod chmod 600 private_file.txt chmod 777 everything_file.txt chmod 644 my_rw_otherwise_read.txt
Moving around We ve said the command prompt is in a directory How do we change that directory? cd change directory cd directions/to/where/we/want/to/go
Moving around Start in our home directory Can return there using ~ cd ~ There are some other directories we can refer to easily. is the directory we are in.. is the parent directory of our current location (the level above where we are)
Moving around We can move to a directory by specifying it and its location relative to root or our current location cd../linux_course/text_files cd /users/s08sf2/linux_course/text_files cd linux_course/text_files NB: once you have started typing the path, you can press tab to autocomplete NB: you can use the up arrow to get the previous command and then edit it
Exercise 2 We ve looked at how you establish what is in a directory and how to move about In exercise 2, we ll use the contents of linux_course to try out some of this material using pwd, ls and cd We ll also try out chmod
Exercise 2
Overview Introduction to Linux Navigating the filesystem Basic commands Linking commands and directing output Additional commands Shells and shell scripts
Basic commands We ve now used a few commands, including some with options and have the basic skills to navigate through the file system Now, we ll look at some additional commands, enabling you to make directories, move files, copy files, remove files, view files and find them
man man manual You can use the man command by supplying the name of a command you want to see the manual entry for i.e. man ls The manual entry provides information on the command and its usage Information can also be found online
mkdir mkdir make directory This creates the specified directory as a subdirectory of the current directory Multiple levels can be created at once using the p option mkdir my_dir mkdir p my_dir/new_dir/another_new_dir
cp cp copy file cp existing.txt copy.txt cp existing.txt../different_location/copy.txt We can also use the r option to recursively copy a directory and all of its contents cp r dir copy_of_dir
mv mv move Moves instead of copying Can be used to rename something in the same location mv old_name.txt new_name.txt mv old.txt../new_location/new.txt Can be applied to files and directories
rm Type with care rm remove (this means delete, and it will NOT move it to trash) rm file_to_remove.txt Need r option to remove a directory because you must also remove any contents rm r directory_to_remove
less less can be used to view files When viewing file press q to quit With less, can search the file using / and then typing pattern to search for less shakespeare/romeo_and_juliet.txt Also head, tail and more
wc wc word count Counts the number of words in a file Has options that can be used, for example, to count the number of lines in a file wc l file.txt
sort sort does what it says By default, sorts lexicographically Can sort numerically and can output only unique lines sort file.txt sort u file.txt
grep grep general regular expression print grep options pattern files grep Juliet romeo_and_juliet.txt grep r tide shakespeare Can supply patterns or regular expressions, which describe what to look for NB: we don t have time to discuss regular expressions today
find find search for things This command has many options find options path expression Path says where to look and expression what to look for find. name going*
Exercise 3 In Exercise 3 we ll try out some of the commands we ve just looked at
Exercise 3
Overview Introduction to Linux Navigating the filesystem Basic commands Linking commands and directing output Additional commands for genomics Shells and shell scripts
Linking commands and directing output So far, any output from our commands has been printed in our terminal However, we can redirect output to files or pipe it to another command pipes output from one command to another > writes output to a file >> appends output to a file
stdout and stderr In most cases, output is written to standard output (stdout) Some errors are written to standard error (stderr) Both are, by default, written to the terminal We ll look at redirecting stdout but stderr can also be redirected
Examples ls > dir_contents.txt ls sort > sorted_dir_contents.txt
Exercise 4 In Exercise 4 we ll use the output of one command as input for another by piping We ll also try redirecting output from standard out (stdout) to a file
Exercise 4
Overview Introduction to Linux Navigating the filesystem Basic commands Linking commands and directing output Additional commands Shells and shell scripts
Additional commands Many and varied commands can be used (including when handling genomic data) These are a few arbitrary examples
Grep for FASTA headers FASTA files have headers for each sequence Headers start with the character > grep > proteins.fa Note the use of around >, enabling the command line to differentiate from redirection
Identify unique FASTA headers grep > proteins.fa grep > proteins.fa wc l grep > proteins.fa sort u grep > proteins.fa sort u wc l
Retain part of FASTA header grep > protein_2.fa >sp P09922 MX1_MOUSE Interferon-induced GTP-binding protein Mx1 OS=Mus musculus GN=Mx1 PE=1 SV=1 sed e s/>\(\s*\).*/\1/ -e execute s/substitute this/for this/ \(\) capture the contents of the brackets (using \ to escape) and reuse the contents using \1 \S non-whitespace, * match many times
Compare sorted lists comm compares sorted lists Options -123-1 lines unique to file1, -2 lines unique to file2, -3 lines that appear in both files Also diff diff y --suppress-common-lines file1 file2
Exercise 5 In Excercise 5 we ll work with some FASTA files We ll try some of the examples that we ve discussed
Exercise 5
Overview Introduction to Linux Navigating the filesystem Basic commands Linking commands and directing output Additional commands Shells and shell scripts
Shells and shell scripts We briefly discussed shells earlier in the session There are different shells that differ slightly in how they operate Often, you can identify the shell you are using by typing: echo $SHELL $SHELL is a variable
Scripts Scripts let us put together a sequence of commands that can then be run We can run a script by typing: source script.sh Source runs the commands in your current shell environment Alternatively, you can make the shell file (.sh) executable
Scripts #!/bin/bash echo Hello World
Scripts #!/bin/bash echo Hello World echo Goodbye World
Exercise 6 In this exercise, we ll look at scripts that run commands we ve already discussed We ll also review checking permissions to see if files are executable
Exercise 6
More information http://www.doc.ic.ac.uk/~wjk/unixintro/ http://www.ee.surrey.ac.uk/teaching/unix/ Also many books and online resources
Feedback Please complete and return the feedback form before you leave
Acknowledgements Naveed Khan Tony Travis Eduardo Alves Mel McCann