Using Unix
Introduction
The Unix family of operating systems provide users with a command line interface (CLI) to execute commands and get things done. They also, typically provide GUIs but we won’t go into those here.
The Unix family includes all varieties of Linux and the Mac OS (which is based on FreeBSD).
The command line that you actually interact with – the set of commands available to you – is called a shell, and there are several shells that you can run on your system. The most typical shell in use today is called bash which stands for Bourne Again Shell, since it is an improved version of bsh (The Bourne Shell). New versions of MacOS use the Z shell (zsh). The commands in these two shells are mostly similar, but there are subtle differences.
Windows has shells too for its command line interface. The default shell is DOS, but is also has PowerShell as an advanced (and very capable) option.
For more information, check out these resources:
- UVA Research Computing’s Unix tutorial.
- Newham, 2005, Learning the bash Shell, O’Reilly Media.
- Jeroen Janssens, 2021, Data Science From the Command Line, O’Reilly Media.
- Neal Stephenson, 1999, In the Beginning Was The Command Line. (PDF version.)
Basic Commands
In this course, you don’t need to know very many Unix shell commands, but you should be comfortable working from the command line to perform basic tasks. This is because some things can only be performed from the command line, such as installing some essential software. Here is a list of basic commands.
Navigating filesystems and managing directories:
cd
– change directorypwd
– show the current directoryln
– make links and symlinks to files and directoriesmkdir
– make new directoryrmdir
– remove directories in Unix
Navigating filesystem and managing files and access permissions:
ls
– list files and directoriescp
– copy files (work in progress)rm
– remove files and directories (work in progress)mv
– rename or move files and directories to another locationchmod
– change file/directory access permissionschown
– change file/directory ownership
Text file commands
Most of important configuration in Unix is in plain text files, these commands will let you quickly inspect files or view logs:
cat
– concatenate files and show contents to the standard outputmore
– basic pagination when viewing text files or parsing Unix commands outputless
– an improved pagination tool for viewing text files (better than more command)head
– show the first 10 lines of text file (you can specify any number of lines)tail
– show the last 10 lines of text file (any number can be specified)grep
– search for patterns in text files
Miscellaneous
clear
– clear screenhistory
– show history of previous commands
Command Line Cool
Although we will not be using the command line to this degree, you should know that it is a powerful environment for doing data science work. The book Data Science from the Command Line makes the case for using the command line to perform many tasks that we often perform with more resource intensive (i.e. bloated) tools such as Python and R. At some point in your early DS career, you may want to look at this. The book itself is also a great introduction to data science!
One last thing – for fun you may want to read Neal Stephenson’s “In the Beginning Was The Command Line”, a kind of cyberpunk history of the topic. Stephenson, by the way, is the author who coined the term “metaverse” in the novel Snowcrash.