Lecture 4, 4/5/10
Administrative stuff
- Overloads: anyone waiting? running into trouble?
- Grading: Peer grading is a go! Handout Wednesday, who to submit to by tonight.
- Homework: It's due Wednesday. How's it going for everyone?
UNIX
The modern command-line really began with UNIX, so I'll start by saying a few words about UNIX itself. It's hard to overstate how important the development of UNIX is in the development of computing. As a random data point, here's a list of some of the modern OSes based on UNIX:
- Linux
- Mac OSX, NeXTSTEP
- FreeBSD, OpenBSD, NetBSD
- Solaris, HP-UX, AIX
- Android, Maemo, Palm's webOS
- ...
The name UNIX has a long and sordid history, but nowadays generally refers to anything in this family of operating systems. Here are some operating systems that aren't "direct" descendants:
- Windows 7
- Windows Vista
- Windows XP
- Windows 2000
- Windows NT
- Windows Mobile, Windows Phone 7 Series ...
UNIX succeeded for a lot of reasons, one of which was definitely being in the right place at the right time. However, there's an underlying core philosophy that has made UNIX appealing since its very inception. Doug McIlroy (one of the original UNIX wizards) summed it up as follows:
- This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
It's hard to get all of the UNIX philosophy into three sentences, but this comes darn close. In particular, one of the key things that you should take away from this is the following: a UNIX environment consists of a number of small and easily composable tools.
Some great quotes
The number of UNIX installations has grown to 10, with more expected. -- UNIX Programmer's Manual, 1972
- Unix is simple. It just takes a genius to understand its simplicity. -- Dennis Ritchie
- UNIX was not designed to stop its users from doing stupid things, as that would also stop them from doing clever things. -- Doug Gwyn
- Unix never says "please." -- Rob Pike
- Unix is user-friendly. It just isn't promiscuous about which users it's friendly with. -- Steven King
- Those who don't understand UNIX are condemned to reinvent it, poorly. -- Henry Spencer
Getting to the command line on your machine
If you're using Linux, just open a terminal.
If you're using Mac OSX, you just open a terminal -- but if you've never done this before, go to Applications -> Utilities -> Terminal.app, and you're good to go.
If you're using Windows, your best bet is to delete it and install something else install Cygwin. (I'm mostly kidding -- Windows actually has a number of amazing features. It just doesn't shine when I'm trying to spend an hour talking about using the command line.)
Let's go!
Okay, so now we're all at a command line. In general, the prompt at your terminal will be something complicated -- I've customized mine, and it looks like this:
[craigcitro@sharma ~] $
(If you'd like to play with customizing your prompt later, it's controlled by the environment variable PS1 -- if this doesn't mean anything, either ask me, google it, or just don't worry about it.) But that's way too cumbersome to type, and doesn't look like quite what you'll see -- so I'm going to abbreviate it to just
$
Two quick notes for talking about traversing filesystems: . always refers to the current directory, and .. always refers to the directory one level above where we are. Also, / refers to the root directory -- that is, the root of the filesystem. Here's an example of how this is relevant:
$ pwd /sage/devel/sage $ cd . $ pwd /sage/devel/sage $ cd .. $ pwd /sage/devel
Play around a little, and this'll feel like second nature soon enough.
The shell is just like any other interpreter you've ever used -- you input commands, and performs them and reports back as necessary. In the case of a shell, the most common commands are of two types: running programs, or moving around in the file hierarchy.
First, and most importantly, let's talk about how the shell finds the commands you ask it to execute. If you're at the command line, and you type
$ awesome_new_program
the shell will look in a specific list of directories for a file called awesome_new_program that it's allowed to run. That list is stored in an environment variable called PATH, which you can print as follows:
$ echo $PATH /Users/craigcitro/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/usr/X11/bin:/usr/local/cuda/bin:/Library/Frameworks/HaskellPlatform.framework/bin
(Your shell stores a whole bunch of environment variables for you; you can type set to see them all, and we'll talk more about them at some later point.)
Now, here's an interesting thing that you may have noticed: the current directory (.) didn't appear anywhere on that list. Of course, you can add it, but this is generally deemed a bad idea for a variety of reasons. That means that if you're trying to run a program in the current directory, you can't just use its name, even though it's "right there in front of you." You have to be explicit -- like this:
$ ls foo foo.c $ foo -bash: foo: command not found $ ./foo Hello world!
Now, let's say there's more than one program called foo in your path -- how do you tell which one is being run? The shell will go through the entries in PATH in order, so you could check them yourself -- or just ask the shell to tell you:
$ which ls /bin/ls $
Ten commands you must know
There are a million UNIX commands out there, and no one ever remembers every single one. Here's my personal list of commands I couldn't go a day without:
ls
cd
cp/rm
man
emacs (or your editor of choice -- vim, pico, ed ...)
cat/less
grep
find/locate/mdfind
ssh/scp
which
and a few more I couldn't last a week without:
wget/curl
ps
kill
top
head/tail
screen
sed/awk
alias
xargs
sort
uniq
apropos
I can't get to all of these today, but if you're curious, I'll give you a lead on finding out more.
RTFM
By far the most important command on that list is the one command that helps you find out more: man, which stands for manual. It's simple to use -- just pick your favorite command, and call man with that as the only argument:
$ man ls <stuff> $ man python <other stuff> $ man man <stuff about finding more stuff>
Any time you have a question, or you're confused, the first thing you should do is check the man page.
Filesystem stuff
Here are the basic commands for snooping around the filesystem:
ls: list the files in this directory
cd: change directory
pwd: where am I?
mkdir: make new directory
cp: make a copy of a file
mv: move a file
rm: remove a file
These are all pretty straightforward, at least for basic use. Of course, over time, they've all picked up quite a bit of sophistication over the years. For instance, here's the basic ls:
[craigcitro@sharma /sage] $ /bin/ls COPYING.txt install.log sage-README-osx.txt README.txt ipython sage-python data local spkg devel makefile test.log examples sage tmp
and here's what I see on my machine:
[craigcitro@sharma /sage] $ l total 14660 72 COPYING.txt 4 makefile 12 README.txt 4 sage* 0 data/ 4 sage-README-osx.txt 0 devel/ 4 sage-python* 0 examples/ 0 spkg/ 14372 install.log 188 test.log 0 ipython/ 0 tmp/ 0 local/
If you want to know about more options for ls, check out the man page. For the curious, here's what I use: on Mac/BSD, it's ls -sFG, and on Linux, it's ls -BhFvs --color=auto.
If you call cd with no arguments, it takes you back to your home directory. I find that I often do this by accident -- but here's a really cool trick: cd - changes back to the directory you were most recently in. This is incredibly useful:
$ pwd /sage/devel/sage-main/sage/rings/polynomial $ cd $ pwd /Users/craigcitro $ cd - /sage/devel/sage-main/sage/rings/polynomial $
You'll notice that most UNIX commands are fairly quiet -- they often don't produce output if everything goes right. This is disconcerting at first if you're not expecting it. However, much of UNIX is built on the idea of chaining things together -- if everything produced output, it'd just be too unwieldy.
Composability
As mentioned above, one of the most important parts of the UNIX philosophy is the ability to compose the small programs that each do one thing well. The most fundamental way to compose them is by using pipes, which are written |. Pipes are really just function composition: in principle, saying foo | bar runs the command foo, records the output, and then calls bar -- passing the output from foo as the input to bar. Here's a simple example, using another wildly useful UNIX utility: grep. The basic use of grep is easy to explain: doing grep "def" myfile.py will show you all lines in the file myfile.py that contain the string def. So let's say you have a program that prints out a bunch of output, and you only care about the lines that contain the word pickles:
$ pantry_contents ... lots of stuff ... $ pantry_contents | grep pickle 1 jar of pickles 1 jar of pickled pig's feet $
Of course, once you start combining a bunch of these, it gets way more exciting:
[craigcitro@sharma /sage/devel/sage-main] $ find . | grep -E "(py|pyx)$" | wc -l 3899 [craigcitro@sharma /sage/devel/sage-main] $ find . | grep -E "(py|pyx)$" | xargs -n 1 cat | sort | uniq | wc -l 403291
The first of those is the the number of files in the Sage codebase, and the second is the number of unique lines of code in the Sage library.
There are two other related things worth mentioning here. Often, you have a bunch of output spewing from some program, and you want to just save it to a file -- foo >output.txt will redirect the output of foo to the file output.txt. Similarly, doing foo <input.txt will run the program foo, passing the contents of input.txt as input to the program. (The first is insanely useful, and the second is often useful, too.)
References
These are two fairly old, but classic, books on the UNIX environment:
The UNIX Programming Environment, by Kernighan and Pike
The Design of the UNIX Operating System, by Bach