Lecture 2: Our first Python program
Some quick comments:
This definitely isn't a perfect Python program. In fact, one
- of the first homework problems is to find some problems with it. It'll work for us, though, and it's (hopefully!) fairly readable and straightforward. For instance, here's a one-line and
significantly less readable version of the same function:
def short_collatz(n): import math return (n == 1 and [1]) or \ ([n] + short_collatz(int(abs(math.sin(math.pi*n/2)))*(3*n+1) + int(abs(math.cos(math.pi*n/2)))*(n/2)))
- of the first homework problems is to find some problems with it. It'll work for us, though, and it's (hopefully!) fairly readable and straightforward. For instance, here's a one-line and
First walkthrough
Let's look at a few things that jump out at us right away:
- the first line looks scary, so we'll ignore it.
- there seem to be two big chunks: the "def collatz(n)" and
"if name == 'main':" sections.
- stuff between """s and after a # isn't code
- the % seems to do several different things
- [5] + ... doesn't do normal addition.
(Ask everyone what else jumps out at them.)
If you've programmed in other languages, here are some other things that probably jump out at you:
- There aren't any braces or parentheses
- == is probably equality testing
- def is used to define functions
- no datatypes anywhere
- no explicit variable declarations
Now let's go through some of these in more detail ...
Functions
def is used to define functions. The syntax is def <name>(<arguments>):, which is all we need to know for
- now. The statements that make up the function are called the
body of the function.
- now. The statements that make up the function are called the
- You specify the body of the function by simply indenting -- there
are no {} or begin/end to worry about.
- Save yourself some trouble: indents should be 4 spaces, always. No
- tabs.
You use the return keyword to tell the function what to send back
- as an answer.
- Functions can call themselves, which is called
recursion. (Side note: you recur, not recurse. The latter means "to curse again," though maybe it's common enough that both definitions are accepted nowadays.)
- Even if you don't tell the function what you want to return, it
- will return something for you. There's a special value in Python
called None that serves several purposes -- one of which is as a return value for functions that don't specify one. (Note that this is different from many other languages, which make a distinction between procedures, which do not return a value, and functions, which do.)
- will return something for you. There's a special value in Python
You execute (or call) a function by using its name and
- parentheses after the name, with the arguments to the function in the parentheses, separated by commas.
Variables
Notice that you don't have to declare a variable before you use it -- you just start using it. If you try to refer to a variable before it's declared, Python yells at you. (Don't do that.)
Now, there's another difference between variables in Python and most other languages that's more fundamental. In most languages, when you declare a variable, you also declare the type of that variable. For instance, in C, saying int foo; means that until we say otherwise, the name foo will always point to an object of type int. On the other hand, one can think of the type of a Python object as changing much more fluidly. (This isn't literally true, but we'll get into the nitty-gritty later.) So for example, something like this:
def guess_my_type(): if the_hills_are_alive_with_the_sound_of_music(): x = 5 else: x = "hi mom" return x
makes it really hard to pin down exactly what type is going to get returned.
if/else
The if/else is an extremely fundamental construction, and luckily, one that's fairly easy to understand. The general shape is this:
if <some condition>: <do this stuff> elif <some other condition>: <do a different thing> else: <do this other stuff>
(where the <> do not literally appear in the source code). elif is an abbreviation for else if, and you can have as many or as few of them as you'd like. The else clause is also optional.
Intuitively, you should think of the conditions as being a question with a "yes" or "no" answer -- something like "Is n == 1?" Instead of "yes" and "no," we use the values True and False, which are called Boolean values (named for George Boole, hence the capital "B"). We can test this from the command line (and introduce a few new operators at the same time):
>>> n = 5 >>> n == 1 False >>> n == 5 True >>> n > 3 True >>> n > 5 False >>> n >= 5 True >>> 2*n < 100 True >>> n != 3 True
Most of the operators you would expect to be there are: == for equal, != for not equal, < for less than, <= for less than or equal to, etc. You can read about all of them here.
Of course, people usually want to ask compound questions: so we have the usual gamut of logical operators, namely and, or, and not. There are clever tricks one can play with these (generally referred to as "short-circuiting"), which we'll talk about more at some point.
There's another aspect of Booleans in Python that's particularly useful: Python objects can decide for themselves whether they're true or false. For instance:
>>> if 2: print "hi" ... hi >>> if 0: print "bye" ... >>>
This is partially a historical thing -- C, for example, has no native boolean type (well, didn't until recently), and it simply uses 0 for False and anything else for True. Most Python native objects have a pretty sane set of defaults for what's true and what's not, and as we'll see soon enough, it's completely customizable -- for new kinds of objects you create yourself, you can decide what's true and what's not.
Objects
I've been using the term "object," because it has a fairly intuitive meaning. In fact, it also has a technical meaning in the world of computer science. The notion of object-oriented programming (OOP) is a way of structuring programs that packages together related functions and data into classes and objects -- classes are the templates, and objects are the actual instantiations that exist in the program. We'll talk a lot more about this soon, but I want to stop and mention now that just about everything in Python is an object. (Well, not quite at the Smalltalk/Squeak level, but close.) Numbers, functions, strings, lists -- they're all objects. In particular, it means that the way we customize various kinds of Python behavior is uniform across the board. ("There should be one, and preferably only one, obvious way to do it.")
Lists
So what's with the [ and ] around 1 in the first two lines of the program? The brackets denote a list, which is one of the most basic Python types. If you've used arrays in other programming languages, they're probably similar but not exactly the same. If you've never used another programming language, they're probably exactly what you think -- just long collections of things. You specify them by just listing the elements, separated by commas. There's an important built-in function called range:
>>> [3, 7, 8] [3, 7, 8] >>> range(10) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> range(3,5) [3, 4] >>> range(2,7,3) [2, 5]
You can ask Python for more info about a function:
>>> help(range) Help on built-in function range in module __builtin__: range(...) range([start,] stop[, step]) -> list of integers Return a list containing an arithmetic progression of integers. range(i, j) returns [i, i+1, i+2, ..., j-1]; start (!) defaults to 0. When step is given, it specifies the increment (or decrement). For example, range(4) returns [0, 1, 2, 3]. The end point is omitted! These are exactly the valid indices for a list of 4 elements.
Lists are handy -- you can add them together, or ask if something is a member of a list.
>>> [3,5] + [2,4] [3, 5, 2, 4] >>> 5 in range(10) True >>> [3,5] * 4 [3, 5, 3, 5, 3, 5, 3, 5]
Now here's a real difference from (many) other programming languages -- lists can have all different kinds of stuff in them. (So they're "heterogeneous.") They can also have nothing in them, and nonempty lists are "false."
>>> [3, range(7), 'the moon'] [3, [0, 1, 2, 3, 4, 5, 6], 'the moon'] >>> [] [] >>> if []: print "stuff!" ... >>>
Since lists are such a "core" type in Python, they have a ton of available functionality built-in. You can see it as follows:
>>> ls = range(5) >>> dir(ls) ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
For now, you can just ignore the things that start with __ (called "dunder" methods, short for "double underscore"), and in fact IPython and Sage will hide them for you by default.