Math 480 -- 2008-04-18: Prime and Factorization system:sage

Prime Numbers

A prime number is a positive integer divisible by only itself and 1. {{{id=8| for n in [1..20]: print '%-4s%-7s'%(n, is_prime(n)) /// 1 False 2 True 3 True 4 False 5 True 6 False 7 True 8 False 9 False 10 False 11 True 12 False 13 True 14 False 15 False 16 False 17 True 18 False 19 True 20 False }}} {{{id=10| @interact def pr(max=(10..1000)): v = prime_range(max) print v show(points([(x,0) for x in v], pointsize=10), xmin=0, ymin=-0.1, ymax=0.1, figsize=[10,2]) ///
max
}}} Finding the next prime after a given prime is quite fast {{{id=11| time next_prime(10^100, proof=False) /// 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000267 CPU time: 0.01 s, Wall time: 0.01 s }}} {{{id=12| time next_prime(10^100, proof=True) /// 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000267 CPU time: 0.28 s, Wall time: 0.33 s }}} {{{id=14| # Let's make a ``personal prime'' time next_prime(101010101010101010101010101010101010101010101010) /// 101010101010101010101010101010101010101010101039 Time: CPU 0.02 s, Wall: 0.02 s }}}

By the way, it was an open problem for a very long time to give a deterministic polynomial time algorithm for primality. This was done a few years ago finally. See http://en.wikipedia.org/wiki/AKS_primality_test.

This is not a function in Sage though, since there are better in practice nondeterministic primality tests, especially for numbers that aren't too big.

The distribution of prime numbers is a problem that has fascinated people for a very long time.

But what the heck does distribution mean?! There are many possible interpretations.

One way to measure the distribution is to simply count the primes up to $x$: $$ \pi(x) = \#\{\text{primes } p : p \leq x\}. $$ {{{id=17| @interact def primecount(m=(1..1000)): n = 100*m print "Primes up to %s"%n show(plot(prime_pi, 1, n), xmin=0) ///
m
}}}

That curve above looks pretty smooth. It's natural to wonder what it is. Maybe it's just really complicated since there isn't a known easy way to find primes?!

In fact, there is a one hundred thousand dollar (100,000) cash prize for finding a prime with at least ten million (10,000,000) decimal digits. See http://w2.eff.org/awards/coop-prime-rules.php. (Note: a 50,000 dollar award was given in 2000 for a million digit prime.)

Surprisingly $$ \pi(x) \sim x/\log(x) $$ {{{id=21| @interact def primecount(m=(1..1000)): n = 100*m; var('x') print "Primes up to %s"%n show(plot(prime_pi, 2, n) + plot(x/log(x), 4, n, color='red') , xmin=0, figsize=[9,3]) ///
m
}}}

In fact it is a theorem (the prime number theorem) that $$ \lim_{x\to\infty} \pi(x) / (x/\log(x)) = 1. $$

Actually $x/\log(x)$ isn't that good, though it looks superb in the picture above... {{{id=28| prime_pi(10^7) /// 664579 }}} {{{id=29| float(10^7/log(10^7)) /// 620420.68843321688 }}}

Let $$ \text{Li}(x) = \int_{2}^x \frac{1}{\log(t)} dt $$ This is just the area under the curve of the graph of $1/log(t)$, which is somehow a more precise way of counting. {{{id=35| @interact def primecount(m=(1..1000)): n = 100*m; var('x') print "Primes up to %s"%n show(plot(prime_pi, 2, n) + plot(Li, 4, n, color='red') , xmin=0, figsize=[9,3]) ///
m
}}}

This is really good! {{{id=34| prime_pi(10^7) /// 664579 }}} {{{id=33| Li(10^7) /// 664917.35989 }}}

It is a conjecture (the Riemann Hypothesis!) that for all $x\geq 2.01$, we have $$ |\pi(x) - \text{Li}(x)| \leq \sqrt{x}\cdot \log(x). $$ That is square root accurate, which is amazingly good.

Oh, by the way, you get a million dollars if you can prove the above inequality... http://www.claymath.org/millennium/ {{{id=39| /// }}} {{{id=38| /// }}}

THEOREM: Every positive integer can be written uniquely as product of primes. Proof: Surprisingly not trivial. Basically use the Euclidean algorithm to prove a special divisibility property, namely "if $p$ divides $a*b$, then $p$ divides a or $p$ divides $b$". The rest is straightforward. {{{id=42| import random def ftree(rows, v, i, F): if len(v) > 0: # add a row to g at the ith level. rows.append(v) w = [] for i in range(len(v)): k, _, _ = v[i] if k is None or is_prime(k): w.append((None,None,None)) else: d = random.choice(divisors(k)[1:-1]) w.append((d,k,i)) e = k//d if e == 1: w.append((None,None)) else: w.append((e,k,i)) if len(w) > len(v): ftree(rows, w, i+1, F) def draw_ftree(rows,font): g = Graphics() for i in range(len(rows)): cur = rows[i] for j in range(len(cur)): e, f, k = cur[j] if not e is None: if is_prime(e): c = (1,0,0) else: c = (0,0,.4) g += text(str(e), (j*2-len(cur),-i), fontsize=font, rgbcolor=c) if not k is None and not f is None: g += line([(j*2-len(cur),-i), ((k*2)-len(rows[i-1]),-i+1)], alpha=0.5) return g @interact def factor_tree(n=100, font=(10, (8..20)), redraw=['Redraw']): n = Integer(n) rows = [] v = [(n,None,0)] ftree(rows, v, 0, factor(n)) show(draw_ftree(rows, font), axes=False) ///
n
font
redraw
}}} {{{id=44| /// }}} {{{id=48| n = next_prime(10^28) * next_prime(10^29) /// }}} {{{id=45| time factor(n) /// 10000000000000000000000000331 * 100000000000000000000000000319 CPU time: 10.84 s, Wall time: 11.67 s }}} Integer factorization in Sage:
  1. The factor command. This calls PARI's factor command, which implements the standard modern factoring algorithms including the elliptic curve method and the quadratic sieve.
  2. The qsieve command. This calls Sage's (Bill Hart's) quadratic sieve implementation of the quadratic sieve. The sieve reduces factoring to an linear algebra problem. This implementation is better than the one in PARI.
  3. The ecm command. This calls Sage's GMP-ECM (Paul Zimmerman), which is better than the ECM implementation in PARI.


The Sage factor command never calls qsieve or ecm. It should. Fixing this would be a superb student project. {{{id=47| time qsieve(n) /// ([10000000000000000000000000331, 100000000000000000000000000319], '') CPU time: 0.01 s, Wall time: 6.83 s }}} {{{id=43| n = next_prime(10^14) * next_prime(10^50) /// }}} {{{id=40| time factor(n) /// 100000000000031 * 100000000000000000000000000000000000000000000000151 CPU time: 3.81 s, Wall time: 3.95 s }}} {{{id=50| time ecm.factor(n) /// [100000000000031, 100000000000000000000000000000000000000000000000151] CPU time: 0.01 s, Wall time: 0.45 s }}}

Parallel computation is extremely relevant fast integer factorization, though it's not used at all above. See the DistributedFactor command in Sage... {{{id=53| /// }}} {{{id=51| /// }}} {{{id=52| /// }}} {{{id=49| /// }}}