Math 480 -- 2008-04-18: Prime and Factorization
system:sage
Prime Numbers
A prime number is a positive integer divisible by only itself and 1.
{{{id=8|
for n in [1..20]:
print '%-4s%-7s'%(n, is_prime(n))
///
1 False
2 True
3 True
4 False
5 True
6 False
7 True
8 False
9 False
10 False
11 True
12 False
13 True
14 False
15 False
16 False
17 True
18 False
19 True
20 False
}}}
{{{id=10|
@interact
def pr(max=(10..1000)):
v = prime_range(max)
print v
show(points([(x,0) for x in v], pointsize=10), xmin=0, ymin=-0.1, ymax=0.1, figsize=[10,2])
///
}}}
Finding the next prime after a given prime is quite fast
{{{id=11|
time next_prime(10^100, proof=False)
///
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000267
CPU time: 0.01 s, Wall time: 0.01 s
}}}
{{{id=12|
time next_prime(10^100, proof=True)
///
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000267
CPU time: 0.28 s, Wall time: 0.33 s
}}}
{{{id=14|
# Let's make a ``personal prime''
time next_prime(101010101010101010101010101010101010101010101010)
///
101010101010101010101010101010101010101010101039
Time: CPU 0.02 s, Wall: 0.02 s
}}}
By the way, it was an open problem for a very long time to give a deterministic
polynomial time algorithm for primality. This was done a few years ago finally.
See http://en.wikipedia.org/wiki/AKS_primality_test.
This is not a function in Sage though, since there are better in practice
nondeterministic primality tests, especially for numbers that aren't too big.
The distribution of prime numbers is a problem that has fascinated people for
a very long time.
But what the heck does distribution mean?! There are many possible
interpretations.
One way to measure the distribution is to simply count the primes up to $x$:
$$
\pi(x) = \#\{\text{primes } p : p \leq x\}.
$$
{{{id=17|
@interact
def primecount(m=(1..1000)):
n = 100*m
print "Primes up to %s"%n
show(plot(prime_pi, 1, n), xmin=0)
///
}}}
That curve above looks pretty smooth. It's natural to wonder what it is.
Maybe it's just really complicated since there isn't a known easy
way to find primes?!
In fact, there is a one hundred thousand dollar (100,000) cash prize for
finding a prime with at least ten million (10,000,000) decimal digits.
See http://w2.eff.org/awards/coop-prime-rules.php. (Note: a 50,000 dollar award was given in 2000 for a million digit prime.)
Surprisingly
$$
\pi(x) \sim x/\log(x)
$$
{{{id=21|
@interact
def primecount(m=(1..1000)):
n = 100*m; var('x')
print "Primes up to %s"%n
show(plot(prime_pi, 2, n) + plot(x/log(x), 4, n, color='red') , xmin=0, figsize=[9,3])
///
}}}
In fact it is a theorem (the prime number theorem) that
$$
\lim_{x\to\infty} \pi(x) / (x/\log(x)) = 1.
$$
Actually $x/\log(x)$ isn't that good, though it looks superb in the picture above...
{{{id=28|
prime_pi(10^7)
///
664579
}}}
{{{id=29|
float(10^7/log(10^7))
///
620420.68843321688
}}}
Let
$$
\text{Li}(x) = \int_{2}^x \frac{1}{\log(t)} dt
$$
This is just the area under the curve of the graph of $1/log(t)$, which is somehow
a more precise way of counting.
{{{id=35|
@interact
def primecount(m=(1..1000)):
n = 100*m; var('x')
print "Primes up to %s"%n
show(plot(prime_pi, 2, n) + plot(Li, 4, n, color='red') , xmin=0, figsize=[9,3])
///
}}}
This is really good!
{{{id=34|
prime_pi(10^7)
///
664579
}}}
{{{id=33|
Li(10^7)
///
664917.35989
}}}
It is a conjecture (the Riemann Hypothesis!) that
for all $x\geq 2.01$, we have
$$
|\pi(x) - \text{Li}(x)| \leq \sqrt{x}\cdot \log(x).
$$
That is square root accurate, which is amazingly good.
Oh, by the way, you get a million dollars if you can prove the above inequality...
http://www.claymath.org/millennium/
{{{id=39|
///
}}}
{{{id=38|
///
}}}
THEOREM: Every positive integer can be written uniquely as product of primes.
Proof: Surprisingly not trivial. Basically use the Euclidean algorithm to
prove a special divisibility property, namely "if $p$ divides $a*b$, then $p$ divides a
or $p$ divides $b$". The rest is straightforward.
{{{id=42|
import random
def ftree(rows, v, i, F):
if len(v) > 0: # add a row to g at the ith level.
rows.append(v)
w = []
for i in range(len(v)):
k, _, _ = v[i]
if k is None or is_prime(k):
w.append((None,None,None))
else:
d = random.choice(divisors(k)[1:-1])
w.append((d,k,i))
e = k//d
if e == 1:
w.append((None,None))
else:
w.append((e,k,i))
if len(w) > len(v):
ftree(rows, w, i+1, F)
def draw_ftree(rows,font):
g = Graphics()
for i in range(len(rows)):
cur = rows[i]
for j in range(len(cur)):
e, f, k = cur[j]
if not e is None:
if is_prime(e):
c = (1,0,0)
else:
c = (0,0,.4)
g += text(str(e), (j*2-len(cur),-i), fontsize=font, rgbcolor=c)
if not k is None and not f is None:
g += line([(j*2-len(cur),-i), ((k*2)-len(rows[i-1]),-i+1)],
alpha=0.5)
return g
@interact
def factor_tree(n=100, font=(10, (8..20)), redraw=['Redraw']):
n = Integer(n)
rows = []
v = [(n,None,0)]
ftree(rows, v, 0, factor(n))
show(draw_ftree(rows, font), axes=False)
///
}}}
{{{id=44|
///
}}}
{{{id=48|
n = next_prime(10^28) * next_prime(10^29)
///
}}}
{{{id=45|
time factor(n)
///
10000000000000000000000000331 * 100000000000000000000000000319
CPU time: 10.84 s, Wall time: 11.67 s
}}}
Integer factorization in Sage:
- The factor command. This calls PARI's factor command, which implements the standard modern factoring algorithms including the elliptic curve method and the quadratic sieve.
- The qsieve command. This calls Sage's (Bill Hart's) quadratic sieve implementation of the quadratic sieve. The sieve reduces factoring to an linear algebra problem. This implementation is better than the one in PARI.
- The ecm command. This calls Sage's GMP-ECM (Paul Zimmerman), which is better than the ECM implementation in PARI.
The Sage factor command never calls qsieve or ecm. It should. Fixing this would be a superb student project.
{{{id=47|
time qsieve(n)
///
([10000000000000000000000000331, 100000000000000000000000000319], '')
CPU time: 0.01 s, Wall time: 6.83 s
}}}
{{{id=43|
n = next_prime(10^14) * next_prime(10^50)
///
}}}
{{{id=40|
time factor(n)
///
100000000000031 * 100000000000000000000000000000000000000000000000151
CPU time: 3.81 s, Wall time: 3.95 s
}}}
{{{id=50|
time ecm.factor(n)
///
[100000000000031, 100000000000000000000000000000000000000000000000151]
CPU time: 0.01 s, Wall time: 0.45 s
}}}
Parallel computation is extremely relevant fast integer factorization,
though it's not used at all above. See the DistributedFactor command
in Sage...
{{{id=53|
///
}}}
{{{id=51|
///
}}}
{{{id=52|
///
}}}
{{{id=49|
///
}}}