For my final project, I (...attempted) to mimick a neat feature on wolfram alpha where you punch in a sequence of numbers such as (1, 2, 3, 4, 5), and it gives you a possible closed form representation of the sequence (in this example, it would return a_n = n)
One major feature missing from my implementation is the fact that it only takes integer values. I think that this code is actually not that far off from working with lists of rational numbers and even floating point numbers (if you round it properly), but I did not get a good enough head start on this to get everything to work smoothly. I will probably try to beef this up a bit over the summer, but a lot of the functionality is there.
INPUT: a list (of any length >= 1) of integers. Examples) [1] , [1, 1, 1, 1] , [1, 3, 7, 13, 21] , [12,12356123512,5]...
OUTPUT: depends on the list you give it. One thing it spits out regardless of what you give it is a possible closed form representation of the sequence as a polynomial. This is 1-indexed, as demonstrated in the following examples.
guess_gen_seq([1,2,3,4,5]) == > "a_n = n" (i.e, the nth entry is n)
guess_gen_seq([1,3,7,13,21]) ==> "a_n = n^2 - n + 1" (i.e, the 3rd term is 3^2 - 3 + 1 = 9 - 3 + 1 = 7)
After spitting these equations out (in a lovely LaTex font), it also returns the next few terms of the sequence, assuming what it produced was in fact the closed form. As before,
guess_gen_seq([1,2,3,4,5]) ==> "Possible continuation: 6, 7, 8, 9, 10..."
guess_gen_seq([1, 3, 7, 13, 21]) ==> 31, 43, 57, 73, 91...
Using the algorithm implemented here, it is possible to create a polynomial of at most degree n-1, call it f, where the i-th term of the given sequence is given by f(i). However, consider the following input.
guess_gen_seq([1,2,4,8,16]) ==> ??
Although there does exist a polynomial f where f(1) = 1, f(2) = 2, f(3) = 4, f(4) = 8, f(5) = 16, it's a bit awkward and probably not what you were looking for. (the polynomial is a_n= 1/24n^4 - 1/4n^3 + 23/24n^2 -3/4n + 1). A geometric fit works much better here, something like "a_n = 2^n" would be a better output.
Either my brain is getting tired(very likely), or this is harder than just spitting out polynomials. guess_gen_seq() will always spit out the generating polynomial for a sequence, and will additionatly try to find a geometric fit only if the polynomial seems "fishy" (like in the [1,2,4,8,16] example). If it seems like the geometric fit will be just as "fishy", it doesn't bother with it.
As of me turning this in, the types of functions this will correctly identify are polynomials of any degree (provided the number of entires in the list is greater than the degree), simple geometric functions of the form a^n-1 (1-indexed!) and simple geometric forms + polynomials. (ex: 2^n + 3).
This function prints beneath the cell it's called on, using a LaTeX font
{{{id=1| def guess_gen_seq(numlist): #check to make sure it's an integer sequence for entry in numlist: if type(entry) != type(1): raise TypeError("Entries must be integers") #Assumes a polynomial fit, finds degree degree = 0 templist = copy(numlist) while (constant_sequence(templist) == False): templist = move_tier(templist) degree += 1 #Sets up a system of equations of the form Ax = B, where #A is based on the index and B is the first few terms from the given sequence. A = matrix(degree+1,degree+1) B = matrix(degree+1, 1) temp = copy(degree) for row in range(degree+1): B[row] = numlist[temp] for col in range(degree+1): A[row,col] = (temp+1)^(degree-col) temp += -1 #Solves the system C = A.inverse()*B #create an easy to read display of the function from C func = make_string_func(C.column(0), degree) nice_view = "Possible closed form: $a_n =" + func + "$" #LaTeX view(nice_view) #find the next few entries in the sequence using the calculated polynomial continuation_string = "" for x in range(len(numlist)+1, len(numlist)+4+1): continuation_string += str(get_nth_term(x, C.column(0))) + ", " print "Possible continuation: " + continuation_string + str(get_nth_term(x+1, C.column(0))) + "..." #Now let's see if a geometric fit is better #ONLY if it seems like poly is iffy #I call a function "iffy" if the degree of the polynomial is exactly 1 less than len(numlist) #(i.e., the max degree it can be) if degree == len(numlist) - 1: templist = copy(numlist) count = len(numlist) while (geo_sequence(templist) == False and count > 2): templist = move_tier(templist) count += -1 if count > 2: print "" print "We can also find a (better?) geometric fit" base = templist[1]/templist[0] base_list = [] for j in range(len(numlist)): base_list.append(numlist[j]-base^j) #Assumes a polynomial fit, finds degree degree = 0 templist = copy(numlist) while (constant_sequence(templist) == False): templist = move_tier(templist) degree += 1 #Sets up a system of equations of the form Ax = B, where #A is based on the index and B is the first few terms from the given sequence. A = matrix(degree+1,degree+1) B = matrix(degree+1, 1) temp = copy(degree) for row in range(degree+1): B[row] = base_list[temp] for col in range(degree+1): A[row,col] = (temp+1)^(degree-col) temp += -1 #Solves the system C = A.inverse()*B #create an easy to read display of the function from C func = make_string_func(C.column(0), degree) nice_view = "Possible closed form: $a_n =%s^{n-1}" %(base) nice_view += func nice_view += "$" #LaTeX view(nice_view) continuation_string = "" for x in range(len(numlist)+1, len(numlist)+4+1): continuation_string += str(get_nth_term(x, C.column(0), base)) + ", " print "Possible continuation: " + continuation_string + str(get_nth_term(x+1, C.column(0), base)) + "..." #Helper method#1 to determine the degree needed to model the sequence #Creates a list of the differences between successive terms from the given list def move_tier(numlist): difflist = [] for i in range(0, len(numlist)-1): difflist.append(numlist[i+1]-numlist[i]) return difflist #Helper method #2 to determine degree. Tests to see if a sequence is constant #If a sequence is constant, we can model a polynomial fit def constant_sequence(numlist): for i in numlist: if i != numlist[0]: return False return True #Helper function to create a string of the calculated coefficients #NOTE: finally got it to work with polynomials, sometimes wonky with geo... def make_string_func(coeffs, power): temp = copy(power) string_func = "" if power == 0: string_func += "%s" %(coeffs[0]) return string_func else: for j in coeffs: if temp >= 2: power_term = "n^%s" %(temp) elif temp == 1: power_term = "n" else: power_term = "" if abs(j) == 1 and temp != 0: coeff_term = "" else: coeff_term = "%s" %(abs(j)) if j > 0: sign = " +" else: sign = " -" if j != 0: if power == temp: if j > 0: sign = "" else: sign = "-" string_func += "%s%s%s" %(sign, coeff_term, power_term) temp += -1 return string_func #Allows to find a given term in the sequence #Input: n = term wanted # coeffs = a vector representing the coefficients of the polynomial #output = the nth term of a sequence def get_nth_term(n, coeffs, base = 0): temp = 0 if base != 0: temp = base^(n-1) power = len(coeffs)-1 while power >= 0: temp += coeffs[(len(coeffs)-1) - power] * n^power power += -1 return temp #Checks to see if the sequence is likely to be geometric... def geo_sequence(numlist): if len(numlist) > 2: for j in range(len(numlist)-1): if numlist[j] == 0 or numlist[j+1]/numlist[j] != numlist[1]/numlist[0]: return False return True return False #Defines how to subtract two arrays def array_subtract(numlist, base_list): temp = [] for j in range(len(numlist)): temp.append(numlist[j]-base_list[j]) return temp /// }}} {{{id=8| #a stupid example... guess_gen_seq([1, 1, 1]) /// \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =1$} Possible continuation: 1, 1, 1, 1, 1... }}} {{{id=9| #a slightly less stupid example... guess_gen_seq([1,2,3,4]) /// \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =n$} Possible continuation: 5, 6, 7, 8, 9... }}} {{{id=10| #Now we're cookin! guess_gen_seq([3, 5, 7, 9, 11]) /// \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =2n +1$} Possible continuation: 13, 15, 17, 19, 21... }}} {{{id=11| #NICE EXAMPLES print "[1, 3, 7, 13, 21]" guess_gen_seq([1, 3, 7, 13, 21]) print "" print "[-4, 104, 414, 1010, 1976]" guess_gen_seq([-4, 104, 414, 1010, 1976]) /// [1, 3, 7, 13, 21] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =n^2 -n +1$} Possible continuation: 31, 43, 57, 73, 91... [-4, 104, 414, 1010, 1976] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =14n^3 +17n^2 -41n +6$} Possible continuation: 3396, 5354, 7934, 11220, 15296... }}} {{{id=12| #Geometric guess_gen_seq([1, 2, 4, 8, 16]) /// \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =1/24n^4 -1/4n^3 +23/24n^2 -3/4n +1$} Possible continuation: 31, 57, 99, 163, 256... We can also find a (better?) geometric fit \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =2^{n-1}$} Possible continuation: 32, 64, 128, 256, 512... }}} {{{id=13| #Geometric + Polynomial a = [2, 3, 4, 5, 6] #(a_n = n+1) b = [1,3,9,27,81] #(b_n = 3^(n-1)) c = [] for j in range(len(a)): c.append(a[j]+b[j]) #c = a + b added term by term guess_gen_seq(c) #should spit out a_n = 3^(n-1) + n + 1 /// \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =2/3n^4 -16/3n^3 +52/3n^2 -65/3n +12$} Possible continuation: 218, 481, 948, 1707, 2862... We can also find a (better?) geometric fit \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =3^{n-1} +n +1$} Possible continuation: 250, 737, 2196, 6571, 19694... }}} {{{id=15| #However, even if the polynomial is wonky, if it cant find a good looking geometric fit #It wont bother with that step guess_gen_seq([1,5, -17, 4, 15]) /// \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =-61/12n^4 +187/3n^3 -3119/12n^2 +1271/3n -220$} Possible continuation: -159, -815, -2372, -5371, -10475... }}}So one thing I learned from doing this (other than possibly not to start a final project ~ 3 days before it's due) is that for any list of integers, there exists an infinite number of polynomials f where f(1) = the 1st term given in the sequence, f(2) = the 2nd term, etc...
because guess_gen_seq([1, 3, 2, 3]) gets a different result than guess_gen_seq([1, 3, 2, 3, 9 ]), which is different from adding any other number on, etc... Yet if f is the function produced by the first entry and g is the function produced by the 2nd entry, then f(1) = g(1). f(2) = g(2)... f(length of initial entry) = g(length of initial entry).
I suppose this isn't too hard to reason why this is the case (after all, there exists a line that connects any two points, there exists a circle in which any 3 points lie on the perimeter, etc...) but it still was a bit surprising to me when I was trying random data entries and it spit out polynomials that made sense. That's what's awesome about math software, or even academic software in general: it can teach you things. ( http://xkcd.com/519/ )
As a example of a consequence of this, consider the sequence of the first n primes. There exists a polynomial in which f(1) = 2, f(2) = 3, f(3) = 5...f(n) = the nth prime that you give it.
{{{id=16| #Starting with 2 primes, adding 1, up to 10 for i in range (2, 11): print "For the first %s primes..." %(i) print primes_first_n(i) guess_gen_seq(primes_first_n(i)) print "" /// For the first 2 primes... [2, 3] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =n +1$} Possible continuation: 4, 5, 6, 7, 8... For the first 3 primes... [2, 3, 5] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =1/2n^2 -1/2n +2$} Possible continuation: 8, 12, 17, 23, 30... For the first 4 primes... [2, 3, 5, 7] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =-1/6n^3 +3/2n^2 -7/3n +3$} Possible continuation: 8, 7, 3, -5, -18... For the first 5 primes... [2, 3, 5, 7, 11] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =1/8n^4 -17/12n^3 +47/8n^2 -103/12n +6$} Possible continuation: 22, 48, 100, 192, 341... For the first 6 primes... [2, 3, 5, 7, 11, 13] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =-3/40n^5 +5/4n^4 -187/24n^3 +91/4n^2 -437/15n +15$} Possible continuation: -6, -89, -312, -793, -1701... We can also find a (better?) geometric fit \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =-2^{n-1}39/20n^5 -65/2n^4 +2431/12n^3 -1165/2n^2 +11407/15n -349$} Possible continuation: 723, 2827, 9165, 21077, 46413... For the first 7 primes... [2, 3, 5, 7, 11, 13, 17] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =23/720n^6 -179/240n^5 +985/144n^4 -1501/48n^3 +13433/180n^2 -5129/60n +38$} Possible continuation: 72, 332, 1139, 3129, 7361... For the first 8 primes... [2, 3, 5, 7, 11, 13, 17, 19] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =-53/5040n^7 +47/144n^6 -595/144n^5 +3953/144n^4 -36883/360n^3 +3829/18n^2 -4681/21n +91$} Possible continuation: -92, -769, -3231, -10129, -26507... For the first 9 primes... [2, 3, 5, 7, 11, 13, 17, 19, 23] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =23/8064n^8 -163/1440n^7 +1085/576n^6 -1229/72n^5 +105385/1152n^4 -423877/1440n^3 +1108061/2016n^2 -12851/24n +206$} Possible continuation: 266, 1944, 8846, 30418, 86894... For the first 10 primes... [2, 3, 5, 7, 11, 13, 17, 19, 23, 29] \newcommand{\Bold}[1]{\mathbf{#1}}\hbox{Possible closed form: $a_n =-79/120960n^9 +65/2016n^8 -4579/6720n^7 +145/18n^6 -336347/5760n^5 +77005/288n^4 -23194097/30240n^3 +663029/504n^2 -126622/105n +443$} Possible continuation: -426, -4189, -21722, -82561, -257228... }}}Which is kind of cool. Of course, this is not nearly as cool as generating a function where given n primes, you can generate the n+1th prime, but i'm pretty sure I would be rich and famous if I could make a function that does that. I wonder if this sequence of polynomials approximate anything as you stretch the list to infinity? It's *almost* like a taylor series...kind of.
This ends the bulk of my project.
One last little neat thing SAGE has taught me when I was just fiddling around with it, which has nothing to do with this project. I was messing around with converting integers into different bases, and discovered that if you take an integer (in base 10) mod n, this is congruent to taking the sum of the digits of the integer, in base n+1, mod n. It's not that hard to prove, which I did, but only after accidently discovering it messing around with SAGE.
Here's some examples:
{{{id=18| def check_mod_base_thingy(num, modulus): a = num % modulus print "%s mod %s is %s" %(num, modulus, a) b = num.digits(modulus + 1) print "%s in base %s (=%s+1) is %s (little endian ordering)" %(num, modulus+1, modulus, b) sum = 0 for c in b: sum += c print "the sum of the digits of %s is %s" %(b, sum) d = sum % modulus print "and %s mod %s is %s" %(sum, modulus, d) print d == a /// }}} {{{id=22| check_mod_base_thingy(17, 7) print "" check_mod_base_thingy(57,12) print "" check_mod_base_thingy(123, 11) /// 17 mod 7 is 3 17 in base 8 (=7+1) is [1, 2] (little endian ordering) the sum of the digits of [1, 2] is 3 and 3 mod 7 is 3 True 57 mod 12 is 9 57 in base 13 (=12+1) is [5, 4] (little endian ordering) the sum of the digits of [5, 4] is 9 and 9 mod 12 is 9 True 123 mod 11 is 2 123 in base 12 (=11+1) is [3, 10] (little endian ordering) the sum of the digits of [3, 10] is 13 and 13 mod 11 is 2 True }}}I'm sure this has a name somewhere, but if it doesn't, I'm calling it "One of the 17 Alex's awesome theorem of awesomeness".
Here's the proof: Skip ahead to the latex printed cell
Let the notation "sum(a, b)" mean "write out a in base b, then take the sum of those digits".
{{{id=20| %latex Consider integers a and b. Let a mod b = c. We want to show that sum(a, b+1) also = c Writing out an integer n in base m is equivalent to finding a solution to \\ $a = a_nm^n + a_{n-1}m^{n-1} + a_{n-2}m^{n-2} + ... + a_1m + a_0 $ \\ Where the coefficients $a_n$ are strictly less than the power you are raising m to, and are equivalent to the digits in the given base. Since we have an equality here, mod both sides by b. \\ $a\;mod\;b = (a_nm^n + a_{n-1}m^{n-1} + a_{n-2}m^{n-2} + ... + a_1m + a_0)\;mod\;b$ \\ In general, we know that $(a+b)\;mod\;c = (a\;mod\;c)+(b\;mod\;c)$. Thus, $a\;mod\;b = (a_nm^n)\;mod\;b + (a_{n-1}m^{n-1})\;mod\;b + ... + (a_1m)\;mod\;b + (a_0)\;mod\;b$ \\ We also know that $(ab)\;mod\;c = (a\;mod\;c)(b\;mod\;c)$. Thus, $a\;mod\;b = (a_n\;mod\;b)(m^n\;mod\;b) + (a_{n-1}\;mod\;b)(m^{n-1}\;mod\;b) + ... + (a_0\;mod\;b) \\ Now back to the point. Let a mod b = c, and write out a in base b+1. We have... $a = a_n(b+1)^n + a_{n-1}(b+1)^{n-1} + a_{n-2}(b+1)^{n-2} + ... + a_1(b+1) + a_0 $ \\ Following the same steps as before, we have $a\;mod\;b = (a_n\;mod\;b)((b+1)^n\;mod\;b) + (a_{n-1}\;mod\;b)((b+1)^{n-1}\;mod\;b) + ... + (a_0\;mod\;b)$ \\ (b+1) mod b is certainly 1. This implies that (b+1)^n$ for any n, mod b, is also 1. Therefore, $a\;mod\;b = (a_n\;mod\;b)*1 + (a_{n-1}\;mod\;b)*1 + ... + (a_0\;mod\;b)$ \\ Since we know a mod b = c, we at last have, \\ $c = (a_n)\;mod\;b + (a_{n-1})\;mod\;b + ... (a_0)\;mod\;b$ \\ which proves the "One of the 17 Alex's awesome theorem of awesomeness" /// }}} {{{id=24| /// }}}