## How explicit is the Explicit Formula?

### Barry Mazur and William Stein

#### Notes for a talk at the AMS Special Session on Arithmetic Statistics

(Change slides with left and right arrows. Type "m" to see all slides.)

## The Explicit Formulas

The "Explicit Formulas" in analytic number theory deal with arithmetically interesting quantities, often given as partial sums-the summands corresponding to primes $p$-up to some cutoff value $X$. We'll call them "Sums of local data".

### Again:

A "Sum of local data" is a sum of contributions for each prime $p\leq X$:
\delta(X) := \sum_{p\le X}g(p)
where the rules of the game require the values $g(p)$ to be determined by only local considerations at the prime $p$.

## Sums of Local Data

We will be concentrating on sums of local data attached to elliptic curves without CM over $\mathbf{Q}$,
\delta_E(X):=\sum_{p\le X}g_E(p)
where the weighting function
p \mapsto g_E(p)
is a specific function of $p$ and $a_E(p)$, the $p$-th Fourier coefficient of the eigenform of weight two parametrizing the elliptic curve.

## Weighted Biases

We will be interested in issues of bias.

### Our Aim

Examine computations of these biases, following the classical "Explicit Formula," and the work of:
Sarnak, Granville, Rubenstein, Watkins, Martin, Fiorilli, Conrey-Snaith, ...

## Sign of $a_E(p)$

### Four our elliptic curves $E$:

ROUGHLY — half the Fourier coefficients $a_E(p)$ are positive and half negative.

That is: there are roughly as many $p$'s for which the number of rational points of $E$ over $\mathbf{F}_p$ is

greater than $p+1$

as there are primes for which it is

less than $p+1$.

## Sign of the $a_E(p)$ - a table

$\text{Curve}$$\text{Positive } a_E(p)\text{ for }p<10^7$$\text{Negative }a_E(p)\text{ for }p<10^7$
11a (rank 0)332169332119
32a (rank 0; CM)166054166126
37a (rank 1)332127332240
389a (rank 2)332317332022
5077a (rank 3)331706332632

## Finer Statistical Issues

So let's study finer statistical issues related to this symmetric distribution. For example, we can ask the raw question: which of these classes of primes are winning the race, and how often?
I.e., what can one say about:
\Delta_E(X) = \frac{\log(X)}{\sqrt{X}} \text{ times }
\#\{p \text{ such that } |E(\mathbf{F}_p)| > p+1\}
minus
\#\{p \text{ such that } |E(\mathbf{F}_p)| < p+1\}?

## Equivalently, putting:

\gamma_E(p) = \begin{cases} 0 & \text{if $p$ is a bad or supersingular prime for $E$},\\ -1 & \text{if $E$ has more than $p+1$ points rational over $\mathbf{F}_p$},\\ +1 & \text{if less} \end{cases}
\Delta_E(X) = \frac{\log(X)}{\sqrt{X}} \sum_{p\leq X} \gamma_E(p)
Rank 0 curve 11a ($p<1000$):

## Graphs of $\Delta_E(X) = \frac{\log(X)}{\sqrt{X}} \sum_{p\leq X}\gamma_E(p)$

Rank 0 curve 11a ($p < 10^6$):
Rank 1 curve 37a ($p < 10^6$):

## More graphs of $\Delta_E(X) = \frac{\log(X)}{\sqrt{X}} \sum_{p\leq X}\gamma_E(p)$

Rank 2 curve 389a ($p < 10^6$):
Rank 3 curve 5077a ($p < 10^6$):
$\Delta_E(X): =\sum_{p\le X}\gamma_E(p)$

## "Means" and "Percentages of positive (or negative) support"

Recall that to say that
\delta(X) = \sum_{p\le X}g(X)
possesses a limiting distribution $\mu_\delta$ with respect to the multiplicative measure $dX/X$ means that for continuous bounded functions $f$ on $\mathbf{R}$ we have:
\lim_{X \to {\infty}}\ {\frac{1}{\log(X)}}\int_0^Xf(\delta(x))\frac{dx}{x} = \int_{\mathbf{R}}f(x)d\mu_\delta(x).
The mean of $\delta(X)$ is by definition:
{\mathcal E} : = \lim_{X \to {\infty}}\ {\frac{1}{\log(X)}}\int_0^X\delta(x)\frac{dx}{x} = \int_{\mathbf{R}}d\mu_\delta(x).

In the work of Sarnak and Fiorilli, another measure for understanding "bias behavior" is given by what one might call the percentage of positive support (relative to the multiplicative measure $dX/X$). Namely:
\begin{align*} {\mathcal P} & := \lim {\rm inf}_{X\to \infty}{\frac{1}{\log(X)}}\int_{2\le x \le X; \delta(x)\le 0}dx/x\\ \quad &= \lim {\rm sup}_{X\to \infty}{\frac{1}{\log(X)}}\int_{2\le x \le X; \delta(x)\le 0}dx/x \end{align*}
It is indeed a conjecture, in specific instances interesting to us, that these limits ${\mathcal E}$ and ${\mathcal P}$ exist.
(Discuss a beautiful result of Fiorilli about ${\mathcal P}$)
\text{mean of $\delta(x)$} : = \lim_{X \to {\infty}}\ {\frac{1}{\log(X)}}\int_0^X\delta(x)\frac{dx}{x} = \int_{\mathbf{R}}d\mu_\delta(x).

## More General Weighting Functions

Consider weighting functions $p\mapsto g_E(p)$ that have the property that:
• for all primes $p$, the number $g_E(p)$ is an odd function of the value $a_E(p)$
• the sum of local data
\delta_E(X) := \sum_{p\leq X} g_E(p)
has—or can be convincingly conjectured to have—a finite mean.

Any such $p \mapsto g_E(p)$ represents a version of a "bias race".

To illustrate specific features of the "Explicit Formula" we focus on three examples of such races for an elliptic curve $E$.
\text{mean of } \delta(x) : = \lim_{X \to {\infty}}\ {\frac{1}{\log(X)}}\int_0^X\delta(x) \frac{dx}{x}; \qquad \gamma_E(p) = \begin{cases} 0 & \text{if $p$ is a bad or supersingular prime for $E$},\\ -1 & \text{if $E$ has more than $p+1$ points rational over $\mathbf{F}_p$},\\ +1 & \text{if less} \end{cases}

## Sums of Local Data

#### RAW

$\Delta_E(X): = \frac{\log(X)}{\sqrt{X}} \sum_{p\le X}\gamma_E(p)$

#### MEDIUM-RARE

${\mathcal D}_E(X):= {\frac{\log(X)}{\sqrt X}}\sum_{p \le X}{\frac{a_E(p)}{\sqrt p}}$

#### WELL-DONE

${D}_E(X):= {\frac{1}{\log(X)}} \sum_{p \le X}{\frac{a_E(p)\log p}{ p}}$
The fun here is that there are clean conjectures for the values of the means (relative to $dX/X$)
—i.e., the biases
of the three "sums of local data" and clean expectations of their variances:
(Use mouse to hover over definition above to see a conjecture.)

#### The well-done data—the mean is (conjecturally)

-r,
where $r=r_E$ is the analytic rank of $E$.

1-2r

#### The raw data—the mean is (conjecturally)

{\frac{2}{\pi}}- {\frac{16}{3\pi}}r + {\frac{4}{\pi}} \sum_{k=1}^{\infty} (-1)^{k+1}\left[{\frac{1}{2k+1}} + {\frac{1}{2k+3}}\right]r({2k+1}),
where $r(n) := r_{f_E}(n) =$ the order of vanishing of $L(\text{symm}^n f_E, s)$ at $s=1/2$, with $f_E$ the newform corresponding to $E$ and $s=1/2$ is the central point.
$\Delta_E(X): =\frac{\log(X)}{\sqrt{X}} \sum_{p\le X}\gamma_E(p), \quad {\mathcal D}_E(X):= {\frac{\log(X)}{\sqrt X}}\sum_{p \le X}{\frac{a_E(p)}{\sqrt p}} \to 1-2r,\quad {D}_E(X):= {\frac{1}{\log(X)}} \sum_{p \le X}{\frac{a_E(p)\log p}{ p}} \to -r$

1. The (conjectured) distinction in the variances of the three formats:
• The raw data has infinite variance
• The medium-rare and well-done data have finite variance
2. The numbers
n\mapsto r_E(n) = \text{ the order of vanishing of }L(\text{symm}^n f_E, s) \text{ at }s=1/2
(for $n$ odd) conjecturally determine all biases!
3. We have the beginnings of some data for those numbers, $n\mapsto r_E(n)$, but nothing systematic.
4. And no firm conjectures yet.
Numerically, instead of simply looking at examples of curves of various ranks, we instead look for curves with interesting $r_E(n)$ and focus on the mean...
\text{mean of $\delta(x)$} : = \lim_{X \to {\infty}}\ {\frac{1}{\log(X)}}\int_0^X\delta(x)\frac{dx}{x} = \int_{\mathbf{R}}d\mu_\delta(x);\qquad r_E(n) = \text{ the order of vanishing of }L(\text{symm}^n f_E, s) \text{ at }s=1/2

## For example...

If $g(t)$ is a continuous function on $[-1,+1]$ with—appropriately defined—Fourier coefficients $\{c_n\}_n$, then the mean of the sum of local data
\delta(X) := \sum_{p\leq X} g(a(p)/(2\sqrt{p}))
is conjecturally
\sum_{n=1}^{\infty} c_n(2 r_E(n) + (-1)^n).
Thus
\left\{\text{ Means of }\delta(X)'s\right\} \longleftrightarrow \left\{ r_E(n)'s \right\}
$E$ rank RAW mean MEDIUM mean $\to 1-2r?$ WELL mean $\to -r?$
11a00.6470.59810.1550
14a00.7520.55410.1140
37a1-1.412-1.967-1-0.816-1
43a1-0.366-1.906-1-0.792-1
389a2-2.658-4.293-3-1.663-2
433a2-4.055-4.167-3-1.617-2
5077a3-5.228-6.598-5-2.507-3
11197a3-4.428-6.289-5-2.360-3
raw mean
medium-rare mean
well-done mean

## Qualitative look at the Explicit Formula

#### Sum of local data = the "bias" + Oscillatory term + Error term

Sum of local data = the "bias" + Oscillatory term + Error term

## Qualitative look at the Explicit Formula

##### For example, for the Well-done data,
D_E(X) := \frac{1}{\log(X)}\sum_{p\leq X} \frac{a_{E}(p)\log p}{p}
the Explicit Formula gives $D_E(X)$ as a sum of three contributions:
-r_E + S_E(X) + O(1/\log(X))
where the "Oscillatory term" $S_E(X)$ is the wild card (even assuming GRH) and we take it to be the limit ($Y\to\infty$) of these generalized trigonometric sums:
S_E(X,Y) = \frac{1}{\log(X)} \sum_{|\gamma|\leq Y} \frac{X^{i\gamma}}{i\gamma}
(sum over imaginary parts of complex zeros of $L(f_E,s)$ above $s=1/2$)
S_E(X,Y) = \frac{1}{\log(X)} \sum_{|\gamma|\leq Y} \frac{X^{i\gamma}}{i\gamma} \quad\text{(sum over zeros)}

## A Tentative Conjecture

It has been tentatively conjectured that
\lim_{X,Y\to\infty} S_E(X,Y) = 0,
but for computations it would be good to know something more explicit.

r_E(n) = \text{ the order of vanishing of }L(\text{symm}^n f_E, s) \text{ at }s=1/2, \qquad S_E(X,Y) = \frac{1}{\log(X)} \sum_{|\gamma|\leq Y} \frac{X^{i\gamma}}{i\gamma}

## Three issues needing conjectures, and computations:

#### What should be conjectured about:

1. the distribution of the $r_{E}(n)$'s
2. the convergence of $\lim_{X,Y\to\infty} S_E(X,Y) = 0$
3. the conditional biases—and multivariate distributions—related to the zeroes of $L$-functions of tensor products of symmetric powers of two (or more) automorphic forms