Search

Equidistribution of Sidon Sums in Residue Classes II

In the last post, we introduced what we are trying to achieve in this post. However, we will write this note in a way that is independent of the last one.

A set of positive integers $A\subset \mathbb N$ is called a Sidon Set if the equation $a+b=c+d$ does not have any non-trivial solutions in $A$. A Sidon subset $A\subset [n]:=\{1,2,\dots , n\}$ is called dense (or extremal) if $\left\lvert A\right\rvert = \max \left\lvert S\right\rvert$ where the maximum is taken over all Sidon subsets of $[n]$. If $A\subset [n]$ is a dense Sidon set with $|A|=n^{1/2}-L$, then $-n^{\frac 14}\le L\ll n^{\frac{21}{80}}$ (see this and this for a proof).

It is well known that dense Sidon sets are surprisingly symmetric in profound ways. For example, it is known that they are uniformly distributed in intervals and well distributed in residue classes. In a previous work, Ding asked whether the sums of elements of a dense Sidon set over residue classes are also equidistributed. We will prove a stronger result here.

Take a dense Sidon set $A\subset [N]:=\{1,2,\dots, N\}$ with $|A| = N^{1/2}-L^\prime$ and $L=\max\{0,L^\prime\}\le N^{\frac{21}{80}}$. For a fixed $k\in \mathbb N$ and some $m\in \mathbb N$, define
\begin{align*} &S^{(k)} := \sum_{a\in A} a^k\\ &S_b^{(k)} := \sum_{\substack{a\in A\\}} a^k\cdot \mathbf{1}_{\{a\,\equiv\, b\,\pmod m\}} \end{align*}
for each $b\in \{0, 1, 2,\dots , m-1\}$.

For $f:\mathbb Z\to\mathbb C$ supported on $[N]$, we define the Fourier transform
$$\widehat f(\beta):=\sum_{n\in\mathbb Z} f(n)e(\beta n)$$
and for $X\in\{0,1,\dots,N\}$, consider the one-sided Dirichlet kernel
$$D_X(\theta):=\sum_{n=1}^{X} e(\theta n)$$
where $e(t) := e^{2\pi it}$. Additionally, we let $\mathbb T := \mathbb R / \mathbb Z$ equipped with the Haar measure.

We will use the following results in our proof.
Theorem A [Ortega-Prendiville] : We have
$$\left\|\widehat{\mathbf 1_A} - \frac{|A|}{N}\,\widehat{\mathbf 1_{[N]}}\right\|_{\infty} \ll\ \sqrt N\, \left(\left|\frac{|A|}{\sqrt N}-1\right| + N^{-\frac 14}\right)^{\frac 12}$$
for a Sidon set $S\subset [N]$.

Theorem B [Balasubramanian-Dutta, Ding] : Let $A=\left\{a_1,\dots ,a_{|A|}\right\}\subset [n]$ be a Sidon set so that $|A|=n^{1/2}-L^\prime$. Then,
$$\sum_{a\in A} a^k = \frac {1}{k +1} \cdot n^{\frac{2k +1}{2}} + \mathcal O \left( n^{\frac{8k +3}{8}} \right) + \mathcal O\left( L^{1/2}\cdot n^{\frac{4k +1}{4}}\right)$$
for $L\le n^{\frac {21}{80}}$.

Similar results can also be proven for Sidon sets in $\mathbb Z_m$ (see this for example).

Theorem C [Folklore] : We have $\|D_n\|_1 \asymp \log n$.

The main result we prove in this note is the following.
Theorem : Fix any $k\in \mathbb N_{\ge 1}$. For an $m=o\left(N^{\frac{19}{160}}\,(\log N)^{-1}\right)$, we have
$$S_b^{(k)}\ =\ \frac{1}{m(k+1)}\,N^{k+\frac 12} \ +\ \mathcal O_k\left(\frac{N^{k+\frac 38}}{m}+\frac{N^{k+\frac 14}\, \sqrt L}{m} \,+\, mN^{k-\frac 12} \,+\, N^{k+\frac{61}{160}}\,\log N\right)$$
for all $b\in \{0, 1, 2,\dots , m-1\}$.

The following proposition lies in the heart of our proof.
Proposition 1 : For $0\le X\le N$, define
$$F_r(X) := \sum_{\substack{a\in A\\ a\le X}} e\left(\frac{ra}{m}\right)$$
for a dense Sidon set $A\subset [N]$ and any integer $m>1$. Then, we have
$$\big|F_r(X)\big|\ll\ \frac{m}{\sqrt N} \,+\, N^{\frac{61}{160}}\log N$$
uniformly for $1\le r\le m-1$ and $0\le X\le N$.

The proof follows immediately from the following three lemmas.

Lemma 1 : We have
$$\sum_{n=1}^{X} f(n)\,e(\alpha n)\ =\ \int_{\mathbb T}\widehat f(\beta)\,D_X(\alpha-\beta)\,\mathrm d\beta$$
for every $\alpha\in\mathbb T$.

Proof : Follows from expanding the integral
\begin{align*} \int_{\mathbb T}\widehat f(\beta)\, D_X(\alpha-\beta)\, \mathrm d\beta &=\int_{\mathbb T}\left(\sum_{u}f(u)\,e(\beta u)\right)\left(\sum_{n=1}^X e((\alpha-\beta)n)\right)\mathrm d\beta\\ &=\sum_{u}\sum_{n=1}^X f(u)\, e(\alpha n)\int_{\mathbb T} e(\beta(u-n))\, \mathrm d\beta\\ &=\sum_{n=1}^X f(n)\, e(\alpha n) \end{align*}
using orthogonality. $\square$

Define
$$f(n):= \mathbf 1_A(n)-\frac{|A|}{N}\mathbf 1_{[N]}(n)$$
for $n\in\mathbb Z$. Then, $f$ is supported on $[N]$. Also,
$$\widehat f(\beta)=\widehat{\mathbf 1_A}(\beta)-\frac{|A|}{N}\widehat{\mathbf 1_{[N]}}(\beta)$$
and we define $U:=\|\widehat f\|_\infty$.

Lemma 2 : For each $r\in\{1,2,\dots,m-1\}$ and each $X\in\{0,1,\dots,N\}$, we have
$$\big|F_r(X)\big| \ll\ \frac{m}{\sqrt N} + U\log N$$
where the implied constant is absolute.

Proof : Put $\alpha:=\frac rm\in\mathbb T$. We have
\begin{align*} F_r(X)=\sum_{n=1}^X \mathbf 1_A(n)e(\alpha n) &= \frac{|A|}{N}\sum_{n=1}^X e(\alpha n) \,+\, \sum_{n=1}^X f(n)e(\alpha n)\\ &= \frac{|A|}{N}D_X(\alpha) \,+\, \sum_{n=1}^X f(n)\, e(\alpha n) \end{align*}
for $X\le N$. The last sum is
$$\sum_{n=1}^X f(n)e(\alpha n) = \int_{\mathbb T}\widehat f(\beta)\, D_X(\alpha-\beta)\, \mathrm d\beta$$
using Lemma 1. This implies
$$\left|\sum_{n=1}^X f(n)e(\alpha n)\right| \le\ U\int_{\mathbb T}\big|D_X(t)\big|dt \ll\ U\log N$$
using $\left|\widehat f(\beta)\right|\le U$ and Theorem C.

On the other hand, we have
$$\left|D_X\left(\frac rm\right)\right| = \left|\frac{\sin(\pi X \alpha)}{\sin(\pi \alpha)}\right| \le \frac{1}{|\sin(\pi \alpha)|}$$
for $\alpha=\frac rm$ with $1\le r\le m-1$. This implies
$$\left|\frac{|A|}{N}D_X(\alpha)\right|\ll \frac{|A|}{N}m \asymp \frac{m}{\sqrt N}$$
since $|\sin(\pi \alpha)|\ge \sin(\pi/m)\gg 1/m$. Combining the two bounds completes the proof. $\square$

Lemma 3 : For $|A|= \sqrt N-L'$ with $L:=\max\{0,L'\}\le N^{\frac{21}{80}}$, we have $U\ll N^{\frac{61}{160}}$.

Proof : We have
$$U\ll N^{1/2}\left(\left|1-\frac{|A|}{N^{1/2}}\right|+N^{-1/4}\right)^{1/2}$$
using Theorem A. Plugging $L\le N^{\frac{21}{80}}$ and using the fact that $A$ is dense, we have
$$U\ll\ \sqrt N \left(\frac{L}{\sqrt N}+N^{-\frac 14}\right)^{\frac 12} \ll\ \sqrt N\left(N^{-\frac{19}{80}}+N^{-\frac{1}{4}}\right)^{\frac 12}$$
hence completing the proof. $\square$


From Proposition 1, there are essentially two main steps to our main theorem. The first one is the next lemma, and the second one is given in the body of the proof. To begin with, define
$$T_{r,k}:=\sum_{a\in A} a^k\, e\left(\frac{ra}{m}\right)$$
for $r\in\{0,1,\dots,m-1\}$.

Lemma 4 : We have
$$\big|T_{r,k}\big|\le\ 2N^k\cdot \max_{0\le X\le N}\big|F_r(X)\big|$$
for each $1\le r\le m-1$.

Proof : Write
$$T_{r,k}=\sum_{n=1}^N n^k\ \mathbf 1_A(n)\,e(\alpha n)$$
and allow $F_r(0):=0$. We have
$$T_{r,k}=\ N^k F_r(N) - \sum_{n=1}^{N-1}F_r(n)\big((n+1)^k-n^k\big)$$
using Abel summation. Define
$$M_r:=\max_{0\le X\le N} \big|F_r(X)\big|$$
to get
$$\big|T_{r,k}\big| \le\ N^kM_r+\sum_{n=1}^{N-1}M_r\big((n+1)^k-n^k\big) \le\ 2N^kM_r$$
hence completing the proof. $\square$

Equipped with all this, we finally present our promised proof.

Proof of Theorem : Begin by noticing
$$\mathbf 1_{{a\equiv b \pmod m}} = \frac1m\sum_{r=0}^{m-1} e\left(\frac{(a-b)r}{m}\right)$$
for integers $a,b$ and $m$. This gives
\begin{align*} S_b^{(k)} &=\sum_{a\in A} a^k \mathbf 1_{{a\equiv b\pmod m}}\\ &=\frac1m\sum_{r=0}^{m-1} e\left(-\frac{br}{m}\right)\sum_{a\in A} a^k e\left(\frac{ar}{m}\right)\\ &=\frac1m\sum_{r=0}^{m-1} e\left(-\frac{br}{m}\right)\, T_{r,k}\\ &=\frac1mT_{0,k} \,+\, \frac1m\sum_{r=1}^{m-1}e\left(-\frac{br}{m}\right)T_{r,k}\\ &=\frac1m\sum_{a\in A}a^k \,+\, \mathcal R_b \end{align*}
where
$$\mathcal R_b:=\frac1m\sum_{r=1}^{m-1}e\left(-\frac{br}{m}\right)T_{r,k}$$
so that
$$\big|\mathcal R_b\big| \le \frac1m\sum_{r=1}^{m-1} \big|T_{r,k}\big| \le \max_{1\le r\le m-1}|T_{r,k}| \ll\ 2N^k\left(\frac{m}{\sqrt N}+N^{\frac{61}{160}}\log N\right)$$
using Lemma 4 and Proposition 1. This proves
$$S_b^{(k)} =\ \frac1m\sum_{a\in A} a^k \,+\, \mathcal O_k\left(mN^{k-\frac12}+N^{k+\frac{61}{160}}\log N\right)$$
from which Theorem B completes the proof. $\square$

-----------

Now, it is not difficult to see that for a fixed non-principal character $\chi$, one has
$$\sum_{a\in A} \chi(a)\ =\ \sum_{b=0}^{q-1}\chi(b)\,r_b$$
where $r_b := \left|\left\{a\in A :\ a\equiv b\,\pmod q\right\}\right|$. This gives
$$\sum_{a\in A}\chi(a)=\sum_{b=0}^{q-1}\chi(b)\left(r_b-\frac{|A|}{q}\right)$$
implying
$$\left|\sum_{a\in A}\chi(a)\right|\ \ll\ N^{\frac{61}{160}}\ =\ o\left(\sqrt{N}\right)$$
using Theorem A. This now makes us wonder about how dense Sidon sets are expected to behave against cancellations. While I do not have any ideas to contribute to this discussion, I would like to take this opportunity to make the following conjecture.

Conjecture 1 : Let $A\subset [n]$ be a dense Sidon set. Then, we have
$$\sum_{a\in A} \mu(a) = o(\sqrt n)$$
where $\mu$ is the Möbius function.

In fact, Conjecture 1 is probably true more generally with $\mu$ replaced by some suitable function that admits sufficient cancellations. The tools developed in the discussed paper (from which Theorem A is taken) are most certainly not strong enough to handle this question.

In case Conjecture 1 is not true, I would like to know if the following weaker statement is true.

Conjecture 2 : Let $\mathrm D_n$ be the collection of all dense Sidon subsets of $[n]$. Then, we have
$$\sum_{A\in \mathrm D_n}\sum_{a\in A} \mu(a) = o\left(\left\lvert \mathrm D_n\right\rvert \sqrt n\right)$$
where $\mu$ is the Möbius function.

On the other hand, the question at the heart of Conjecture 1 (and 2) is whether dense Sidon sets are multiplicatively independent. This invites us to ask another question of a similar flavour.

Conjecture 3 : Let $A\subset [n]$ be a dense Sidon set. Then, we have
$$\pi(A) \asymp \frac{\sqrt n}{\log n}$$
where $\pi(A)$ is the number of primes in $A$.

Whether order can be replaced with asymptote (with a suitable constant) in Conjecture 3 is an interesting question, but also one that probably asks a bit too much. An upper bound of $\pi(A)(\log n) \ll \sqrt n$ follows from an application of the Selberg sieve. From what I understand, the other direction might be very difficult to achieve. For example, I do not know whether it is easy to even show $\pi(A)>100$ for sufficiently large $n$.

Equidistribution of Sidon Sums in Residue Classes

A set of positive integers $A\subset \mathbb N$ is called a Sidon Set if the equation $a+b=c+d$ does not have any non-trivial solutions in $A$. A Sidon subset $A\subset [n]:=\{1,2,\dots , n\}$ is called dense (or extremal) if $\left\lvert A\right\rvert = \max \left\lvert S\right\rvert$ where the maximum is taken over all Sidon subsets of $[n]$. If $A\subset [n]$ is a dense Sidon set with $|A|=n^{1/2}-L$, then $-n^{\frac 14}\le L\ll n^{\frac{21}{80}}$ (see this and this for a proof).

It is well known that dense Sidon sets are surprisingly symmetric in profound ways. For example, it is known that they are uniformly distributed in intervals and well distributed in residue classes. In a previous work, Ding asked whether the sums of elements of a dense Sidon set over residue classes are also equidistributed. We will prove a stronger result here.

Take a dense Sidon set $A\subset [n]:=\{1,2,\dots, n\}$ with $t:=|A| = n^{1/2}-L^\prime$ and $L=\max\{0,L^\prime\}\le n^{\frac{21}{80}}$. For a fixed $k\in \mathbb N$ and some $m\in \mathbb N$, define
$$S_b^{(k)}(A;m) := \sum_{\substack{a\in A\\a\,\equiv\, b\,\pmod m}} a^k$$
for each $b\in \{0, 1, 2,\dots , m-1\}$.

We will use the following result in our proof.
Theorem A [Balasubramanian-Dutta, Ding] : Let $A=\left\{a_1,\dots ,a_{|A|}\right\}\subset [n]$ be a Sidon set so that $|A|=n^{1/2}-L^\prime$. Then,
$$\sum_{a\in A} a^k = \frac {1}{k +1} \cdot n^{\frac{2k +1}{2}} + \mathcal O \left( n^{\frac{8k +3}{8}} \right) + \mathcal O\left( L^{1/2}\cdot n^{\frac{4k +1}{4}}\right)$$
for $L\le n^{\frac {21}{80}}$.

Similar results can also be proven for Sidon sets in $\mathbb Z_m$ (see this for example).

The main result we prove in this note is the following.
Theorem : For fixed $k,m\in \mathbb N$, we have
$$S_b^{(k)}(A;m)\ =\ \frac{1}{m(k+1)}\,n^{k+\frac 12} \,+\, \mathcal O_{k,m}\left(n^{k+\frac 38}+n^{k+\frac 14}\,\sqrt{L}\right)$$
for all $b\in \{0, 1, 2,\dots , m-1\}$.

Proof of Theorem : The proof will require a lemma that will be proven after the body of the proof.

Let $\omega:=e^{2\pi i/m}$. We have
$$\mathbf 1_{\{x\equiv b \pmod m\}} := \frac1m\sum_{r=0}^{m-1}\omega^{r(x-b)}$$
for any integers $x,b$. Also, we have
$$\Delta_r^{(k)}:=\ \sum_{a\in A} a^k\,\omega^{ra}$$
for $r\in \{0,1,\dots,m-1\}$. This gives
$$S_b^{(k)}(A;m) =\sum_{a\in A}a^k\mathbf 1_{a\equiv b\pmod m} =\frac1m\sum_{r=0}^{m-1}\omega^{-rb}\Delta_r^{(k)}$$
implying
$$S_b^{(k)}(A;m)-\frac1m\sum_{a\in A}a^k \ =\ \frac1m\sum_{r=1}^{m-1}\omega^{-rb}\Delta_r^{(k)}$$
using definition of $\Delta_0^{(k)}$. Hence,
$$\left|S_b^{(k)}(A;m)-\frac1m \sum_{a\in A} a^k\right| \le\ \max_{1\le r\le m-1}\left|\Delta_r^{(k)}\right|$$
and therefore, it suffices to bound $\left|\Delta_r^{(k)}\right|$ for a fixed $r\in\{1,\dots,m-1\}$.

Define
$$d(h):=\ \#\left\{(a,a')\in A\times A:\ a-a'=h\right\}\le 1$$
since $A$ is Sidon. Now, extend $f:\mathbb Z \to \mathbb C$ by
$$f(x):=\omega^{rx}\,\mathbf 1_A(x)$$
so that $f(x)=0$ for $x\notin[1,n]$. Let $H$ be an integer parameter with $1\le H\le n$. Define block sums
$$B(t):=\sum_{i=1}^H f(t+i)$$
for $t\in \mathbb Z$. Also, define the energy
$$E:=\sum_{t=-H}^{n-1} \big|B(t)\big|^2$$
and
$$G:=\left|\sum_{h=1}^{H-1}(H-h)\omega^{-rh}\right|, \qquad M:=\sum_{h=1}^{H-1}(H-h)\,\left(1-d(h)\right)$$
so that
$$E\le Ht + 2G + 2M$$
using triangle inequality and the lemma we prove below. Also, by a simple geometric series argument, we have $G\ll_m H$.

Define the window count
$$X_q:=\#\big(A\cap(q-H,q]\,\big)=\sum_{x=q-H+1}^{q}\mathbf 1_A(x)$$
for $q\in\{1,2,\dots,n+H\}$. Therefore,
$$\sum_{q=1}^{n+H} X_q = Ht$$
since each $a\in A$ belongs to exactly $H$ such windows, namely those with $q\in\{a,a+1,\dots,a+H-1\}$. Also,
\begin{align*} \sum_{q=1}^{n+H} X_q^2 &=\sum_{q}\ \sum_{x,y\,\in(q-H,q]}\mathbf 1_A(x)\mathbf 1_A(y)\\ &=\sum_{x,y\in A}\#\big\{q:\ x,y\in(q-H,q]\big\}\\ &=Ht + 2\sum_{h=1}^{H-1}(H-h)d(h)\\ &\ge \frac{1}{n+H} \left(\sum_{q=1}^{n+H} X_q\right)^2 =\frac{H^2t^2}{n+H} \end{align*}
since for fixed $x,y\in\mathbb Z$, the number of $q$ such that both $x$ and $y$ lie in $(q-H,q]$ equals $H-|x-y|$ if $|x-y|\le H-1$, and equals $0$ otherwise.

Combining all of these now yields
$$2\ \sum_{h=1}^{H-1}(H-h)d(h)\ \ge\ \frac{H^2t^2}{n+H}-Ht$$
implying
\begin{align*} M &=\frac{H(H-1)}2-\sum_{h=1}^{H-1}(H-h)d(h)\\ &\le \frac{H(H-1)}2-\frac12\left(\frac{H^2t^2}{n+H}-Ht\right)\\ &\le \frac12\left(Ht+H^2\left(1-\frac{t^2}{n+H}\right)\right)\\ &\ll\ Ht+\frac{H^3}{n}+\frac{H^2L}{n^{1/2}} \end{align*}
using $t=\sqrt n -L$.

On the other hand, we also have
$$E\ \ll_m\ Hn^{1/2}+\frac{H^3}{n}+\frac{H^2L_+}{n^{1/2}}$$
using the geometric bound mentioned before.

Now, define
\begin{align*} T_k: &=\sum_{t=-H}^{n-1} t^k\, B(t) =\sum_{t=-H}^{n-1} t^k\ \sum_{i=1}^H f(t+i)\\ &=\sum_{x=1}^n f(x)\ \sum_{u=1}^H (x-u)^k = \sum_{\ell=0}^k c_{k,\ell}(H)\,\Delta_r^{(\ell)} \end{align*}
where
$$\Delta_r^{(\ell)}:=\sum_{a\in A}a^\ell\omega^{ra}$$
and
$$c_{k,\ell}(H):=\binom{k}{\ell}(-1)^{k-\ell}\sum_{u=1}^H u^{k-\ell}$$
coming from the Binomial expansion of $(x-u)^k$. It will enough for us to use the crude bounds
$$\big|c_{k,\ell}(H)\big|\le\ \binom{k}{\ell}\sum_{u=1}^H u^{k-\ell}\ \ll_k\ H^{k-\ell+1}$$
and
$$\left|\Delta_r^{(\ell)}\right|\le\ \sum_{a\in A}a^\ell\le\ t\,n^\ell\ll\ n^{\ell+\frac 12}$$
for $\ell\le k-1$.

We now have
$$\Delta_r^{(k)}=\frac1H\left(T_k-\sum_{\ell=0}^{k-1}c_{k,\ell}(H)\, \Delta_r^{(\ell)}\right)$$
since $c_{k,k}(H)=H$. On the other hand,
$$|T_k|^2 =\left|\sum_{t=-H}^{n-1} t^k B(t)\right|^2 \le \left(\sum_{t=-H}^{n-1} t^{2k}\right)\left(\sum_{t=-H}^{n-1}\big|B(t)\big|^2\right) =\left(\sum_{t=-H}^{n-1} t^{2k}\right)\cdot E\ \ll_k\ n^{k+\frac 12}E^{\frac 12}$$
by Cauchy-Schwarz. Combining all of these, and plugging in our crude bounds, we get
$$\left|\Delta_r^{(k)}\right| \le\ \frac{|T_k|}{H} + \frac1H\sum_{\ell=0}^{k-1}\big|c_{k,\ell}(H)\big|\,\left|\Delta_r^{(\ell)}\right| \ll_k\ \frac{n^{k+\frac 12}}{H}E^{\frac 12} + \sum_{\ell=0}^{k-1} H^{k-\ell} n^{\ell+\frac 12}$$
implying
$$E^{\frac 12} \ll_m\ n^{\frac 58}+n^{\frac 12}\sqrt{L}$$
on making the explicit choice $H:=\left\lfloor n^{\frac 34}\right\rfloor$.

The first term becomes
$$\frac{n^{k+\frac 12}}{H}E^{\frac 12} = \mathcal O_m \left(n^{k+\frac 38}+n^{k+\frac 14}\sqrt{L}\right)$$
and the second term simplifies to
$$\sum_{\ell=0}^{k-1} H^{k-\ell}\, n^{\ell+\frac 12}\ \ll_k\ n^{k+\frac 14}$$
on putting $H:=\left\lfloor n^{\frac 34}\right\rfloor$.

Combining all of these yields
$$\left|\Delta_r^{(k)}\right|\ \ll_{k,m}\ n^{k+3/8}+n^{k+1/4}\sqrt{L_+ +1}$$
uniformly for $1\le r\le m-1$. This completes the proof using Theorem A. $\square$

Finally, we prove the lemma we used in the previous proof.

Lemma : We have
$$E =\ H t + 2\Re\sum_{h=1}^{H-1}(H-h)\,\omega^{-rh}\,d(h)$$
where $E$ is the energy defined above.

Proof : Expand
$$E=\sum_{t}\sum_{1\le i,j\le H} f(t+i)\overline{f(t+j)}$$
so that
$$f(t+i)\overline{f(t+j)}=\ \omega^{r(t+i)}\omega^{-r(t+j)}\, \mathbf 1_A(t+i)\mathbf 1_A(t+j) =\ \omega^{-rh}\, \mathbf 1_A(t+i)\mathbf 1_A(t+i+h)$$
letting $h=j-i$ so that $\overline{f(t+j)}=\omega^{-r(t+j)}\,\mathbf 1_A(t+j)$.
    
Summing over $t$ with $f\equiv 0$ outside $[1,n]$ gives
$$\sum_{t=-H}^{n-1}\mathbf 1_A(t+i)\mathbf 1_A(t+i+h)=d(h)$$
for $h\ge 0$ and similarly, $d(-h)$ for $h<0$. The weight $(H-|h|)$ counts the number of pairs $(i,j)$ with $j-i=h$. Collecting terms yields
$$E=\sum_{h=-(H-1)}^{H-1} \big(H-|h|\big)\, \omega^{-rh}\, d(h) =\ Ht + 2\Re\sum_{h=1}^{H-1}(H-h)\,\omega^{-rh}\, d(h)$$
because $d(0)=t$ and $d(-h)=d(h)$. This completes the proof. $\square$


Notice that the equidistribution proof we presented was completely elementary, and does not use any outside results. In the next post, we will show that using more advanced tools (namely the fact that dense Sidon sets are Fourier-pseudorandom), we can prove slightly better results. In particular, we will show that the main theorem of this post is true with a better error term, allowing us to replace the fixed $m$ with an $m$ that can vary upto a power of $n$.

A subcase of Problem #42

Problem #42 on Bloom's website of Erdős Problems asks about a fascinating property of Sidon sets. While the problem itself is wide open, one can still salvage something when the set is "smaller than dense". This is what we will see here.

Theorem: Fix an integer $M\ge 1$. Then there exists $N_0=N_0(M)$ such that for every $N\ge N_0$ and every Sidon set $A\subset [N]:=\{1,2,\dots, N\}$ with $|A|=o(\sqrt N)$, there exists a Sidon set $B\subset [N]$ with $|B|=M$ such that $(A-A)\cap(B-B)=\{0\}$.

To begin with, for a finite set $S\subset \mathbb Z$, define $S-S := \{s-s': s,s'\in S\}$ and $\Delta(S) := \{s-s': s,s'\in S,\ s>s'\}\subset \mathbb N$. Then, $S-S \,=\, \{0\} \cup \Delta(S) \cup (-\Delta(S))$.

First consider the first two lemmas, which are obvious from definition, and then a third one whose proof is provided.

Lemma 1: If $S\subset\mathbb Z$ is Sidon, then the map
$$\phi:\{(s,s')\in S^2: s>s'\}\to \mathbb N,\quad \phi(s,s')=s-s'$$
is injective. In particular, $|\Delta(S)|=\binom{|S|}{2}$.

Lemma 2: Let $S\subset\mathbb Z$ be finite. If all elements of $\Delta(S)$ are distinct (equivalently, $\phi$ above is injective), then $S$ is Sidon.

Lemma 3: Let $N\ge 1$, $F\subset \{1,2,\dots, N-1\}$, and let $M\ge 1$. Let
$$C_M \,:=\, M \,+\, \sum_{r=1}^{M-1} r\binom{r}{2} \,=\, M+\frac12\sum_{r=1}^{M-1}(r^3-r^2)$$
be a constant depending only on $M$. If $N \ge \binom{M}{2} |F| + C_M$, then there exists a set $B=\{b_1<b_2<\cdots<b_M\}\subset [N]$ such that $\Delta(B)\cap F=\emptyset$, and all elements of $\Delta(B)$ are distinct (implying $B$ is Sidon by Lemma 2).

Proof: We build $b_1<b_2<\dots<b_M$ inductively, always choosing the new element to the right of the previous ones. Let $b_1:=1$. For $t\ge 2$, suppose $b_1<\dots<b_{t-1}$ have been chosen so that $\Delta({b_1,\dots,b_{t-1}})\cap F=\emptyset$ and all positive differences among $\{b_1,\dots,b_{t-1}\}$ are distinct. Let $E_{t-1}:=\Delta({b_1,\dots,b_{t-1}})$. Then, $|E_{t-1}|=\binom{t-1}{2}$.

We seek an integer $b_t>b_{t-1}$ such that for every $i\in\{1,\dots,t-1\}$, $b_t-b_i \notin F$ and $b_t-b_i \notin E_{t-1}$. For fixed $i$, the condition $b_t-b_i\in F$ is equivalent to $b_t\in b_i+F$. Similarly $b_t-b_i\in E_{t-1}$ is equivalent to $b_t\in b_i+E_{t-1}$. Therefore the set of "bad" values of $b_t$ is contained in
$$\mathcal B_t:=\bigcup_{i=1}^{t-1}(b_i+F)\ \cup\ \bigcup_{i=1}^{t-1}(b_i+E_{t-1})$$
implying
$$|\mathcal B_t|\le \sum_{i=1}^{t-1}|b_i+F| + \sum_{i=1}^{t-1}|b_i+E_{t-1}| = (t-1)|F| + (t-1)\binom{t-1}{2}$$
by a union bound.

Now consider the consecutive integers $b_{t-1}+1, b_{t-1}+2,\dots$. Among these, at most $|\mathcal B_t|$ are forbidden. Hence the least integer $>b_{t-1}$ satisfying the two required properties obeys $b_t \,\le\, b_{t-1} \,+\,  |\mathcal B_t| \,+\, 1 \,\le\, b_{t-1} \,+\, (t-1)|F| \,+\, (t-1)\binom{t-1}{2} \,+\, 1$. Define an explicit upper-bound sequence $U_t$ by $U_1:=1$ and
$$U_t:=U_{t-1} + (t-1)|F| + (t-1)\binom{t-1}{2}+1$$
for $t\ge 2$. Then, we can choose $b_t\le U_t$ at each step, so in particular $b_M\le U_M$. This gives
$$U_M = 1 + \sum_{t=2}^M \left((t-1)|F| + (t-1)\binom{t-1}{2}+1\right) = \binom{M}{2}|F| + C_M$$
on summing the recurrence. So, $U_M\le N$ and hence $b_M\le N$, so $B:=\{b_1,\dots,b_M\} \subset[N]$ exists. Finally, by construction, no new positive difference equals an old one, hence completing the proof. $\square$

Equipped with all these, we can finally provide a proof of the advertised theorem.

Proof of Theorem: Let $A\subset [N]$ be Sidon and let $|A|=o(\sqrt{N})$. Let $F:=\Delta(A)\subset{1,\dots,N-1}$. By Lemma 1, $|F| = |\Delta(A)| = \binom{|A|}{2} \le \frac{|A|^2}{2}=o(N)$. For a fixed $M$, we have
$$N \ge \binom{M}{2}|F| + C_M$$
for all sufficiently large $N$. So, there exists $B\subset[N]$ with $|B|=M$ and $B$ Sidon, so that
$$\Delta(B)\cap\Delta(A)=\emptyset$$
using Lemma 3.

It remains to show $(A-A)\cap(B-B)=\{0\}$. Let $x\neq 0$ and $x\in(A-A)\cap(B-B)$. Then $|x|\in\Delta(A)$ and $|x|\in\Delta(B)$, a contradiction. This gives
$$(A-A)\cap(B-B)=\{0\}$$
hence completing the proof. $\square$

On the Sidon Tails of $\left\{\left\lfloor x^n\right\rfloor\right\}$

A set of positive integers $A\subset \mathbb N$ is called a Sidon Set or a Sidon Sequence if the equation $a+b=c+d$ does not have any non-trivial solutions in $A$. It is known (and fairly easy to prove) that the sets
$$\mathbf S_x := \big\{\left\lfloor x^n\right\rfloor : n\in \mathbb N\big\}$$
are Sidon for $x\ge 2$. In a recent preprint, I investigated the tails of these sequences in the range $x\in (1,2)$. In particular, one of the results I proved is that $\mathbf S_x$ has Sidon tail for almost all $x\in (1,2)$. It was only after I uploaded the preprint, and it was only after I sent the paper to a journal, that I realised that an even stronger statement is true. This is what I will prove here.

To keep this post self-contained, we will call an $\mathbf S_x$ tail Sidon if there exists $N_0$ such that the set
$$\big\{\left\lfloor x^n\right\rfloor : n\ge N_0\big\}$$
is a Sidon set.

Theorem: If $\mathbf S_x$ is not tail Sidon, then $x$ is algebraic. In particular, one can produce integers $u> w\ge v\ge 1$ such that
$$1+y^u-y^v-y^w=0 \qquad\text{or}\qquad 1-y^v-y^w=0$$
for $y:=\frac 1x \in (0,1)$.

Proof: Assume an $x$ for which $\mathbf S_x$ is not tail Sidon. Note that $\mathbf S_x$ is eventually increasing - so, from now on, we will only work in this increasing tail. Let $\mathbf x_n:=\left\lfloor x^n\right\rfloor$. It is easy to show that if there are indices $a$, $b$, $c$ and $d$ such that $\mathbf x_d+ \mathbf x_a = \mathbf x_b+ \mathbf x_c$, then $a<b\le c<d$.

Now, write $\theta_n := x^n-\lfloor x^n\rfloor \in [0,1)$. If $\mathbf x_d+\mathbf x_a=\mathbf x_b+\mathbf x_c$, then
$$x^d+x^a-x^b-x^c = (\theta_d+\theta_a)-(\theta_b+\theta_c)$$
implying $\bigl|x^d+x^a-x^b-x^c\bigr|<2$.

Now, define
$$y:=\frac1x\in(0,1),\qquad u:=d-a,\quad w:=d-b,\quad v:=d-c$$
so that
$$\bigl|1+y^u-y^w-y^v\bigr|<2y^d$$
for infinitely many quadruples $(u,v,w,d)$ with $d\to\infty$.

Choose $D_0$ for which $4y^{D_0}<1$. Consider any collision $-2y^d < 1+y^u-y^w-y^v < 2y^d$ for $d\ge D_0$. This gives
$$y^v+y^w > 1+y^u-2y^d \ge 1-2y^d > \frac12$$
since $y^u\ge 0$ and $2y^d<\tfrac12$. Since $v\le w$ and $0<y<1$, we have $y^v\ge y^w$, hence $y^v+y^w \le 2y^v$. Therefore $2y^v>\tfrac12$, i.e. $y^v>\tfrac14$. Because $v\mapsto y^v$ is strictly decreasing, there is a bound $V=V(y)$ such that for every collision with $d\ge D_0$, one has $v\le V$.

Using $y^v+y^w > 1-2y^d$ in a similar way as in the last argument, we also get a bound $W=W(y)$ such that for every collision with $d\ge D_1$, one has $w\le W$. In other words, all sufficiently large collisions lie in the finite set $\{(v,w)\in\mathbb N^2:\ 1\le v\le w\le W\}$. Since there are infinitely many collisions with $d\to\infty$, by the pigeonhole principle there exist fixed integers $v_0$, $w_0$ and an infinite subsequence of collisions along which $(v,w)=(v_0,w_0)$ and $d\to\infty$.

Let $C:=C(y):=y^{v_0}+y^{w_0}$ so that
$$\bigl|1+y^{u}-C\bigr|<2y^{d}\xrightarrow[]{d\to\infty}0$$
so that $y^{u}\to C-1$.

If $C=1$, then $y^{v_0}+y^{w_0}=1$, i.e., $1-y^{v_0}-y^{w_0}=0$.

If $C>1$, then $y^{u}\to C-1>0$. Since $y^n\to 0$, the exponents $u$ cannot tend to $\infty$; more formally, choose $M$ such that $2y^M<C-1$. For all sufficiently large terms of the subsequence, $2y^{u}>C-1$ implying $u<M$. Thus, $u$ takes values in a finite set $\{1,2,\dots,M-1\}$, so there exists $u_0$ and a further infinite subsequence with $u=u_0$ constant.On that further subsequence, we have
$$\bigl|1+y^{u_0}-C\bigr|<2y^{d}\xrightarrow[]{d\to\infty}0$$
implying $1+y^{u_0}-C = 0$, and hence, $1+y^{u_0}-y^{v_0}-y^{w_0}=0$.

This completes the proof in both cases.

Dissociated sets avoiding geometric progressions

In a recent preprint, I dealt with sets with distinct subset sums. A set $\mathcal S\subset \mathbb N$ is said to be a subset-sum-distinct or dissociated (when studied in other contexts) if all of its finite subsets have different sums. Alternately, an equivalent classification is if any equality of the form
$$\sum_{s\in \mathcal S} \varepsilon_s \cdot s =0$$
where $\varepsilon_s \in \{-1,0,+1\}$ implies that all the $\varepsilon_s$'s are $0$.

Some two and a half years ago, I was playing around with the greedy algorithm on these sets, and noticed something strange. This starts with two given integers $a>0$ and $b>a$. It sets $\gamma_1=a$, $\gamma_2=b$ and in the $r$-th step it chooses $\gamma_r$ to be the smallest integer greater than $\gamma_{r-1}$ such that the sequence $\{\gamma_k\}_{k=1}^r$ is dissociated. For any given $a,b$, let $\Gamma:=\Gamma_{a,b}=\{\gamma_1=a, \gamma_2=b, \gamma_3, \dots \}$ be the dissociated sequence that the greedy algorithm produces. From purely numerical experiments, I noticed that there is always an $n_0=n_0(a,b)$ such that
$$\gamma_n = 2\cdot \gamma_{n-1}$$
for all $n\ge n_0$. I had to wait all this time to finally be able to prove this. One can see the proof in the preprint.

From this result (and some other observations I made in and outside the preprint), I thought that a large dissociated set must contain a lot of geometric progressions. In the preprint, I answered this in the negative by constructing a large collection of dissociated sets which also do not contain any three $a,b,c$ with $b^2=ac$. In this post, I will give an explicit nice construction of such a set.

First of all, note that for a dissociated set $\mathcal S$, we have
$$\left\vert\mathcal S(n)\right\vert \le \log_2 n + \mathcal{O}( \log_2\log_2 n)$$
where $\mathcal S(n) := \mathcal S \cap [1,n]$. This is because the map $f : \mathcal P\left(\mathcal S(n)\right) \to \mathbb N$ that takes a subset of $\mathcal S(n)$ to the sum of its elements is injective, hence giving $2^{\left\vert\mathcal S(n)\right\vert} \leq n\cdot \left\vert\mathcal S(n)\right\vert +1$, hence completing the proof.

So, we will now construct a dissociated set of size $\sim \log_2 n$ that does not contain any GP.

Theorem : The sequence $\{\psi_n\}_{n=1}^\infty$ defined as
$$\psi_n = \frac{2^{n+2}}{7} + \frac{1}{7\sqrt 3}\cdot\sin \left(\frac{2\pi n}{3}\right) + \frac 37 \cdot \cos \left(\frac{2\pi n}{3}\right)$$
is dissociated and does not contain any three term geometric progression.

The proof of this follows from a series of simple and straightforward observations that we will list as Lemmas. We should of course begin with the obvious
Lemma 1 : The values of $\{\psi_n\}_{n=1}^\infty$ are given by
\begin{align*}&\psi_{3k} = \frac{4\cdot 8^k}{7} + \frac 37\\&\psi_{3k+1} = \frac{8^{k+1}}{7} - \frac 17\\&\psi_{3k+2} = \frac{2\cdot 8^{k+1}}{7} - \frac 27\end{align*}
for all $k$. So, $\psi_n$ is a positive integer, and can be expressed as
$$\psi_n = \left\lVert \frac{2^{n+2}}{7} \right\rVert$$
where $\lVert \cdot \rVert$ represents rounding to the nearest integer.

Proof : The expressions for $\psi_{3k+i}$, $i=0,1,2$ are a matter of straightforward computation. The fact that they are all integers follows from the simple observation that
$$8^k = \left(7+1\right)^k \equiv 1 \pmod 7$$
for all $k$. $\square$


Lemma 2 : The sequence $\{\psi_n\}_{n=1}^\infty$ is strictly increasing. In fact, we have
\begin{align*}&\psi_{3k} = 2\cdot \psi_{3k-1} + 1\\&\psi_{3k+1} = 2\cdot \psi_{3k} - 1\\&\psi_{3k+2} = 2\cdot \psi_{3k+1}\end{align*}
for all $k\in \mathbb N$. So,
$$\psi_n = \psi_{n-1}+\psi_{n-2}+2\cdot \psi_{n-3}$$
which admits
$$\frac 1{1 - x - x^2 - 2x^3} = \frac 1 {(1-2x)(1+x+x^2)}$$
as the generating function.

Proof : The expressions for $\psi_{3k+i}$, $i=0,1,2$ follow immediately from Lemma 1. This, used in an Induction argument, proves the recursion. The generating function is a simple consequence of it. $\square$


Lemma 3 : The sequence admits a nice modularity property. In particular
\begin{align*}&\psi_{3k} \equiv 5 \pmod 8\\&\psi_{3k+1} \equiv 1 \pmod 8\\&\psi_{3k+2} \equiv 2 \pmod 8\end{align*}
for all $k\in \mathbb N$.

Proof : We begin by calculating that $\psi_1=1$, $\psi_2=2$ and $\psi_3=5$. The rest of the proof is immediate from Lemma 2. $\square$

Lemma 4 : We have
$$\sum_{i=1}^m \psi_i < \psi_{m+1}$$
for all $m\in \mathbb N$, $m\ge 2$.

Proof :  Notice that the lemma is trivial to check for $m=2,3,4$.
Now, let us assume
$$\sum_{i=1}^k \psi_i < \psi_{k+1}$$
for some $k$. This gives
$$\sum_{i=1}^{k+1} \psi_i < 2\cdot\psi_{k+1}$$
and hence
$$\sum_{i=1}^{k+3} \psi_i < \psi_{k+4}$$
using the recursion discussed in Lemma 2.
So, an induction argument completes the proof. $\square$


Equipped with these lemmas, it is now possible to give a proof of the Theorem.

Proof : We will first show that the set is dissociated. To do so, let us assume that it's not. If possible, let there be indexing sets $\{i_k\}_{k=1}^n$ and $\{j_k\}_{k=1}^m$ such that
$$\psi_{i_1}+\dots \psi_{i_n} = \psi_{j_1}+\dots \psi_{j_m}$$
where all the $\psi_i$'s are distinct.

Without loss of generality, let $\psi_{i_1}$ be the largest of these $\psi_k$'s. By Lemma 4, this implies
$$\psi_{i_1} > \psi_{j_1}+\dots \psi_{j_m}$$
and hence
$$\psi_{i_1}+\dots \psi_{i_n} > \psi_{j_1}+\dots \psi_{j_m}$$
which is a contradiction.

Now, we wish to show that there are no three term geometric progressions in the sequence. On the contrary, let us assume
$$\psi_i^2 = \psi_j\cdot \psi_k$$
for some $\psi_i$, $\psi_j$, $\psi_k$ in our sequence. We begin by noticing that Lemma 3 implies that either
$$i,j,k \equiv 0,1,1 \pmod 3$$
or $i\equiv j\equiv k\pmod 3$.

For the first case, let us assume that there are integers $j,k,\ell\ge 2$ (the other cases can be checked by hand) such that
$$\psi_{3k}^2 = \psi_{3j+1} \cdot \psi_{3\ell +1}$$
which implies
$$\left(4\cdot 8^k + 3\right)^2 = \left(8^j-1\right)\cdot \left(8^\ell-1\right)$$
by Lemma 1. This is a contradiction since the LHS and the RHS are $9$ and $1$ modulo $64$ respectively.

Now, let us consider the case $i\equiv j\equiv k \equiv 0\pmod 3$. Again as before, we assume
$$\psi_{3k}^2 = \psi_{3j} \cdot \psi_{3\ell}$$
where without loss of generality, it may be assumed that $\ell<k<j$. By Lemma 1, we have
$$8^{2k+1}-8^{\ell+j+1} = 12\cdot \left(8^j + 8^\ell -8^k\right)$$
and hence
$$8^{2k+1-\ell} - 8^{j+1} = 12\times \left(8^{j-\ell} - 8^{k-\ell} + 1\right)$$
which is a contradiction since the LHS and RHS do not match modulo $8$.

The cases $i\equiv j\equiv k \equiv 1\pmod 3$ or $i\equiv j\equiv k \equiv 2\pmod 3$ are to be dealt similarly as the previous case. $\square$

Some Expressions of $\pi$

The idea is to derive some expressions for describing $\pi$ starting from the basic definition. The steps will be outlined and the reader will need to fill most of the details up. Of course, convergence questions will be ignored since checking them is routine. Take this as a late pi-day post if you wish to.

Start with the circle described by
$$\left(x=\frac{1-t^2}{1+t^2}, y=\frac{2t}{1+t^2}\right)$$
to get
$$\frac{\pi}{4} = \int_0^1\frac{dt}{1+t^2} = \frac12\int_0^1\frac{s^{-1/2}}{1+s}\,ds$$
using elementary calculus.

Now define
$$G(z)\;:=\;\frac12\int_0^1\frac{s^{-1/2}}{1+z s}\,ds =\frac12\sum_{n\ge0}(-z)^n\int_0^1 s^{n-1/2}\,ds =\sum_{n\ge0}\frac{(-z)^n}{2n+1}$$
and note that using $z=1$ gives
$$\boxed{\frac{\pi}4=\sum_{n\ge0}\frac{(-1)^n}{2n+1}}$$
which is the famous Madhava-Gregory-Leibnitz-(enter your favourite mathematician) series for $\pi$. Squaring both sides of this "carefully", one gets
$$\boxed{\frac{\pi^2}{6}=\sum_{n\ge 1} \frac 1{n^2}}$$
which is a famous result of Euler. This "careful squaring" is an exercise in Borwein & Borwein’s Pi and the AGM (Wiley, 1987).

Now, consider the character $\chi$ mod $4$ and define
$$F(s) := \frac{1}{1 - 2^{-s}} \cdot \prod_{p\in \mathcal P} \frac{1}{1 + \frac{\chi(p)}{p^s}}$$
where $\mathcal P$ is the set of odd primes.

A little algebra shows
$$F(s) = \frac{\zeta(2s)(1 + 2^{-s})}{L(s, \chi)}$$
where
$$\zeta(s) := \sum_{n\ge 1} \frac{1}{n^s}, \qquad L(s, \chi) := \sum_{n\ge 1} \frac{\chi(n)}{n^s} = \prod_{p} \frac{1}{1 - \frac{\chi(p)}{p^s}}$$
for our chosen character.

Taking $s\to 1^+$ proves
$$\boxed{\pi = \sum_{m=1}^{\infty}\frac{(-1)^{s(m)}}{m}}$$
where $s(m)$ counts the number of appearances of primes of the form $4k+1$ in the prime decomposition of $m$.

We return back to $G$ and note that $G$ can also be expressed as the Stieltjes integral
$$G(z)=\int_0^1\frac{d\mu(s)}{1+zs}$$
for the positive measure $\mu$ on $[0,1]$ with density $d\mu(s)=\tfrac12 s^{-1/2}ds$. Now, show that any function of this type admits an expression of the form
$$G(z)=\cfrac{1}{\,1+\cfrac{\beta_1 z}{3+\cfrac{\beta_2 z}{5+\cfrac{\beta_3 z}{7+\ddots}}}\,}$$
by looking at the Taylor series and uniquely determining the finite continued fraction convergents successively so that the $k$-th convergent matches the Taylor series up to $z^k$.

The crucial part is to show $\beta_n = n^2$. If we can prove this, then we get
$$\boxed{\frac{\pi}{4} = \cfrac{1}{\,1+\cfrac{1^2}{3+\cfrac{2^2}{5+\cfrac{3^2}{7+\ddots}}}\,}}$$
from which one also gets
$$\boxed{\frac{\pi}{4}=1+ \cfrac{1}{3- \cfrac{3\cdot 4}{1- \cfrac{2 \cdot 3}{3- \cfrac{5 \cdot 6}{1-\cfrac{4 \cdot 5}{\ddots}}}}}}$$
after some simple algebra.

Indeed, start with the first form and write
$$\begin{align*}\frac 4{\pi} -1 &= \frac{1}{3+\dfrac{4}{5+\dfrac{9}{7+\cdots}}} = \frac{1}{3+\dfrac{4}{\,\dfrac{\left(5+\dfrac{9}{7+\cdots}\right)}{1}\,}} = \frac{1}{\,3+\dfrac{4}{\dfrac{5+\dfrac{9}{7+\cdots}}{1}\,}}\\&= \frac{1}{\,3+\dfrac{4}{5+\dfrac{9}{7+\cdots}}\,} = \frac{1}{\,3-\dfrac{12}{\,1-\dfrac{6}{3-\dfrac{20}{1-\cdots}}\,}\,}\end{align*}$$
and continue in this manner at each step to get the second form.

Finally, to show $\beta_n=n^2$, we define
$$m_k=\int_0^1 s^k\frac{1}{2}s^{-1/2}ds=\int_0^1 u^{2k}\,du=\frac{1}{2k+1}$$
and let
$$\Delta_n=\det\bigl(m_{i+j}\bigr)_{0\le i,j\le n}$$
with the convention $\Delta_{-1}:=1$ and $\Delta_0=m_0$. By writing the orthogonality linear systems for the polynomial coefficients and solving them by Cramer’s rule, one gets
$$\beta_n=\frac{\Delta_n\Delta_{n-2}}{\Delta_{n-1}^2}$$
for $n\ge 1$. On the other hand, it is also possible to show (using Induction or otherwise) that
$$\Delta_n =\frac{2^{\,n(n+1)}\left(\prod_{m=1}^n m!\right)^{2}}{\prod_{k=0}^{2n}(2k+1)^{\min(k+1,2n-k+1)}}$$
for $n\ge 1$. Plugging this into the equation of $\beta_n$ now completes the proof.

A special case of Problem #355

In the online database of Erdős Problems, the following is listed as problem #355. A sequence of natural numbers $A\subset \mathbb N$ is called a lacunary sequence if $a_1<a_2<\dots$ and there is a $\lambda_A>1$ so that
$$\frac{a_{n+1}}{a_n} \ge \lambda_A$$
for all $n\ge 1$. Consider the set
$$\mathcal S_A := \left\{ \sum_{a\in A'}\frac{1}{a} : A'\subseteq A\textrm{ finite}\right\}$$
for such a sequence. The question is whether there exists a lacunary sequence $A$ for which $\mathcal S_A$ contains all rationals in some open interval.

While this question is quite a difficult one to answer, a special case is pretty elementary. I recently came to know that the results discussed in this post are pretty well known and are often attributed to Kakeya from more than a hundred years ago. I would like to thank Prof. Vjekoslav Kovač for this information.


Theorem 1 : Let $A = \{a_1 < a_2 < \cdots\} \subset \mathbb{N}$ be a lacunary sequence with $\lambda:=\lambda_A>2$. Then $\mathcal S_A$ does not contain all rationals in an open interval.

Proof : Set $u_n:=1/a_n$ and $r=1/\lambda<1/2$. By lacunarity, there is a constant $C=u_1/r$ such that
$$u_n\le Cr^n$$
for all $n\ge 1$. Note that every $s\in \mathcal S_A$ is of the form
$$s=\sum_{n\ge 1} \varepsilon_n u_n$$
for $\varepsilon_n\in \{0,1\}$ and $\varepsilon_n\ne 0$ for finitely many $n$.
    
Now, fix an $N\in \mathbb N$ and for each $\mathbf \varepsilon = (\varepsilon_1,\varepsilon_2,\dots , \varepsilon_N) \in \{0,1\}^N$, define
$$s_{\mathbf{\varepsilon}} := \sum_{n=1}^N \varepsilon_n u_n, \qquad R_N := \sum_{n>N} u_n$$
to be the first $N$ digits and the tail after that. But for each sequence $\{\varepsilon_n\}_{n\ge 1}$, we have
$$\sum_{n\ge 1} \varepsilon_n u_n \in \left[s_{\mathbf\varepsilon}, s_{\mathbf{\varepsilon}}+R_N\right]$$
for all $N\in \mathbb N$. So, define
$$\mathcal U_N := \bigcup_{\mathbf{\varepsilon}\in \{0,1\}^N} \left[s_{\mathbf\varepsilon}, s_{\mathbf{\varepsilon}}+R_N\right]$$
and note that $\mathcal S_A\subset \mathcal U_N$.

Now, $\mathcal U_N$ being a finite union of closed sets is closed. Also, it is clear that
$$\overline{\mathcal S_A}\subset \bigcap_{n\ge 1} \mathcal U_N$$
where $\overline{\mathcal S_A}$ is the closure of $\mathcal S_A$.

But by definition, we have
$$R_N \le \sum_{n\le N} Cr^n \le C' r^N$$
for a suitable constant $C'$. Now, $\mathcal U_N$ is the union of at most $2^N$ intervals. This implies
$$\left\lvert \mathcal U_N\right\rvert \le 2^N R_N \le C' (2r)^N \xrightarrow{N\to \infty} 0$$
since $\lambda>2$. This means that the Lebesgue measure
$$\mu\left(\bigcap_{n\ge 1} \mathcal U_N\right) = 0$$
and hence
$$\mu\left(\overline{\mathcal S_A}\right) = 0$$
implying $\overline{\mathcal S_A}$ has empty interior.

So, if there was an open interval $I$ such that $I\cap \mathbb Q \subset \mathcal S_A$, then this would imply
$$I = \overline{I\cap \mathbb Q} \subset \overline{\mathcal S_A}$$
which is a contradiction.


In fact, we can prove a stronger statement.

Theorem 2 : For a lacunary sequence $A$, we have
$$\dim_H\left(\overline{\mathcal S_A}\right) \le \frac{\log 2}{\log \lambda}$$
where $\dim_H(X)$ is the Hausdorff dimension of a set $X$.

Proof : With notations as in the previous proof, the $s$ dimensional Hausdorff content at scale $R_N$ satisfies
$$\mathcal H^s_{R_N} \left(\overline{\mathcal S_A}\right) \le 2^N (2R_N)^s$$
for $s>\frac{\log 2}{\log \lambda}$.

Using the bound for $R_N$, we have
$$\mathcal H^s_{R_N} \left(\overline{\mathcal S_A}\right) \le \left(\frac{2C r^{\,}}{1-r}\right)^s\cdot \left(2 r^s\right)^N$$
and hence
$$\lim_{N\to \infty} \mathcal H^s_{R_N} \left(\overline{\mathcal S_A}\right) = 0$$
since $2r^s=2\lambda^{-s}<1$.

Hence, $\dim_H\left(\overline{\mathcal S_A}\right) \le s$. Letting $s\downarrow \frac{\log2}{\log\lambda}$ completes the proof.


Remark : If we also impose $\lambda>2$ in Theorem 2, then $\dim_H\left(\overline{\mathcal S_A}\right)<1$ and hence $\overline{\mathcal S_A}$ has empty interior, hence proving Theorem 1.

Equidistribution of Sidon Sums in Residue Classes II

In the last post , we introduced what we are trying to achieve in this post. However, we will write this note in a way that is independent o...