Search

Another Beautiful Pigeon-Hole Argument

I first encountered this question in an assignment in our Graph Theory and Combinatorics course. Later, I also saw this in an MSE post and answered immediately. I was extremely happy to be able to solve this problem and now seems to be a good time to put it on this blog.

Problem Statement -
Let $S=\{1,2,\dots ,280\}$. Find the smallest natural number $n$ such that every $n$-element subset of $S$ contains $5$ pairwise relatively prime numbers.

Solution -
Before we begin with the real question, we note that $\lfloor\sqrt{280}\rfloor=16$. This means that all composite numbers below $280$ must have one prime factor in the set $\{2,3,5,7,11,13\}$. We will use this fact in our answer.

First, we will consider the subset $A$ of $S$ that only contains multiples of $2,3,5$ and $7$. We can use Principle of Inclusion and Exclusion to find the cardinality of $A$ as
\begin{align*} |A|=&\left\lfloor\tfrac{280}{2}\right\rfloor+\left\lfloor\tfrac{280}{7}\right\rfloor+\left\lfloor\tfrac{280}{3}\right\rfloor+\left\lfloor\tfrac{280}{5}\right\rfloor\\ &-\left\lfloor\tfrac{280}{14}\right\rfloor-\left\lfloor\tfrac{280}{6}\right\rfloor-\left\lfloor\tfrac{280}{10}\right\rfloor-\left\lfloor\tfrac{280}{15}\right\rfloor-\left\lfloor\tfrac{280}{21}\right\rfloor-\left\lfloor\tfrac{280}{35}\right\rfloor\\ &+\left\lfloor\tfrac{280}{70}\right\rfloor+\left\lfloor\tfrac{280}{105}\right\rfloor+\left\lfloor\tfrac{280}{42}\right\rfloor+\left\lfloor\tfrac{280}{30}\right\rfloor\\ &-\left\lfloor\tfrac{280}{210}\right\rfloor \end{align*}
which gives $|A|=216$.

Now, note that, $A$ only has multiples of four prime numbers. So, using Pigeon Hole Principle, we can say that any five elements of $A$ must have at least two multiples of a particular prime, which proves that any $5$ element subset of $A$ is NOT pairwise relatively prime.

This proves that if we want to have an $n$-element subset of $S$ such that any five of them will be relatively prime, then we must have $n>216$.

Now, notice that the cardinality of $A$ is $216$ and $A$ contains exactly $4$ prime numbers (namely, $2,3,5$ and $7$). So, the other $212$ numbers in $A$ (and hence in $S$) are composite.

Let us try to find the other composite numbers in $S$. These must be of the form $11m$ or $13n$ for some $m,n\in \mathbb N$ such that $m,n>10$ (since all multiples till $10$ have already been counted). It is easy to check that the only such composite numbers are  $121,143,169,187,209,221,247,253$. That gives precisely $8$ more composite numbers in $S$. So, $S$ has $220$ composite numbers and $59$ primes (with a $1$ which neither).

Now, consider a subset $X$ of $S$ such that $|X|=217$. We claim that there are five elements in $X$ such that any two of them are pairwise relatively prime. To prove that, let us assume on the contrary that our claim is false. So, any five element subset of $X$ must have two relatively non-prime elements. This forces the number of prime numbers in $X$ to be less than or equal to $4$ since if there are more than four prime numbers, then the set of these primes has all pairwise relatively prime numbers. So, there are at most four primes in $X$ which means that there are at least $213$ composite numbers in $X$. This means that the number of composite numbers outside $X$ (i.e., in $S\backslash X$) is at most $7$.

Now, consider the following eight sets
\begin{align} &\{2^2,3^2,5^2,7^2,13^2\}\\ &\{2\times31,3\times29,5\times23,7\times19,11\times17\}\\ &\{2\times47,3\times43,5\times41,7\times37,13\times19\}\\ &\{2\times41,3\times37,5\times31,7\times29,11\times23\}\\ &\{2\times23,3\times19,5\times17,7\times13,11^2\}\\ &\{2\times43,3\times41,5\times37,7\times31,13\times17\}\\ &\{2\times37,3\times31,5\times29,7\times23,11\times19\}\\ &\{2\times29,3\times23,5\times19,7\times17,11\times13\} \end{align}
and note that the $40$ elements in these $8$ sets are all distinct because of the Unique Factorization Theorem. Also, since there are eight sets, and at most seven composite numbers can be outside $X$, by the Pigeon Hole Principle, at least one of these sets must be in $X$. Also, clearly, in all of the above sets, the elements are mutually co-prime.

This completes the proof. $\square$

Four Proofs of a Theorem

A set of positive integers $A\subset \mathbb N$ is called a Sidon Set or a Sidon Sequence if the equation $a+b=c+d$ does not have any non-trivial solutions in $A$. They were named after Hungarian mathematician Simon Sidon who was inspired by certain problems in Fourier series to ask Erdős about the possible growth of such sequences.

Since then, there has been an extensive amount of work on this topic exploring a plethora of different questions about finite and infinite Sidon sets, and various generalizations of them. The classic result regarding bounds on sizes of Sidon sets was given by Erdős and Turan who also provided a first proof of their theorem. However, here we will discuss four other proofs which I find are easier than the original one. They all yield the same bound, with the exception of Ruzsa's proof which when refined can change the final constant to $1/2$ instead of $1$.


Theorem (Erdős, Turan) : Let $A\subset [n]$ be Sidon Set. Then

$$\left\lvert A\right\rvert < n^{1/2}+n^{1/4}+1$$

where $[n] := \{1,2,\dots , n\}$.


Proof 1 (Lindström) : Let $A=\{a_1<a_2\dots <a_k\}\subset [n]$, $a_j-a_i$ are all distinct. Indeed this property of all the differences (except $0$) being distinct is a defining feature of Sidon sets and is often one of the most important ingredients in proofs of quite strong results.

Define

$$S_1 := \sum_{1\le j\le k-1} (a_{j+1}-a_j) = a_k-a_1$$

so that

$$\frac{(k-1)\cdot k}{2}\le S_1\le n$$

as $S_1$ is the sum of $(k-1)$ distinct integers, and is therefore bigger than the sum of the first $(k-1)$ integers. This yields the bound $k\lesssim \sqrt{2n}$ and was already observed by Erdős. But to refine this further, we need to test the limits of this idea.

Define

$$S_2 := \sum_{1\le j\le k-1} (a_{j+2}-a_j) = a_k+a_{k-1}-a_2-a_1\le 2n$$

so that

$$\frac{(2k-3)\cdot (2k-2)}{2}\le S_1+S_2\le 3n$$

which again follows from the fact that the differences considered in $S_1$ are all different from those considered in $S_2$ and hence $S_1+S_2$ is the sum of $(2k-2)$ distinct integers. This by itself improves the bound to $k\lesssim \sqrt{1.5 \; n}$. Of course we do not need to stop here and can indeed continue to the $\ell$-th iteration. After iterating this $\ell$ times, we have

$$\frac{D\cdot (D+1)}{2}\le S_1+\dots +S_\ell \le (1+2+\dots +\ell)n$$

where $D := (k-1)+\dots +(k-\ell)=k\ell -\frac{\ell\cdot (\ell+1)}{2}$.

Rearranging, we get

$$k-\frac{\ell +1}{2}\le \sqrt{\frac{n\cdot (\ell+1)}{\ell}}$$ 

which gives

$$k\le \sqrt n + \frac{\sqrt n}{2\ell} + \frac{\ell}{2} + \frac 12$$

using the inequality $\sqrt{1+x}\le 1+x/2$.

It is easy to check that $\ell =\left\lfloor n^{1/4}\right\rfloor$ is the optimal choice, hence completing the proof. $\square$


Proof 2 (Ruzsa) : This proof uses a lemma which is fairly well known and appears in different texts and in a variety of different contexts. It is usually attributed to Johnson although it has been rediscovered several times.

Lemma : Let $\mathcal A$ be a family of $k$-sets $A_1,\dots ,A_m$ so that $A_i\cap A_j$ has cardinality atmost $t$ for any $i,j$. Then,

$$v \le \frac{k^2m}{tm+k-t}$$

where $v:= \left\lvert \cup A_i \right\rvert$.

Proof of Lemma : For an element $x\in \cup A_i$, let $d_x= \left\lvert \{i : x\in A_i\}\right\rvert$ so that

\begin{align*} tm(m-1)+mk&\ge \sum_{i}\left(\sum_{j} \left\lvert A_i\cap A_j\right\rvert\right)\\ &=\sum_{x} d_x^2\ge \frac{\left(\sum d_x\right)^2}{v}=\frac{k^2m^2}{v} \end{align*}

hence completing the proof. $\square$

Now let $A\subset [n]$ be a Sidon set and let $k := \left\lvert A\right\rvert$. Take a positive integer $m$ and construct

$$\mathcal A := \{A_i := A+(i-1) : i\in [m]\}$$

where $A+(i-1):=\{a+i-1 : a\in A\}$. Notice that $\cup A_i \subseteq [n+m-1]$.

The key observation is that $\left\lvert A_i\cap A_j\right\rvert\le 1$ for $i\neq j$ as if $x,y\in A_i\cap A_j$ for some $i<j$, then we have $a,b,c,d\in A$ with $a+i=x=b+j$ and $c+i=y=d+j$ implying $a+d=(x-i)+(y-j)=(x-j)+(y-i)=b+c$, and hence $x=y$.

Applying Lemma with $t=1$ and $v\le n+m-1$, we get

$$(n+m-1)(m+k-1)\ge k^2 m$$

hence completing the proof. $\square$


Proof 3 (Graham) : This is the shortest proof in the list. In fact, this is probably one of those proofs that are so short and immediate that they feel like magic. The proof is essentially a direct application of a well known lemma.

Recall the Weyl - van der Corput Lemma. This states that if $\xi(n)$ is a real valued function supported on the interval $[1, N]$, then

$$\left\lvert \sum_{n} \xi(n)\right\rvert ^2 \le \frac{N+H-1}{H} \sum_{\left\lvert h\right\rvert <H} \left(1-\frac{\left\lvert h\right\rvert}{H}\right)\sum_{n} \xi(n)\xi(n+h)$$

for any positive integer $H$.

Now, let $A$ be a Sidon set and $\xi(n)$ be the characteristic function of $A$, i.e., $\xi(n)=1$ if $n\in A$ and $\xi(n)=0$ otherwise. We have

$$\left\lvert A\right\rvert^2 \le \frac{N+H-1}{H} \left(\left\lvert A\right\rvert +H-1\right)$$

since $\sum_{n} \xi(n)\xi(n+h)=\left\lvert A\right\rvert$ if $h=0$ and $\sum_{n} \xi(n)\xi(n+h)\le 1$ otherwise. This is due to the fact that for a fixed constant $h$, there can be atmost one $n$ for which both $n$ and $n+h$ are in $A$ (since all differences in $A$ are distinct).

Using $H=\left\lfloor N^{3/4}\right\rfloor+1$ completes the proof. $\square$


Proof 4 (Cilleruelo) : This proof will require the introduction of some new definitions, including set additions. For $A,B\subset \mathbb N$ we define $A+B:=\{a+b : a\in A, b\in B\}$ to be the collection of all sums of $A$ and $B$. The set $A-B$ is defined analogously.

Now, for $A,B \subset \mathbb N$, let us define

$$r_{A+B} (z) := \left\lvert\{z=a+b : a\in A, b\in B\}\right\rvert$$

to be the number of representations of an integer $z$ as sum of an element from $A$ and another from $B$. We also define

$$E := \sum_{x\in A+B} r_{A+B}^2 (x)$$

and call it the additive energy of $A+B$.

We immediately have $r_{A-A}(0) = \left\lvert A\right\rvert$. It is straightforward to check that

\begin{align*} &\sum_{x\in A+B} r_{A+B}^2 (x) = \sum_{x\in A+B} r_{A-A}(x) r_{B-B}(x)\\ &\sum_{x\in A+B} r_{A+B} (x) = \left\lvert A\right\rvert \left\lvert B\right\rvert \le \sqrt{\left\lvert A+B\right\rvert \sum_{x\in A+B} r_{A+B}^2} = \sqrt{\left\lvert A+B\right\rvert \left(\left\lvert A\right\rvert \left\lvert B\right\rvert + \sum_{x\neq 0} r_{A-A}(x) r_{B-B}(x)\right)} \end{align*}

for any $A,B\subset \mathbb N$.

Now, let $A\subset [N]$ be a Sidon set and $B\subset [\ell]$ so that $A+B\subset [N+\ell]$. So, using the second property above, we have

$$\left\lvert A\right\rvert ^2 (\ell+1)^2 \le (N+\ell)\left(\left\lvert A\right\rvert (\ell +1) + \sum_{x\neq 0} r_{B-B}(x)\right)$$

as $r_{A-A}(x)\le 1$ for all $x$ since $A$ is Sidon (and hence all differences are distinct).

Also,

$$\sum_{x\neq 0} r_{B-B}(x) = \left\lvert \{b_1-b_2 : b_1\neq b_2 ; b_1,b_2\in B\}\right\rvert = \ell (\ell +1)$$

so that

$$\left\lvert A\right\rvert ^2 (\ell+1) \le (N+\ell) \left(\left\lvert A\right\rvert +\ell\right)$$

and hence putting $\ell = \left\lfloor \sqrt{n\left(\left\lvert A\right\rvert -1\right)}\right\rfloor$ completes the proof. $\square$


A keen reader might try to make a case for Proof 4 being essentially the same as Proof 2, only written in a different language. While this may be true to some extent, it should be noted that Cilleruelo's goal (in this paper) was to consider Sidon sets in $\mathbb N^d$ and to prove an analogous result for this case using the method of additive energy. So, this proof is almost a byproduct of his actual work.

It is also worth mentioning that the linked document to Graham's proof is an excellent article which tries to show how the Weyl - van der Corput Lemma alone can prove some really strong results in the field. It is quite surprising how one inequality by itself can describe so many results!


The theorem we described remained the best result until very recently when an improvement was provided by Balogh, Füredi and Roy proving

\begin{equation} \left\lvert A\right\rvert < n^{1/2} + (1-\gamma) n^{1/4} \end{equation}

for some $\gamma\ge 0.002$. In the same spirit, O'Bryant improved the constant from $0.998$ to $0.99703$, and then again to $0.98183$ with Carter and Hunter.

It should however be noted that we are still quite far away from proving what we expect to be true. Addressing the original question of Sidon, Erdős conjectured (and offered $500 for a proof or disproof) that 

$$\left\lvert A\right\rvert < n^{1/2} + o(n^\varepsilon)$$

for all $\varepsilon >0$. This is considered to be a pretty hard problem and it will probably require some time before we can answer it. Maybe one of the readers will be able to help someday!


Acknowledgement : I want to express my heartfelt thanks to my supervisor, Prof. Ramachandran Balasubramanian, who introduced me to the beautiful topic of Sidon sets. His straightforward approach to understanding mathematics has greatly influenced my own mathematical intuition and helped me greatly in understanding these proofs.

A Dense Infinite Sidon Sequence

A set of positive integers $A\subset \mathbb N$ is called a Sidon Set or a Sidon Sequence if the equation $a+b=c+d$ does not have any non-trivial solutions in $A$. They were named after Hungarian mathematician Simon Sidon who was inspired by certain problems in Fourier series to ask Erdős about the possible growth of such sequences.

Since then, there has been an extensive amount of work on this topic exploring a plethora of different questions about finite and infinite Sidon sets, and various generalizations of them. In this post, we are mainly concerned about a breakthrough theorem in the field of infinite Sidon sets.

First consider the following example.

Example : (Greedy Algorithm) If $S\subset [1,n]$ is already a Sidon set, we can find $x$ such that $x$ does not equal $a$, $b$, $c$ or $a+b-c$ for any $a,b,c\in S$ and $S\cup \{x\}\subset \mathbb N$ is Sidon. Since there are at most $n$ choices for each of $a$, $b$ and $c$ and $a,b,c\le n$, choosing $x>n^3$ is enough.

It is clear that this sequence satisfies $S(x)\gg x^{1/3}$ where $S(x) := \left\lvert S\cap [1,x] \right\rvert$. It was conjectured by Erdős that there is a Sidon sequence $S$ with $S(x)\gg x^{1/2-\varepsilon}$ for all $\varepsilon>0$ (recall that $f(x)\gg g(x)$ means that there exists a constant $C$ such that $f(x)\ge C g(x)$ for all large $x$ and $f(x)\ll g(x)$ is defined similarly). He also proved that any arbitrary Sidon sequence will have the close upper bound of

$$S(x)\ll x^{1/2} \left(\log x\right)^{-1/2}$$

for infinitely many $x$. This exhibits that finite segments of an infinite Sidon set must sometimes be much smaller than it is expected in finite Sidon sets.

The lower bound was first improved by Ajtai, Komlós and Szemerédi who proved the existence of $S$ satisfying

$$S(x)\gg x^{1/3}\left(\log x\right)^{1/3}$$

and then finally Ruzsa gave a breakthrough by finding another $S$ satisfying

$$S(x)\gg x^{\sqrt 2 -1 - o(1)}$$

which has remained the best result in this area since $1998$. It is this proof that I will present in this post.

For those asking why we care, it is known that the number of elements in the biggest (usually called dense in literature) Sidon subset of the interval $[1,x]$ is of the order $\sqrt x$ (see here) and what Erdős' conjecture claims is that $\mathbb N$ is quite similar. In other words, we can find Sidon subsets of $\mathbb N$ with (logarithmic) density arbitrarily close to $\sqrt x$ (although $\sqrt x$ is not achievable, and this was proven by Erdős).


Theorem : There exists an infinite Sidon sequence $B$ such that

$$B(N)=N^{\gamma+ o(1)}$$

where $\gamma=\sqrt 2-1$.


Idea of the Proof : Notice that the set of primes $\mathcal P$ form a multiplicative Sidon set, that is $p_1p_2=p_3p_4$ for $p_i\in \mathcal P$ implies $\{p_1,p_2\}=\{p_3,p_4\}$. So, $\{\ln p\}_{p\in \mathcal P}$ forms a Sidon set in $\mathbb R$.

This means that if we can cut the binary representation of $\ln p$ in a proper way, we still almost surely retain the Sidon property. In fact, all the observations we made so far are true for $\{\alpha \ln p\}_{p\in \mathcal P}$ for all $\alpha$. Indeed we will start with a bunch of $\alpha$ so that we can later argue probabilistically. 


Proof Setup : The first step is to look at the number $\alpha\ln p$ in binary. Choose $\alpha\in [1,2]$ and write

$$\alpha \ln p = \sum_{i=0}^{k_p}\varepsilon_{ip} 2^i + \sum_{i=1}^\infty \delta_{ip} 2^{-i}$$

where $k_p=\left\lfloor\log \left (\alpha \ln p\right )\right\rfloor$. So, the $\varepsilon$-string is the integral part of this number and the $\delta$-string is the part after the decimal point (or is it the binary point?). Here, and in the rest of the proof, $\log$ will represent $\log_2$ and $\ln$ will be a placeholder for logarithm to any pre-specified base - the reader can think of it as $\log_e$ for convinience.

Now, we set $\beta = \sqrt 2 +1$ and $\gamma=1/\beta$. Then, we define

$$K_p = \min \left\{i>2 : 2^{(i-1)^2}>p^\beta\right\}$$

so that $K_p=2+\lfloor \sqrt{\beta \log p}\rfloor$. A more natural way to present this proof could have been to start with an arbitrary $\beta$ and realize at the end of the proof that this is the value of $\beta$ that optimizes the density.

Now, we need to do the proper cutting. So, we define

$$\lambda_p = \sum_{i=0}^{k_p} \varepsilon_{ip} 2^i + \sum_{i=1}^{K_p^2} \delta_{ip} 2^{-i}$$

by cutting the binary expansion of $\alpha \ln p$ at the $K_p^2$-th place.

Now, we use the square indices to construct

$$\Delta_{ip}=\sum_{j=(i-1)^2+1}^{i^2} \delta_j 2^{i^2-j}$$

by cutting the bigger block into smaller pieces.

In binary, the numbers look like

\begin{align}\Delta_{1,p} &= \delta_1\\\Delta_{2,p} &=\delta_2\delta_3\delta_4\\\Delta_{3,p} &=\delta_5\delta_6\delta_7\delta_8\delta_9 \end{align}

and so on. From now on, we will often supress $p$ or $\alpha$ from our notation whenever it doesn't interfere with the clarity of the proof.

Finally, we define $b_p$ as

$$\dots \underline{\smash{\:0\:}}\;\underline{\smash{\:\textcolor{red}{1}\:}}\;\underline{\smash{\:0\:}} \;\Delta_K \dots \Delta_3\; \underline{\smash{\:0\:}}\;\underline{\smash{\varepsilon_2}}\;\underline{\smash{\:0\:}}\;\underline{\smash{\:0\:}}\;\underline{\smash{\:0\:}}\; \Delta_2 \;\underline{\smash{\:0\:}}\;\underline{\smash{\varepsilon_1}}\;\underline{\smash{\:0\:}}\;\underline{\smash{\:0\:}}\;\underline{\smash{\:0\:}} \;\Delta_1 \;\underline{\smash{\:0\:}}\;\underline{\smash{\varepsilon_0}}\;\underline{\smash{\:0\:}}\;\underline{\smash{\:0\:}}\;\underline{\smash{\:0\:}}$$

which was done in the following way : first write the $\Delta_i$'s in a piece of paper and put five blanks in between each of them, and after $\Delta_1$. Fill up the second blank to the right of $\Delta_i$ with $\varepsilon_{i-1}$. The $4$th blank always contains a $0$ except the one to the left of $\Delta_K$ - this is done to keep track of the size of $b_p$. The rest of the blanks are all $0$. 

More formally,

$$b_p = \sum_{i=1}^K \Delta_{ip}2^{(i-1)^2+5i} + \sum_{i=0}^k \varepsilon_{ip} 2^{i^2+5i+2} + t_p$$

where $t_p=2^{K^2+5K+4}$.

Further, we put $\Delta_{ip}=0$ for $i>K$ and $\varepsilon_{ip}=0$ for $i>k$. Our Sidon set will be a subset of the set $\{b_p\}_{p\in \mathcal P}$.


Plan of the Proof : Recall how we introduced a padding of $0$'s in the definition of $b_p$. This was done to ensure that there are no carry overs when we add $b_{p_1}+b_{p_2}$. This makes sure that

\begin{align} b_p + b_s = b_q + b_r \iff & t_p + t_s = t_q + t_r, \\ & \varepsilon_{ip} + \varepsilon_{is} = \varepsilon_{iq} + \varepsilon_{ir}, \\ & \Delta_{ip} + \Delta_{is} = \Delta_{iq} + \Delta_{ir} \end{align}

for all $i$. Now, consider

$$b_p+b_s=b_q+b_r,\qquad b_p>b_q\ge b_r>b_s \tag{*}$$

for some $p,q,r,s$.

Define $P_K=\{p : K_p=K\}$. These sets provide a proper partitioning of the primes with respect to their sizes, and hence have the following important property. If $(*)$ holds, then there are integers $K\ge L$ such that

$$\lambda_p+\lambda_s = \lambda_q+\lambda_r \tag{1}$$

and $p,q\in P_K$, $r,s\in P_L$.

The artistic part is now over and what's left of our work is the technical part. This is the part where we try to filter out the largest Sidon subset of $\{b_p\}_{p\in \mathcal P}$. The techniques are all mainly aimed towards estimating the number of solutions of $(*)$ in the set $\{b_p\}_{p\in \mathcal P}$.


Bounds : Notice that

$$\left\lvert\lambda_p-\alpha\ln p\right\rvert\le 2^{-K^2}<p^{-\beta}$$

follows from definitions. Also, if $(*)$ holds, then

$$\left\lvert ps-qr\right\rvert < 8qr\cdot 2^{-L^2},\qquad qr > 2^{L^2-2}, \qquad (K-1)^2 > (\beta-1)(L-1)^2 + \beta(2L-3)$$

where $K$ and $L$ are as in $(1)$.

Now, we define

$$J_{KL} = \left\lvert\left\{p,q,r,s : p\neq q\in P_K, \; r\neq s\in P_L,\; \left\lvert ps-qr\right\rvert<8qr\cdot 2^{-L^2}\right\}\right\rvert$$

and try to estimate it.


Lemma : We have

$$J_{KL}\ll 2^{2\gamma \left((K-1)^2+(L-1)^2\right)-L^2}$$

where $J_{KL}$ is defined as above.

Proof : The Diophantine equation $ps-qr=a$ has the general solution

$$p=p_0+tr,\qquad q=q_0+ts$$

where $p_0$, $q_0$ are the smallest solution.

As $p<2^{\gamma(K-1)^2}$, we have $t<\frac{2^{\gamma(K-1)^2}}{r}$. So, for fixed $a,r,s$, the number of solutions is $\mathcal{O}\left(\frac{2^{\gamma(K-1)^2}}{r}\right)$.

Again, $\left\lvert a\right\rvert \ll r\cdot 2^{\gamma(K-1)^2-L^2}$. So, for fixed $r,s$, the number of solutions is $\mathcal O\left(2^{2\gamma (K-1)^2 -L^2}\right)$ and $r,s<2^{\gamma (L-1)^2}$.


Lemma : We have

$$\left\lfloor 2^{K^2}\alpha \ln p\right\rfloor \equiv \left\lfloor 2^{K^2}\alpha \ln q\right\rfloor \pmod{2^{K^2-L^2}}$$

assuming $(*)$ holds.

Proof : It is clear that

$$\left\lfloor 2^{K^2}\alpha \ln p\right\rfloor \equiv \sum_{i=L^2+1}^{K^2} \delta_{ip}2^{K^2-i} \pmod{2^{K^2-L^2}}$$

from definition. Also for $L<i\le K$, we have $\Delta_{ip}=\Delta_{iq}$ so that $\delta_{jp}=\delta_{jq}$ for $L^2<j\le K^2$. So, RHS remains unchanged if we replace $p$ by $q$.


Lemma : Let $K>L$ and $p\neq q\in P_K$ be given. Let there be at least one pair $r,s\in P_L$ and an $\alpha\in [1,2]$ such that $(*)$ holds. Then, we have

$$\mu\left(\alpha : \left\lfloor 2^{K^2}\alpha \ln p\right\rfloor \equiv \left\lfloor 2^{K^2}\alpha \ln q\right\rfloor \pmod{2^{K^2-L^2}}\right)\ll 2^{L^2-K^2}$$

where $\mu$ is the Lebesgue measure.

Proof : Call $A=2^{K^2}\ln \frac pq$, $M=2^{K^2-L^2}$ and notice that the property we are after is

$$\left\lfloor A\alpha \right\rfloor \equiv 0 \text{ or } -1 \pmod{M}$$

and since $\left\lfloor A\alpha \right\rfloor$ is evenly distributed modulo $M$, we have

$$\mu\left(\alpha : \left\lfloor A\alpha \right\rfloor \equiv 0 \text{ or } -1 \pmod{M}\right)\ll \frac 1M$$

hence completing the proof.


The proofs of the next two lemmas are straightforward and hence we will refrain from exhibiting them.

Lemma : For $p\neq q\in P_K$ and $r,s\in P_L$, we have

$$\mu\left(\alpha : b_p+b_s = b_q+b_r\right)\begin{cases}=0 &\text{if} \left\lvert ps-qr\right\rvert\ge 8qr\cdot 2^{-L^2}\\ \ll 2^{L^2-K^2} &\text{if} \left\lvert ps-qr\right\rvert <8qr\cdot 2^{-L^2}\end{cases}$$

for $K>L$.

Lemma : If

$$G_{KL}(\alpha) := \left\lvert\{\;p,q,r,s : p\neq q\in P_K, r,s\in P_L, (*) \text{ holds }\}\right\rvert$$

then

$$\int_1^2 G_{KL}(\alpha) \;\text{d}\alpha \ll 2^{2\gamma\left((L-1)^2+(K-1)^2\right)-K^2}$$

for $K>L$.


Lemma : Define

$$G_{K}(\alpha) = \left\lvert\{\;p,q,r,s : p,q\in P_K, (*) \text{ holds }\}\right\rvert$$

so that

$$\int_1^2 G_{K}(\alpha) \;\text{d}\alpha \ll 2^{\gamma\left(K-1\right)^2-2K}$$

for all $K$.

Proof : By definition, we have $G_K(\alpha)=\sum_{L\le K} G_{KL}(\alpha)$. And since $G_{KL}(\alpha)\neq 0$ only if $(L-1)^2<(K-1)^2/(\beta-1)$, we obtain an estimate of $\ll 2^Z$ with

$$Z=\left(\frac 2{\beta -1}-1\right)(K-1)^2 -2K +1$$

on which we use the fact that $\frac 2{\beta -1}-1=\frac 1\beta$.

In fact, the value of $\beta$ was chosen in the first place to satisfy $\frac 2{\beta -1}-1=\frac 1\beta$.


Proof of Main Theorem : With all the setup in hand, the rest of the proof is essentially an exercise in probability theory. Define the event

$$E_K := \left[G_K\ge N_K\right]$$

where $N_K:= 2^{\gamma(K-1)^2-K}$. This gives

\begin{align*}\mathbb P(E_K) &= \mathbb P(G_K\ge N_K)\\&\le \frac{\mathbb E(G_K)}{N_K}\;\;\text{(using Markov Inequality)}\\&= \frac 1{N_K} \int_1^2 G_K(\alpha)\text{d}\alpha\\ &\ll 2^{-K}\end{align*}

and hence

$$\sum_{K} \mathbb P(E_K) < \infty \implies \mathbb P\left(\liminf_{K} E_K^C\right)=1$$

by Borel-Cantelli lemma.

This guarantees the existence of an $\alpha$ such that

$$G_K(\alpha) < 2^{\gamma (K-1)^2 - K}$$

for all (except at most finitely many) $K$. Fix this $\alpha$.

Using Prime Number Theorem, the cardinality of $P_K$ is

$$\left\lvert P_K\right\rvert = \pi \left(2^{\gamma (K-1)^2}\right) - \pi \left(2^{\gamma (K-2)^2}\right) \sim \frac{2^{\gamma (K-1)^2}}{\gamma K^2 \ln 2}$$

and hence $G_K(\alpha)<\left\lvert P_K\right\rvert /2$ for $K>K_0$.

Now, for each $p,q,r,s$ satisfying $(*)$ with $p\in P_K$, we omit $p$ and define $Q_K$ to be the set of remaining elements. Then we define

$$B = \bigcup_{K>K_0} Q_K$$

which, by definition, is a Sidon set.

Now, we have

$$b_p < 2^{K^2+5K+5} < 2^{\left(K+3\right)^2}$$

implying that for $K=\left\lfloor \sqrt{\log x} -3\right\rfloor$, $\max Q_K <x$. This gives

$$B(x)\gg \left(\frac 12 -\varepsilon\right) \pi\left(2^{\gamma \left(K-1\right)^2}\right) = x^{\gamma + o(1)}$$

hence completing the proof. The reader can check that a similar calculation yields an upper bound. $\square$


Although any improvements on the density is still beyond our reach, some analogues of this idea have been suggested and they have been able to yield similar results in more general settings. They have some excellent potential of their own, but that is a topic for a different post!


Acknowledgement : I want to express my heartfelt thanks to my supervisor, Prof. Ramachandran Balasubramanian, who introduced me to the beautiful topic of Sidon sets. His straightforward approach to understanding mathematics has greatly influenced my own mathematical intuition and helped me greatly in understanding this paper.

AMM problem 12451

This is AMM problem 12451 that I solved with my friends Satvik Saha and Sohom Gupta.


Problem Statement -
Let $A$ and $B$ be complex $n\times n$ and $n\times m$ matrices respectively. Let $0_{m, n}$ denote the $m\times n$ zero matrix and let $I_{m}$ denote the $m\times m$ identity matrix. Prove 

$$\exp\begin{bmatrix}A & B \\ 0_{m, n} & 0_{m, m}\end{bmatrix} = \begin{bmatrix}\exp(A) & \left(\int_0^1 \exp(tA) \:dt\right)\cdot B \\0_{m, n} & I_{m}\end{bmatrix}$$

where $\exp$ be the matrix exponential function.


Solution -
Setting 

$$P = \begin{bmatrix} A & B \\ 0_{m, n} & 0_{m, m}\end{bmatrix}$$

we observe that the (unique) solution of the IVP $\mathbf{z}' = P\mathbf{z}$ with $\mathbf{z}(0) = \mathbf{z}_0$ is precisely $\mathbf{z}(s) = \exp(sP)\,\mathbf{z}_0$ for all $\mathbf{z}_0 \in \mathbb{C}^{m + n}$.

Thus, it suffices to check that for all $\mathbf{x}_0 \in \mathbb{C}^n, \mathbf{y}_0 \in \mathbb{C}^m$, the curves

\begin{align*} \mathbf{x}(s) &= \exp(sA)\cdot\mathbf{x}_0 + \left(\int_0^1 \exp(stA) \:dt\right)\cdot sB\cdot \mathbf{y}_0, \\ \mathbf{y}(s) &= \mathbf{y}_0 \end{align*}

solve the system of differential equations

\begin{align*}\mathbf{x}' &= A\mathbf{x} + B\mathbf{y}, \\ \mathbf{y}' &= \mathbf{0}\end{align*}

and then put $s = 1$.

This is easily verified; with $\mathbf{x}(s)$ as above, we have

\begin{align*}\mathbf{x}'(s) &= A\exp(sA)\, \mathbf{x}_0 + \left(\int_0^1 (I_n + stA)\,\exp(stA) \:dt\right)B\mathbf{y}_0 \\ &= A\exp(sA)\, \mathbf{x}_0 + \exp(sA)\,B\mathbf{y}_0\end{align*}

via integration by parts, and

\begin{align*}A\mathbf{x}(s) &= A\exp(sA)\, \mathbf{x}_0 +  \left(\int_0^1 sA\,\exp(stA) \:dt\right)B\mathbf{y}_0 \\&= A\exp(sA)\, \mathbf{x}_0 + (\exp(sA) - I_n)\, B\mathbf{y}_0 \\&= \mathbf{x}'(s) - B\mathbf{y}(s)\end{align*}

hence completing the proof. $\square$

AMM problem 12450

This is AMM problem 12450 that I solved with my friends Satvik Saha and Sohom Gupta.


Problem Statement -
Let $n$ be an odd positive number, and suppose that $x_1, \dots, x_n$ are chosen randomly and uniformly from the interval $[0, 1]$. For $1 \leq i \leq n$, let $y_i = x_i - x_i^2$. What is the expected value of the median of $\{y_1, \dots, y_n\}$?


Solution -
Let $g(x) = x - x^2$ on $[0, 1]$, and let

$$h(y) = \frac{1 - \sqrt{1 - 4y}}{2}$$

be the inverse of the $g$ restricted to $[0, 1/2]$.

Let $X \sim \mathcal{U}(0, 1)$, and set $Y = g(X)$.

Then, by symmetry, $Y$ has density 

$$f_Y(y) = 2 f_X(h(y)) \cdot \frac{d}{dy}h(y) = 2 h'(y)$$

from which it follows that the cdf of $Y$ on $[0, 1/4]$ is $F_Y(y) = 2 h(y)$. From an iid sample $y_1, \dots, y_n \sim F_Y$ where $n = 2k + 1$, the sample median $\xi = y_{[k + 1]}$ has density

$$f_\xi(\xi) = \frac{n!}{(k!)^2} \, F_Y(\xi)^k(1 - F_Y(\xi))^k f_Y(\xi) = \frac{n!}{(k!)^2} (2h(\xi))^k (1 - 2h(\xi))^k\, 2h'(\xi)$$

and hence, the expected value of the median is

\begin{align*}\mathbb{E}[\xi] = \int \xi f_\xi(\xi) \:d\xi&= \int_0^{1/4} \frac{n!}{(k!)^2} \, \xi\, (2h(\xi))^k (1 - 2h(\xi))^k \cdot 2h'(\xi) \:d\xi \\&= \int_0^1 \frac{n!}{(k!)^2} \, g(x/2)\, x^k(1 - x)^k \:dx \\&= \int_0^1 \frac{n!}{(k!)^2} \,\left[\frac{x}{2} - \frac{x^2}{4}\right]\, x^k(1 - x)^k \:dx \\&= \frac{n!}{(k!)^2} \left[\frac{k! (k + 1)!}{2(n + 1)!} - \frac{k!(k + 2)!}{4(n + 2)!}\right] \\&= \frac{3n + 5}{16n + 32}\end{align*}

which is the final answer.

AMM problem 12449

This is AMM problem 12449 that I solved with my friends Satvik Saha and Sohom Gupta.


Problem Statement -

Let $n$ be a positive integer with $n \geq 2$. The squares of an $(n^2 + n - 1)$-by-$(n^2 + n - 1)$ grid are colored with up to $n$ colors. Prove that there exist two rows and two columns whose four squares of intersection have the same color.


Solution -

Define a monochromatic $i$-pair as an unordered pair of row indices of squares in the same column with the same color $i$.

Note that if the same $i$-pair appears in two distinct columns, we have found a monochromatic rectangle. If not, each column must contribute distinct $i$-pairs, making the maximum number of possible $i$-pairs $\binom{n^2 + n - 1}{2}$, hence the maximum number of monochromatic pairs $n\cdot\binom{n^2 + n - 1}{2}$.

But if the $k$-th column contains $k_i$ squares of color $i$, then it contributes

$$\sum_{i = 1}^n \binom{k_i}{2}$$

monochromatic pairs, which is minimized when one $k_i$ is $n$ and the remaining $n - 1$ many $k_i$ are $n + 1$.

Thus, the $k$-th column contributes at least

$$\binom{n}{2} + (n - 1)\binom{n + 1}{2} = \frac{n\cdot (n^2 + n - 2)}{2}$$

monochromatic pairs, giving a total of at least

$$\frac{(n^2 + n - 1)\cdot n \cdot (n^2 + n - 2)}{2} = n \cdot \binom{n^2 + n - 1}{2}$$

monochromatic pairs.

Since we have equality, each color $i$ must contribute the same number (i.e. $\binom{n^2 + n - 1}{2}$) of $i$-pairs, while simultaneously all $k_i \in \{n, n + 1\}$. The latter means that each color $i$ must contribute a multiple of $n$ many $i$-pairs, whereas

$$n \nmid \binom{n^2 + n - 1}{2}$$

which is a contradiction. $\square$

The $\ell$-th element of a Sidon set in $\mathbb Z_m$

Before we begin, a Sidon set is a set $\mathcal S\subset \mathbb N$ with no non-trivial solutions to the equation $a+b=c+d$ (trivial solutions being those satisfying $\{a,b\}=\{c,d\}$). A Sidon subset $A\subset [n]:=\{1,2,\dots , n\}$ is called dense if $\left\lvert A\right\rvert = \max \left\lvert S\right\rvert$ where the maximum is taken over all Sidon subsets of $[n]$. For a brief introduction to some of the work done in the field, see the first two sections of our paper. That being said, for all practical purposes of this particular post, it is enough to think of Sidon sets as sets with a certain property.


Recently I co-authored Prof. R. Balasubramanian in a work on Sidon sets (which has now been accepted in Journal of Number Theory) where we proved that if $A = \{a_1< a_2< \dots <a_{\left\lvert A\right\rvert}\}\subset [n]$ be a dense Sidon set with $|A|=n^{1/2}-L^\prime$, then

$$a_m = m\cdot n^{1/2} + \mathcal O\left( n^{7/8}\right) + \mathcal O\left(L^{1/2}\cdot n^{3/4}\right)$$

where $L=\max\{0,L^\prime\}$. This helped us to provide easier derivations of results previously also achieved by Ding (see this and this).

The main ingredient in our proof is a theorem due to Cilleruelo. Very informally, the theorem says that if we have a dense Sidon set $\mathcal S \subset [n]$ and if we choose an interval $I\subset [n]$ of length $m$, then $I$ will contain about these many elements of $\mathcal S$. In our proof, we specifically put $I=[1,a_m]$ and this gave us an asymptotic estimate for $a_m$.


It was only much after the paper was uploaded that it came to my notice that an analogue of Cilleruelo's theorem also exists for Sidon sets in $\mathbb Z_m$, due to Tomasz Schoen. This means that our proof technique can also achieve an analogous result for Sidon sets in $\mathbb Z_m$. And (due to technical reasons), this time the calculations will be easier than before. This is what I will show here.

Theorem : Let $A\subset \mathbb Z_m$ be a dense Sidon set with $|A|=m^{1/2}+O(1)$. Then

$$a_\ell = \ell\cdot m^{1/2} + \mathcal O\left( m^{3/4} \ln m\right)$$

where $a_\ell$ is the $\ell$-th element of $A$ (in increasing order).

Proof : Take Theorem 1 from Schoen's paper and put

$$\mathscr I = \{1,2,\dots , a_\ell\}$$

so that (by definition) $\left\lvert \mathscr I\right\rvert = a_\ell$ and $\left\lvert A\cap \mathscr I\right\rvert = \ell$. So, by Schoen's theorem,

$$\left\lvert \ell - \left\lvert A\right\rvert \cdot \frac{a_\ell}{m}\right\rvert = \mathcal O \left(\left\lvert A\right\rvert^{1/2} \ln m\right)$$

and hence

$$\frac{a_\ell}{\sqrt{m}} = \ell + \mathcal O\left(m^{1/4}\ln m\right)$$

thus giving

$$a_\ell = \ell \sqrt m + \mathcal O\left(m^{3/4}\ln m\right)$$

hence completing the proof.


Finding asymptotic form for $\sum_{a\in \left\lvert A\right\rvert} a^k$ (as done in our original paper) is now only a matter of computaion and is left as an exercise for the reader. It should be noted (as was also done in the original paper) that this theorem is technically a valid formula for all $\ell$, but it's only useful as an asymptotic formula when $\ell$ is close to $\sqrt m$, more precisely when $\ell > m^{1/4}\ln m$.

AMM problem 12448

This is AMM problem 12448 that I solved with my friends Satvik Saha and Sohom Gupta.


Problem Statement -

Given real numbers $a_1, \dots, a_n$. Prove

$$S_1\sqrt{nS_2} \leq S_1^2 + \frac{1}{2}\left\lfloor\frac{n^2}{4}\right\rfloor\left(\max_i a_i - \min_i a_i\right)^2$$

where $S_1 = \sum_{i = 1}^n a_i$ and $S_2 = \sum_{i = 1}^n a_i^2$.


Solution -

Without loss of generality, let $a_1 \leq \dots \leq a_n$ and $S_1 > 0$.

Setting $\mu = S_1 / n$, we need to prove 

$$\mu \sqrt{\sigma^2 + \mu^2} \leq \mu^2 + \frac{1}{2}\left\lfloor\frac{n^2}{4}\right\rfloor \frac{(a_n - a_1)^2}{n^2}$$

where $\sigma^2 = \frac{1}{n}\sum_{i = 1}^n (a_i - \mu)^2 = \frac{S_2}{n} - \mu^2$.

Squaring both sides, this is equivalent to

$$\sigma^2 \leq \left\lfloor\frac{n^2}{4}\right\rfloor \frac{(a_n - a_1)^2}{n^2} + \frac{1}{4\mu^2}\left(\left\lfloor\frac{n^2}{4}\right\rfloor \frac{(a_n - a_1)^2}{n^2}\right)^2 \tag{$\star$}$$

since $\mu > 0$.

Let $\mu_{\mathbf{x}} = \frac{1}{n} \sum_{i} x_i$ and $\sigma^2_{\mathbf{x}} = \frac{1}{n}\sum_i (x_i - \mu_{\mathbf{x}})^2$ for real numbers $x_1, \dots, x_n$.

Since $\sigma^2_{\mathbf{x}}$ is a smooth convex function of $\mathbf{x}$, it attains its maximum on $[a, b]^n$ at an extreme point, i.e. when all $x_i \in \{a, b\}$.

If exactly $k$ many $x_i$ are equal to $a$ and $n - k$ many are equal to $b$, we have

$$\sigma^2_{\mathbf{x}} = \frac{k(n - k)}{n^2}(b - a)^2.$$

Now, $(n - 2k)^2 \geq 0$ gives $k(n - k) \leq n^2/4$, forcing $k(n - k) \leq \lfloor n^2/4\rfloor$.

Thus, 

$$\max_{\mathbf{x} \in [a, b]^n} \sigma^2_{\mathbf{x}} \leq \left\lfloor\frac{n^2}{4}\right\rfloor \frac{(b - a)^2}{n^2}$$

and $\sigma^2 \leq \lfloor n^2/4\rfloor (a_n - a_1)^2/n^2$, from which ($\star$) follows directly. $\square$

Another Needlessly Sophisticated Proof of a Simple Result

This post belongs solely to a genre that should be called "joke proofs". The only thing a joke proof provides (except an awfully sophisticated argument for a very simple result) is intellectual stimulation. Some very well known contributions to this field are a proof of irrationality of $2^{1/n}$ using Fermat's Last Theorem or a proof of infinitude of primes using irrationality of $\zeta(3)$ (although infinitude of primes have received some more suspicious entries as well).

The following theorem (if you can call it a theorem) follows directly from looking at the integers $\{(k+1)! +\ell\}_{\ell = 2}^{k+1}$. But our proof instead uses two other well known results, namely the fact that the sum of reciprocals of primes diverges (proof) and something slightly stronger than the fact that the sum of reciprocals of twin primes converges (proof). It is well known that an exact same proof technique runs through to show that if $\mathcal S_k = \{p\in \mathcal P : p+k \in \mathcal P\}$ where $\mathcal P$ is the set of primes, then $\sum_{s\in \mathcal S_k} \frac 1s$ converges. This is what we will use in the proof.

As far as my knowledge goes (which is not too far), this proof is completely original, although that's not something to be very proud of!


Theorem : For any given $k$, there exists $k$ consecutive composite numbers.

Proof : If possible, let

$$\sup_{p_1, p_2 \text{ are consecutive primes}} \left\vert p_1-p_2\right\vert = B$$

for some integer $B$.

Then

$$\sum_{i=1}^B \;\;\sum_{p, p+i\in \mathcal P} \frac 1p \ge \sum_{p\in \mathcal P} \frac 1p$$

where $\mathcal P$ is the set of primes.

The LHS converges by Brun sieve and the RHS diverges, hence giving a contradiction! $\square$

AMM problem 12435

This is AMM problem 12435 that I solved with my friends Satvik Saha and Sohom Gupta.


Problem Statement -

For a positive integer $n$, let $d(n)$ be the number of positive divisors of $n$, let $\phi(n)$ be Euler's totient function, and let $q(n) = d(\phi(n)) / d(n)$. Find $\inf_n q(n)$ and $\sup_n q(n)$.


Solution -

We claim that $\inf_n q(n) = 0$ and $\sup_n q(n) = \infty$.

To show that $\sup_n q(n) = \infty$, note that $q(p) = d(p - 1) / 2$ for primes $p$, which can be made arbitrarily large. Indeed, for any $\ell \in \mathbb{N}$, the arithmetic progression $\{\ell!\, k + 1\}_{k \in \mathbb{N}}$ must contain a prime $p'$ (due to Dirichlet) whence $p' - 1 = \ell!\,k'$ has at least $M$ divisors so that $d(p' - 1) \geq \ell$.

Next, we show that $\inf_n q(n) = 0$. To do so, we choose arbitrary $m \in \mathbb{N}$ and find $N$ such that $q(N) \leq 1/2^{m - 1}$. Pick a sequence $\mathcal{Q}_m$ of $m$ primes $q_1 \leq \dots \leq q_m$ such that $q_m < 2q_1$; we will show later that this is always possible via the Prime Number Theorem. Let $\mathcal{P} = \{p_1, \dots, p_n\}$  be all the primes less than $q_1$. Note that each $q_i - 1$ and each $p_j - 1$ can be factored solely in terms of the primes from $\mathcal{P}$.

For $k_1, \dots, k_n \in \mathbb{N}$, consider

$$N = N(k_1, \dots, k_n) = p_1^{k_1}\cdots p_n^{k_n} \times q_1\cdots q_m,$$

With this, we have 

$$d(N) = (k_1 + 1)\cdots (k_n + 1)\times 2^m$$

and

\begin{align}\phi(N) &= p_1^{k_1 - 1} \cdots p_n^{k_n - 1} \times (p_1 - 1)\cdots (p_n - 1)\times (q_1 - 1)\cdots (q_m - 1)\\ &= p_1^{k_1 + \alpha_1 - 1} \cdots p_n^{k_n + \alpha_n - 1}\end{align}

where the integers $\alpha_j \geq 0$ depend only on $\mathcal{Q}_m$. Thus,

$$d(\phi(N)) = (k_1 + \alpha_1)\cdots (k_n + \alpha_n)$$

and

$$q(N(k_1, \dots, k_n)) = \left(\frac{k_1 + \alpha_1}{k_1 + 1}\right)\cdots \left(\frac{k_n + \alpha_n}{k_n + 1}\right) \times \frac{1}{2^m}.$$

Taking $\min\{k_1, \dots, k_n\} \to \infty$, we can make $q(N(k_1, \dots, k_n))$ arbitrarily close to $1/2^m$, in particular less than $1/2^{m - 1}$.

To prove the existence of $\mathcal{Q}_m$ for each $m \in \mathbb{N}$, it is enough to show that there is $n \in \mathbb{N}$ such that $[n, 2n]$ contains at least $m$ primes. By the Prime Number Theorem, the number of primes in $[n, 2n]$ is of the order

$$\frac{2n}{\log(2n)} - \frac{n}{\log(n)} \sim \frac{n}{\log(n)}$$

which grows arbitrarily large as $n \to \infty$.



Note : Here's an easier proof of $\inf_n q(n) = 0$ assuming that there are infinitely many Fermat primes. Consider the sequence

$$N=2^k\cdot F_1\dots F_n$$

where $F_1,\dots ,F_n$ are the first $n$ Fermat primes. This gives

$$q(N)=\frac{k+\alpha_1+\dots +\alpha_n}{k+1}\times \frac 1{2^n}$$

where $F_i=2^{\alpha_i}+1$. Taking $k$ and $n$ to infinity, we complete the proof.


Thanks to this argument, if $\inf_n q(n) \neq 0$, then that implies that there are only finitely many Fermat primes, which resolves a long standing conjecture about Fermat primes. Since that is highly unlikely, the $\inf$ has to be $0$ - this is a classic case of proof by existence of proof!

AMM problem 12377

This is my first post (or second, depending on how you look at the world). For this, I decide to post the first AMM problem I solved with my friends Satvik Saha and Sohom Gupta.


Problem Statement -

We define a one-drop number as in problem 12377 from Problems and Solutions, The American Mathematical Monthly, Volume 130, 2023, Issue 3.

An integer is a one-drop number if its decimal digits \(d_1, \dots, d_n\) satisfy

$$1 \leq d_1 \leq d_2 \dots \leq d_i > d_{i + 1} \leq \dots \leq d_n$$

for some \(i\), with \(1 \leq i < n\).

For example, the numbers \(13802\) and \(49557\) are one-drop numbers, while the numbers \(12345\) and \(352712\) are not.

We answer the following question: for \(n \geq 2\), how many \(n\)-digit one-drop numbers are there?


Solution -

There are

$$\binom{n + 18}{18} - \binom{n + 9}{9} - n\cdot\binom{n + 8}{8}$$

one-drop numbers of length \(n \geq 2\).

The digits of every \(n\)-digit one-drop number \(d_1, \dots, d_n\) can be uniquely split into two zero-drop blocks \(A\) and \(B\), with digits \(1 \leq d_1 \leq \dots \leq d_i\) and \(0 \leq d_{i + 1} \leq \dots \leq d_n\). Here, is a zero-drop block consists of a sequence of non-decreasing decimal digits; we allow the leading digit to be \(0\).

Consider all \(19\)-tuples \( (a_1, \dots, a_9, b_0, b_1, \dots, b_9) \) of non-negative integers such that

$$a_1 + \dots + a_9 + b_0 + b_1 + \dots + b_9 = n.$$

Each tuple represents an \(n\)-digit number obtained by concatenating two zero-drop blocks \(A\) and \(B\), where \(a_d\) and \(b_d\) denote the number of occurrences of the digit \(d\) in the respective blocks. Clearly, there are \(\binom{n + 18}{18}\) such tuples. Note that each tuple of this kind represents either a zero-drop or a one-drop number, and each \(n\)-digit one-drop number is uniquely represented by such a tuple.

Discard all tuples where the block \(A\) is empty, with all \(a_d = 0\); the corresponding number is zero-drop. There are as many such tuples as there are non-negative integer solutions of 

$$b_0 + b_1 + \dots + b_9 = n,$$

that is, \(\binom{n + 9}{9}\). This leaves \(\binom{n + 18}{18} - \binom{n + 9}{9}\) tuples; call this set of tuples \(S\). Note that for any element in \(S\), the block \(A\) is non-empty. Thus, it represents a proper \(n\)-digit number since the leftmost digit is non-zero.

Suppose that a tuple \( (a_1, \dots, a_9, b_0, b_1, \dots, b_9) \in S\) represents a zero-drop number. Since some \(a_d > 0\), the digit \(0\) cannot appear in block \(B\), hence the number consists only of digits \(1\) through \(9\). There are \(\binom{n + 8}{8}\) such \(n\)-digit zero drop numbers, via counting the non-negative integer solutions of

$$c_1 + \dots + c_9 = n.$$

Each of these numbers has been represented by precisely \(n\) tuples from \(S\), because the first \(i\) digits of such a number can be placed in block \(A\) with the remaining digits in block \(B\) for all \(1 \leq i \leq n\).

For example, the \(5\)-digit zero-drop number \(11234\) can be broken into blocks \(A, B\) as \(1, 1234\) or \(11, 234\) or \(112, 34\) or \(1123, 4\) or \(11234, \emptyset\), each of which correspond to distinct tuples from \(S\).

Thus, precisely \(n\cdot\binom{n + 8}{8}\) tuples from \(S\) represent zero-drop numbers. The remaining $\binom{n + 18}{18} - \binom{n + 9}{9} - n\cdot\binom{n + 8}{8}$ tuples uniquely represent all \(n\)-digit one-drop numbers. $\square$

Hello World

Hey there!

I am Sayan Dutta, a student of mathematics interested in Number Theory and Combinatorics. I completed my BS-MS from IISER Kolkata and I will be joining a PhD program in 2025 in University of Montreal under Prof. Dimitris Koukoulopoulos. Before that, I completed my Masters' thesis under the supervision of Prof. Ramachandran Balasubramanian and Prof. Soumya Bhattacharya. My thesis was on Sidon sets and can be found in my website.

A far less boring description of me was given by my dear friend Sohom Gupta

Sayan Dutta is an undergraduate student of mathematics, interested in discrete mathematics and vibrant patterns hidden in the simplest of structures. His appreciation of beauty stems from his cinephile present, while his chess moves are just nuances of perfection. He aspires to paint the world in the language of mathematics and the blossoms of the silver screen.


Inspired by What's new, Joni's Math Notes and others, I decided to keep a blog of my own. As of now, I don't have any particular plans and will post anything I have in mind (only related to mathematics). We will see where this goes!

On the Neighbour Sum Problem

As a kid, I remember being utterly fascinated by this deceptively simple Olympiad problem. The 64 squares of an 8 × 8 chessboard are filled ...