Search

Introduction to Sumsets II

This is a sequel to a previous post. However, one can technically read it without reading the previous post.

Given A,BZ for a commutative ring Z, we define
A+B:={a+b:aA,bB}
to be the sumset of A and B.

Similarly,
AB:={ab:aA,bB}
kA:=A+Ak times
b+A:={b}+A
for A,BZ.

Theorem 1 : We have
|AC||B||AB||BC|
where A, B and C are finite subset of R.

Proof : For all dAC, we fix the representation d=adcd where adA and cdC. Now, we define
f:(AC)×B(AB)×(BC)(d,b)(adb,bcd)
Note that if f(x,y)=(u,v), then we must have x=u+v and y is determined by the equations y=axu and y=cx+v. So, an image determines its unique inverse, that is, f is injective.

This completes the proof.

Remark : Theorem 1 can be interpreted as a triangle inequality under the norm
Δ(A,B)=log|AB||A||B|
although this definition of Δ is not really a distance as Δ(A,A)0.

Putting C=A and then B=A in Theorem 1 we have

Corollary 2 : For all A,BR, we have
|AA||AB|2|B||AA||A+A|2|A|

The last inequality basically says that if A has small doubling, then AA is also small. Now, we want to show that AA being small also implies that A+A is. For that, we proceed with

Lemma 3 : Let A,BR be finite. If there is a c>0 so that
|A+B|=c|A|and|A+B|c|A|AA,
then, we have
|A+B+C|c|A+C|
for any finite CR.

Proof : We use induction on the size of C. For |C|=1, i.e., C={x}, we have
|A+B+x|=|A+B|=c|A|=c|A+x|
and hence the statement is trivial.

Now, let us assume that the statement is true for a non-empty set C, and define C:=C{x} for some xC. We define
A:={aA:a+xA+C}=A(A+Cx)A
so that
|A+C|=|A+C|+|A||A|
since A+x=(A+x)(A+C) and A+C=(A+C)(A+x).

Now, A+xA+CA+B+xA+B+C so that
A+B+C=(A+B+C)(A+B+x)=(A+B+C)[(A+B+x)(A+B+x)]
and
|A+B+C||A+B+C|+|A+B||A+B|
and hence, using the Induction Hypothesis, we have
|A+B+C|c|A+C|+c|A|c|A|=c|A+C|
hence completing the proof.

This lemma will help us to prove the two following important consequences.

Theorem 4 : We have
|A+C||B||A+B||B+C|
for all A,B,CR.

Proof : Let UB be such that |U+A||U|=c is minimal. This implies
|U+A+C|c|U+C|
using the previous lemma.

Again, since B itself is considered when the minimum is selected, we have |A+C||U+A+C|, |U+C||B+C| and c|A+B||B|. These four inequalities complete the proof.

Corollary 5 : We have
|A+A||A+B|2|B||A+A||AA|2|A|
for all A,B.

Proof : Put C=A and then B=A in the previous theorem.

Corollary 6 : We have
|AB||A+B|3|A||B||A+B||AB|3|A||B|
for all A,B.

Proof : Use the previous corollary in Theorem 1 and then set B to B.

Corollary 7 : Let A,BR be finite. If we have
|A+B|=c|A|and|A+B|c|A|AA
for some c>0, then
|A+hB|ch|A|
for all positive integers h.

Proof : Set C=(h1)B in Lemma 3 to get |A+hB|c|A+(h1)B| and use Induction.

Theorem 8 (Plünnecke-Ruzsa Inequality) : Let BR be finite satisfying |B+B|c|B| (or |BB|c|B|). Then
|mBnB|cm+n|B|
for all m,nN.

Proof : Let the non empty set AB (or AB) be one of the sets that minimizes |A+B||A| so that |A+B|=c|A|. Obviously cc. Also, |A+B|c|A| for all AA and hence the conditions of Corollary 7 are satisfied. This gives
|hB||A+hB|(c)h|A|ch|B|
hence proving the statement for the case m=0 or n=0.

Now, if m,n0, we use Theorem 1 and set AmB, BA and CnB to get
|mBnB||A||A+mB||A+nB|(c)m|A|(c)n|A|cm+n|B||A|
hence completing the proof.

Introduction to Sumsets

After finding a quetsion on Math Stack Exchange, I decided to take a large chunk out of my MS thesis and present it.

Given A,BZ for a commutative ring Z, we define
A+B:={a+b:aA,bB}
to be the sumset of A and B.

Similarly,
AB:={ab:aA,bB}
kA:=A+Ak times
b+A:={b}+A
for A,BZ.

Theorem 1 : If A and B are two finite subsets of R, then
|A|+|B|1|A+B||A||B|
and the lower bound is sharp iff A and B are arithmetic progressions of same difference.

Proof : The upper bound is obvious as we can just choose A and B in such a manner that all the sums of the form a+b are different. One example of such a set is
A={0,1,,r1},B=r{0,1,,s1}
so that
A+B={0,1,,rs1}
hence completing the proof.
  
Denoting A={a1<a2<<ar} and B={b1<b2<<bs}, and noting that the following chain of inequalities
a1+b1<a1+b2<a1+b3<<a1+bs<a2+bs<<ar+bs
contains r+s1 different elements of A+B immediately completes the proof of the lower bound.
  
Now, let us assume that the lower bound is tight. Then, the following chain of inequalities
a1+b1<a2+b1<a2+b2<a2+b3<<a2+bs<a3+bs<<ar+bs
and the previous one must be the same.

The first and the last r1 terms are clearly equal, the other s1 terms determine the equations
a1+b2=a2+b1a1+b3=a2+b2...a1+bs=a2+bs1
which on rearranging gives
b2b1=a2a1b3b2=a2a1...bsbs1=a2a1
and hence B is an arithmetic progression with difference a2a1. Since the roles of A and B are symmetric in the statement, then so must be A.

In fact, a more general statement is true.

Theorem 2 : If A1,A2,,Ak, k2 are finite subsets of R, then
|A1|+|A2|+|Ak|k+1|A1+A2+Ak||A1||A2||Ak|
and the lower bound is sharp iff A1,A2,,Ak are arithmetic progressions of same difference.

Proof : We proceed by Induction.

The case k=2 is precisely the last theorem.

Now, let us assume that the statement is true for k1.

Let A=A1+A2++Ak1 and B=Ak so that A+B=A1+A2++Ak. Using the previous theorem and the Induction Hypothesis, we get
|A1|+|A2|+|Ak1|k+2+|Ak|1|A|+|B|1|A+B||A||B||A1||A2||Ak|
and hence the proof follows.
  
Now, we just notice that the left inequality can be an equality only if
|A1|+|A2|+|Ak1|k+2=|A1+A2++Ak1|=|A|
and
|A|+|B|1=|A+B|.

The first relation implies that A1,,Ak1 and hence A1++Ak1=A are arithmetic progressions of same difference, and the second relation implies that B is also an arithmetic progression of same difference. This completes the proof.


Corollary 3 : Let A be a finite subset of R. Then, for an integer k2, we have
k|A|k+1|kA||A|k
and the lower bound is sharp iff A is an arithmetic progression.


However, it should be noted that the upper bound in the last corollary is not sharp. The sharp upper bound is given by the following theorem.

Theorem 4 : Let A be a finite subset of R. Then,
|kA|(|A|+k1k)
for any integer k1.

Proof : The number of ways of choosing unordered k-tuples from a set of cardinality |A| is the number of non-zero integer solutions of the equation
x1++x|A|=k
which is (|A|+k1k).

This upper bound is indeed sharp and the case k=2 of the sharp upper bound is called a Sidon set.


The moral of Corollary 3 is that if A+A is smallest possible, then A has a well defined structure, namely it is an arithmetic progression. We want to understand whether A+A being small, say C|A| still implies some structure in A. To do that, we investigate some examples now.

Clearly, if A is a set of r members of an arithmetic progression of length 2r, then A+A is part of an arithmetic progression of length 4r1 and hence |A+A|4r1.

Now, considering A={0,1,,r2,N} for some N>r2, we see that
A+A={0,1,,2r4}{N,N+1,,N+r2}{2N}
and so, if r2<N2r4, then |A+A|=N+r3r4, and if N>2r4, then |A+A|=3r3. The first one is an example of having r members of an arithmetic progression of size N+1 (2r3) such that |A+A|3r4. And the second one is an example of the same where |A+A|=3r3 but A cannot be covered with any arithmetic progression of reasonable size.

And next, we consider
Q={a0+a1n1++adnd:0njNj, j=1,,d}
where a0,a1,,ad are fixed, and N1,,Nd are integers 2. This is a generalized arithmetic progression of dimension dim(Q)=d and volume vol(Q)=N1Nd.

Obviously, |Q|vol(Q), and the equality holds only if all the elements are different in which case we call Q to be proper.

It was Gregori Freiman in the 1960’s who recognized if A+A is small, say C|A|, then A still has a special structure, though not that strong. Freiman's theorem is one of the deepest results in the theory of sumsets and it is obligatory to state it in a discussion about Additive Combinatorics.

Theorem 5 (Freiman, Ruzsa) : If A is a set of positive integers and C2 such that |A+A|C|A|, then there is a generalized AP, Q with AQ, vol(Q)F(C)|A| and dim(Q)d(C) for some functions F(C) and d(C).

A Surprising Proof of an Inequality

Problem : Prove that for real numbers x1,,xn,
i=1nj=1n|xixj|ri=1nj=1n|xi+xj|r
for any r[0,2].

Solution : The cases r=0,2 can be checked by hand. So, we assume r(0,2) and begin by noting that the integral
I(α)=01cos(αx)xr+1dx
is positive and finite. Also, it is easy to prove using the substitution y=|α|x that I(α)=|α|I(1).

Now, using this, we have
|a+b|r|ab|r=1I(1)0cos((ab)x)cos((a+b)x)xr+1dx=1I(1)02sin(ax)sin(bx)xr+1dx
and hence
i=1nj=1n|xixj|ri=1nj=1n|xi+xj|r=2I(1)0(i=1nsin(xix))2xr+1dx0
hence completing the proof.

Prokhorov’s Theorem

The aim is to give a proof of Prokhorov’s Theorem.

Theorem : Let {μn:n1} be a sequence of probability measures on (Rd,B(Rd)). If {μn:n1} is tight, then every sequence has a further subsequence that converges weakly to some probability measure.

We will use the following (well known) fact in our proof.
Fact : Let {μn:n1} be a sequence of probability measures on (Rd,B(Rd)). Then, weak convergence μndμ0 is equivalent to gdμngdμ0 for all compactly supported continuous functions.

With this at hand, we give the proof of the theorem.

Proof of Theorem : Let KN=[N,N]d. Then, μn induces a subprobability νnN:=μn|KN on KN. By weak compactness, we can find a subsequence νnkN such that νnkNwνN for a subprobability measure νN on KN. Repeating the argument starting with this subsequence on KN+1 gives a new limit νN+1. Repeating this procedure countably many times (and noticing that νN=νN+1 on KN), we get
μ(A):=limNνN(AKN)
for AB(Rd).

Now, we will show that μ is a probability measure. From here, it follows using the mentioned fact that our subsequence converges weakly to μ.

Clearly μ(Ω)1 as each νN is a subprobability. Given ε>0, tightness yields Nε such that μn(KNε)1ε for all n1. It follows that μ(Ω)μ(KNε)1ε. This shows that μ(Ω)=1. It remains to prove that μ is σ-additive. Clearly μ is monotone. Let first
A=i=1IAi
be a disjoint union of finitely many measurable sets. Then
|μ(i=1IAi)i=1Iμ(Ai)||μ(i=1IAiKNε)i=1Iμ(AiKNε)|+(I+1)ε=(I+1)ε
since the restriction of μ to KNε (which is νNε) is σ-additive. Thus μ is finitely additive. It follows that μ is σ-superadditive as
μ(A)μ(i=1IAi)=i=1Iμ(Ai)i=1μ(Ai)
by letting I.

Conversely, using again that the restriction of μ to KNε is σ-additive, we have
μ(A)=μ(AKNε)+μ(i=1AiKNε)=μ(AKNε)+i=1μ(AiKNε)ε+i=1μ(Ai)
thus completing the proof.

Kolmogorov’s Continuity Theorem

The aim is to give a proof of the Kolmogorov’s Continuity Theorem, also known as Kolmogorov-Chentsov Theorem. The main reference we are following is Probability Theory, An Analytic View by Daniel W. Stroock.

Theorem : Suppose a stochastic process X={X(t):0tT} takes values in a Banach space B so that for some p>1, C>0, r>0,
E[X(t)X(s)Bp]1pC|ts|1p+r
for all s,t[0,T]. Then there exists a family {X~(t):t[0,T]} of random variables such that X(t)=X~(t) almost surely for all t[0,T] and t[0,T]X~(t,ω)B is continuous for all ωΩ. In fact, for each α(0,r), we have
E[sup0s<tT(X~(t)X~(s)B(ts)α)p]1pKT1p+rα
for a suitable constant K=K(α,r,C).

Proof : By rescaling time, we can assume without loss of generality, that T=1.

Now, for n0, we define
Mn:=max1m2nX(m2n)X((m1)2n)B
and observe that
E[Mnp]1pE[(m=12nX(m2n)X((m1)2n)Bp)1p]
which is bounded by C2rn.

Next, consider the path tX(t) on each interval [(m1)2n,m2n] and linearize it to tXn(t). It is easy to see that
maxt[0,1]Xn+1(t)Xn(t)B=max1m2nX((2m1)2n1)X((m1)2n)X(m2n)2
which is bounded by Mn+1.

So,
E[supt[0,1]Xn+1(t)Xn(t)Bp]1pC2rn
and so there exists a measurable X~:[0,1]×ΩB satisfying
E[supt[0,1]X~(t)Xn(t)Bp]1pC2rn12r
and tX~(t,ω) is continuous for all ωΩ.

Next, notice that for all t[0,1],
X~(τ)X(t)Bτtp0
and hence X~(τ)=X(t) almost surely.

Finally, note that for 2n1ts2n, we have
X~(t)X~(s)BX~(t)Xn(t)B+Xn(t)Xn(s)B+Xn(s)X~(s)B2supτ[0,1]X~(τ)Xn(τ)B+2n(ts)Mn
and hence
X~(t)X~(s)B(ts)α2αn+α+1supτ[0,1]X~(τ)Xn(τ)B+2n+αnMn
which implies
E[sup0s<t1(X~(t)X~(s)B(ts)α)p]1pCn=0(2αn+α+1rn12r+2αnrn)5C(12r)(12αr)
hence completing the proof.

Tangent Space of SL2(R)

This was a project I did about three years ago when I was in third year - I thought that it would be a good fit for this blog. I didn't change much from the original (except the necessary ones to change an Overleaf document to a Blogger post).


Introduction

The group SL2(R) consists of 2×2 real matrices of determinant 1. So, it is the collection of all points (a,b,c,d) satisfying adbc=1.

Of course, SL2(R)M2(R) as M2(R) consists of all the points of the form (a,b,c,d). And since M2(R)R4, and R4 has the Euclidean metric on it, we can induce that metric to SL2(R) and consider it as a metric space.


Two properties of SL2(R) immediately stick out-

1. SL2(R) is closed in M2(R).

2. SL2(R) is unbounded.


Proof of 1 : Consider the function

det:M2(R)R(a,b,c,d)adbc

Clearly, SL2(R)=(det)1({1}). And since the det map is continuous, the result immediately follows.


Proof of 2 : From definition, (a,b,c,d)SL2(R)adbc=1. So, fixing b=b0 and d=d0, we have

a=1+b0cd0

which means that any arbitrarily large value of a can be accounted for by using suitably large value of c.


Problem 1 :

Let ASL2(R). Show that ϵ such that Bϵ(A)SL2(R) is homeomorphic to an open subset of R3.

Solution :

First let us only look at the points of the form p=(a,b,c,1+bca) with a0. So, consider the set 

U={(abcd)SL2(R):a0} 

Let us consider the map

ϕ:Uϕ(U)R×R×R(abcd)(a,b,c)

where R=R{0}.

Clearly, U is an open subset of SL2(R) as a0 is an open condition. Also, the map ϕ is clearly continuous.

Now, the inverse of ϕ is given by

ϕ1:ϕ(U)U(a,b,c)(abc1+bca)

which is continuous since the inverse of the open set Ua×Ub×Uc×U1+bca is Ua×Ub×Uc which is open.

So, ϕ is a homeomorphism from U to ϕ(U).

Now, let a=0. But, since adbc=1, if a=0, then b and c must be non-zero. So, now let us consider the set

V={(abcd)SL2(R):b0}

and the map

ψ:Vψ(V)R×R×R(abcd)(a,b,d)

This map is a homeomorphism from V to ψ(V) due to the same arguments as the last one.

So, SL2(R) is a 3-dimensional topological manifold with the atlas {(U,ϕ),(V,ψ)}.

So, ASL2(R), ϵ>0 such that Bϵ(A)SL2(R) is homeomorphic to an open subset of R3.


Problem 2 :

Let γ:(α,β)SL2(R) be a continuous curve. We say γ is smooth if γ is a smooth curve in R4. Show that Bγ is a smooth curve for any BSL2(R).

Solution :

Let

B=(a1a2b1b2)

and

γ(t)=(p1(t)q1(t)p2(t)q2(t))

Then,

Bγ(t)=(1=12aipi(t)1=12aiqi(t)1=12bipi(t)1=12biqi(t))

Now, since γ is a smooth curve, each of the pi's and qi's are smooth. So, clearly, Bγ is also smooth.

We discuss about this a little bit more in Appendix Notes 1.


Problem 3 :

Let γ, η be smooth curves in SL2(R) passing through I=(1001), i.e., γ(0)=η(0). We say, γηγ(0)=η(0). Show that the set of equivalence classes of smooth curves passing through I is a vector space, denoted by sl2, of dimension 3.

Solution :

First, we will try to prove that sl2 is the set of all traceless matrices. To do that, let us note that if 

x(t)=(x1(t)x2(t)x3(t)x4(t))

is a parametrized curve in SL2(R) with

x(0)=(x1(0)x2(0)x3(0)x4(0))=(1001)

Now, if x(t)SL2(R), then,

x1(t)x4(t)x2(t)x3(t)=1x1(t)x4+x1(t)x4(t)x2(t)x3x2(t)x3(t)=0x1(0)+x4(0)=0

This shows that any element in sl2 is traceless.

Now, we want to show that if A is a traceless matrix, then A is in sl2. This, we will prove in the answer of the next question. For now, we will assume that this is true.

So, the question boils down to proving that the set of traceless matrices has dimension three. To do so, let us prove that

span{(1001),(0100),(0010)}=T

where T is the set of all traceless matrices.

But, this is clearly true as given A=(abca), we can write

A=a.(1001)+b.(0100)+c.(0010)

and since the linear independence of the given vectors is trivially true, this completes our proof.


Problem 4 :

Show that sl2 can be identified, in a natural way, to traceless 2×2 matrices. Moreover, the exponential map

exp:{AM2(R):tr(A)=0}SL2(R)AeA

is well defined and produces a smooth curve

γA:RSL2(R)γA(t)=exp(tA)

passing through I with γA(0)=A.

Solution :

This map is clearly well defined as the matrix exponential map is well defined. Some details about this can be found in Appendix Problem 1.

Now, we want to show that, for any complex matrix A,

det(exp(A))=etr(A)

To do that, let us recall that every complex matrix has a Jordan normal form and that the determinant of a triangular matrix is the product of the diagonal. So,

exp(A)=exp(S1JS)=S1exp(J)S

and hence,

det(exp(A))=det(exp(SJS1))=det(Sexp(J)S1)=det(S)det(exp(J))det(S1)=det(exp(J))=i=1nejii=ei=1njii=etr(J)=etr(A)

which completes the proof.

So, if trace of A is zero, that immediately gives that the determinant of exp(A) will be 1, which (combined with our proof in the last problem) proves that the traceless matrices can be naturally identified with sl2.

Now, we have

γA(t):=exp(tA)

This γA is clearly smooth as

exp(tA)=n=0tnAnn!

which gives

(exp(tA))=(n=0tnAnn!)=n=0ntn1Ann!=n=0tn1An(n1)!=A.exp(tA)

which is clearly well defined. We add some rigour to this argument in Appendix Problem 2.

Now,

γA(t)=exp(tA)

which gives

γA(0)=exp(0)=I

and

γA(t)=A.exp(tA)

which gives

γA(0)=A.exp(0)=A

hence completing the proof.


Problem 5 :

What is the fundamental group of SL2(R)?

Solution :

We claim that the fundamental group of SL2(R) is Z.

To prove our claim, first we will prove that SL2(R) deformation retracts to SO2(R). To do that, let us define the retraction map,

φ:SL2(R)SO2(R)(v1,v2)(v1||v1||,e2||e2||)

where e2=v2(u1.v2)u1 where u1=v1||v1||.

To prove that this is indeed a deformation retract, let us note the homotopy

H:SL2(R)×ISL2(R)(A,t)(1t)A+tφ(A)

so that H(A,0)=A and H(A,1)=φ(A) hence completing the proof.

Now, we want to show that SO2(R) is homeomorphic to S1. To do this, let us note that the map

f:SO2(R)S1(cosθsinθsinθcosθ)eiθ=(cosθ,sinθ)

is one-one, onto and continuous.

Proving that this map is an injection is trivial as (cosθ1,sinθ1)=(cosθ2,sinθ2) implies θ1=θ2.
Surjectivity is also trivial as any point in S1 is of the form (cosθ,sinθ) which has the preimage (cosθsinθsinθcosθ).
To prove that the map is a homeomorphism, we simply need to note that it is component wise continuous, and the inverse

f1:S1SO2(R)(cosθ,sinθ)(cosθsinθsinθcosθ)

is also component wise continuous. 

Now, we know that the fundamental group of S1 is Z. So, the fundamental group of SO2(R) and hence SL2(R) should also be Z. This completes the proof.


Problem 6 :

What does SL2(R) look topologically?

Solution :

We claim that SL2(R) is homeomorphic to S1×R2.

Let us recall that any invertible matrix M has a unique decomposition of the form M=QR where Q is orthogonal and R is upper triangular. So, given MSL2(R), we have

M=(cosθsinθsinθcosθ)R

where R is upper triangular.

So,

detM=det(cosθsinθsinθcosθ)detR

hence giving detR=±1.

So,

M=(cosθsinθsinθ   cosθ)(ab01/a)

where a>0, bR.

This proves our claim.


Alternately, we may also take a different and much more beautiful approach and note that the group SL2(R) acts transitively on R2{(0,0)} which is clearly homeomorphic to R×S1. The stabilizer of the vector 

(10)

is the set of matrices of the form

(1a01)

which is clearly homeomorphic to R, hence completing the proof.


Appendix :

Appendix Problem 1 :

Prove that the matrix exponential is well defined.

Proof :

First, we will show that, if X and Y are two n×n matrices, then

||XY||||X||.||Y||

where

||X||=(i,j=1nxij2)12

where xij is the ij-th entry of X.

So, let X={xij} and Y={yij}. Using Cauchy-Schwarz Inequality, we have

||XY||2=i,j(xy)ij2i,j(kxik2)(kykj2)(i,kxik2)(j,kykj2)=||X||2||Y||2

hence completing the proof.

Now, let us consider the sequence of partial sums

Sn=k=0nXkk!

We will show that {Sn}n=1 is an uniformly Cauchy sequence. To do so, let us note that for n>m, we have

||SnSm||=k=m+1nXkk!k=m+1n||Xk||k!k=m+1n||X||kk!

which of course goes to 0, as the sequence of partial sums, k=1nxkk! uniformly converges (as a sequence of reals) to ex. So, {Sn}n=1 is an uniformly Cauchy sequence.

Now, we know that (Rn×n,||.||Rn×n) is complete. So, {Sn}n=1 being an uniformly Cauchy sequence, must converge uniformly.

This completes the proof.


Appendix Problem 2 :

Prove that γA as defined in Problem 3 is a smooth curve.

Proof :

Let us consider the partial sums of exp(tA) given by

Sn=k=0n(tA)kk!=k=0ntkAkk!

So, we have

Sn=(k=0ntkAkk!)=k=0nktk1Akk!=A.k=0ntkAkk!

which (using the result of Appendix Problem 1) converges.


The result that we have used in the last argument is from Rudin's Principles of Mathematical Analysis, Theorem 7.17, given by
Suppose {fn} is a sequence of functions, differentiable on [a,b] and such that {fn(x0)} converges for some point x0 on [a,b]. If {fn} converges uniformly on [a,b], then {fn} converges uniformly on [a,b], to a function f, and

f(x)=limnfn(x),(axb).


Arguing inductively, we can show that γA is C and

γA(n)=An1.γA

which completes the proof.


Alternately, we could have also noted that each entry of γA being a power series in t, is C and hence γA is also C.


Appendix Problem 3 :

We want to understand what a smooth curve in SL2(R) actually means. This we will do in the next appendix notes. For now, we wish to set the stage and look at SL2(R) as a differentiable manifold.

Proof :

We know that, to obtain a C1-differentiable manifold from a topological manifold with atlas A, we only need to check that every transition map between charts in A is differentiable in the usual sense.
In this case, we have the atlas A={(U,x),(V,y)} (from solution of Problem 1). It is easy to see that

(yx1).(a,b,c)=y.(abc1+bca)=(a,b,1+bca)

This gives us the transition map

yx1:x(UV)y(UV)(a,b,c)(a,b,1+bca)

Similarly,

(xy1).(a,b,d)=y.(abad1bd)=(a,b,ad1b)

giving us

xy1:y(UV)x(UV)(a,b,d)(a,b,ad1b)

Clearly, the transition maps are differentiable as a0 and b0.


This proves that A is a differentiable atlas. So, we see that SL2(R) is a differentiable manifold.


Appendix Note 1 :

Let us recall that a function f:RnM on a manifold M is called smooth if for all charts (U,ϕ) the function

ϕf:Rnϕ(U)

is smooth (in the Euclidean sense).

In our case (in Problem 2), we were already given the definition that γ is smooth if γ is a smooth curve in R4. We want to unify the two definitions which will establish that SL2(R) is a good enough manifold. So, given the two charts {(U,ϕ),(V,ψ)}, let us look at

ϕγ:RγUR3ϕγ(t)=ϕ(p1(t)q1(t)p2(t)q2(t))=(p1(t),q1(t),p2(t))

and

ψγ:RγVR3ψγ(t)=ψ(p1(t)q1(t)p2(t)q2(t))=(p1(t),q1(t),q2(t))

Now, these being maps from Rn to Rm are smooth in the Euclidean sense as the components are smooth.

Once this unification is done, we can now say that component-wise smoothness is enough to guarantee manifold-wise smoothness. This justifies the argument we used in Problem 2 to show that Bγ is smooth.


Appendix Problem 4 :

Prove that SL2(R) is a topological space.

Proof :

Let us recall that if N is a subset of M, and O is a topology on M, then

O|N:={UN:UO}

equips N with the subset topology inherited from M.

So, let us take the standard topology on R given by

Br(x):={yR:|xy|<r}UORxUr>0:Br(x)U

Now, we can equip R4 with the product topology so that we can finally define

O:=(OR)|SL2(R)

so that (SL2(R),O) is a topological space.

This added to Appendix Problem 3 and Problem 1 shows that SL2(R) is a differentiable topological manifold.


Appendix Problem 5 :

Prove that SL2(R) forms a group under the usual definition of multiplication.

Proof :

The usual definition of multiplication is of course given by

:SL2(R)×SL2(R)SL2(R)((abcd),(efgh))(ae+bgaf+bhce+dgcf+dh)

This operation is closed as real numbers are closed under addition and multiplication. It is also straightforward to check that

1. this operation is associative

2. has the identity element (1001), and

3. for each (abcd), admits the inverse (dbca).


This completes the proof that SL2(R) is a (non-commutative) group.


Appendix Problem 6 :

Prove that SL2(R) is connected.

Proof :

We will prove the more general case of SLn(R).

First, let us prove that the elementary matrices of first type generate SLn(R). This is Exercise 2.4.8(b) of Artin's Algebra.

If E is an elementary matrix, then XEX can be described as follows: if E is Type 1, with the off-diagonal entry a at (i,j), we do riri+arj; if E is Type 3, with the modified diagonal entry c0 at index i, then ricri.

Note that Type 1 matrices have determinant 1 and Type 3 matrices have determinant c.

In order to showM=E1E2Ekfor some (permitted) elementary matrices Ei, it suffices to show In=FkFk1F1Mfor some elementary Fi, since thenM=F11Fk1,as elementary matrices are invertible, and their inverses are elementary as well.

Now, we consider MSLn(R). Using the row operations corresponding to Type 1 elementary matrices, we turn column i into ei (1 at position i, 0 elsewhere) from left to right.

Take the leftmost column i with ciei, if it exists (otherwise, we are done). Since det(M)=10, we can not have ci written as a linear combination ofc1=e1,,ci1=ei1;hence one of entries i,i+1,,n must be nonzero, say j.

Subtracting rj from the other rows as necessary, we first clear out all column i (except for row j). Note that none of this affects columns 1 through i1. If i=n, we have a diagonal matrix with determinant 1 and the first n1 entries all 1's, so we are done. Otherwise, if i<n, pick an arbitrary row kj from i to n, and add a suitable multiple of rj to rk so that the (k,i) entry becomes 1. Now subtract a suitable multiple of rk from rj so the (j,i) entry becomes 0. If k=i, we can proceed to column i+1; otherwise, add rk to ri and subtract ri from rk, and then proceed to column i+1.

Now, Let be the binary operation corresponding to path-connectivity in SLn(R). Clearly, is an equivalence relation.

In order to show SLn(R) is path-connected, it suffices to show AIn for all ASLn(R). But, we just proved that A can be written as a (possibly empty) product of elementary matrices of the first type, so it in fact suffices to prove that

Euv(a)MM

for all MSLn(R) and Type 1 elementary matrices Euv(a) (1u,vn) of the form

In+[a[(i,j)=(u,v)]]i,j=1n

Yet

MEuv(b)M

simply adds b times row j to row i, i.e. takes ri to ri+brj. For fixed u, v, M, this map is continuous in b (and preserves the determinant), so the continuous function

X(t)=Euv(ta)M

over [0,1] takes

X(0)=MX(1)=Euv(a)M

while remaining inside SLn(R), as desired.

Alternately, we can also first prove that GLn+(R):={AMn(R):det(A)>0} is connected, and then consider the continuous surjective map

Ψ:GLn+(R)AAdetASL2(R)

and recall that continuous image of a connected set is connected.

Now, we can prove GLn+(R) is connected by induction on n. Consider the map

p:Mn(R)=Rn×Mn(n1)(R)Rn

given by p(A)=Ae1. That is p sends AMn(R) to its first column. Note that p is a projection map, hence open and continuous. 

Note that, GL1+(R)=(0,). Now, let f:GLn+(R)Rn{0} be the restriction of p to the GLn+(R)openMn(R). So, f is also open continuous. 

Next, f1(e1)=Rn1×GL+(n1,R), which is a connected set by induction. For yRn{0} choose BGLn+(R) with f(B)=y. Then, f1(y)={BC:Cf1(e1)}. Hence each fibre of f is connected. But, we know that if Y be connected and f:XY is a surjective continuous map having connected fibers, and if f is open, then X is also connected. This, we can prove by contradiction-

If possible, write X=UV where U,V are non-empty open subsets of X. Then, f(U),f(V) are open subsets of Y such that f(U)f(V)=Y. Now, Y is connected implies f(U)f(V). Take, yf(U)f(V), then f1(y)U and f1(y)V, contradicts to the fact that fibers are connected sets.

This proves that SL2(R) in particular is path connected, and hence connected.


Appendix Problem 7 :

Prove that SL2(R) forms a Lie group.

Proof :

We have equipped SL2(R) with both a group and a manifold structure. In order to obtain a Lie group structure, we have to check that these two structures are compatible, i.e, we need to show that the two maps

μ:SL2(R)×SL2(R)SL2(R)((abcd),(efgh))(ae+bgaf+bhce+dgcf+dh)

and

i:SL2(R)SL2(R)(abcd)(abcd)1

are differentiable with the differentiable structure on SL2(R). For instance, for the inverse map i, we have to show that the map yix1 is differentiable in the usual sense for any pair of charts (U,x),(V,y)A.

USL2(R)iVSL2(R)xyx(U)R3yix1y(V)R3

But, since SL2(R) is connected, the differentiability of the transition maps in A implies that if yix1 is differentiable for any two given charts, then it is differentiable for all charts in A. Hence, we can simply let (U,x) and (V,y) be the two charts on SL2(R) defined above. then we have

(yix1)(a,b,c)=(yi)((abc1+bca))=y((1+bcabca))=(1+bca,b,a)

which is clearly differentiable as a map between open subsets of R3 as a0 on x(U).

To prove that μ is differentiable, we can proceed almost similarly once we have a differentiable structure on the product manifold SL2(R)×SL2(R). Or, we may also argue that the matrix multiplication M2(R)×M2(R)M2(R) is

given by smooth expressions in the entries (involving products and sums) and hence it is a smooth map, from which it follows that the restriction to SL2(R)×SL2(R)SL2(R) is also smooth.

This completes the proof that SL2(R) is a 3-dimensional Lie group.

An Application of a Theorem I Just Learned

Question : Recall that a polygonal number is a number that counts dots arranged in the shape of a regular polygon. Prove that the n-th non s-gonal number is given by
12+14+2ns2+n
where x represents the least integer less than or equal to x.

Answer : It is well known that
(s2)n(n1)2+n
is the n-th s-gonal number.

Now, we will use the Lambek-Moser Theorem to get
f(n)=(s2)n(n1)2
and hence
f(n)=12+14+2ns2
and hence
F(n)=12+14+2ns2+n
hence completing the proof.

Remark : A careful induction on n (keeping s fixed) would probably have also done the job.

Acknowledgement : I learnt about the Lambek-Moser Theorem from the book The Irrationals: a Story of the Numbers You Can't Count On by Julian Havil (section Theoretical Matters of chapter Does Irrationality Matter?) which I also reviewed for zbMATH

The Law of Iterated Logarithm

1. Introduction

The goal is to give a complete proof of the Law of Iterated Logarithm. The setup we will work in is the following. Let {Xi}i=1 be a sequence of iid random variables with mean 0 and variance 1. Let
Sn:=i=1nXi and S¯n:=Snn
for nZ+. From now on, we will use Xn and Sn as defined in our discussions.

We wish to prove the following theorem.
Theorem 1 (The Law of Iterated Logarithm) : Let
Λn:=2nloglogn
wherever it is defined. Then,
lim supnSnΛn=1
almost surely. By replacing Xj by Xj, this also proves that
lim infnSnΛn=1
almost surely. In fact, for any fC(R,R),
lim supnf(SnΛn)=supt[1,1]f(t)
almost surely. That is, the set of limit points of
{SnΛn:n1}
coincides with the interval [1,1] almost surely.

In what follows, we will use the notation
Λβ:=Λβ and S~β:=SβΛβ
where β is greatest integer less than or equal to β.

The main reference we are following is Probability Theory, An Analytic View by Daniel W. Stroock.

2. Lemmas

To prove Theorem 1, we need three lemmas.

Lemma 1 : For a(0,) and β(1,), if
m=1P(|S~βm|aβ)<
then
lim supn|S~n|a
almost surely.

Lemma 2 : For any sequence {Xn:n1} of iid random variables with mean 0 and variance σ2, we have
lim supn|S~n|8σ
almost surely.

Lemma 3 : Let the sequence {Xn} have a common distribution μ. Further, let the moment-generating function
Mμ(ξ):=Reξxμ(dx)<
for all ξR. Then, for all R(0,), there is an N(R)Z+ and a suitable constant K such that
P(|S~n|R)2exp[(1K8Rloglognn)R2loglogn]
for all nN(R). Also, for each ε(0,1], there is an N(ε)Z+ such that for all for all nN(ε) and |a|1ε,
P(|S~na|<ε)12exp[(a2+4K|a|ε)loglogn]
for some suitable constant K.

3. Proof of Theorem
We will first give a proof assuming the three lemmas and then give a proof of the lemmas afterwards.

Proof of Theorem 1 : Observe first that for any Xn and ε>0, it is easy to find a ψCb(R) so that Yn:=ψXn has mean 0 and variance 1 while Zn:=XnYn has variance less than ε2 (what we want is to find a smooth truncation - one way to do this is by first considering Xn1|Xn|<n and then appropriately smoothening the edges). Applying Lemma 2 to this Zn, we have
lim supn|S~n(ω)|1+lim supn|m=1nZm(ω)Λn|1+8ε
and for a[1,1], we have
lim infn|S~n(ω)a|lim supn|m=1nZm(ω)Λn|8ε
for almost every ωΩ. So, it is enough to assume that the random variables are bounded, and that is what will be assumed in the rest of the proof.

Now, for a given β>1, we can use Lemma 3 to argue
P(|S~βm|β)2exp[βloglogβm]
for all sufficiently large mZ+. So, using Lemma 1 with a=β, we see that
lim supn|S~n|β
almost surely for all β(1,).

So, to complete the proof, we need to show
P(lim infn|S~na|<ε)=1
for all a(1,1) and all ε>0. To do this, we wish to use the Borel-Cantelli Lemma. But for that, we need to make sure we are dealing with independent events. For this, we can use the previously proved result to argue that
lim infn|S~na|infk2lim infm|S~kma|lim infklim infn|SkmSkm1Λkma|
almost surely for all integers k2.

Now, since the events
Ak,m:={|SkmSkm1Λkma|<ε}mZ+
are mutually independent for each k2, it is enough to check that
m=1P(Ak,m)=
for sufficiently large k.

But, since
limkmaxmZ+|ΛkmΛkmkm11|=0
and
P(Ak,m)=P(|S~kmkm1aΛkmΛkmkm1|<εΛkmΛkmkm1)
for all k2, it is enough to show
m=1P(|S~kmkm1a|<ε)=
for every k2 and a(1,1) and ε>0. This follows by using the second part of Lemma 3 and choosing ε~>0 so small that ρ:=a2+4Kε~|a|<1 from which it is possible to conclude that when 0<ε<ε~, we have
P(|S~na|<ε)12exp[ρloglogn]
for all sufficiently large n.

This completes the proof.

4. Proof of Lemma 1

We first introduce the idea of a median. Give a real valued random variable Y, we say αR is a median of Y if
P(Yα)12 and P(Yα)12
and write as αmed(Y). It is easy to see that
α:=inf{tR:P(Yt)12}
is a median of Y. In other words, every Y has a median. Additionally, it can be shown that if YL2(R) and m is the mean of Y, then
|αm|2Var(Y)
for all αmed(Y).

We also have the following theorem.
Theorem 2 (Lévy Reflection Principle) : For k, choose α,kmed(SSk). Then, for all ε>0, we have
P(max1nN|Sn+αn,m|ε)2P(|SN|ε)
for all NZ+.

With this at hand, we move to the proof.

Proof of Lemma 1 : Let β(1,) be given. For each mN and 1nβm, let αm,n be a median of SβmSn. This gives |αm,n|2βm so that
lim supn|S~n|=lim supmmaxβm1nβm|S~n|βlim supmmaxnβm|Sn|Λβmβlim supmmaxnβm|Sn+αm,n|Λβm
which gives
P(lim supn|S~n|a)P(lim supmmaxnβm|Sn+αm,n|Λβmaβ)
and hence on application of Theorem 2, we have
P(maxnβm|Sn+αm,n|Λβmaβ)2P(|S~βm|aβ)
from which the desired result follows using Borel-Cantelli Lemma.

5. Proof of Lemma 2

Again, we need to introduce a few objects before we can move to our proof.

We define the Rademacher functions {Rn}nZ+ on [0,1) as follows. Consider the function R:R{1,1} given by
R(t):={1iftt[0,12)1iftt[12,1)
and define Rn on [0,1) by
Rn(ω):=R(2n1ω)
for nZ+ and ω[0,1). It can be shown that the Rademacher functions are mutually independent.

We will use these functions in our proof.

Proof of Lemma 2 : Without loss of generality, it can be assumed that σ=1. We begin by considering the case where the Xn's are symmetric, i.e., Xn and Xn has the same distribution as Xn itself. For this case, we will be able to prove the result with the constant 8 replaced by 4 once we can show
m=0P(|S~2m23/2|)<
since we can then use Lemma 1.

Now, let (Ω,F,P) be the probability space on which the Xn's are defined and let {Rn}nZ+ be the sequence of Rademacher functions on [0,1). Set Q:=λ[0,1)×P on ([0,1)×Ω,B[0,1)×F) where λ[0,1) is the Lebesgue measure on [0,1). It is easy to check that the symmetry of Xn is equivalent to the statement that
ωΩ(X1(ω),X2(ω),)RZ+
has the same distribution under P as
(t,ω)[0,1)×Ω(R1(t)X1(ω),R2(t)X2(ω),)RZ+
has under Q.

Now, from Hoeffding's inequality, we know that for a given {σk}k=1nR and mutually independent random variables {Xi}i=1n with common distribution μ, the centered Bernoulli distribution given by μ({±1})=12, if ν denotes the distribution S:=1nσkXk, then for all a[0,), we have
P(|S|a)2exp[a22Σ2]
where Σ2:=1nσk2. We use this with σk=Xk(ω) and note that
λ[0,1)({t[0,1):|n=12mRn(t)Xn(ω)|a})2exp[a2212mXn(ω)2]
for all a[0,) and ωΩ.

So, if we define
Am:={ωΩ:12mn=12mXm(ω)22}
and
Fm(ω):=λ[0,1)({t[0,1):|n=12mRn(t)Xn(ω)|23/2Λ2m})
then we have
P({ωΩ:|S2m(ω)|23/2Λ2m})=ΩFm(ω)P(dω)2Ωexp[8Λ2m2212mXn(ω)]P(dω)2exp[4loglog2m]+2P(Am)
due to Tonelli's Theorem.

So, it is enough to show 0P(Am)<. To do so, let us set
Tm:=n=12mXn2Bm:={Tm+1Tm2m2}T¯m:=Tm2m
for mZ+. Clearly {Bm}mZ+ are independent and P(Am)=P(Bm). So, it is enough to show
P(lim supmBm)=P(lim supmTm+1Tm2m2)=0
and then use Borel-Cantelli Lemma.

But, by Strong Law of Large Numbers, we know T¯m1 almost surely, and hence
Tm+1Tm2m1
almost surely. This completes the proof for the symmetric case replacing the constant 8 by 4.

To eliminate the symmetry assumption, let (Ω,F,P) be the space on which Xn's are defined and define {Yn} on (Ω2,F2,P2) as
Yn(ω1,ω2):=Xn(ω1)Xn(ω2)
for nZ+. It is easy to check that the Yn's are symmetric. This gives
lim supn|Sn(ω1)Sn(ω2)|Λn25/28
for Q almost every (ω1,ω2)Ω2.

Now, we wish to proceed by contradiction. If possible, let
lim supn|Sn(ω)|Λn>8
on a set of positive P measure so that by Kolmogorov's 01 Law, there is an ε>0 such that
lim supn|Sn(ω)|Λn8+ε
for P almost every ωΩ. Using Fubini's Theorem, this gives that for P2 almost every (ω1,ω2)Ω2, there is a {nm(ω1):mZ+}Z+ such that nm(ω1) increases to so that the limit
limmSnm(ω1)(ω1)Λnm(ω)8+ε
exists and
lim infm|Snm(ω1)(ω2)|Λnm(ω1)limmSnm(ω1)(ω1)Λnm(ω)lim supm|Snm(ω)(ω1)Snm(ω)(ω2)|Λnm(ω1)ε
so that by Fubini's Theorem, we have an {nm:mZ+}Z+ such that nm increases to and
lim infm|Snm(ω2)|Λnmε
for P almost every ω2Ω. This contradicts Markov's Inequality owing to
E[(SnΛn)2]=12loglogn0
hence completing the proof.

Proof of Lemma 3

As usual, we need to discuss some new ideas before we can move to our proof. We define
Λμ(ξ):=log(Mμ(ξ))
and define the Legendre transform of Λμ as
Iμ(x):=sup{ξxΛμ(ξ):ξR}
for xR. Also, let Ξμ:=(Λμ)1.

Lemma 4 : There exists a δ(0,1] and a K(0,) such that [mδ,m+δ](α,β) and
|Λμ(Ξ(x))|K|Ξμ(x)|K|xm|
and
|Iμ(x)(xm)22σ2|K|xm|3
for all x[mδ,m+δ]. In particular, if 0<ε<δ, then
P(|S¯nm|ε)2exp[n(ε22σ2Kε3)]
and if |am|<δ, then
P(|S¯nm|<ε)(1Knε2)exp[n(|am|22σ2+K|am|(ε+|am|2))]
for any ε>0.

Proof : Recall that we started with the assumption that our random variables have mean 0 and variance 1. So, we have Λμ(0)=Λμ(0)=0 and Λμ(0)=1 and hence Ξμ(0)=0 and Ξμ(0)=1. So, we can find an M(0,) and a δ(0,1] with α<δ<δ<β for which
|Ξμ(x)x|M|x|2
whenever |x|δ and
|Λμ(ξ)ξ22|M|ξ|3
whenever |ξ|(M+1)δ. This immediately proves
|Ξμ(x)|(M+1)|x|
for |x|δ and the estimate for Iμ follows since Iμ(x)=Ξ(x)xΛμ(Ξμ(x)).

This gives us enough room to discuss our proof.

Proof of Lemma 3 : Begin by setting
λn:=Λnn=2loglognn
wherever it is defined.

To prove the first part, we apply the upper bound of P(|S¯nm|ε) obtained in Lemma 4 to see that
P(|S~n|R)=P(|S¯n|Rλn)2exp[n(R2λn22KR3λn3)]
for sufficiently large nZ+.

To prove the next part of the lemma, we note
P(|S~na|<ε)=P(|S¯nan|<εn)
where an=aλn and εn=ελn.So, using the lower bound on P(|S¯nm|<ε) obtained in Lemma 4 to conclude
P(|S~na|<ε)(1Knεn2)exp[n(an22+K|an|(εn+an2))](1K2ε2loglogn)exp[(a2+2K|a|(ε+a2λn))loglogn]
for sufficiently large n.


Acknowledgement : I'd like to thank Satvik Saha, who has been my partner in reading this proof, patiently answering my barrage of inane questions and making clever observations that I've shamelessly passed off as my own in this write-up. I feel safe admitting it here because, let’s be real, no one’s reading this far anyway.

Introduction to Sumsets II

This is a sequel to a previous post . However, one can technically read it without reading the previous post. Given A,BZ for a c...