The goal is to give a complete proof of the Law of Iterated Logarithm. The setup we will work in is the following. Let be a sequence of iid random variables with mean and variance . Let
for . From now on, we will use and as defined in our discussions.
We wish to prove the following theorem.
Theorem 1 (The Law of Iterated Logarithm) : Let
wherever it is defined. Then,
almost surely. By replacing by , this also proves that
almost surely. In fact, for any ,
almost surely. That is, the set of limit points of
coincides with the interval almost surely.
In what follows, we will use the notation
where is greatest integer less than or equal to .
The main reference we are following is Probability Theory, An Analytic View by Daniel W. Stroock.
2. Lemmas
To prove Theorem 1, we need three lemmas.
Lemma 1 : For and , if
then
almost surely.
Lemma 2 : For any sequence of iid random variables with mean and variance , we have
almost surely.
Lemma 3 : Let the sequence have a common distribution . Further, let the moment-generating function
for all . Then, for all , there is an and a suitable constant such that
for all . Also, for each , there is an such that for all for all and ,
for some suitable constant .
3. Proof of Theorem
We will first give a proof assuming the three lemmas and then give a proof of the lemmas afterwards.
Proof of Theorem 1 : Observe first that for any and , it is easy to find a so that has mean and variance while has variance less than (what we want is to find a smooth truncation - one way to do this is by first considering and then appropriately smoothening the edges). Applying Lemma 2 to this , we have
and for , we have
for almost every . So, it is enough to assume that the random variables are bounded, and that is what will be assumed in the rest of the proof.
Now, for a given , we can use Lemma 3 to argue
for all sufficiently large . So, using Lemma 1 with , we see that
almost surely for all .
So, to complete the proof, we need to show
for all and all . To do this, we wish to use the
Borel-Cantelli Lemma. But for that, we need to make sure we are dealing with independent events. For this, we can use the previously proved result to argue that
almost surely for all integers .
Now, since the events
are mutually independent for each , it is enough to check that
for sufficiently large .
But, since
and
for all , it is enough to show
for every and and . This follows by using the second part of Lemma 3 and choosing so small that from which it is possible to conclude that when , we have
for all sufficiently large .
This completes the proof.
4. Proof of Lemma 1
We first introduce the idea of a median. Give a real valued random variable , we say is a median of if
and write as . It is easy to see that
is a median of . In other words, every has a median. Additionally, it can be shown that if and is the mean of , then
for all .
We also have the following theorem.
Theorem 2 (Lévy Reflection Principle) : For , choose . Then, for all , we have
for all .
With this at hand, we move to the proof.
Proof of Lemma 1 : Let be given. For each and , let be a median of . This gives so that
which gives
and hence on application of Theorem 2, we have
from which the desired result follows using Borel-Cantelli Lemma.
5. Proof of Lemma 2
Again, we need to introduce a few objects before we can move to our proof.
We define the Rademacher functions on as follows. Consider the function given by
and define on by
for and . It can be shown that the Rademacher functions are mutually independent.
We will use these functions in our proof.
Proof of Lemma 2 : Without loss of generality, it can be assumed that . We begin by considering the case where the 's are symmetric, i.e., and has the same distribution as itself. For this case, we will be able to prove the result with the constant replaced by once we can show
since we can then use Lemma 1.
Now, let be the probability space on which the 's are defined and let be the sequence of Rademacher functions on . Set on where is the Lebesgue measure on . It is easy to check that the symmetry of is equivalent to the statement that
has the same distribution under as
has under .
Now, from
Hoeffding's inequality, we know that for a given and mutually independent random variables with common distribution , the centered Bernoulli distribution given by , if denotes the distribution , then for all , we have
where . We use this with and note that
for all and .
So, if we define
and
then we have
due to Tonelli's Theorem.
So, it is enough to show . To do so, let us set
for . Clearly are independent and . So, it is enough to show
and then use Borel-Cantelli Lemma.
But, by Strong Law of Large Numbers, we know almost surely, and hence
almost surely. This completes the proof for the symmetric case replacing the constant by .
To eliminate the symmetry assumption, let be the space on which 's are defined and define on as
for . It is easy to check that the 's are symmetric. This gives
for almost every .
Now, we wish to proceed by contradiction. If possible, let
for almost every . Using Fubini's Theorem, this gives that for almost every , there is a such that increases to so that the limit
exists and
so that by Fubini's Theorem, we have an such that increases to and
for almost every . This contradicts Markov's Inequality owing to
hence completing the proof.
Proof of Lemma 3
As usual, we need to discuss some new ideas before we can move to our proof. We define
and define the Legendre transform of as
for . Also, let .
Lemma 4 : There exists a and a such that and
and
for all . In particular, if , then
and if , then
for any .
Proof : Recall that we started with the assumption that our random variables have mean and variance . So, we have and and hence and . So, we can find an and a with for which
whenever and
whenever . This immediately proves
for and the estimate for follows since .
This gives us enough room to discuss our proof.
Proof of Lemma 3 : Begin by setting
wherever it is defined.
To prove the first part, we apply the upper bound of obtained in Lemma 4 to see that
for sufficiently large .
To prove the next part of the lemma, we note
where and .So, using the lower bound on obtained in Lemma 4 to conclude
for sufficiently large .
Acknowledgement : I'd like to thank Satvik Saha, who has been my partner in reading this proof, patiently answering my barrage of inane questions and making clever observations that I've shamelessly passed off as my own in this write-up. I feel safe admitting it here because, let’s be real, no one’s reading this far anyway.