# Deriving Chernoff Bounds
## Definitions
A sample space \\( \Omega \\) is a set of outcomes \\( \Omega = \{a\_1, a\_2, ... \} \\) with associated probabilities \\(P(a_1), ...\\).
Each time we sample from \\( \Omega \\) we recieve an outcome \\( V(\Omega) \\).
After \\( k \\) trials let \\(G\_k = \sum\_{i=1}^k V(\Omega) \\).
In this post we'll be developing methods to bound \\( G_k \\).
## Example
Let's say there's a lottery that either wins you $60, $1010, or nothing at all.
It costs $10 to enter, so the net outcomes would be \\( \Omega = \\{ -10, 50, 1000 \\} \\), let's say with probabilities \\( P(-10)=90\%\\), \\( P(50)=9.9\%\\), and \\( P(1000)=0.1\%\\).
After \\( k \\) trials, what are the odds that we won't loss money? That we'll make or lose more than \\( $1000 \\)?
To answer these questions, we'll be finding upper bounds on probabilities of the form \\( P(G\_k \geq Z) \\).
## Deriving
The expected value of each trial is
$$
\mathbb{E}[V(\Omega)] = \sum\_{a \in \Omega} a \cdot P(a)
$$
We'll introduce an auxiliary variable, \\(t > 0 \\), to modify the value of each outcome,
$$
\mathbb{E}[e^{t\cdot V(\Omega)}] = \sum\_{a \in \Omega} e^{a\cdot t} \cdot P(a)
$$
We'll leave the value of \\(t \\) floating for now and resolve it towards the end of the post.
We now show
\begin{align}
\mathbb{E}[e^{t \cdot G_K}] &= \mathbb{E}[e^{t \sum\_{i=1}^k V(\Omega)}] \\\\
&= \prod\_{i=1}^k \mathbb{E}[e^{t \cdot V(\Omega)}] && \text{ because trials are independent} \\\\
&= \Big( \mathbb{E}[e^{t \cdot V(\Omega) } ] \Big)^{k} \\\\
&= \Big(\sum\_{a \in \Omega} e^{a\cdot t} \cdot P(a) \Big)^{k}
\end{align}
Bounding the probability on the upper tail, \\( G\_k \geq Z \\),
\begin{align}
P(G\_k \geq Z) &= P(e^{t G\_k} \geq e^{t Z}) && \text{ by monotonicity} \\\\
&\leq \frac{\mathbb{E}[e^{t G\_k}]}{e^{t Z}} && \text{ by Markov's inequality} \\\\
&= \frac{\Big(\sum\_{a \in \Omega} e^{a\cdot t} \cdot P(a) \Big)^{k}}{e^{t Z}}
\end{align}
To evaluate the bound we'll pick a \\(t > 0 \\) that minimizes the RHS.
Because it's convex, we can numerically find \\(t\\) using a binary search type algorithm,
## Empirical Evaluation
Revisiting the example, if we buy \\(k\\) tickets, what are the odds we won't lose money, \\(P(G\_k \geq 0)\\)?
This may be somewhat difficult to answer, and so we'll ask for an upper bound on this probability.
We plot the derived bound alongside probabilities found by monte carlo simulation,
Won't lose money
We'll make at least $1000
We won't lose more than $1000
## Final Thoughts
Although this may not seem awfully tight, asymptotically, this is as good as it gets;
a simple analysis shows that the bound is exponential in \\(k\\), \\( \Theta(c^k), \text{ for } 0 < c < 1 \\).
I'm also realizing that in the example I promised to give bounds on the probability of losing more than $1000, \\( P(G\_k < -1000) \\), however I only gave an upper bound, \\( u \\), on the complement, \\( P(G\_k \geq -1000) \leq u \\).
We can easily perform a slight change to convert this into a lower bound on \\( P(G\_k < -1000) \\),
$$ P(G\_k \geq -1000) \leq u \Longleftrightarrow P(G\_k < -1000) \geq 1 - u $$