# Deriving Chernoff Bounds ## Definitions A sample space \$ \Omega \$ is a set of outcomes \$ \Omega = \{a\_1, a\_2, ... \} \$ with associated probabilities \$P(a_1), ...\$. Each time we sample from \$ \Omega \$ we recieve an outcome \$ V(\Omega) \$. After \$ k \$ trials let \$G\_k = \sum\_{i=1}^k V(\Omega) \$. In this post we'll be developing methods to bound \$ G_k \$. ## Example Let's say there's a lottery that either wins you $60, $1010, or nothing at all. It costs $10 to enter, so the net outcomes would be \$ \Omega = \\{ -10, 50, 1000 \\} \$, let's say with probabilities \$ P(-10)=90\%\$, \$ P(50)=9.9\%\$, and \$ P(1000)=0.1\%\$. After \$ k \$ trials, what are the odds that we won't loss money? That we'll make or lose more than \$ $1000 \$? To answer these questions, we'll be finding upper bounds on probabilities of the form \$ P(G\_k \geq Z) \$. ## Deriving The expected value of each trial is $$ \mathbb{E}[V(\Omega)] = \sum\_{a \in \Omega} a \cdot P(a) $$ We'll introduce an auxiliary variable, \$t > 0 \$, to modify the value of each outcome, $$ \mathbb{E}[e^{t\cdot V(\Omega)}] = \sum\_{a \in \Omega} e^{a\cdot t} \cdot P(a) $$ We'll leave the value of \$t \$ floating for now and resolve it towards the end of the post. We now show \begin{align} \mathbb{E}[e^{t \cdot G_K}] &= \mathbb{E}[e^{t \sum\_{i=1}^k V(\Omega)}] \\\\ &= \prod\_{i=1}^k \mathbb{E}[e^{t \cdot V(\Omega)}] && \text{ because trials are independent} \\\\ &= \Big( \mathbb{E}[e^{t \cdot V(\Omega) } ] \Big)^{k} \\\\ &= \Big(\sum\_{a \in \Omega} e^{a\cdot t} \cdot P(a) \Big)^{k} \end{align} Bounding the probability on the upper tail, \$ G\_k \geq Z \$, \begin{align} P(G\_k \geq Z) &= P(e^{t G\_k} \geq e^{t Z}) && \text{ by monotonicity} \\\\ &\leq \frac{\mathbb{E}[e^{t G\_k}]}{e^{t Z}} && \text{ by Markov's inequality} \\\\ &= \frac{\Big(\sum\_{a \in \Omega} e^{a\cdot t} \cdot P(a) \Big)^{k}}{e^{t Z}} \end{align} To evaluate the bound we'll pick a \$t > 0 \$ that minimizes the RHS. Because it's convex, we can numerically find \$t\$ using a binary search type algorithm,

## Empirical Evaluation Revisiting the example, if we buy \$k\$ tickets, what are the odds we won't lose money, \$P(G\_k \geq 0)\$? This may be somewhat difficult to answer, and so we'll ask for an upper bound on this probability. We plot the derived bound alongside probabilities found by monte carlo simulation,

Won't lose money

We'll make at least $1000

We won't lose more than $1000

## Final Thoughts Although this may not seem awfully tight, asymptotically, this is as good as it gets; a simple analysis shows that the bound is exponential in \$k\$, \$ \Theta(c^k), \text{ for } 0 < c < 1 \$. I'm also realizing that in the example I promised to give bounds on the probability of losing more than $1000, \$ P(G\_k < -1000) \$, however I only gave an upper bound, \$ u \$, on the complement, \$ P(G\_k \geq -1000) \leq u \$. We can easily perform a slight change to convert this into a lower bound on \$ P(G\_k < -1000) \$, $$ P(G\_k \geq -1000) \leq u \Longleftrightarrow P(G\_k < -1000) \geq 1 - u $$