## Sums of Random Variables

## Sums of Random Variables

##### Nomenclature:

Upper case letters, \(X, Y\), are random variables; lower case letters, \(x, y\), are specific realizations of them. Upper case \(F\) is a cumulative distribution function, *cdf*, and lower case \(f\) is a probability density function, *pdf*.

**Sometimes** you need to know the distribution of some combination of things. The sum of two incomes, for example, or the difference between demand and capacity. If \(f_X(x)\) is the distribution (probability density function, *pdf*) of one item, and \(f_Y(y)\) is the distribution of another, what is the distribution of their sum, \(Z = X + Y\) ?

As a simple example consider X and Y to have a uniform distribution on the interval (0, 1). The distribution of their sum is triangular on (0, 2).

Why? To begin consider the problem qualitatively. The minimum possible value of \(Z = X + Y\) is zero when \(x=0\) and \(y=0\), and the maximum possible value is two, when \(x=1\) and \(y=1\). Thus the sum is defined only on the interval \((0, 2)\) since the probability of \(z \lt 0\) or \(z \gt 2\) is zero, that is, \(P(Z \lvert z \gt 0) = 0\) and \(P(Z \vert z \gt 2) = 0\).

#### Proof:

Further, it seems intuitive^{(1)} that the most probable value would be near \(z=1\), the midpoint of the interval, for several reasons. The summands are **iid** (independent, identically distributed) and the sum is a linear operation that doesn’t distort symmetry. So we would surmise that the probability density of \(Z = X + Y\) should start at zero at z=0, rise to a maximum at mid-interval, z=1, and then drop symmetrically to zero at the end of the interval, z=2. We might expect the distribution of \(Z = X + Y\) to look like this:

Enough of visceral pseudo calculus. How do you prove that this result is correct?

##### Note:

- Statistical intuition can sometimes be misleading. See Joseph P. Romano, Andrew F. Siegel (1986)
**Counterexamples in Probability and Statistics**(Wadsworth and Brooks/Cole Statistics/Probability Series)

\(F_Z(z)\), the cdf of \(Z\), is the probability that the sum, \(Z\), is less than or equal to some value \(z\). The probability density that we’re looking for is \(f_Z(z) = d\big( Z(z) \big) /dz\), since the integral of the *pdf* is the *cdf*. as a consequence of the Fundamental Theorem of Calculus.

1 |
\[F_Z(z)=P(Z \le z)=P(X+Y \le z)\] | by definition. |

2 |
\[=\int_{-\infty}^{\infty} P(X+Y \le z \space \lvert \space X=x) \times f_X(x) dx\] | by the definition of conditional probability and the independence of \(X\) and \(Y\) |

3 |
\[F_Z(z)=\int_{-\infty}^{\infty}P(x+Y \le z) \times f_X(x) dx\] |
letting \(X = x\) |

4 |
Now,
\[P(x+Y \le z) = F_Y(z-x)\] |
by the definition of \(F_Y\) |

5 |
so that
\[F_Z(z)=\int_{-\infty}^{\infty} F_Y(z-x) \times f_X(x) dx\] |
by substitution of 4 into 3. |

6 |
Also, \[f_Z(x)=\frac{d F_Z(z)}{dz} \] |
by the relationship between a pdf and its cdf. |

7 |
\(f_Z(z)= \frac {d}{dz} \Big( \int_{-\infty}^{\infty} F_Y(z-x) \times f_X(x) dx \Big)\) | by substituting 5 into 6. |

8 |
\[f_Z(z) = \int_{-\infty}^{\infty} \frac{dF_Y(z-x)}{dx} f_X(x) dx\] |
by Liebnitz’s rule for differentiating an integral. |

9 |
Since,
\[f_Y(y) = \frac{dF_Y(y)}{dy} \times \Big(\frac{dy}{dx} = 1 \Big) = \frac{dF_Y(y)}{dz} \] |
by the relationship between a pdf and its cdf and the fact that since \(y = z – x, dy = dz\) |

10 |
Finally,
\[f_Z(z) = \int_{-\infty}^{\infty} f_Y(z-x) \space f_X(x) dx \] |
Q.E.D. by substituting8 into 7 This is a general result. |

11 |
\[f_X(x)=1 \text{ and } 0 \le x \le 1\] | Our specific example: Sum of two standard uniform densities. |

12 |
and
\[f_Y(y)=1 \text{ and } 0 \le y \le 1\] |
by the definition of a standard uniform distribution |

13 |
\[f_Z(z)=\int_{-\infty}^{\infty} 1 \times 1 \space dx\] |
from 10, 11 and 12 above, |

14 |
Breaking the integral into to parts depending on \(z\)
\[f_Z(z)=\int_0^x dx=x \lvert_0^1 = z\] |
if \(0 \le z \le 1\), and |

15 |
\[f_Z()z)=\int_0^1 dx = x \lvert_{x-1}^1 =1-(z-1) = 2-z\] |
if \(1 \le z \le 2\) |

Which are seen to be the equations describing a triangular distribution on \((0, 2)\) shown in the figure above. | Q.E.D. |

Note that 10, above, is called the * convolution* of functions \(f_X(x) \) and \(f_Y(y)\). This result is general and it true for any independent continuous densities. A specific example is our original problem when \(f_X(x) \) and \(f_Y(y)\) are both uniform on \((0, 1)\) and \(Z = X + Y \)is their sum.