{
    "id": "https://brandonrozek.com/blog/expectations-are-linear/",
    "url": "https://brandonrozek.com/blog/expectations-are-linear/",
    "title": "Expectations are Linear",
    "authors": [
        
            { "name": "Brandon Rozek" }
        
    ],
    "content_html": "\u003cblockquote\u003e\n\u003cp\u003eAs an example, he asked me, in more words, what the expected rank when flipping over the top card of a deck of cards was (A=1, J=11, Q=12,  K=13). This is easy to compute directly as 7. Then he asked me the expectation of the \u003cem\u003esum of the top two cards\u003c/em\u003e.\u003c/p\u003e\n\u003cp\u003e- \u003ca href=\"https://buttondown.com/jaffray/archive/expectation-and-copysets/\"\u003eFrom \u0026ldquo;Expectation and Copysets\u0026rdquo; by Justin Jaffray\u003c/a\u003e\u003c/p\u003e\u003c/blockquote\u003e\n\u003cp\u003eWhat does your intuition say the answer is? Justin continues by stating that computing this expectation is as easy as summing their individual expectations.\n$$\nE[X + Y] = E[X] + E[Y]\n$$\nIn other words, \u003cstrong\u003eexpectations are linear\u003c/strong\u003e. I recommend reading his entire blog post. It\u0026rsquo;s great and also talks about how this property is used in databases today. After a high-level explanation, he says:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003eThe fact that expectation is linear is easy to show if you just look at the definition, which we will not do here, but I trust you are capable of if you are interested and have not already seen it.\u003c/p\u003e\u003c/blockquote\u003e\n\u003cp\u003eIn this episode of \u003cem\u003eExercise for the Reader\u003c/em\u003e (\u003ca href=\"/blog/implications-prenex-normal-form/\"\u003elast episode\u003c/a\u003e), we\u0026rsquo;ll look at the definition and show why this property holds. This is true regardless of the underlying probability distribution and whether or not we\u0026rsquo;re sampling with replacement.\u003c/p\u003e\n\u003cp\u003eAs Justin stated, let\u0026rsquo;s start with the definition of expectation and then split the sum:\n$$\n\\begin{align*}\nE[X + Y] \u0026amp;= \\sum_{x \\in X} \\sum_{y \\in Y} (x + y) \\cdot P(X = x, Y=y) \\\\\n\u0026amp;= (\\sum_{x \\in X} \\sum_{y \\in Y} x \\cdot P(X = x, Y = y)) + (\\sum_{x \\in X} \\sum_{y \\in Y} y \\cdot P(X = x, Y = y))\n\\end{align*}\n$$\nNotice that the left-hand-side of the multiplication does not depend on both variables anymore. Also it doesn\u0026rsquo;t matter whether we do the summation over $X$ first or $Y$. Therefore, we can bring that variable out of the inner sum and simplify this to:\n$$\nE[X + Y] = (\\sum_{x \\in X} x \\sum_{y \\in Y} P(X = x, Y = y)) + (\\sum_{y \\in Y} y \\sum_{x \\in X} P(X = x, Y = y))\n$$\nWe can then perform \u003cem\u003emarginalization\u003c/em\u003e to substitute $\\sum_{y \\in Y} P(X = x, Y = y)$ with $P(X = x)$ and do the same for the right hand side of the sum.\n$$\n\\begin{align*}\nE[X + Y] \u0026amp;= (\\sum_{x \\in X}xP(X = x)) + (\\sum_{y \\in y}yP(Y=y))) \\\\\n\u0026amp;= E[x] + E[Y]\n\\end{align*}\n$$\u003c/p\u003e\n\u003chr\u003e\n\u003cp\u003eWhy can we marginalize? For me to show why, we need to peel back the curtain on the notation.\u003c/p\u003e\n\u003cp\u003eThe set $\\Omega$ contains the outcomes of all the events that we\u0026rsquo;re concerned about. So, if we are considering events $X$ and $Y$ with outcomes $x_i$ and $y_i$, respectively. Then, our event space $\\Omega$ is equal to $\\{ (x_i, y_i) \\mid x_i \\in X, y_i \\in Y\\}$.\u003c/p\u003e\n\u003cp\u003eTherefore when we say $X = x$, what we really mean is the set of outcomes where that is true. In mathematical terms, $\\{\\omega \\in \\Omega \\mid X(\\omega) = x\\}$.\u003c/p\u003e\n\u003cp\u003eNow, let\u0026rsquo;s show why $\\sum_{y \\in Y} P(X = x, Y = y) = P(X = x)$.\n$$\n\\sum_{y \\in Y}P(X = x, Y = y) = \\sum_{y \\in Y} P(\\{\\omega \\in \\Omega \\mid X(\\omega) = x\\} \\cap \\{\\omega \\in \\Omega \\mid Y(\\omega) = y\\})\n$$\nOne of the three Kolmogorov axioms of probability is \u003cstrong\u003ecountable additivity\u003c/strong\u003e. This is defined as:\n$$\n\\sum_{x \\in A}P(X = x) = P(\\bigcup_{x \\in A}X =x )\n$$\nSubstituting that in and simplifying, we get:\n$$\n\\begin{align*}\n\\sum_{y \\in Y}P(X = x, Y = y) \u0026amp;= P(\\bigcup_{y \\in Y}(\\{\\omega \\in \\Omega \\mid X(\\omega) = x\\} \\cap \\{\\omega \\in \\Omega \\mid Y(\\omega) = y\\})) \\\\\n\u0026amp;= P(\\{\\omega \\in \\Omega \\mid X(\\omega) = x\\} \\cap \\bigcup_{y \\in Y}\\{\\omega \\in \\Omega \\mid Y(\\omega) = y\\})\n\\end{align*}\n$$\nNotice that the right hand side of the term is just $\\Omega$. We can then simplify to,\n$$\n\\begin{align*}\n\\sum_{y \\in Y}P(X = x, Y = y) \u0026amp;= P(\\{\\omega \\in \\Omega \\mid X(\\omega) = x\\} \\cap \\Omega) \\\\\n\u0026amp;= P(\\{\\omega \\in \\Omega \\mid X(\\omega) = x\\}) \\\\\n\u0026amp;= P(X = x)\n\\end{align*}\n$$\nSince countable additivity is an axiom, we\u0026rsquo;ll stop our derivations there.  See you next time.\u003c/p\u003e\n",
    "date_published": "2026.04.26",
    "tags": [],
    "_syndication": {
        "mastodon": {
            "enabled": false,
            "toot_id": null,
            "toot_text": ""
        },
        "medium": {
            "enabled": false,
            "post_id": null
        }
    }
}