Probability Theory in the Context of DFS
A random variable is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense). As we’ll see shortly, random variables abound in daily fantasy sports.
A large part of the skill in DFS involves dealing with random variables. There isn’t room in this course for a complete discussion of probability theory, but there are certain parts a player absolutely must know to be successful.
H3. Discrete Random Variables
There are two types of random variables, discrete and continuous. Discrete random variables usually represent one of a finite set of possibilities. For example, a roll of a pair of dice results in a total between 2 and 12.
A discrete random variable has a probability mass function, which specifies the probability for each of the possible outcomes. For example, for the pair of dice, the probability mass function is

Continuous Random Variables
A continuous random variable can take on any value, usually a real number. For example, the heights of NBA players measured in inches would specify a continuous random variable.
A continuous random variable has a probability density function. For example, the familiar standardized Gaussian bell-shaped curve has the probability density function

Some random variables we see in daily fantasy sports:
- The number of fantasy points a player accrues in a game (continuous)
- The total fantasy points a lineup scores in a contest (continuous)
- The rank of a lineup among the entries in a contest (discrete)
- Whether the lineup cashed or not: 1 if it did, 0 if it didn’t (discrete)
H2. The Bernoulli and Binomial Distributions
The last entry in the list above — whether a lineup cashed or not — is an example of a Bernoulli distribution. A Bernoulli random variable has two possible outcomes, which in games we usually refer to as “win” and “lose”.
To make calculations easier, we’ll use “1” for win and “0” for lose. The probability of a win is usually denoted by the letter p. The probability of a loss is usually denoted by the letter q; p + q = 1 and q = 1 – p.
Bernoulli variables aren’t very interesting; we wouldn’t just enter one lineup in one contest and walk away forever. So we need a random variable that models how many times we cash over a number of contests. And that’s a binomial random variable.
A binomial random variable has an underlying Bernoulli random variable with parameters p and q. We ask the question, “If we enter N contests, what’s the probability that we win none, one, two, and so on up to N?” And that’s the probability mass function for the binomial,
If we know N and we know p, we can compute the probability of winning exactly k contests out of N tries. That probability is

where (N/k) is the number of combinations of N things taken k at a time. That’s interesting, but that doesn’t solve our problem. We know N – how many contests we entered. And we know k – how many we won. But we don’t know p. We need to know p to calculate expected values.
It turns out we can estimate p easily. The estimate of p is just

So if I entered 100 50/50 contests and cashed 60 of them, the estimate of p is 0.6 and the estimate of q is 0.4.
H2. Confidence Interval For p
Before we move on to expectations, there’s one more tool we’ll need. It turns out that not only can we estimate p, we can compute a confidence interval for p.
We want to say, “there’s a 95% probability that the real value of p is between plower and pupper”. As the Wikipedia article above notes, there are a number of options for doing this and all have certain limitations. For our purposes, the simplest one that we can copy and paste into a spreadsheet will do. In the equations, pest is the estimate of p we computed above and qest = 1 – pest.

H3. Expectations
Now that we have an estimate and a confidence interval for p, we can estimate how much we expect to win or lose per dollar of entry fees. For a $1 50/50, we pay a dollar to enter. If we win, we get $1.80 back, so we win $0.80. If we lose, we lose the dollar. The estimated expectation per dollar is

In general, if F is the entry fee in dollars and C is the cash paid for a win in dollars, then
- The winnings W per dollar is (C – F) / F
- The loss L per dollar is F / F = 1
- The win/loss ratio wlratio is W/L=(C-F)/F
and the estimated expectation EVest is

with confidence interval

In the spreadsheets, we’ll do this calculation for pest, plower and pupper, generating a 95 percent confidence interval for EV.