The problem is described in a report published earlier this year by the American Council of Trustees and Alumni. An excerpt of the really telling part is provided below (emphasis mine):

In late summer of 2015, ACTA commissioned the research firm GfK to survey recent American college graduates and the public at large about their understanding of our free institutions of government. Our questions were drawn from standard high school civics curricula. They emphasized the content of the U.S. Constitution and the basic workings of our government. A smaller number of questions also asked about prominent figures currently serving in the federal government.

The results were abysmal. For example:

- Only 20.6% of respondents could identify James Madison as the Father of the Constitution. More than 60% thought the answer was Thomas Jefferson—despite the fact that Jefferson, as U.S. ambassador to France, was not present during the Constitutional Convention.
- College graduates performed little better: Only 28.4% named Madison, and 59.2% chose Jefferson.
- How do Americans amend the Constitution? More than half of college graduates didn’t know. Almost 60% of college graduates failed to identify correctly a requirement for ratifying a constitutional amendment.
- We live in a dangerous world—but almost 40% of college graduates didn’t know that Congress has the power to declare war.
- College graduates were even confused about the term lengths of members of Congress. Almost half could not recognize that senators are elected to six-year terms and representatives are elected to two-year terms.
- Less than half of college graduates knew that presidential impeachments are tried before the U.S. Senate.
- And 9.6% of college graduates marked that Judith Sheindlin—“Judge Judy”—was on the Supreme Court!
Many of the figures may actually understate how poorly our colleges are doing

because older respondents performed significantly better than younger ones. For example, 98.2% of college graduates over the age of 65 knew that the president cannot establish taxes—but only 73.8% of college graduates aged 25-34 answered correctly.Most college graduates over age 65 knew how to amend the Constitution—76.7% answered correctly. But among college graduates aged 25-34, less than a third chose the right answer, and over half answered that the president must ratify an amendment, failing to comprehend how the division of powers among coequal branches protects citizens’ rights.

Notice that the questions in the survey were drawn from standard *high school* civics curricula. Kids should be *entering* college knowing this stuff, not graduating without knowing it.

Think about this the next time someone (particularly a millennial) tries to tell you that “educated people” think this or think that. This report indicates that they don’t have the proper analytical skills to have an informed opinion (compared to most of their international peers or to previous generations), even the ones with the highest degrees.

Pay particular attention to the part that addresses “years of schooling” and “conferring of credentials and certificates.” Anyone who thinks that Sanders’s plan for “free college” would to do anything to help the country or the economy is fooling himself.

Keep in mind that this is a report produced by ETS — the folks who do the SAT/GRE/etc. tests. I’ve included a few excerpts from their summary below:

]]>Millennials may be on track to be our most educated generation ever, but they consistently score below many of their international peers in literacy, numeracy, and problem solving in technology-rich environments (PS-TRE).

… despite having the highest levels of educational attainment of any previous American generation, these young adults [those born after 1980 and between 16-34 years of age] on average demonstrate relatively weak skills in literacy, numeracy, and problem solving in technology-rich environments compared to their international peers. These findings hold true when looking at millennials overall, our best performing and most educated, those who are native born, and those from the highest socioeconomic background. Equally troubling is that these findings represent a decrease in literacy and numeracy skills for U.S. adults when compared with results from previous adult surveys.

The findings also offer a clear caution to anyone who believes that our policies around education should focus primarily on years of schooling or trusts that the conferring of credentials and certificates alone is enough. While it is true that, on average, the more years of schooling one completes, the more skills one acquires, this report suggests that far too many are graduating high school and completing postsecondary educational programs without receiving adequate skills.

- In literacy, U.S. millennials scored lower than 15 of the 22 participating countries. Only millennials in Spain and Italy had lower scores.
- In numeracy, U.S. millennials ranked last, along with Italy and Spain.
- Our best-educated millennials those with a master’s or research degree only scored higher than their peers in Ireland, Poland, and Spain.

Upon learning about Bendford’s Law, my colleague decided to put it to a test. So he grabbed an ENDF file, `endf66a`

, and pulled as many values as he could from it. (ENDF stands for “Evaluated Nuclear Data File.” It is a large database of nuclear data, such as cross-sections of various nuclides. For our purposes here, it is a large collection of real-world numerical data.) He collected the leading digit in the mantissa of each floating-point number in the file and examined the frequency of occurrence of each digit. The plot of the results that he produced is shown below.

After seeing his empirical verification of this law, I tried to explain why this law works in a couple of ways.

The first way is to consider the population of data, and here we’re talking about values that span multiple orders of magnitude. For example, if we were talking about lengths, the list could include values measured in centimeters, meters, and kilometers. Now for all of these units to be represented in our population we’d need the number of values expressed in centimeters to be roughly equivalent to the number of values expressed in kilometers. Naturally, such a population wouldn’t even come close to being distributed uniformly over the range of values—there are simply too many centimeters in a kilometer to sample from them in the same way that one would sample over centimeters in one meter. Mathematically, this means that, if \(f(x)\) is the probability density function (p.d.f.) of our population, then\[

\int_{10\,\text{cm}}^{1\,\text{m}} f(x)\,dx \approx

\int_{100\,\text{m}}^{1\,\text{km}} f(x)\,dx

\]which means that\[

\bar f_{(1\,\text{m})} \approx 1000\times\bar f_{(1\,\text{km})}

\]where \(\bar f_{(a)}\) is the average value of the p.d.f. over the neighborhood where \(x \approx a\).

Therefore, it makes sense to look at the value of \(f(x) \cdot x\) plotted versus \(x\) on a semi-log scale, keeping in mind that the probability of \(X\) (a random value drawn from the population) being between \(x_1\) and \(x_2\) is\[

P(x_1 < X < x_2) = \ln(10) \int_{\log_{10}(x_1)}^{\log_{10}(x_2)}
f(10^\xi)\cdot 10^\xi\,d\xi
\]That is, when \(f(x) \cdot x\) is plotted versus \(x\), with a logarithmic scale for \(x\), the area under the curve represents the relative probability of \(X\) being in a particular region. (\(\xi\) gives the linear distance along the logarithmic scale: \(\xi = \log_{10}x\).)

A (made-up) example of such a distribution is shown below.

Here the regions where the leading digit would be 1 are marked off. Compare this to the same figure with the regions where the leading digit would be 5 are highlighted.

As we can see, the areas marked off in in the first figure are significantly larger than the areas marked off in the second, which means that 1 is more likely than 5 to show up as the first digit of a number that is randomly drawn from this distribution, and this is the case for many collections of numbers that span multiple scales.

But perhaps someone objects to the p.d.f. that I used as an example. OK. Then consider this. To simplify the situation let’s consider only integers and let’s look at the range of integers from 0 to 9. If I have a population of these integers that is uniformly distributed, then any random variable that I produce is as likely to be one digit as another. But what happens if I double my range of possibilities by expanding to the right?

Now, I’m looking at the range of numbers from 0 to 19. Once again, if the distribution is uniform, I’m equally as likely to draw any of the numbers, but now the probability of drawing a number with 1 as the leading digit has changed from 10% to 55%! Over half of the time I’m going to get a number that starts with 1.

This example is contrived, of course, but it readily generalizes. For example, consider what happens if I expand the range to extend from 0 to 49. Although the probability of getting a number that begins with a 1 is no longer 55%, it’s still larger than getting a number that begins with a 5, 6, 7, 8, or 9.

This phenomenon is an artifact of the way we *represent* real values, not the real values themselves. To see this, it is instructive to consider just one number—for example, the fine-structure constant:\[

\alpha \approx 7.297\times 10^{-3}

\]This is a physical constant. It’s dimensionless. There is no mathematical basis for this constant. As Richard Feynmann once wrote,

It’s one of the greatest damn mysteries of physics: a magic number that comes to us with no understanding by man. You might say the “hand of God” wrote that number, and “we don’t know how He pushed his pencil.”

So this is about an arbitrary a number as one can find. It happens to begin with a 7, not a 1, but that’s just because we use a number system with ten digits. (Ignore, for the moment, that its reciprocal, \(\alpha^{-1} \approx 137.036\), an equally arbitrary number, *does* begin with a 1.) The number is what it is, regardless of how we write it. We can think of it as a single point on a line that represents all real numbers (the \(x\)-axis of the complex plane). What digits we use to write that number depend on where we lay down the grid lines that denote 1, 2, 3, and so on. If the real values are scaled logarithmically, this line looks like the following: The point that is \(\alpha\) also is shown above and falls between the ticks for 7 and 8, resulting in a number that begins with 7. But what happens when a different base (number of digits) is used to represent the same number. In octal (base 8, which is often used in computer science, because \(8 = 2^3\)), the real line looks like (Keep in mind that now the octal “10” is really an “8” in decimal notation.) In this system of numbers, the numerical representation of \(\alpha\) begins with a 3. If we double the number of digits to 16 (hexadecimal), we find the following: The leading digit is now 1.

It is important to note that the real line itself and all of the real values it represents, including \(\alpha\), haven’t changed. They are the same in all three diagrams. By changing the number of digits in our number system, we change only where the *grid lines* (tick marks) are located. All of these number systems agree on the location of 1 (and zero and infinity), but everything else changes. Below, I have tabulated the value of \(\alpha\) for number systems with 3 to 16 digits (base 3 to base 16):

Base | \(\alpha\) |
---|---|

3 | 1.210e-12 |

4 | 1.313e-10 |

5 | 4.240e-4 |

6 | 1.324e-3 |

7 | 2.334e-3 |

8 | 3.571e-3 |

9 | 5.278e-3 |

10 | 7.297e-3 |

11 | 9.793e-3 |

12 | 1.074e-3 |

13 | 1.305e-2 |

14 | 1.605e-2 |

15 | 1.996e-2 |

16 | 1.DE4e-2 |

In 8 out of the 14 number schemes (57%), the representation of \(\alpha\) begins with a 1. This is not surprising when you think about it, because although the locations of the grid lines change, the structure of the grid lines remains similar. The space along the (logarithmically scaled) real line where a 1 is the leading digit is always the largest space. Therefore, 1 is the most likely leading digit, regardless of the base.

For a truly arbitrary number, chosen without any restrictions, it is not difficult to predict that the likelihood of the digit \(n\) being the first digit is\[

P(n) = \log_b(n+1) – \log_b(n) \] where \(b\) is the base of the number system being used. This can be deduced geometrically from the figures above.

As long as one uses a notation system for numbers consisting of a series of symbols in which each successive symbol in the series constitutes a value that is \(b\) times smaller than the symbol before it (where \(b\) is the number of symbols used), the symbol denoting the smallest (non-zero) value (in our case, 1) will be the most likely symbol to appear first in the series.

]]>Since many “thirders” seem to be fond of relying on betting analogies to reason through the problem and explain their arguments, a useful substitute for the coin toss is a lottery. A typical lottery provides a very small chance of winning accompanied by a very large payoff (which is why lotteries are so popular). So here we shall examine what happens when Sleeping Beauty plays the lottery.

The events of the original Sleeping Beauty Problem occur as described before. That is, Beauty is put to sleep on Sunday and woken on Monday. She is then put to sleep with her memory erased and woken again on Tuesday *only* if a certain random event happens.

The difference between the two problems is that the coin toss in the original problem is replaced with a lottery. Before she goes to sleep on Sunday night, Beauty chooses a “lucky” set of lottery numbers. (Note: Each set of numbers constitutes one lottery ticket and one chance to win the lottery.) Then, sometime after she is asleep, the lottery’s winning numbers are drawn at random. If the numbers that Beauty chose match the winning numbers, she will be woken on Monday, have her memory erased, and woken again on Tuesday. If her numbers do not match the winning numbers, she will be woken only on Monday.

As in the original problem, the circumstances of her wakings are identical, so she cannot tell what day it is or the result of the lottery. The new question is the following: What should Beauty think about her probability of winning the lottery upon awakening?

Note that winning the lottery in this problem corresponds with the coin toss coming up tails in the original problem, and losing the lottery corresponds to the toss coming up heads.

To quantify the problem, assume that there are \(n\) possible sets of lottery numbers and each set has an equal chance of being drawn. Therefore, the probability of winning the lottery is 1 in \(n\). Typically, \(n\) is a large number—e.g., one in a million.

The following notation is used below to indicate the conditions of the experiment:\[

\begin{aligned}

W &{}= \text{Beauty picked the winning set of numbers}\\

L &{}= \text{Beauty did not pick the winning set of numbers}\\

D_1 &{}= \text{It is the first day (Monday)}\\

D_2 &{}= \text{It is the second day (Tuesday)}

\end{aligned}

\]

With the problem defined, it is useful to consider the typical approaches that are used by the “thirders” to tackle a problem such as this. To this purpose, I have divided the majority of the thirders into three categories:

**The Waking Thirders** – This group fixates on the datum that Sleeping Beauty wakes up

**The Vegas Thirders** – This group evaluates probabilities by turning them into some sort of gambling proposition

**The Monday-Morning Thirders** – This group fixates on what happens on Monday morning

Note: The fact that I have divided the “thirders” into three categories is not intended to imply that they are evenly divided into these categories. In fact, many “thirders” use, or at least explore, more than one of these approaches to this problem, so the process of categorization is somewhat nuanced.

Little changes in the analysis provided by the “waking thirders.” Once again, they observe three states that Beauty can find herself when she wakes:

(1) Beauty lost the lottery and it’s Monday

(2) Beauty won the lottery and it’s Monday

(3) Beauty won the lottery and it’s Tuesday

Since it is impossible for her to distinguish between the three, they are all assumed to be equally as likely:\[

P(L \cap D_1) = P(W \cap D_1) = P(W \cap D_2)

\]Because these are the only possibilities, their probabilities must sum to one. Therefore, each state has a probability of 1/3, and since two out of the three states is a win for Beauty, they conclude that Beauty should believe upon waking that she has a 2/3 chance of having won the lottery.

For the “Vegas” crowd to be able to analyze this situation, additional assumptions are necessary, because they are interested in the payoff. Their analysis requires that we know how much she won to determine how likely she was to win.

Therefore, let’s stipulate that the winner is paid according to the odds of winning. That is, the winner of a lottery with a one-in-a-million chance of winning is paid $1 million on a $1 lottery ticket. (The winnings will be taxed, of course, which is how the state will pull in its take, but that’s not relevant to this analysis.) Furthermore, we must assume that the possibility of two or more people selecting the same set of numbers is either so unlikely as to be negligible or that it is simply not allowed by the rules.

Beauty chooses her “lucky” numbers on Sunday night. Each time she wakes, she is given an opportunity to purchase a lottery ticket. To avoid the possibility of having to split the winnings, we can posit that the lotteries on Monday and Tuesday are separate lotteries, which use the same numbers that were drawn on Sunday. Therefore, it is possible for Beauty to win the entire lottery prize twice.

Note that if \(n = 2\), this problem reduces to the problem of the flipped coin, with Beauty betting on tails. Therefore, this lottery problem can be considered to be a generalization of the original Sleeping Beauty Problem, and the reasoning considered here is a generalization of the reasoning used by “thirders” for the original problem.

To make this problem concrete, a “Vegas thirder” would suggest something like a $1 lottery ticket for a $1 million prize with a 1-in-a-million chance of picking the winning number. Then, the “thirder” would consider the situation in which the “experiment” (lottery drawing) is repeated many, many times—say, one million times so that we can expect that Beauty picks the winning number once.

With the conditions in place, the totals can be compiled. If Beauty buys a lottery ticket every time she wakes, then she will have spent $1,000,001 in lottery tickets (because she would have bought an additional ticket on Tuesday the time that she picked the winning number), and she will have won $2,000,000. The “thirder” then calculates the odds of winning the lottery as\[

\begin{aligned}

\text{odds} &{}= (\text{amount won}):(\text{amount lost})\\

&{}= 2,000,000:999,999 \approx 2:1

\end{aligned}

\]This is then interpreted as 2:1 odds of picking the winning number or a probability of \(P(W) = 2/3\).

The answer given by the “Monday-morning thirders” is different than the answer given by the other two groups, because this group actually considers the probabilities of the random number generator (i.e., coin toss in the original problem) in their arguments. They reason that, since Beauty must wake on Monday regardless of the result of the lottery, her chance of having the winning number on Monday are\[

\begin{aligned}

P(W|D_1) &{} = \frac{1}{n}\\

P(L|D_1) &{} = \frac{n-1}{n}

\end{aligned}

\]They note that the probability of Beauty picking the wrong number is\[

P(L) = P(L|D_1)\cdot P(D_1) + P(L|D_2)\cdot P(D_2)

\]Since Beauty will not be woken on Tuesday (\(D_2\)) if she did not pick the winning number, \(P(L|D_2) = 0\). Therefore,\[

P(L) = P(L|D_1)\cdot P(D_1) = \frac{n-1}{n} P(D_1)

\]So the answer depends on the probability of the day being Monday. Since there are three indiscernible states in which Beauty can wake and two of these states occur on Monday, the “thirder” concludes that \(P(D_1) = 2/3\). Thus, the probabilities associated with Beauty picking the winning number are\[

\begin{aligned}

P(L) &{} = \frac{2n – 2}{3n}\\

P(W) &{} = 1 – P(L) = \frac{n + 2}{3n}

\end{aligned}

\]

In the limit that \(n\) becomes large,\[

\begin{aligned}

P(L) &{} \sim 2/3\\

P(W) &{} \sim 1/3

\end{aligned}

\]This is closer to the correct answer than the other two arguments, but it is still incorrect.

The correct set of probabilities is given below.\[

\begin{aligned}

P(W) &{}= \frac{1}{n}\\

P(L) &{}= \frac{n – 1}{n}\\

P(D_1|L) &{}= 1\\

P(D_2|L) &{}= 0\\

P(D_1|W) &{}= 1/2\\

P(D_2|W) &{}= 1/2\\

P(W \cap D_1) &{}= P(W) \cdot P(D_1|W) = \frac{1}{2n}\\

P(W \cap D_2) &{}= P(W) \cdot P(D_2|W) = \frac{1}{2n}\\

P(L \cap D_1) &{}= P(L) \cdot P(D_1|L) = \frac{n-1}{n}\\

P(L \cap D_2) &{}= P(L) \cdot P(D_2|L) = 0\\

P(D_1) &{}= P(W) \cdot P(D_1|W) + P(L) \cdot P(D_1|L) =

\frac{2n-1}{2n}\\

P(D_2) &{}= P(W) \cdot P(D_2|W) + P(L) \cdot P(D_2|L) =

\frac{1}{2n}\\

P(W|D_1) &{}= \frac{P(D_1|W) \cdot P(W)}{P(D_1)} =

\frac{1}{2n – 1}\\

P(L|D_1) &{}= \frac{P(D_1|L) \cdot P(L)}{P(D_1)} =

\frac{2n – 2}{2n – 1}\\

P(W|D_2) &{}= \frac{P(D_2|W) \cdot P(W)}{P(D_2)} = 1\\

P(L|D_2) &{}= \frac{P(D_2|L) \cdot P(L)}{P(D_2)} = 0

\end{aligned}

\]These results are consistent with the original problem, which corresponds to \(n = 2\), \(W = T\), and \(L = H\).

By changing the probabilities of the outcomes of the event that determines the number of times that Beauty wakes, the flaws in the reasoning commonly used by the “thirders” become obvious.

My advice to the “thirders” is the following: Try to find some drug that produces limited memory loss, then go buy a lottery ticket—the larger the payout the better. With the help of a friend, you can go to sleep on the night of the drawing confident that when you awake, you’ll have between a 33% and a 67% chance (depending on your reasoning) of being a lottery winner and becoming a new multimillionaire. Good luck!

Note: This entry uses some free clip art from openclipart.org and cliparts.co.

]]>

The Sleeping Beauty Problem is usually described as follows:

Beauty is going to be the subject of an experiment that will take place over three days. On Sunday, Beauty will be told the plan for the experiment, and then given a drug and put to sleep. The drug will cause her to sleep until Wednesday, but the experimenters plan to wake her up some number of times. To decide how many times to wake her up, they toss a coin. If the coin lands heads, they will wake her on Monday briefly, and then return her to sleep. If the coin lands tails, they will instead wake her briefly on both Monday and Tuesday. After they put her back to sleep on Monday, they will erase her memory, so that upon waking on Tuesday, her last memories will be having gone to sleep on Sunday. The circumstances of her wakings are identical, so she cannot tell from her environment what day it is or which way the coin landed. What should Beauty think about the probability of heads and tails upon awakening?

Alternatively, the question that is often asked is, “What is the probability that the coin came up heads?”

A popular, incorrect answer to this problem is that Beauty should believe that, because she woke up, the fair coin is twice as likely to have come up tails as to come up heads. Since this results in a probability of heads of 1/3, the people who argue this point of view are often called “thirders.”

In the most naive argument of this type that is put forward, the proponent points out that Beauty can wake in one of three states:

(1) The coin toss was heads and it’s Monday

(2) The coin toss was tails and it’s Monday

(3) The coin toss was tails and it’s Tuesday

The proponent then (implicitly or explicitly) assumes that all three states are equally likely to occur—i.e., each occurs with the same probability as the others. Since the coin came up heads in only one of the three states, he or she concludes that the probability of heads is 1/3.

More complicated, but equally wrong, arguments are addressed below.

Here we will not only provide the correct answer to the problem (which is trivial) but also rigorously explore the estimates of probabilities that Beauty should assign to other parts of the experiment based on what information she has available.

In the interest of conserving space, let’s introduce the following notation for conditions of the experiment:\[

\begin{aligned}

H &{}= \text{The coin came up heads}\\

T &{}= \text{The coin came up tails}\\

D_1 &{}= \text{It is the first day (Monday)}\\

D_2 &{}= \text{It is the second day (Tuesday)}

\end{aligned}

\]The notation used above for Monday and Tuesday readily generalizes to the scenario proposed by Nick Bostrom whereby Beauty is wakes for many, many (e.g., a million) days if the coin comes up tails (more below).

The assumption that the coin is fair implies\[

\begin{aligned}

P(H) &{}= 1/2\\

P(T) &{}= 1/2

\end{aligned}

\]If the coin comes up heads, Beauty will be woken on Monday, but not on Tuesday. This means\[

\begin{aligned}

P(D_1|H) &{}= 1\\

P(D_2|H) &{}= 0

\end{aligned}

\]If the coin comes up tails, Beauty will be woken on both Monday *and* Tuesday. She does not know which day it is when she wakes, but she knows that if tails shows up, she will be woken on *both* days, so in this situation the frequency of Monday and Tuesday awakenings will be the same. Therefore, they have an equal probability:\[

\begin{aligned}

P(D_1|T) &{}= 1/2\\

P(D_2|T) &{}= 1/2

\end{aligned}

\]

From that, the grid of all four possibilities falls out:\[

\begin{alignedat}{2}

P(H \cap D_1) &{}= P(H) \cdot P(D_1|H) & & {}= 1/2\\

P(H \cap D_2) &{}= P(H) \cdot P(D_2|H) & & {}= 0\\

P(T \cap D_1) &{}= P(T) \cdot P(D_1|T) & & {}= 1/4\\

P(T \cap D_2) &{}= P(T) \cdot P(D_2|T) & & {}= 1/4

\end{alignedat}\]

Let’s get one thing out of the way, Beauty is *always* going to wake up:\[

P(\text{wake}) = P(\text{wake}|H) = P(\text{wake}|T) =

P(\text{wake}|D_1) = P(\text{wake}|D_2) = 1

\]So the fact that she wakes and is “in the moment” doesn’t give her or us any more information than she started with. Since Beauty has not been provided with any new information, the probability of the coin coming up heads remains \(P(H) = 1/2\), which is the *right* answer to the problem.

With that out of the way, we can use the information above to determine the probability of Beauty waking on a Monday,\[

P(D_1) = P(H) \cdot P(D_1|H) + P(T) \cdot P(D_1|T) = 3/4

\]and the probability of it being a Tuesday,\[

P(D_2) = P(H) \cdot P(D_2|H) + P(T) \cdot P(D_2|T) = 1/4

\]If Beauty is informed that she has risen on a Monday, then she can reevaluate the probability of the coin landing on heads or tails by applying Bayes’s Theorem:\[

\begin{alignedat}{2}

P(H|D_1) &{}= \frac{P(D_1|H) \cdot P(H)}{P(D_1)} & &{}= 2/3\\

P(T|D_1) &{}= \frac{P(D_1|T) \cdot P(T)}{P(D_1)} & &{}= 1/3

\end{alignedat}

\]Similarly, if she is told that it is a Tuesday, then she knows for certain that the coin toss was tails:\[

\begin{alignedat}{2}

P(H|D_2) &{}= \frac{P(D_2|H) \cdot P(H)}{P(D_2)} & &{}= 0\\

P(T|D_2) &{}= \frac{P(D_2|T) \cdot P(T)}{P(D_2)} & &{}= 1

\end{alignedat}

\]

Although the flaws in logic employed by the “thirders” encompass a wide range of fallacies, a couple of mistakes appear quite frequently. Here we’ll examine why they are wrong.

The mistake that most “thirders,” including the original guy who published the first article on this, make is to reason that, since she *must* wake up on Monday—regardless of the coin flip—we must consider her situation at *that* time. The usual explanation that is given is that the experimenters might not have flipped the coin until *after* Beauty had been woken on Monday, because the coin doesn’t affect that part of the experiment. The coin toss determines only whether Beauty has her memory erased and is put back to bed. Nothing in the experiment requires that the coin is flipped before Beauty is woken and asked about the result of the coin toss.

They then argue that it is ridiculous, in the situation where Beauty is asked on Monday about the probability of heads on a coin toss that has not yet happened, for her to say anything but 1/2. In one sense, they have a valid point. If Beauty is woken and told that it is Monday *and* that the coin toss hasn’t yet occurred, then she should reasonably conclude that the probability of the (future) coin toss coming up heads is 1/2. However, this is *not* the question that is being asked in the problem.

By focusing on the result of a coin toss on Monday—regardless of whether the coin has actually been tossed or not—the “thirders” have eliminated the possibility that the coin toss could have resulted in Beauty waking on Tuesday. In their line of reasoning, it’s simply not possible. Therefore, the coin toss that they are considering is *not* the coin toss that Beauty is asked about in the problem.

Following up on this mistake by applying it to probability calculations, they then (erroneously) reason that\[

\begin{aligned}

P(H|D_1) &{}= 1/2 \\

P(T|D_1) &{}= 1/2

\end{aligned}\rlap{\qquad\text{(wrong)}}

\]and work backward through Bayes’s Theorem to conclude the wrong answer:\[

P(H) = 1/3 \rlap{\qquad\text{(wrong)}}

\]

The source of their error is that they have changed the definition of the problem to a different problem in which the day is always Monday:\[

\begin{aligned}

P(D_1) &{}= 1 \\

P(D_2) &{}= 0

\end{aligned}

\]Naturally, the coin toss can have no effect on this:\[

\begin{alignedat}{2}

P(D_1|H) &{}= P(D_1|T) & &{}= 1 \\

P(D_2|H) &{}= P(D_2|T) & &{}= 0

\end{alignedat}

\]It is important to note that the claim made above about the probability of heads and tails on Monday, which is wrong in the original problem, is *correct* in this new, different problem,\[

\begin{alignedat}{2}

P(H|D_1) &{}= \frac{P(D_1|H) \cdot P(H)}{P(D_1)} & &{}= 1/2\\

P(T|D_1) &{}= \frac{P(D_1|T) \cdot P(T)}{P(D_1)} & &{}= 1/2

\end{alignedat}

\]but *only* if \(P(H) = P(T) = 1/2\), so even their modified problem doesn’t demonstrate their claim that the probability of heads is 1/3. When the assumptions are explicitly stated and the math is done correctly, it merely reinforces the proof that the probability is 1/2.

The other mistake that is commonly encountered in “thirder” reasoning is to treat the Sleeping Beauty problem as a betting proposition. There are several variants as to how the wager is presented—whether only one bet or multiple bets are considered, whether odds are given on the bet, the amount of the bet, etc.—but they all suffer from the same critical error. There is even confusion over what constitutes a “bet.” While it is generally agreed that Beauty bets on the result of the toss of the coin *every* time she wakes up, some people count the number of bets as the number of times the coin is tossed \(n_{\text{t}}\), which can be controlled, since it’s the number of times Beauty goes to sleep on Sunday. Meanwhile, other people consider every waking to be a separate bet, even though this number is randomly determined by the coin tosses. The best that can be said is that the number of times Beauty wakes \(N_{\text{w}}\) is expected to be\[

E(N_{\text{w}}) = n_{\text{t}} P(H) + 2 n_{\text{t}} P(T)

= 3 n_{\text{t}} / 2

\]That is, Beauty can be expected to wake about 50% more than the number of times that she was put to sleep on Sunday, but the exact number in each series of experiments depends on the outcome of the coin tosses.

To avoid this ambiguity, let’s define “toss” to be every time the coin is tossed (and Beauty is put to sleep on Sunday) and “bet” as every time she has a chance to win or lose money (which is every time that Beauty wakes and is interviewed). Beauty has two options. She either can decide what she will bet when she wakes up each morning, or she can decide what to bet when she goes to sleep on Sunday and consistently place the same bet each time she wakes. Either way, her expected winnings will be the same. She can decide always to bet heads, always to bet tails, or to bet heads randomly at a specified fraction of the time \(f_{\text{H}}\).

Her expected winnings are\[

E(W) = f_{\text{H}}\cdot\bigl(P(H) – 2P(T)\bigr) +

(1 – f_{\text{H}})\cdot\bigl(2P(T) – P(H)\bigr)

\]which becomes\[

E(W) = (2f_{\text{H}} – 1)\cdot\bigl(P(H) – 2P(T)\bigr)

\]So if Beauty always bets heads (\(f_{\text{H}} = 1\)),\[

E(W) = -1/2

\] and if Beauty always bets tails (\(f_{\text{H}} = 0\)),\[

E(W) = 1/2

\]If she bets heads only part of the time (\(0 < f_{\text{H}} < 1\)) then the expected winnings vary linearly with \(f_{\text{H}}\) between these two extremes.

So obviously, Beauty’s best strategy that maximizes her winnings is to always bet tails. For every, two dollars she offers to wager, she is expected to win one, which leads the “thirders” to mistakenly conclude that the odds of the coin coming up heads is 1 to 2 or that heads has a probability of \(P(H) = 1/3\).

However, this is not the case. If she seals in her bet (or bets, in case of tails) on Sunday by always betting either heads or tails, she is making a wager with equally likely results on the toss, but an *uneven* payoff on the result of the toss. The payoff table looks like the following:

Bet Heads | Bet Tails | |

Result Heads | Win 1 | Lose 1 |

Result Tails | Lose 2 | Win 2 |

The “thirders” have confused betting odds with statistical odds. They are not the same thing. Therefore, a wise Beauty can make easy money by always betting tails and taking the two-to-one payoff, but she should not delude herself that the probabilities of the coin toss are anything but even.

Philosopher Nick Bostrom has proposed a scenario—possibly inspired by the movie *Groundhog Day*—in which Beauty, instead of being woken only on Monday and Tuesday when the coin comes up tails, is woken many, many times—perhaps as many as a *million* times. (Never mind that a person who lives to be 100 years old will have less than 37,000 days on which to wake up. This is a philosopher talking.)

The idea behind this argument is that the set of possible days on which Beauty can wake up is overwhelmed by all of these additional days making it “more absurd” for someone to argue that the *one* day with heads out of all of these possible days should be so frequent. This is nonsense, of course.

The coin will come up heads 1/2 of the time, and the experiment will end on Monday every time this happens. The other half of the time, Beauty will be woken on one of the days, each of which is no more likely than the rest without additional information. Thus, if there are \(n_{\text{T}}\) such days, then\[

P(D_i|T) = 1/n_{\text{T}}, \quad i = 1, 2, \ldots, n_{\text{T}}

\] So the problem generalizes as follows\[

\begin{alignedat}{2}

P(D_1|H) &{}= 1\\

P(D_i|H) &{}= 0, \quad & i &{}= 2, 3, \dots, n_{\text{T}} \\

P(D_i|T) &{}= 1/n_{\text{T}}, \quad & i &{}= 1, 2, \dots, n_{\text{T}} \\

\end{alignedat}

\]and\[

\begin{alignedat}{3}

P(H \cap D_1) &{}= P(H) \cdot P(D_1|H) & & {}= 1/2 & & \\

P(H \cap D_i) &{}= P(H) \cdot P(D_i|H) & & {}= 0, & \quad

i &{}= 2, 3, \dots, n_{\text{T}} \\

P(T \cap D_i) &{}= P(T) \cdot P(D_i|T) & & {}= \frac{1}{2n_{\text{T}}}, & \quad

i &{}= 1, 2, \dots, n_{\text{T}}

\end{alignedat}\]

There is nothing in this more general problem to imply anything other than \(P(H) = 1/2\).

Finally, there are a few people who claim that the framing of the problem is too ambiguous to evaluate the probability of the coin landing on heads. I don’t quite understand how they have confused themselves enough to reach this conclusion, but they are clearly wrong, as has been demonstrated above.

The correct answer to the Sleeping Beauty Problem is that Beauty has received no additional information upon waking. Therefore the probability that the coin has or will come up heads is 1/2, which is exactly the same as it was when she went to sleep on Sunday. A fair coin is, after all, a fair coin.

]]>In the spirit of Thanksgiving, and wanting to take a break from reading, thinking, and writing about nuclear energy, I’m offering my tried and true cooking instructions for something completely different.

By Sunday night you will be stuffed, fed up, literally, and figuratively, with turkey. Instead of food fit for pilgrims, try food invented in the wide open west—chili. Cook this dish on Saturday. Eat it on Sunday.

These instructions take about an hour to complete. This chili has more vegetables and beans than some people might like, but we’re all trying to eat healthy. Although the name of this dish has the word “nuclear” in it, it isn’t that hot on the Scoville scale. If you want some other choices for nuclear chili there are lots of recipes on Google

The beer adds sweetness to the vegetables, as does the brandy, and is a good for cooking generally. In terms of the beer, which is an essential ingredient, you’ll still have five cans or bottles left to share with friends so there’s always that.

However, I recommend Negra Modelo for drinking with this dish and Budweiser or any American pilsner for cooking it. Alternatives for drinking include local western favorites, Moose Drool or Black Butte Porter, and regional amber ales Alaskan Amber, Fat Tire, or Anchor Steam. Do not cook with “light” beer. It’s a very bad idea.

Scoville, Idaho, is the destination for Union Pacific rail freight for the Idaho National Laboratory (INL) way out on the Arco desert.

There is no town by that name, but legend has it that way back in the 50s & 60s, when the place was called the National Reactor Testing Station, back shift workers on cold winter nights relished the lure of hot chili hence the use of the use of the name “Scoville” for shipping information.

Overnight temperatures on the Arco desert can plunge to −20 °F or more. Unfortunately, the guys running the reactors couldn’t drink beer, but they did have coffee. It’s still that way today.

This is “second-day chili.” That means after you make it, put it in the unheated garage to cool, then refrigerate it, and reheat the next day. The flavors will have had time to mix with the ingredients, and on a cold Idaho night what you need that warms the body and the soul is a bowl of hot chili with fresh, warm cornbread on the side.

If you make a double portion, you can serve it for dinner over a hot Idaho baked potato with salad. Enjoy.

1 lb chopped or ground beef (15% fat)

1 large onion

1 sweet red pepper

1 sweet green pepper

10–12 medium size mushrooms

1 can pinto beans (plain, no “sauce”)

1 can black beans

1 can chopped tomatoes

1 can small, white “shoepeg” corn

1 12 oz can beer

1 cup hot beef broth

1 tablespoon cooking brandy

2 tablespoons finely chopped jalapeno peppers

2–4 tablespoons red chili powder

1/2 teaspoon black pepper

1/2 teaspoon salt

1/2 teaspoon coarse powdered garlic

1/2 teaspoon cumin

- Chop the vegetables into small pieces and brown them in cooking

oil. Add 1 tablespoon of cooking brandy near the end. Drain thoroughly. - Brown the meat separately and drain the fat.
- Combine all the ingredients in a large pot. Be sure to drain the

beans, and tomatoes before adding. Simmer slowly for at least

60–120 min. Stir occasionally. - Set aside and refrigerate when cool.
- Reheat the next day. Serve with cornbread. Garnish with shredded

sharp cheddar cheese.

Feeds 2–4 adults.

]]>Although I recommend stirring the drink and serving it straight up, when I first used to enjoy this drink, while living at 4 East Range in the “Academical Village,” I used to drink it on the rocks for practical reasons. Those venerable rooms at UVa don’t even provide a toilet, much less a kitchen or wet bar, so I had only a compact, dorm-sized fridge to provide the limited amount of ice that I had available for making drinks.

The recipe is as follows:

- 8 parts gin
- 1 part French vermouth
- a dash of orange bitters
- one tomolive

Mixed (preferred):

In a mixing glass half filed with cracked ice, add a dash of orange bitters. Then add the vermouth and gin. Stir until cold. Strain into a chilled cocktail glass. Add the tomolive and serve.

Shaken:

Fill a chilled cocktail glass with ice and vermouth. In a shaker filed with cracked ice, add the dash of orange bitters and the gin. Shake until cold. Dump the ice and vermouth from the glass, and strain the shaker into the glass. Add the tomolive and serve.

]]>It is understandable that an activist would try to exploit this statistic. Most obviously, it emphasizes that recent global temperatures have been relatively high (where “high” corresponds to an increase of less than one degree Celsius over a 100-year period). The real purpose of repeating this factoid, however, is that it confuses and charms the numerically unsophisticated, leading them to assume that such a concentration of unprecedented, elevated temperatures in recent times is highly unlikely—unless some underlying cause is responsible.

This is quite misleading, however. In fact, it is not difficult to demonstrate that a relatively simple statistical model can account for this result, without requiring any bias toward warming.

The basic mistake that most people make when hearing about the recent number of warmest years is that they assume that, unless there is some underlying trend causing the warming, all years in the record are the same, and each is equally as likely to be one of the warmest. To have 10 out of 12 to be at the top of a list of 134 temperatures intuitively seems to be extremely unlikely, and indeed it is. Under these (incorrect) assumptions, the probability of getting 10 of the hottest years in the last 12 years is given by the hypergeometric distribution:\[ P(X=10)=\frac{{12\choose 10}{132\choose 2}}{{134\choose 12}} \]where the parentheses denote the binomial coefficient\[ {n\choose k}\equiv\frac{n!}{k!(n-k)!} \]The probability of 10 or more hottest years is 1 in 86 billion—very unlikely indeed!

The problem, however, is that each year’s temperature is not an independent random variable. This is clear from looking at the temperature record. For example, consider the annual global land and ocean surface temperature anomalies, which have been obtained from NOAA’s National Climatic Data Center. This temperature record is shown below. The points give the yearly global average temperature. The smooth curve is the five-year running average.

Although the temperatures in the record bounce up and down from year to year, each year’s temperature clearly depends on the temperature of the years immediately before it. That is, this time series is *autocorrelated*. In fact, the series clearly resembles a random walk.

This is even more clear when considering the change in temperature anomaly from year to year, which is shown below.

Unlike the temperature record itself, the differences show no obvious trend. They appear to be distributed as (mutually independent) white noise. A Q-Q plot, shown below, appears to indicate that these differences are normally distributed, which is confirmed by a Shapiro-Wilk test for normality (\(p=0.43\)).

Thus, the differences constitute a normally distributed set with a sample mean of \(0.0058\) and sample variance of \(0.0096\). The slightly positive mean corresponds to the upward trend that is observed in the temperature record. Over the entire record, it results in a \(0.0058\) °C/year average increase in temperature. Nevertheless, it should be kept in mind that this does not imply that the mean of the underlying distribution is this value or even that it is positive. The estimate for the population mean, including the standard error, is \(\bar x\pm s/\sqrt{n}\), where \(\bar x\) is the sample mean, \(s\) is the sample standard devation, and \(n\) is the size of the sample. In this case, the estimate for the mean of the underlying distribution is \(0.0058\pm 0.0085\), a range that includes zero at the 1-\(\sigma\) level. Thus, we cannot reasonably conclude from these data that there is a positive bias in the random walk process describing this temperature record.

So let’s consider the possibility that the temperature record is a result of a random walk process with no bias. The statistical model that describes this process is\[ T_n = T_{n-1} + \epsilon^{(n)} \]where \(T_n\) is the temperature anomaly of the \(n\)-th year in the series, and \(\epsilon^{(1)},\ldots,\epsilon^{(n)}\) is a series of normally distributed random variables with \(\epsilon\sim\mathcal N(0,\sigma^2)\). Although this model can be studied analytically, it lends itself very well to Monte Carlo simulation. For the results discussed here, I use a model in which the variance of the random term is \(\sigma^2 = 0.0096\), the sample variance of the NOAA temperature anomaly data.

When a model of red noise—the random walk described above—is used to generate a simulated 134-year temperature record, 10 or more of the last 12 years in the record end up being the “hottest” years on record about 7% of the time. Thus, while it is still unlikely with this model to have such a high concentration of warm years in recent times, it is far more likely (by nine orders of magnitude) than what the intuitive, but naive, assumption of completely independent yearly temperatures indicates.

It is common, when applying the techniques of statistical inference, to judge a set of empirical data by calculating the probability that such a set of data could appear purely by chance under the assumption that no real relationship exists. The conventional point at which something is considered, not necessarily true, but merely “significant” and worthy of further investigation, is if the probability of chance producing the observed result is less than 5% (i.e., a one-in-twenty chance). By these standards, the claim that 10 of last 12 years are the hottest on record doesn’t qualify as statistically interesting.

A more reasonable stochastic model, however, is one that combines red noise and white noise. In other words, it is a random walk with some additional “error” added to the final result. The statistical model can be described as\[ T_n = a r_n + b \epsilon_{\text w}^{(n)} \]where \(\epsilon_{\text w}\sim\mathcal N(0,\sigma^2)\) is the white-noise random variable, \(r_n\) is the red-noise term,\[ r_n = r_{n-1} + \epsilon_{\text r}^{(n)} \]with \(\epsilon_{\text r}\sim\mathcal N(0,\sigma^2)\) as the red-noise random variable, and \(a\) and \(b\) are the coefficients that determine the relative importance of each term. With this model, the annual change in temperature is\[ \Delta T_n = T_n – T_{n-1} = a \epsilon_{\text r}^{(n)} + b\bigl[\epsilon_{\text w}^{(n)} + \epsilon_{\text w}^{(n-1)}\bigr] \] Since this is a sum of three normally distributed random variables, \(\Delta T_n\sim\mathcal N\bigl(0,(a^2+2b^2)\sigma^2\bigr)\), and so, for the variance of these temperature differences to be \(\sigma^2\), the coefficients \(a\) and \(b\) must satisfy the following relation:\[ a^2 + 2b^2 = 1 \]

Numerical experiments indicate that the combination of red and white noise that is most likely to produce a shape that is similar to NOAA’s temperature anomaly record has a value of \(a\) that is between \(0.2\) and \(0.4\). For example, not very many tries were required to produce the following series (shown in red), which was generated with \(a=0.4\):

This series of randomly generated points is rather similar to the temperature series, which is shown in black.

Above, I have demonstrated that claims such as 10 out of the last 12 years are the warmest on record are not very impressive. Such a situation has a reasonable probability of resulting from a simple, unbiased random walk. Therefore, although such claims serve as a reminder that recent years have been warmer than the slightly less recent past, they indicate almost nothing about the recent *trends* in global temperatures or what is to be expected in the future. An unbiased random walk is equally as lightly to trend down as it is to trend up, regardless of what it has done over the past 134 steps.

Naturally, it is quite possible that some steady upward trend does exist in the temperature record that biases the random walk to higher temperatures. This possibility would be consistent with the observed temperature record, making such a result more likely in the statistical models. Nevertheless, it is not surprising, since it is well known that temperatures in the modern era have been steadily trending upward from a minimum that occurred sometime in the seventeenth century—an era commonly referred to as the “little ice age.” (Whether this phenomenon was regional or global is a matter of debate.) In fact, there could be several naturally occurring cyclic trends that could be affecting the record, but there is no way to tell definitively from the series itself.

One think is certain, however. A dozen or so warm years in recent memory is very weak evidence of a deterministic warming trend. Without additional supporting evidence to say otherwise, such results could simply be the luck of the draw.

(Note: The R code used to generate the results discussed above is available for the reader’s amusement and edification.)

]]>