Nearly everyone has a poor grasp of probability, including many professional game creators (and I very much include myself in this). I talked about procedurally generated content in games with some other designers tonight, and obviously randomness factors into that. And of course, chance in general plays a big part in many games. So rather than something more abstract, or about some game I've been playing lately, this is going to be a brief primer on the mathematics of probability. It's more that writing it will hopefully lodge it more firmly in my grey matter, but if anyone else gets something out of this, that's great too.
The difficulty in really understanding probability is compounded by at least two factors. One, pure randomness almost never exists in nature, so having an event so controlled and discrete as to have the probability of its outcomes even be calculable is tremendously rare. Even devices we manufacture to be random turn out to be quite biased (e.g. the rounded corners of Warhammer-style dice make them almost twice as likely to roll a 1).
Two, humans are really good at pattern matching. Honestly, too good. Our brains are wired to seek out patterns, even when they aren't there. Appropriately enough, one manifestation of this is called the Gambler's Fallacy. Basically this describes the belief that after a "streak" of one particular outcome, another outcome is more likely. So if you've been tossing a fair coin and it's come up heads twenty times in a row, tails is "due" to come up. But of course the coin has no statefulness or memory of past events. The likelihood of tails versus the 21st heads is the same as it always was, 50/50.
The reasoning behind most probabilistic math is actually quite simple. It's simply a matter of enumerating all the possible outcomes and either counting the ones you want or discarding the ones you don't. Using the perennial example then, what's the probability of a six rolling a d6 once? Obviously, 1/6. What's the probability of rolling two sixes in a row? It's just the likelihood of the first event (1/6) being following by the second event (also 1/6), or 1/36.
Other possibilities are slightly more complicated to explain, so I'll just mention one more common case and leave the rest to one of countless Internet explanations better diagrammed than this. Often we care about the likelihood of some event occurring at least once in some set number of tries. So what's the probability of at least one six in two die rolls? Well, the probability of not getting a six on the first is 5/6 and the possibility of that also being followed by a non-six is also 5/6. Multiplying these together we get the probability of no sixes occurring, so subtracting that from 1 gives the probability of at least once six. Probability = 1 - (5/6 * 5/6), or about 30.5%.
Note that it is not 1/6 * 2, which would be 33.3%. Seems like a small difference, but imagine this relatively mundane situation. Creating some RPG, our wayward designer sets up a quest where player must kill Greyfang Gnolls to retrieve a Stolen Senatorial Standard. Any gnoll has a 1/3 chance of dropping the standard. Thinking it would be appropriate to add "twice that many" gnolls, our designer puts 6 gnolls in an area, thinking almost all the players will get the standard by the time the sixth gnoll is dead. But what's the actual probability of the standard dropping after killing six gnolls?
As above, the probability of the standard not dropping is 2/3. Multiply that by itself six times for all the gnolls and subtract it from 1, or p(standard drop) = 1 - ( (2/3) ^ 6) = ~91.2%. That seems good, except imagine if our designer's game is successful, netting one million players. Of them, about 88,000 will kill all the gnolls and never receive the standard. To them, this will almost certainly be indistinguishable from a bug, or at best, a misleading quest description.
That is exactly the danger of probability. Something with a high probability seems like "it should occur" except for some amount of people, it won't. Setting an event to occur at a very low probability still means some fraction of players will still experience it. We want probability to behave as if we're setting bounds on the likelihood of something occurring, but that's not how straight random events work. They're stateless (again, see the Gambler's Fallacy), which means the likelihood of all sequences of outcomes is exactly the same. You're as likely to get 20 heads in a row as you are 20 tails.
This is nothing new either. Old Dungeons & Dragons random loot tables would often include one or two absurdly overpowered items in the far upper reaches of the chart, acting as if putting the "Ring of Three Wishes" at 1/1000 means it's rare. For someone rolling on that table once, getting the Ring of Three Wishes is just as likely as any other thing on the chart. So one out of a thousand gaming groups would have to deal with an extremely unbalancing item. Given that hundreds of thousands of people might be rolling on the table dozens of times, it's far more common to occur that you might think just looking at the "1/1000" likelihood.
Even some very, very smart digital game developers have made this mistake. Team Fortress 2 had its random loot system revamped earlier this year, as they detail here. The key line: "Previously, we rolled randomly at intervals to see if you got an item drop. Now we roll to determine when your next item drop will occur." They changed their algorithm to provide what they really wanted in the first place, item drops to occur randomly within some bounds.
On DeathSpank, we initially made the same mistake. Loot drops were simply a flat random chance, but that meant a player was as likely to kill 50 greems and get a loot drop from every one as they were to kill 50 greems and never get a single piece of loot. (This isn't quite true, there were other factors in determining when to drop loot, but it was still a very real problem)
The solution I proposed and implemented was to change the model of loot drops from being basically a roll of the dice ("boxcars means you get loot") to drawing from a deck of cards ("ace of spades means you get loot"). The difference being the cards don't go back into the deck after they're drawn. You don't know when you're going to get the ace of spades, but sometime in the next 52 draws it's guaranteed. The most often you'll see loot is back-to-back (last card in one deck, first card after the shuffle) but in this case, there will be a 51 card gulf on each side. And in the inverse situation, the longest you'll go between loot drops is 102 cards.
Again, this is a gross simplification, but it's basically the chance we (and by the sound of it, Valve) made to make our random distribution of loot match the player's expectations. And that's really the heart of it- only on the rarest of occasions do we actually want true randomness. Possessing at least a basic understanding of probability is an important skill for almost anyone interested in creating games. Used correctly, probability can be a fantastic tool but used incorrectly, it can create some seriously unsatisfying experiences. Even with something as mundane as dice, things are not always what they seem. Many suddenly penniless Las Vegas visitors can attest to that.