Category: Cards

Predicting the World Series using Python

By Rathan Haran, October 29, 2009 7:45 am

Last week, I’ve started to learn Python through a peer-to-peer learning session set up through nextNY.  The material that we’ve gone through has made learning programming very easy to wrap our heads around, and the environment of cooperative learning has been awesome.  I’m looking forward to being a Python ninja* pretty soon.

With four and half chapters of Python at my disposal, I wanted to put my skills to the test.  Since I’m a huge baseball fan, I thought I’d try my hand in simulating who would lose the World Series this year, a pillow-fight match-up between the New York Yankees and the Philadelphia Phillies.

The first thing to do was to crunch the numbers.  Crunching the numbers means exactly that, figuring out the probabilities of events occurring over a seven game series.  I incorporated things like Ryan Howard’s immense strike-out rate, Derek Jeter’s incredible lack of range at shortstop, and Brad Lidge’s ninth inning ERA.  I also made sure to incorporate correlations, or how related each variable is to each other.  Funny enough, the highest correlation I found was between having a runner on first base with less that two outs in the seventh inning onwards and Arod weakly grounding into a double-play.  Numbers never lie.

Now this got me a pretty good picture of who would lose the World Series, but I hadn’t taken into consideration the qualitative variables, the intangibles, the “Cole Hamels’ is a play-off pitcher” and the “Mariano is unhittable in the World Series” bullshit bullshit.  These are usually the ’statistics’ that overzealous fans throw out (with no meaningful data except their distorted memories) as their defense to a player’s immortality.

The classic intangible lies on the shoulders’ of the Yankee captain, Derek Jeter, a ball player that seems to find himself at the right place at the right time in the postseason.  Yankee fans have constantly spouted his ‘greatness’, and refuse to admit that he was horribly out of position on the Jeremy Giambi play at the plate, and doesn’t even register as having the highest batting average in a World Series (that designation goes to Billy Hatcher who hit a sickening .750 for the Reds in 1990 in 12 ABs).  Heck, Jeter doesn’t even deserve the nickname “Mr. November” for his play in the 2001 World Series.  He had 1 HR, 1 RBI, and 2 runs scored in November, numbers that were almost matched by a pitcher for the Arizona Diamondbacks (1 RBI and 2 runs scored).  Oh, and that pitcher also won two potentially series ending games in two days that November with a 2.22 ERA, .96 WHIP, 8Ks in 8.1 innings.  Derek Jeter, I’d like you to meet the real “Mr. November,” Randy Johnson.

Okay, so I wrote my little Python program to capture all of this.  The stats, the pseudo-stats, the Phillie Phanatic’s rants, and the countless times we’ll hear “26 World Series rings.”  With so many probabilities and interactions, this program chugged along for two days, and finally, yesterday before the first pitch, I got the result:  Value Error: Let’s Go Mets.

*Looking forward to the day when ninja is not used in start-up world employment searches and reverts back to its original awesomeness of stealthy nighttime assassin.

Blackjack, Basic Strategy, Battle of Wits – Part III

By Rathan Haran, August 5, 2009 12:28 pm

Have you ever been on a blackjack table and accidentally hit a hard 14 with the dealer showing a 5 while playing basic strategy?  Replay it in your mind, you bust on the King, dealer makes his 21 on a 6, the entire table gives you the death stare, curses your first born, all while mumbling under their breath, “Never hit on a hard 14 with the dealer showing a 5 idiot. That King was the dealer’s bust card. We all would have won.”  Tough room.

First things first, those people have no idea what they are talking about.  There is no such thing as “that was the dealer’s bust card.”  The deck doesn’t know whether the dealer or the player is hitting or staying and the cards don’t change because of how someone plays their hand.  The probabilities that guide basic strategy haven’t been altered because someone does not make the optimal play and theoretically the dealer still has the same likelihood of busting (in practice though, since the deck has a fixed amount of cards, the distribution of remaining cards changes the underlying probabilities of basic strategy.  Card counting attempts to exploit this by identifying random deck distributions that happen to have a large amount of 10-value cards remaining).

The important thing to remember here is that basic strategy gives you the probabilistically best play given that the deck has a RANDOM DISTRIBUTION OF CARDS.  That means that if the deck is not random, basic strategy might not be the optimal play.  So what everyone should consider before baptize themselves in the holy waters of basic strategy is what it takes to make a deck random (and who controls what it takes to make a deck random).

To make a single deck random, the deck must be riffle shuffled about 7 times.  Since suit doesn’t matter in blackjack, and K, Q, J, and 10 hold the same value, you actually need to shuffle a single deck less, about 4 times, to make it random.  Most casino blackjack tables play with 6 or 8 decks at once which are shuffled together and played from a dealer’s shoe.  In order to randomize a shoe of 8 decks, it takes about 12 riffle shuffles.  Does your casino shuffle a shoe 12 times?  Probably not.  Most casinos shuffle a shoe 4 times, and that has some interesting implications when an entire table is playing basic strategy.

So let’s take a look at what happens to a shoe when the entire table is playing basic strategy.  The first thing is that anyone that has a strong hand on their first two cards (17+) is instructed to stay, and their cards remain on the table until the deal is over.  Players with weak hands play out their hands, and if they bust, the cards are removed from the table and placed in the discard shoe.  This begins to create layers of cards in the shoe; clusters of low cards placed on the shoe first, followed by clusters of high cards that were left on the table.  Since most casinos do not shuffle the shoe enough times, these layers loosely exist in the new shoe, and are further propagated when the entire table plays basic strategy (some people attempt to exploit this by using a technique called cluster counting).

Clustering of cards creates decks that are not random, which is one of the critical assumptions that basic strategy is built on.  This creates opportunities for dealers to win/push more hands than basic strategy predicts.  During a high cluster deal, a dealer is likely to have high cards to push, or even beat, the tables “strong” 19s and 20s.  In the case where low card clusters are being dealt, a dealer will likely have a low up-card, a situation where basic strategy dictates to hit against until about 14.  The thing is that since it’s a low cluster deck, the dealer  has a better chance to make a hand!  The player also has a better chance to make a hand, but basic strategy actually advises them not to try.  INCONCIEVABLE!

Basic strategy is still by far the best way to reduce the house odds, but since decks are not completely random, there is certainly room for improvements in game play.  For example, in high cluster deck situations, it may be worthwhile to split face cards, while in low cluster situations, taking another card to try to make a better hand may be your best bet.  Playing this way may add a little bit more excitement to the rule-based approach of basic strategy as you’d be trying to exploit the rest of the table playing the basic strategy system.  And if it pisses anyone off at your table, just turn to them and say “You fell victim to one of the classic blunders!  The most famous is never get involved in a land war in Asia, but only slightly less well-known is this: never play basic strategy against a dealer when deck isn’t random!”  I’d get a kick out of that if I heard that on a blackjack table.

Panorama theme by Themocracy