Long Tall Sally: What are my chances?

Recently @calumvs remarked that, on an iPod with 2000 songs, he heard Long Tall Sally by the Beatles (LTSB), followed by Long Tall Sally by Little Richard (LTSR).

What, he mused, are the odds of that happening – the “Long Tall Sally Calumdrum”?

This is a bit more of an interesting problem than it first appears; the answer is very sensitive to small changes in the way you ask the question, and to apparently small changes in the circumstances.

I’ll start with the easiest first: how are the two tracks selected?

If you were to select one track at random, then the chances of it being LTSB is 1 / 2,000. To make the pair, you then need to draw LTSR, again 1 / 2,000.

So, with two totally independent draws, your odds are

(1 / 2,000) x (1 / 2,000) = 1 / 4,000,000

Easy.

But the chances are a bit better than that if we don’t care about the order. We can pick either at the outset, giving us a 2 / 2,000 = 1 / 1,000 chance of getting the first one, but then we absolutely have to get the second one correct, for total odds of

(1 / 1,000) x (1 / 2,000) = 1 / 2,000,000

And this seems justified in the overall context of the question; I dare say the feisty Scottish curler would be just as surprised to hear LTSR followed by LTSB.

So that’s the easiest case dispensed with. But it ignores something we all “know” about MP3 players; they don’t really do random.

By this, I don’t mean that their random number generation is dodgy (although it probably is), or that some clever “genius” algorithm is in play to generate interesting playlists (and in fact I’m assuming it isn’t – otherwise all bets are off).

What they almost all do is to create a shuffled playlist, and then proceed sequentially through that playlist; they don’t create a huge playlist and then skip around it at random. And these are two very different behaviours.

How does it affect the odds? Well again, “it depends” on the question you pose.

Simplest case first again. You create a random playlist of all 2,000 tracks. (Bear in mind that typically you don’t consciously do this. You usually ask for “Shuffle All” or something similar)

Now, what are the odds that you’ll immediately hear LTSB followed by LTSR, or LTSR followed by LTSB?

For the first track, the odds are 2 / 2,000 = 1 / 1,000. But for the second track, we only have 1,999 eligible tracks left, reducing the odds for the second track from 1 / 2,000 to 1 / 1,999. This gives you an overall probability of

(1 / 1,000) * (1 / 1,999) = 1 / 1,999,000.

So it gets a bit less improbable, but not much.

And we can do much better; this is the point where the exact circumstances become very important to the question.

The artificial example above assumes that we want to know the odds of these two tracks following one another straight out of the trap on a fresh shuffle. But this isn’t usually how these things are noticed; you just hear two songs that apparently have a link when you’re listening to music at some point.

If you’re in the habit of sticking your MP3 player on “Shuffle All”, as you work your way through the track list there are multiple opportunities for the significant LTSB and LTSR pair to crop up, reducing the odds again.

At the start, with no songs played, the odds are 1 / 1,999,000 that the magic pair turn up immediately. But as long as neither individual song turns up, the odds of the pair turning up get better and better and better as time goes on.

As an illustrative example, if neither have been played and you’ve burnt through 1500 songs on your playlist, effectively you’re left with a new, shorter playlist of 500 songs. With 500 songs left, the odds of the next two songs fitting the remarkable circumstance are

(2 / 500) x (1 / 499) = 1 / 124,750

A dramatic improvement.

Get to 50 songs without either appearing, and the chance is

(2 / 50) x (1 / 49) = 1 / 1,225.

So there are many chances for the magic pairing to occur as the playlist is processed, and you aren’t generally keeping track where you are in the list until it runs out. None of which prevents you experiencing the same sense of surprise if it does pop up.

How can the chances of the pair cropping up at some point in the run be quantified?

Imagine we have a playlist N tracks in length. The number of possible combinations of tracks is therefore N!. So for 5 tracks, there are 120 combinations, for 10 tracks there are 3,628,800, for 20 tracks there are 2,432,902,008,176,640,000. I could go on, but 2,000 tracks results in a number I don’t want to type, and you don’t want to read or check.

Somewhere in amongst that lot are a subset of combinations that contain the two tracks side-by-side. And it’s the ratio of these two combinations that gives you the odds.

For the magic to happen, the two tracks must be adjacent to one another, and there are (N – 1) positions where this can occur. For each of these positions, we have no interest what tracks are played before or come afterwards, we don’t care how the remaining (N – 2) tracks are arranged.

So the number of combinations that contain the magic is (N – 1) * (N – 2)! – the number of possible two-slot positions multiplied by the number of combinations accompanying them. And as we don’t care which way round the magic pair appears, we can double that number for 2 * (N – 1) * (N – 2)!

That leads to a formula to express this ratio:

(2 * (N – 1) * (N – 2)!) / N!

and by a few algebraic manipulations (courtesy of http://www.wolframalpha.com/ if I’m honest) this can be further simplified to:

2 / N

Hurrah. None of those terrifyingly large factorial expressions involved.

So to return to the actual example; before the playlist started the chances of @calumvs hearing the two magic tracks sequentially was 2 / 2000 = 1 / 1000.

Still a long-shot, but nowhere near the 1 / 1,999,000 odds we started out with.

This is a massive decrease, but that isn’t the real reduction in odds that’s possible, if we just reframe the question slightly.

The remarkable circumstance in this case is that the two tracks were different versions of the same song. But the level of, er, remarkability-ness is really a subjective judgement. In a playlist of 2,000 songs, how many pairs of songs would seem remarkable in some way if they were played side-by-side?

If I hear the first two songs that were played at my wedding reception, is that unusual? Probably only to me.

How about:

Vincent and Killing Me Softly with His Song?

Flowers in the Rain and Massachusetts?

Eyes Without a Face and The Night Has 1000 Eyes?

Fly Me to the Moon and Moon River?

All lumbering obvious examples and there’ll be hundreds of other more subtle connections, or connections that are contingent on a particular interest or specialism on the listeners part.

Your subjective judgement of what constitutes significance in a link will always trump the apparent odds.

Given that, there would be a slim chance of getting through a playlist of 2,000 tracks without noticing at least one apparently bizarre coincidence.