# Matching socks

Isn’t it annoying when trying to pair up a pile of socks, just how many single socks build up before you finally start finding matches for them?

So, how many single socks might we expect to have at various times in our pairing exercise?

We need first to ask this question in a more precise manner. There are two related but separate versions of this question.

*But before we start, some ground rules…*

*Just don’t get me started on odd socks – finding the planet where they go is another much more complicated problem, which I’ll leave to astrobiologists and ufologists. So, here we’re thinking only about sock piles with no odd socks. *

*We will also ignore our natural ability to easily see the two bright orange socks in the pile and pick them out as a pair straight away. That is, we assume each sock is chosen from the pile completely at random, then checked to see if it matches any other previously selected single sock.*

Suppose we have *N* *pairs* of socks. After choosing * s* socks, let the number of unpaired socks at any point be

**. We want to know the average value of**

*u***for any given value of**

*u***.**

*s*As the value of ** u** obviously depends on the value of

**, we say that**

*s**is a function of*

**u****, and so (formally speaking) we want a formula for the function**

*s**.*

**u(s)**A function * u(s)* should at best provide an exact value, but at least give us a good approximation.

We can then ask…

### Question 1

*After pulling s socks out of the pile, on average how many of these would be unpaired?*

We note that

- before we start pairing, there are no unpaired socks. That is, when
*s = 0*, then*u = 0*. - the first sock chosen must be unpaired. That is, when
, then*s = 1**u = 1*. - after all the
*2N*socks have been paired, there are again no unpaired socks. That is, when, then*s = 2N*again.*u = 0*

There are many mathematical functions which follow this pattern, starting small, increasing in the middle to some maximum value, then decreasing again. Let’s try the simplest function following this pattern – a quadratic.

A general quadratic may be written as

*u(s) = as ^{2} + bs + c*

with *a*, *b* and *c* being (as yet undetermined) constants.

Substituting the combinations of *s* and *u* as above – ** (s, u) = (0, 0)**,

**and**

*(1, 1)***respectively – into the above equation, we get**

*(2N, 0)**0 = c*

** 1 = a + b** + 0

** 0 = a(2N)^{2} + b(2N)** + 0

=> b = -2aN=> b = -2aN

from which

*a = -1/(2N – 1)*

*b = 2N/(2N – 1)*

So we then have

*u(s) = s(2N-s)/(2N-1)*

This function has a maximum average^{*} value for * u* when

**s=N => u = N ^{2}/(2N-1)**

This latter * u* is the average (also called the expected) number of unpaired socks after exactly half the socks have been sorted. (Note that this is

*not*the expected maximum number of unpaired socks for the entire sock sorting exercise – see Question 2 below.)

In fact, these formulas give the exact values for the average number of unpaired items and the maximum value of this average. This may come as a surprise, but formulas are often no more complicated than they need to be.

In any case, try it with a pack of cards, treating as pairs those cards with the same value and colour e.g. *7 spades and 7 clubs* or *Q diamonds and Q hearts*. Your count of unpaired cards should, on average, follow reasonably close to the following graph:

It is amazing that we can develop an exact formula for what might seem a very complicated scenario, simply by using three almost trivial data points, together with basic assumptions about simplicity and symmetry – even if the formula is for an average value rather than an exact value.

* Note that we said “*maximum average*“. We will deal with “*average maximum*” in an update to this page.