Personal Blogs

Solution To 'Dermatoglyphics' Question

Saturday, 31 Aug 2019, 15:28

Visible to anyone in the world

The question at

https://learn1.open.ac.uk/mod/oublog/viewpost.php?post=221531

asked for the average, i.e. the mean, number of letters that would be left unchanged in position if the letters of 'Dermatoglyphics' are scrambled at random. The answer is 1 and surprisingly, to me anyway, the answer would be exactly the same for any other word with no repeated letters, whatever its length. Even though I know why this is it still amazes me, it seems to go against intution.

We can try this out with short word.

1 letter - 'a'

There is 1 arrangement, with 1 letter in the right place.

So the average is 1.

2 letters - 'at'

Now there are 2 arrangements.

'at' with 2 letters in the right place.

'ta' with 0 letters in the right place.

So the average is 2 + 0 = 2, divided by 2, which is 1.

3 letters - 'cat'

Now there are 6 arrangements.

'cat' with 3 letters in the right place.

'cta' with 1 letter in the right place.

'act' with 1 letter in the right place.

'atc' with 0 letters in the right place.

'tca' with 0 letters in the right place.

'tac' with 1 letter in the right place.

So the average is 3 + 1 + 1 + 0 + 0 + 1 = 6, divided by 6, which is 1.

Doing it this way will rapidly become hard work, but luckily there is something called 'linearity of expectation' which provides a general argument why this average will always be 1. For anyone interested, I'll put something in the comments tomorrow.

Tags: linearity of expectation

Permalink Add your comment

Comments

Saturday, 31 Aug 2019, 15:36

by Judith McLean

New comment

I would be interested in this

Judith

Sunday, 1 Sept 2019, 15:19

by Richard Walker

New comment

OK here goes.

The 'expected value' of some random quantity is roughly speaking the average value we would expect over a long series of experiments and we can calculate it by multiplying each of the possible values by its probability of occurring.

For example if we roll a 6-sided die their are 6 possible scores, 1-6, each with probability 1/6. So the expectation is 1x(1/6) + 2x(1/6) + 3x(1/6) + 4x(1/6) + 5x(1/6) + 6x(1/6) = 15/6 = 3.5. The average score we would expect in the long run is 3.5.

Now suppose we want the expected score when we roll a pair of dice. We might consider all the possible scores, work out the probability of each, and do a similar calculation to the one above. But there is a much better way: we can use the linearity of expectation rule. In its simplest form this says that the expectation of the sum of two or more random quantities is the sum of their expectations. So in the case of a pair of dice the expected score when they are rolled is 3.5 + 3.5 = 7.

This is fairly intuitive perhaps.After all we can imagine independently rolling one die and then the other, which would naturally give an expectation of 3.5 + 3.5 = 7.

But what is amazing is that linearity applies even when the events are not independent. Imagine we put 6 tokens, numbered 1-6, into a bag and then pull out two of them at random. What is the expected total now? The choices are no longer independent; for example if the first token we pick is say 1, the second must be a different number from 1. But amazingly we can still add the expectations, and the answer is again 3.5 + 3.5 = 7.*

Now for the word puzzle. We have 15 different letters, we shuffle then at random, and we want the expected number that end up where they were originally. Imagine that with each position in the word we associate a value which is

1 if the letter at that position is the same one as the start
0 if it is different.

It's useful to have a word to refer to this value and the usual term is 'indicator variable', so we can say we have an indicator variable for eachof the 15 positions.

Consider the first position. When we shuffle the letters of the word at random the probability of this letter being the same as at the start is 1/15 (because there are 15 letters) and the probability of its being different is the remaining 14/15. So the expected value of the first indicator variable is 1x(1/15) + 0x(14/15)= 1/15.

Exactly the same argument applies to position 2, position 3, all the way up to position 15. The expected value of all 15 indicator variables is 1/15 in each case. They are certainly not independent, but that doesn't affect the linearity of expectation. Consequently the total expectation of a leter being in the same position as at the start is the 15 terms 1/15 + 1/15 +... + 1/15 = 15x(1/15) = 1.

The average number of letters that end up in the same place is 1, and equally amazing is that the number of letters is irrelevant. 15 is not special and the answer would be 1 whatever the word length.

This linearity of expectation rule is really powerful and can be used to find the solution to many problems in a fairly simple way.

*In case you doubt the extectation is still 7, here it is calculated the hard way, from first principles. There are 30 possible outcomes.

1 with 2, 3, 4, 5, 6, scores 3, 4, 5, 6, 7, total 25
2 with 1, 3, 4, 5, 6, scores 3, 5, 6, 7, 8, total 29
3 with 1, 2, 4, 5, 6, scores 4, 5, 7, 8, 9, total 33
4 with 1, 2 ,3, 5, 6, scores 5, 6, 7, 9, 10, total 37
5 with 1, 2 ,3, 4, 6, scores 6, 7, 8, 9, 11, total 41
6 with 1, 2 , 3, 4, 5, scores 7, 8, 9, 10, 11, total 45

The grand total is 210, which divided by the number of possible outcomes gives 210/30 = 7 as claimed.

Sunday, 1 Sept 2019, 17:18

by Richard Walker

New comment

An extension is the special case where we rotate the letters cyclically a random number of places. This is a bit like the so-called 'Caesar Cipher' but with a random shift. Now the independence is minimal; either every letter will be in its original position or none will. But the expectation of each indicator variable is still 1/n, because each letter has a 1/n chance or being in its original position, and the sum of the expectations still 1.

Sunday, 1 Sept 2019, 22:01

by Richard Walker

New comment

'Expectation' is the traditional term and it means what you can predict the long-term average to be. If it's payback say

Suppose someone offers to play a game, in which you will win £100 with probablility 1/2 and lose £160 with probability 1/3, your expected profit is

£100x(1/2) - £180x(1/3) = £50 - £60 = -£10.

There is no certainty involved, but if many punters play the game, most will lose, on average to the tune of £10.

Sunday, 1 Sept 2019, 22:04

by Richard Walker

New comment

And so the doubt is about what may happen in any individual case; the certainty about what will happen in the long term. On average punters will lose £10.