Playfair Cipher Cracking

Wheatston and Playfair - people behind eponymous cipher

Charles Wheatston proposed this cipher in the middle of XIX century, while lord Playfair was its devoted proponent, so eventually cipher got his name.

British inventor of electrical telegraph Charles Wheatstone also invented many other interesting things - one of them is this fancy encryption system, which is easy to work with, but harder to crack (compared with Caesar's or Vigenere's) since it encodes bigrams or pairs of letters. Its modifications were used up till WW2, spanning almost 100 years.

I come upon this while preparing lecture on telegraph for my electronics classes and after struggling for a week, found it gives a nice exercise on optimization approaches. So it's just it: we are given a piece of encrypted text - and the task is to decode it back to normal English.

Algorithm Description

There could be small differences among implementations of Playfair Cipher, so here are precise steps we use.

The cipher works on alphabet of 25 letters: any spaces and punctuations are omitted, also letter J is encoded as I (so two letters are not distinguished).

The key also consists of 25 letters, all different, arranged into 5*5 square. For example, suppose we pick distinct letters from the name William Shakespeare and then pad the remaining alphabetically, to create key which one can easily remember without writing it down:

W I L A M
S H K E P
R B C D F
G N O Q T
U V X Y Z

Encoding runs like this (suppose we encode the text PROGRAMMING):

Take the next pair of letters to encrypt. Most probably they are found in different rows and columns of key table, first has coordinates x1,y1, second x2,y2. They are encrypted with pair of letters found at positions x2,y1 and x1,y2. In other words, these two letters are in the opposite corners of some rectangle - and we take two letters from two other corners of this rectangle (corner in the same row with the first letter goes first). So PR becomes SF (coordinates 5,2 and 1,3 are converted to 1,2 and 5,3).

Sometimes both letters appear in the same row or the same column (so there is no rectangle) - then we use different rule: just shift them both 1 position right (in case of row) or down (in case of column). Just wrap over the edge if necessary. For example next pair from our text OG is found in the same row, with coordinates 3,4 and 1,4. They become QN.

Next we encode RA to DW without any trouble.

Troubles come when we encounter pair of identical letters, e.g. MM. Here the rule is: don't take the whole pair, instead take single letter and add Q to it. So we encode MQ to AT (and then next pair is MI).

We complete encryption converting MI to WL and NG to ON.

The full result is SFQNDWATWLON. Decoding is obviously almost the same and gives programqming. Since Q is almost never found without following U (and never in the end of word), it is easy to get rid of extra Qs.

More precaution is needed: if pair to encrypt consists of different letters, and the second letter is Q, let's do the same as with pair of same letters - encode only the first (with addition of Q) shifting original Q to the beginning of the next pair. If text ends with incomplete pair - also add Q to it. For example word AQUA we encode as pairs AQ, QU and AQ. On decoding we just threw away any second letter in the pair if it is Q. (we don't expect to meet duplicate QQ ever and make no provision for it)

Problem

We got encrypted text, about 5kb split into multiple lines for convenience. Total number of lines, as integer, precedes the text in its own separate line. There are only lowercase latin letters (without "j").

Decode the text and return 70 first letters as an answer.

Example:

input data:
63
pdcklhltgtyqyptrbgiwiktctmyowbhzmbbzuohltzrhxziwpstvvtyhbiwmlrhlhrkoywbdwgimcezg
imrxwdgbuhtwpccyhlyvlthsimichogfgvtkpswovyolpdlrdbmtgbhlkonlhlyvltwerixlkmwbmqxz
ltouslyowkyofgymwkthwokxlhmylthlkodntcwdpltegvrzfpnmyoyofgxhbxhuhvimetwxwnzxkoyk
...for brevity 60 more lines were omitted...

answer:
whichtheylaidmeonacouchimotionedforairtheywerecompelledtoopenthewindow

Here the key was quickbrownfxmpsvethlazydg (from the pangram about quick brown fox...). Note that fragment do not necessarily start at exact word boundary.