Describing Your Way to Decryption
Al’s talk to the ACM on STE and decryption.
History
- Encryption has been used for thousands of years.
- Written security took many forms
- Shaved heads and then codes
- Computers begin being used around WWII
- The “modern” era of cryptography dates to Shannon in 1940’s
Shannon Theory
- Comes from “Communication Theory of Secrecy” in 1949
- Talks about “confusion” and “diffusion”
- Discusses “unicity”, “redundancy”, and “entropy”
- These key concepts have resulted in DES, RSA, and CipherSmith
- Also responsible for the direction of modern cryptography
Entropy
- Entropy is the uncertainty in a letter before you know it
- Based on Hartley’s function
- Gives an idea of how much uncertainty remains after randomly picking a member of a set
- Maximum where you have a uniform probability
Entropy and Redundancy
- Redundancy indicates how often information is repeated in a string
- Languages are not uniformly random
- This is all conditional
Unicity
- Unicity is the measure of the average number of symbols needed to break a code
- For a substitution cipher n = 26 symbols
- The RL measure was empirically determined
- Minimum (lower limit) when RL = 1
Set Theoretic Estimation
- Been used for years for many different problems
- The method is based largely on set theory and has certain requirements
- Uses “Last Man Standing”
- Requires:
- A function that resolves to a discrete point
- The function and any “noise” must be bounded
- The function and it’s inverse must be one-to-one and onto
- Must know elements of target set
- Must be continuous
Property Sets
- Describe some property of the solution given an input
- The property set maps back to the possible inputs
- Each property must be a subset of possible inputs
- The solutions must be a member of all the property sets
- We can use AND and intersection
- Finding one property eliminates other property sets
Sets
Language -> English -> Letters -> Sentences -> Words -> M-grams
Why STE?
- A close examination of Shannon reveals he used a STE approach
- Shannon attempted to build up a solution with p(k) = 1
- Shannon used m-grams to find p(k)
More M-grams
- We have a list of forbidden m-grams: qqq, zxy, bbqxnk, etc.
- We also used allowed m-grams
Property Sets
- English language
- Doubled letters
- M-grams for 2 through 6 and 13
- Letter frequency for very uncommon and extremely common letters
- Words and sentences
Simple Example
- Encrypted data: xyz
- Possible decryptions: amy, mut, oop, lly
- oop and lly are out since their patterns don’t match
Where’s the data come from?
- Training done by checking generally recognizable works in English
- Works are chosen from literature from various genres
- Each author is represented by several texts
- Removed non-alphabetic characters to count words, m-grams, etc.
Testing
- Encrypt file
- Randomly select start point
- Read one character at a time
- If the matrix changes, reapply the changes to the stuff we’ve already seen
- Continue until all letters are found (need 25)
- Constantly check against key to make sure we haven’t made a mistake
Results
- Shift cipher
- Using all property sets
- 1142 tests
- 100% success
- 5.0 symbols average std. dev = .5545 symbols
- Traditional unicity is 1.3 symbols
- Used only 4 and 5-grams
- General substitution cipher
- Used 12 runs per work
- Ran a composite set of all authors against each author
- Ran a total of 12792 tests
- Authors always correctly decrypt their own works
- Authors of similar style and era sometimes work
- The composite set decrypted all works correctly
- Unicity was 17.456 symbols
- Std. dev. was 5.53 symbols
- High was 91 symbols
- Low was 13 symbols
- Average time / test was less than 1 ms
- Remove worst 5 cases and get std. dev. of 3.36 symbols
Extensions and Future Work
- Add data from words (variable length m-grams)
- Add data from sentence structure
- Entropy needs a ton more attention
- Examine more ciphers to see if they’re vulnerable too (XOR, stream, block, etc.)
- Stylometry (Morton 1976)
- Language recognition via STE
- Parallel processing
Conclusions
- Can’t apply theories of uniform and random distribution to language since they’re not valid
- Certain parts of a string are more vulnerable than others. How can we find them?
- In effect, we’re changing the probability distribution function
- STE can be a powerful tool for decryption
- STE provides a framework for explaining Shannon
- Need more people