Break Caesar Cipher with Z3

Z3 is usually given a list of hard constraints, and told to solve them, but it can do quite a few more things.

Let's use it unconventionally and break Caesar cipher.

The Challenge

The following plaintext from Wikipedia:

In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of three, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

I replaced digit 3 by word "three", as Caesar cipher doesn't handle punctuation or digits - understandably as Romans used letters for digits.

It also explains what we're doing.

Random encryption

This program picks a random key, and runs Caesar cipher on it:

#!/usr/bin/env ruby

key = rand(0..25)
text = STDIN.read.downcase.gsub(/[^a-z]/, "")

puts text.bytes.map{|c| (((c-'a'.ord)+key) % 26 + 'a'.ord).chr}.join

I got the following output:

pujyfwavnyhwofhjhlzhyjpwolyhszvruvduhzjhlzhyzjpwolyaolzopmajpwolyjhlzhyzjvklvyjhlzhyzopmapzvulvmaolzptwslzahuktvzadpklsfruvdulujyfwapvualjoupxblzpapzhafwlvmzbizapabapvujpwolypudopjolhjoslaalypuaolwshpualeapzylwshjlkifhslaalyzvtlmpelkubtilyvmwvzpapvuzkvduaolhswohilamvylehtwsldpaohslmazopmavmaoyllkdvbskilylwshjlkifhldvbskiljvtlihukzvvuaoltlaovkpzuhtlkhmalyqbspbzjhlzhydovbzlkpapuopzwypchaljvyylzwvuklujl

How to break Caesar Cipher old fashioned way

It's actually really easy to do it by brute force, as we only have 26 possible keys. So there are 26 possible decryptions, and you can check which one is most likely by simple methods like checking most frequent letters, common words in output and so on.

How to break Caesar Cipher with Z3

The problem for a constraint solver like is that technically all 26 possible keys are "valid", so if we just tell it what the ciphertext is, and how Caesar's cipher works, it would give us one of 26 possible valid decodings, but which one?

Fortunately in addition to default "hard constraints" Z3 also supports "soft constraints". We can list a lot of them, and Z3 will try to give us a solution that fulfills as many as possible, in addition to every hard constraint. We could also give them weights, put them in different groups, and so on, but for now we'll just use the simplest solution. And our only constraint is that each letter is "E". So Z3 will try to give us solution with as many "E"s as possible.

Solver

#!/usr/bin/env ruby

require "z3"

class CaesarSolver
  def initialize(text)
    @key = Z3.Int("key")
    @ct = text.bytes.map{|c| to_number(c)}
  end

  def to_letter(i)
    (i + 'a'.ord).chr
  end

  def to_number(c)
    c.ord - 'a'.ord
  end

  def call
    @solver = Z3::Optimize.new
    # Variables for each plaintext letter
    @pt = (0...@ct.size).map{|i| Z3.Int("pt#{i}")}

    # Possible keys
    @solver.assert @key >= 0
    @solver.assert @key <= 25

    # This is how Caesar cipher works
    @pt.zip(@ct).each do |p,c|
      # modulo is a bit awkward
      @solver.assert p >= 0
      @solver.assert p <= 25
      @solver.assert Z3.Or(p == @key + c, p == @key + c - 26)
    end

    # Tell Z3 that there's probably a lot of common letters like E
    @pt.each do |p|
      @solver.assert_soft p == to_number('e')
    end

    if @solver.satisfiable?
      model = @solver.model
      puts @pt.map{|p| to_letter(model[p].to_i)}.join
    else
      puts "No solution found"
    end
  end
end

text = open(ARGV[0] || "encrypted.txt").read.chomp
CaesarSolver.new(text).call

Is it practical?

This should break a lot of encrypted messages, unless they're very short, very unusual, or not in English.

We could make it more powerful, by telling it about other common letters like T, A, O, I, N etc, with somewhat lower weights. We could even include common letter combinations etc.

It is quite slow, and brute forcing 26 solution, and counting ones with most Es in a lot faster. The difference is that Z3 can deal with situations where brute force isn't possible.

In this case would could probably improve Z3 performance by using bitvectors instead of integers, as then would have native modulo operations.

A good example of a cipher that would likely work quite well in Z3 but might be difficult to break by brute force would be a XOR cipher with short key. Of course for all such common ciphers there already exist fast tools to break them, so it's mostly useful if you're trying to solve a CTF challenge with a novel but simple cipher.

Code

All code for this post is available on GitHub.