100 Languages Speedrun: Episode 49: Crystal

Ruby is the best programming language created so far. The history of programming language design this century is mainly features from Ruby slowly diffusing into every other programming language, newly designed or existing. Hopefully someday a language better than Ruby will be created, but nothing's really attempted that yet.

Some languages like Python are really resisting this trend. Just about every Python release gets a few small things from Ruby, and it will likely continue getting small things from Ruby for as long as Python continues. For a recent one, it took decades of resistance for Python to finally give in and add string interpolation.

Other languages decided to embrace this obvious trend, and just wholesale copied big parts of Ruby as a starting point. Crystal took it the furthest. And it's great! Starting your language and getting 80% of it right by copying the best is an amazing idea. And the other 20%? Prepare yourself for a treat!

FizzBuzz

Let's write FizzBuzz in Ruby:

#!/usr/bin/env ruby

# FizzBuzz in Crystal. Also in Ruby.
(1..100).each do |n|
  if n % 15 == 0
    puts "FizzBuzz"
  elsif n % 3 == 0
    puts "Fizz"
  elsif n % 5 == 0
    puts "Buzz"
  else
    puts n
  end
end

It turns out this code works just as well in Crystal:

#!/usr/bin/env crystal

# FizzBuzz in Crystal. Also in Ruby.
(1..100).each do |n|
  if n % 15 == 0
    puts "FizzBuzz"
  elsif n % 3 == 0
    puts "Fizz"
  elsif n % 5 == 0
    puts "Buzz"
  else
    puts n
  end
end

But with Crystal you can also compile it:

$ crystal build fizzbuzz.cr
$ ./fizzbuzz
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
...

By default Crystal language generates reasonably optimized code, but doesn't go all the way to balance build speed and runtime performance. To get full optimizations we need to do crystal build --release. This will take a while.

Type System

Most languages are dynamically typed. Of static typing systems, there are these broad categories:

  • simple to understand, but prevent a lot of perfectly valid code (original pre-generic Java, Go)
  • allow most code, at cost of type system being ridiculously complicated (Haskell, OCaml, Scala)
  • fudge it both ways, accept not being fully correct (TypeScript)

If you're interested in why it's impossible to a static type system that doesn't suck in one of these ways, check out this talk or the associated paper.

As a side note, there's a huge divide between the real world and the academia here - the academia is in love with ridiculously overcomplicated type systems, as that lets them endlessly churn completely worthless papers about yet another type system extension. The real world either uses dynamic typing, or simple static type systems. Every single language with complicated type system failed the real world test.

Well, Crystal introduced a whole new category to this.

Let's give it a try:

def add(a, b)
  puts "#{a} + #{b} = #{a + b}"
end

add 400, 20
add 6, 0.9
add "foo", "bar"

That works as expected:

$ crystal add.cr
400 + 20 = 420
6 + 0.9 = 6.9
foo + bar = foobar

Now let's try to write something incorrect:

def add(a, b)
  puts "#{a} + #{b} = #{a + b}"
end

add 400, 20
add 6, 0.9
add "foo", "bar"
add "y", 2000
$ crystal add2.cr
Showing last frame. Use --error-trace for full trace.

In add2.cr:2:27

 2 | puts "#{a} + #{b} = #{a + b}"
                             ^
Error: no overload matches 'String#+' with type Int32

Overloads are:
 - String#+(other : self)
 - String#+(char : Char)

This isn't terribly complicated code. The thing is, every other statically typed language I can think of just fails. Not only they can't infer the correct type automatically, there's usually no reasonable annotation you can provide.

Most statically languages are completely hopeless and don't even try. I was wondering if I could maybe setup some type classes in Haskell, but even enabling a lot of language extensions, and writing far more type code than actually code, I couldn't get it to work, and errors from compiler were total gibberish. If anyone is up to the challenge, in any statically typed language, please let me know in the comments. I think it's possible to twist type classes to maybe support this, at least in something like Haskell, or Idris, or such, but it sure won't as easy as Crystal here.

Of course this is a total toy example, but all existing static typing systems get in the way all the damn time and can't even handle extremely simple code like this. There's zero hope that they'd just let you write more complex code without getting in the way even more.

Crystal Type System

So how is Crystal doing this?

The key is global type inference. When you do def add(a, b), Crystal doesn't immediately try to assign most generic possible type to it, like pretty much every other statically typed language. Instead, it parks that question, and when add is actually used, it checks if it makes sense there.

If we want to see what ends up being created, it just happily creates three functions for these three situations, all of these variants are perfectly typed:

$ crystal build add.cr
$ objdump -t add | grep '\*add<'
0000000100012dd0 l    d  *UND* _*add<Int32, Int32>:Nil
0000000100012e70 l    d  *UND* _*add<Int32, Float64>:Nil
0000000100012ea0 l    d  *UND* _*add<String, String>:Nil
0000000100012e70 g     F __TEXT,__text _*add<Int32, Float64>:Nil
0000000100012dd0 g     F __TEXT,__text _*add<Int32, Int32>:Nil
0000000100012ea0 g     F __TEXT,__text _*add<String, String>:Nil

With no templates or typeclass definitions, it Just Works.

Nils

If that wasn't magical enough, let's get to nils (or nulls in most other languages). nils are pretty much necessary evil. A lot of code really wants data to be uninitialized for a while, and it would require a lot of twisting it around to always prevent this situation. On the other hand, a lot of other code wants to guarantee that it never gets nils, just actual values.

Traditionally most languages have "nullable" and "non-nullable" values, and it's overall a total mess. How about Crystal? It can do what no other language can. Don't believe me? Check this out:

ary = [1, 2, 3, nil]

puts "Type of array:"
puts typeof(ary)

puts ""
puts "Type of each element:"
ary.each do |x|
  puts typeof(x)
end

puts ""
puts "Type of each element:"
ary.each do |x|
  if x
    puts typeof(x)
  else
    puts typeof(x)
  end
end
$ crystal run ./types.cr
Type of array:
Array(Int32 | Nil)

Type of each element:
(Int32 | Nil)
(Int32 | Nil)
(Int32 | Nil)
(Int32 | Nil)

Type of each element:
Int32
Int32
Int32
Nil

First we define ary as array with some stuff in it. We ask Crystal what it thinks type of ary is, and it correctly says it's Array(Int32 | Nil). Int32 | Nil is union type. So far so good.

Then we iterate the array. Just to be clear - the typeof operator is not any kind of hacky runtime check, it tells us what we know about the variable statically. So as all we know is that ary is Array(Int32 | Nil), then Crystal can only tell us that each element is (Int32 | Nil).

So far so good. But what about the second loop? At if/else, Crystal checks which types can possibly get into each branch. So it knows for a fact in the positive branch, only Int32 can get there, and in the negative branch, only Nil! This of course isn't some magic, and only some kinds of syntax trigger this splitting, but it's totally amazing in practice.

This works for nils, but it also works for any other union types.

None of this is possible in any language with traditional (unification-based) static type system.

Union Types

Let's have more fun, how about this program:

ary = [7, 210, "foo", nil]

puts "Type of array:"
puts typeof(ary)

puts ""
puts "Double each element:"
ary.each do |x|
  puts "Double of #{x} (#{typeof(x)}) is #{x * 2}"
end

Crystal correctly sees that it cannot run:

$ crystal run ./union.cr
Showing last frame. Use --error-trace for full trace.

In union.cr:9:46

 9 | puts "Double of #{x} (#{typeof(x)}) is #{x * 2}"
                                                ^
Error: undefined method '*' for Nil (compile-time type is (Int32 | String | Nil))

So what if we provide a default value:

ary = [7, 210, "foo", nil]

puts "Type of array:"
puts typeof(ary)

puts ""
puts "Double each element:"
ary.each do |x|
  x ||= "unknown"
  puts "Double of #{x} (#{typeof(x)}) is #{x * 2}"
end
crystal run ./union2.cr
Type of array:
Array(Int32 | String | Nil)

Double each element:
Double of 7 ((Int32 | String)) is 14
Double of 210 ((Int32 | String)) is 420
Double of foo ((Int32 | String)) is foofoo
Double of unknown ((Int32 | String)) is unknownunknown

Yeah, it knows that reassigned x cannot be nil, so it runs perfectly.

This is total magic. We have full static typing, with zero annotations so far, and the kind of expressiveness that's so far been only reserved for dynamically typed languages.

Does it come at a cost? Yes it does. Compilation times are definitely slower than languages with simpler type systems would be. Eventually you'll need some type annotations, even if these are generally very simple concrete types. And until last week, it didn't have a proper REPL, as this kind of back and forth type inference wasn't very compatible with REPL environment. Well now it does, but it's still in early stages.

Type Annotations

Probably the most common case is empty arrays and such, which require type annotations, instead of trying to infer things from everything you put there:

ary = [] of String
ary << "foo"
ary << "bar"
puts ary.map(&.upcase)
$ crystal run ./ary.cr
["FOO", "BAR"]

Notice &.methodname syntax for a trivial block - and it's part of language syntax. In Ruby it's &:methodname (which started as a hack to turn Symbol into Proc).

If you define non-empty array, Crystal will infer type of the array from its initial elements, so if you want to support some more, you'd need to add them too.

For example if you did not use explicit annotation, Crystal would think ary is Array(String) and treat ary << nil as a type error:

ary = ["foo"] of (String | Nil)
ary << "foo"
ary << nil
puts ary.map{|x| x || "unknown"}

Overall it's a sensible choice.

Should you use Crystal?

Absolutely yes!

The language has about Ruby levels of programmer friendliness, most expressive and completely revolutionary type system in existence, and performance in the same tier as Java or Go (so not quite as good as raw C, without a lot of additional work).

It's about the most exciting new language of the last decade, and it's infinitely much better than dumbed-down corporate crap Big Tech tries to shove down our throats. If you have a task for which Ruby is too slow, having Crystal as the first choice is a great idea.

To keep episodes short I focused only on Crystal's friendliness and type system, but it does a lot more really interesting things.

It's not completely without downsides, especially compilation times are not great, but Crystal is one of very few languages that get my full endorsement.

Code

All code examples for the series will be in this repository.

Code for the Crystal episode is available here.