Tomasz Węgrzanowski
Tomasz Węgrzanowski's Blog

Tomasz Węgrzanowski's Blog

Open Source Adventures: Episode 79: Exploring Crystal Regular Expression API

Tomasz Węgrzanowski's photo
Tomasz Węgrzanowski
·Sep 11, 2022·

3 min read

In the previous episode we've taken a look at Ruby Regular Expression API. I want to try a few more languages, and the most obvious one to start with is Crystal.

A lot of solutions work exactly like in Ruby, but some of the differences are interesting.

Test case

Crystal doesn't have %W, which is one of my favorite Ruby features, but in this case its non-interpolating and much less awesome relative %w will do.

Here's the test case:

%w[
  2015-05-25
  2016/06/26
  27/07/2017
].each do |s|
  p parse_date(s)
end

And the expected result:

[2015, 5, 25]
[2016, 6, 26]
[2017, 7, 27]

Solution 1

def parse_date(s)
  case s
  when %r[(\d\d\d\d)-(\d\d)-(\d\d)]
    [$1.to_i, $2.to_i, $3.to_i]
  when %r[(\d\d\d\d)/(\d\d)/(\d\d)]
    [$1.to_i, $2.to_i, $3.to_i]
  when %r[(\d\d)/(\d\d)/(\d\d\d\d)]
    [$3.to_i, $2.to_i, $1.to_i]
  end
end

The most straightforward solution works exactly as it did in Ruby with no changes.

Solution 2

#!/usr/bin/env crystal

def parse_date(s)
  case s
  when %r[(\d\d\d\d)-(\d\d)-(\d\d)], %r[(\d\d\d\d)/(\d\d)/(\d\d)]
    [$1.to_i, $2.to_i, $3.to_i]
  when %r[(\d\d)/(\d\d)/(\d\d\d\d)]
    [$3.to_i, $2.to_i, $1.to_i]
  end
end

Grouping when options works just like in Ruby.

Solution 3

def parse_date(s)
  case s
  when %r[(\d\d\d\d)([/-])(\d\d)\2(\d\d)]
    [$1.to_i, $3.to_i, $4.to_i]
  when %r[(\d\d)/(\d\d)/(\d\d\d\d)]
    [$3.to_i, $2.to_i, $1.to_i]
  end
end

Back-references also work just like in Ruby.

Solution 4

Now this does not work:

def parse_date(s)
  case s
  when %r[(\d\d\d\d)-(\d\d)-(\d\d)|(\d\d\d\d)/(\d\d)/(\d\d)]
    [($1 || $4).to_i, ($2 || $5).to_i, ($3 || $6).to_i]
  when %r[(\d\d)/(\d\d)/(\d\d\d\d)]
    [$3.to_i, $2.to_i, $1.to_i]
  end
end

The reason is that in Ruby $1 can be either a String or a nil. In Crystal $1 is a String, so if it didn't match, it's an error to access it.

Crystal also has nilable equivalents $1?, $2? etc. Notice that to make the whole expression not nilable, we don't put ? on the last one:

def parse_date(s)
  case s
  when %r[(\d\d\d\d)-(\d\d)-(\d\d)|(\d\d\d\d)/(\d\d)/(\d\d)]
    [($1? || $4).to_i, ($2? || $5).to_i, ($3? || $6).to_i]
  when %r[(\d\d)/(\d\d)/(\d\d\d\d)]
    [$3.to_i, $2.to_i, $1.to_i]
  end
end

Solution 5

def parse_date(s)
  case s
  when %r[(\d\d\d\d)-(\d\d)-(\d\d)|(\d\d\d\d)/(\d\d)/(\d\d)|(\d\d)/(\d\d)/(\d\d\d\d)]
    [($1? || $4? || $9).to_i, ($2? || $5? || $8).to_i, ($3? || $6? || $7).to_i]
  end
end

Knowing what we know, we can use the same trick, rewriting ($1 || $4 || $9) into ($1? || $4? || $9) and so on.

Solution 6

def parse_date(s)
  case s
  when
    %r[(?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d)],
    %r[(?<year>\d\d\d\d)/(?<month>\d\d)/(?<day>\d\d)],
    %r[(?<day>\d\d)/(?<month>\d\d)/(?<year>\d\d\d\d)]
    [$~["year"].to_i, $~["month"].to_i, $~["day"].to_i]
  end
end

Using named captures works identically to the Ruby version.

Solution 7

def parse_date(s)
  case s
  when %r[(?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d)|(?<year>\d\d\d\d)/(?<month>\d\d)/(?<day>\d\d)|(?<day>\d\d)/(?<month>\d\d)/(?<year>\d\d\d\d)]
    [$~["year"].to_i, $~["month"].to_i, $~["day"].to_i]
  end
end

Having capture groups with the same name works just like in Ruby without changes.

Solution 8

def parse_date(s)
  case s
  when %r[
      (?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d) |
      (?<year>\d\d\d\d)/(?<month>\d\d)/(?<day>\d\d) |
      (?<day>\d\d)/(?<month>\d\d)/(?<year>\d\d\d\d)
    ]x
    [$~["year"].to_i, $~["month"].to_i, $~["day"].to_i]
  end
end

And so does //x flag - everything just works.

Solution 9

def parse_date(s)
  if %r[
      (?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d) |
      (?<year>\d\d\d\d)/(?<month>\d\d)/(?<day>\d\d) |
      (?<day>\d\d)/(?<month>\d\d)/(?<year>\d\d\d\d)
    ]x =~ s
    [year.to_i, month.to_i, day.to_i]
  end
end

This on the other hand is completely unsupported - the only side effect of regular expression match is overriding $~ variable ($1 is just alias for $[1]? etc.). Regular expression match can't override other local variables.

I'm not really comfortable with this Ruby feature, so it's not surprising it didn't find its way here.

Story so far

Everything just worked with no or minimal changes. This is my usual experience with Crystal. Things just work most of the time.

All the code is on GitHub.

Coming next

In the next episode we'll see how other languages handle this problem.

 
Share this