100 Languages Speedrun: Episode 99: Extending Esoteric Language Tuples

The series is heading towards completion, but I have one unfinished business.

Back in episode 79 I did some prototyping of new esoteric language Tuples. Then in episode 80 I investigated a minimalistic version of it.

Let's try to turn it into a real language. By that I mean one that could at least in principle run Wordle.

We'll need all these features:

  • parser for Tuple language
  • support for reading files
  • support for random number generation
  • interactive I/O
  • tuples with multiple dependencies

Special Tuples

We need a much richer interface to the runner than the one we had before. To let the runner evolve without messing up the program, only tuples starting with 0.* will have special meaning. Everything else can safely be used by the programs.

So here are special tuple types we need.

If any file names are passed to the runner at start, these are pre-filled before program starts. They don't do anything special afterwards. This is about the bare minimum file I/O, it will let us implement common Unix programs like cat or wc, or read data files like wordle-dictionary.txt. We can't create files, write files, or open files dynamically, but it's a start.

  • 0.2.f.i.c - i-th character of the file with id f has Unicode codepoint c (system-created, before start)
  • 0.2.f.i.0 - special value for first position after end of file (system-created, before start)

For random numbers, we need a request tuple and a response tuple:

  • 0.4.id.min.max - request to generate a random value - each request has unique id in case you need multiple numbers (program-created)
  • 0.4.id.min.max.value - random value between min and max (system-created, on demand)

So if you need to roll a bunch of d6s, you can create tuples 0.4.100.1.6, 0.4.101.1.6, 0.4.102.1.6, and the runner will respond by creating some numbers like let's say 0.4.100.1.6.5, 0.4.101.1.6.2, and 0.4.102.1.6.3.

I'm including id.min.max in the response, just to make it clear what happens if someone requestsn 0.4.69.1.6 and then 0.4.69.1.10 - these will be independent requests.

For input we need "request to read i-th character", and response with contents of that i-th character, or 0 if end of input was reached:

  • 0.0.i - request to get more input up to i-th character (program-created) - note that STDIN input will use full lines anyway
  • 0.0.i.c - i-th character of the input has Unicode codepoint c (system-created, only when requested)
  • 0.0.i.0 - special value for first position after end of input (system-created, only when requested)

And finally the output. I found it much easier to write Tuples programs if gaps in the output are allowed. If you don't need any interactivity, you can just create some 0.1.*.* tuples and be done with it. But if you want to flush the output during program run, you need to create 0.3.i tuples. Once the flush request has been fulfilled, corresponding 0.3.i.0 tuple will be created.

  • 0.1.i.c - i-th character of the output has Unicode codepoint c (program-created, gaps allowed, 0s ignored)
  • 0.3.i - request to print everything up to i (skipping gaps; program-created) - the rest will be printed anyway when program ends
  • 0.3.i.0 - request acknowledgements (system-created after request done)

There's nothing special about this arrangement, it's just the minimum functionality I needed to enable writing the kind of mini-programs I wanted to do, like infinite loop printing, concatenating multiple files, or Wordle.

Structure of the episode

First, I'll show a bunch of Tuples programs, then the interpreter. I originally planned to do Wordle in Tuples, but this post was already getting crazy long with 16 [ ] examples, so I'll leave that as an exercise for the reader.

I put a lot of comments inside Tuples code, but maybe it's not enough as it's an esoteric language. I'd love it if other people also tried to use it. Let me know if you write anything in it.

First I'll recap programs from old episodes that still work. The main difference is just some extra comments.

Hello, World!

Many programs from previous episodes still work just fine. This is the Hello, World!:

# Just the characters in "Hello, World!\n"
-> (0, 1, 0, 72)
-> (0, 1, 1, 101)
-> (0, 1, 2, 108)
-> (0, 1, 3, 108)
-> (0, 1, 4, 111)
-> (0, 1, 5, 44)
-> (0, 1, 6, 32)
-> (0, 1, 7, 87)
-> (0, 1, 8, 111)
-> (0, 1, 9, 114)
-> (0, 1, 10, 108)
-> (0, 1, 11, 100)
-> (0, 1, 12, 33)
-> (0, 1, 13, 10)
$ ./tuples.rb examples/hello.txt
Hello, World!

Loop

# Initial state
-> (1, 1, 9)

# Generate data
(1, a, b) -> (1, a+1, b-1)

# Generate output, number to string conversion
# But send it to extra stage so we can filter the parts we don't want
# if second arg is negative, it won't get added to the state
(1, a, b) -> (2, (a/10)-1, 10*a+0, 48+(a/10))
(1, a, b) -> (2, 0, 10*a+1, 48+(a%10))
(1, a, b) -> (2, 0, 10*a+2, 10)

# Filtering is done, output the result
(2, a, b, c) -> (0, 1, b, c)
$  ./tuples.rb examples/loop.txt
1
2
3
4
5
6
7
8
9
10

Minimalistic Loop

That's the version that uses no operations except +1 and -1.

# Initial state
-> (1, 1, 19)

# Generate data
(1, a, b) -> (1, a+1, b-1)
# Send it to digit splitter
(1, a, b) -> (2, a, a, 0, 0)

# Split digits, send to r<10 filter
(2, i, a, d, r) -> (2, i, a-1, d, r+1)
(2, i, a, d, 10) -> (2, i, a, d+1, 0)
(2, i, 0, d, r) -> (3, i, d, r, r, 9)

# m<10 filter
(3, i, d, r, ra, rb) -> (3, i, d, r, ra-1, rb-1)
(3, i, d, r, 0, rb) -> (4, i, 0, 0, d, r)

# OK, we split the number and we're ready to print
# Multiply i by 3 to get final position, then send to three printers
(4, i, ia, 0, r, m) -> (4, i-1, ia, 3, r, m)
(4, i, ia, ib, r, m) -> (4, i, ia+1, ib-1, r, m)
(4, 0, ia, 0, r, m) -> (10, ia, r, r-1)
(4, 0, ia, 0, r, m) -> (20, ia+1, m)
(4, 0, ia, 0, r, m) -> (30, ia+1)

# First digit printer, needs additional check arguments to filter out zeroes
(10, i, d, z) -> (11, i, d, 48)
(11, i, a, b) -> (11, i, a+1, b-1)
(11, i, a, 0) -> (0, 1, i, a)

# Second digit printer
(20, i, d) -> (21, i, d, 48)
(21, i, a, b) -> (21, i, a+1, b-1)
(21, i, a, 0) -> (0, 1, i, a)

# Newline printer
(30, i) -> (0, 1, i+1, 10)

FizzBuzz

# Initial state
-> (1, 1, 99)

# Generate data
(1, a, b) -> (1, a+1, b-1)

# Print all "\n"s
(1, i, _) -> (0, 1, 1000*i + 100, 10)

# Decide which print function to call
(1, i, a) -> (20, (i%3)-1, (i%5)-1, i)
(1, i, a) -> (21, -(i%3), (i%5)-1, i)
(1, i, a) -> (22, (i%3)-1, -(i%5), i)
(1, i, a) -> (23, -(i%3), -(i%5), i)

# Print number
(20, a, b, i) -> (11, 1000*i + 99, i)

# Print Fizz
(21, a, b, i) -> (0, 1, 1000*i + 0, 70)
(21, a, b, i) -> (0, 1, 1000*i + 1, 105)
(21, a, b, i) -> (0, 1, 1000*i + 2, 122)
(21, a, b, i) -> (0, 1, 1000*i + 3, 122)

# Print Buzz
(22, a, b, i) -> (0, 1, 1000*i + 0, 66)
(22, a, b, i) -> (0, 1, 1000*i + 1, 117)
(22, a, b, i) -> (0, 1, 1000*i + 2, 122)
(22, a, b, i) -> (0, 1, 1000*i + 3, 122)

# Print FizzBuzz
(23, a, b, i) -> (0, 1, 1000*i + 0, 70)
(23, a, b, i) -> (0, 1, 1000*i + 1, 105)
(23, a, b, i) -> (0, 1, 1000*i + 2, 122)
(23, a, b, i) -> (0, 1, 1000*i + 3, 122)
(23, a, b, i) -> (0, 1, 1000*i + 5, 66)
(23, a, b, i) -> (0, 1, 1000*i + 6, 117)
(23, a, b, i) -> (0, 1, 1000*i + 7, 122)
(23, a, b, i) -> (0, 1, 1000*i + 8, 122)

# Convert string to number and output it
(11, i, n) -> (0, 1, i, 48+(n%10))
(11, i, n) -> (12, (n/10)-1, i-1, n/10)
(12, _, i, n) -> (11, i, n)

Minimalistic FizzBuzz

That's the version that uses no operations except +1 and -1.

# Initial state
# There's a lot of countdown counters here
-> (1, 0, 0, 9, 0, 0, 0, 100)

# cnt -= 1
(1, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (2, i1, i0, ic, fcnt, bcnt, pos, 10, cnt-1)

# loop to do pos += 10
(2, i1, i0, ic, fcnt, bcnt, pos, posplus, cnt) -> (2, i1, i0, ic, fcnt, bcnt, pos+1, posplus-1, cnt)
(2, i1, i0, ic, fcnt, bcnt, pos, 0, cnt) -> (3, i1, i0, ic, fcnt, bcnt, pos, cnt)

# update i1, i0, and ic
# i0 - lowest digit
# i1 - highest digit
# ic + i0 == 9 always
(3, i1, i0, 0, fcnt, bcnt, pos, cnt) -> (4, i1+1, 0, 9, fcnt, bcnt, pos, cnt)
(3, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (4, i1, i0+1, ic-1, fcnt, bcnt, pos, cnt)

# update fizz countdown
(4, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (5, i1, i0, ic, fcnt-1, bcnt, pos, cnt)
(4, i1, i0, ic, 0, bcnt, pos, cnt) -> (5, i1, i0, ic, 2, bcnt, pos, cnt)

# update buzz countdown
(5, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (6, i1, i0, ic, fcnt, bcnt-1, pos, cnt)
(5, i1, i0, ic, fcnt, 0, pos, cnt) -> (6, i1, i0, ic, fcnt, 4, pos, cnt)

# decide what to print based on counters
(6, i1, i0, ic, 0, bcnt, pos, cnt) -> (10, pos)
(6, i1, i0, ic, fcnt, 0, pos, cnt) -> (11, pos)
(6, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (12, fcnt-1, bcnt-1, i1, i0, pos)
(6, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (13, pos)

# continue the loop
(6, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (1, i1, i0, ic, fcnt, bcnt, pos, cnt)

# print fizz
(10, pos) -> (20, pos, 0, 70)
(10, pos) -> (20, pos, 1, 105)
(10, pos) -> (20, pos, 2, 122)
(10, pos) -> (20, pos, 3, 122)

# print buzz
(11, pos) -> (20, pos, 4, 66)
(11, pos) -> (20, pos, 5, 117)
(11, pos) -> (20, pos, 6, 122)
(11, pos) -> (20, pos, 7, 122)

# print number
(12, fcheck, bcheck, i1, i0, pos) -> (14, i1-1, pos, i1)
(12, fcheck, bcheck, i1, i0, pos) -> (14, 0, pos+1, i0)

# print newline
(13, pos) -> (20, pos, 8, 10)

# digit print check
(14, dcheck, pos, c) -> (15, pos, c, 48)

# convert digit to ASCII
(15, pos, c, cplus) -> (15, pos, c+1, cplus-1)
(15, pos, c, 0) -> (0, 1, pos, c)

# print character at specific offset
(20, pos, posplus, c) -> (20, pos+1, posplus-1, c)
(20, pos, 0, c) -> (0, 1, pos, c)

Fibonacci

# Initial state
-> (10, 1, 1, 1, 99)

# Generate data
(10, i, a, b, cnt) -> (10, i+1, b, a+b, cnt-1)

# print "fib(",  ")=", and "\n" parts
(10, i, a, b, cnt) -> (0, 1, 1000*i + 0, 102)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 1, 105)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 2, 98)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 3, 40)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 200, 41)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 201, 61)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 401, 10)

# Send i and fib(i) to conversion function to output at the right places
(10, i, a, b, cnt) -> (11, 1000*i + 199, i)
(10, i, a, b, cnt) -> (11, 1000*i + 399, a)

# Convert string to number and output it
(11, i, n) -> (0, 1, i, 48+(n%10))
(11, i, n) -> (12, (n/10)-1, i-1, n/10)
(12, _, i, n) -> (11, i, n)

Print File

OK, let's get to the new functionality.

Here's Tuples program for printing a single file. As you can see for esoteric language, it can be quite expressive:

# 0.2.0.*.* is contents of a file, just send them through
(0, 2, 0, i, b) -> (0, 1, i, b)
$ ./tuples.rb examples/print_file.txt spec/data/a.txt
One
Two
Three

Dice

The RNG system is very easy to use, we just generate some requests at start, and print in the right place when we get a response:

# Request some dice rolls
-> (0, 4, 1, 1, 6)
-> (0, 4, 2, 1, 6)
-> (0, 4, 3, 1, 6)
-> (0, 4, 4, 1, 6)

# Print the results
(0, 4, i, _, _, n) -> (0, 1, 2*i, 48+n)
(0, 4, i, _, _, n) -> (0, 1, 2*i+1, 10)
$ ./tuples.rb examples/dice.txt
6
1
4
2
$ ./tuples.rb examples/dice.txt
5
3
6
1

Yes

It's equivalent of yes Unix program that prints infinite y. We cannot actually generate infinite ys at once, we need to generate just some defined chunk (in this case one line), and call flush. Then in respone to flush succeeding, do the next chunk (in this case next line etc).

# Initial state
-> (1, 0)

# Print "y\n" and request callback
(1, i) -> (0, 1, i*2, 121)
(1, i) -> (0, 1, i*2+1, 10)
(1, i) -> (0, 3, i*2+1)

# If printed, get next state
(0, 3, ii, 0) -> (1, (ii/2) + 1)
$  ./tuples.rb examples/yes.txt | head
y
y
y
y
y
y
y
y
y
y

Echo

Echo program gets whatever's on STDIN, and echos it.

# Request first character
-> (0, 0, 0)

# Stop if EOF
(0, 0, pos, char) -> (1, pos, char, char-1)

# On character received, print it and request another
# (no flushing, just let it flush on exit)
(1, pos, char, _) -> (0, 1, pos, char)
(1, pos, char, _) -> (0, 0, pos+1)
$ ./tuples.rb examples/echo.txt
Żółw likes 🍨
Żółw likes 🍨

Print multiple files

Printing one file was easy, printing multiple files is more complicated, as we need to know where one file starts and another ends.

We could also just assume all files are smaller than let's say 1GB, and just offset each by 1_000_000_000 and that would totally work, but here's doing it the hard way:

# 2.i.size are file sizes
(0, 2, i, size, 0) -> (2, i, size)

# 3.i.size is total size of all files preceeding i
# check flag is == j
-> (3, 0, 0)
(3, i, presize), (2, j, size) -> (3, i+1, presize+size, -(i-j)**2)
(3, i, size, _) -> (3, i, size)

# output contents of each file at the right offset
# (we can safely add 0s to the output at duplicate position
# they're ignored by the output algorithm)
# check flag id1 == id2
(0, 2, id1, i, b), (3, id2, size) -> (4, size+i, b, -(id1-id2)**2)
(4, pos, b, _) -> (0, 1, pos, b)
$  ./tuples.rb examples/print_files.txt spec/data/{a,b,c}.txt
One
Two
Three
Four
Five
Six
Seven
Eight

Infinite Loop

Infinite loop is a mix of "yes" program and "loop" program.

# Initial state
-> (1, 0)

# Print number and "\n" and request flush
(1, i) -> (0, 1, 10*i+9, 10)
(1, i) -> (2, i, 10*i+8)
(1, i) -> (0, 3, 10*i+9)

# On flush, do next number
(0, 3, ii, 0) -> (1, (ii/10) + 1)

# (2, num, pos) prints number with last digit at pos
(2, num, pos) -> (0, 1, pos, 48+(num % 10))
(2, num, pos) -> (3, (num/10), pos-1, (num/10)-1)
(3, num, pos, _) -> (2, num, pos)

Hello, Name!

Nothing to complicated about it, except we want to keep going until we find either End-Of-File (0) or newline character (10).

# Request first character
-> (0, 0, 0)

# Print "Hello, "
-> (0, 1, 0, 72)
-> (0, 1, 1, 101)
-> (0, 1, 2, 108)
-> (0, 1, 3, 108)
-> (0, 1, 4, 111)
-> (0, 1, 5, 44)
-> (0, 1, 6, 32)

# Stop if newline or EOF
(0, 0, i, b) -> (1, i, b, b-1, ((b-10)**2)-1)
(0, 0, i, 0) -> (2, i)
(0, 0, i, 10) -> (2, i)

# On character received, print it and request another
# (no flushing, just let it flush on exit)
(1, pos, char, _, _) -> (0, 1, 10+pos, char)
(1, pos, char, _, _) -> (0, 0, pos+1)

# Print final parts
(2, i) -> (0, 1, 10+i, 33)
(2, i) -> (0, 1, 11+i, 10)
$ ./tuples.rb examples/hello_name.txt
Eve
Hello, Eve!

Count Lines

This program counts lines in a passed file:

# how many lines before N
-> (1, 0, 0)

# process file character by character
(0, 2, 0, pos1, 10), (1, pos2, cnt) -> (2, pos1+1, cnt+1, -(pos1-pos2)**2, 0, 0)
(0, 2, 0, pos1, c), (1, pos2, cnt) -> (2, pos1+1, cnt, -(pos1-pos2)**2, c-1, ((c-10)**2)-1)
(0, 2, 0, pos1, 0), (1, pos2, cnt) -> (3, cnt, -(pos1-pos2)**2)

# continue loop
(2, pos, cnt, _, _, _) -> (1, pos, cnt)

# print result
(3, cnt, _) -> (4, cnt, 100)
# always print "\n"
-> (0, 1, 101, 10)

(4, cnt, outpos) -> (0, 1, outpos, 48 + (cnt % 10))
(4, cnt, outpos) -> (4, (cnt/10), outpos-1, (cnt/10)-1)
(4, cnt, outpos, _) -> (4, cnt, outpos)
$ ./tuples.rb examples/count_lines.txt spec/data/a.txt
3

Random Word

I started writing Wordle program, but I didn't finish it, as this post was already far too long. But the first part is picking up random file from the Wordle dictionary.

Because we know every line is 6 characters (5 letters + newline), we can really simplify this. Random line of unknown length would be a lot more complicated.

# we know every line is 5 characters + newline, that makes it a lot easier
# as we can just start at 6*Nth character

# 10.x is count of words in the dictionary
(0, 2, 0, i, 0) -> (10, i/6)

# roll random index
# 11.x is character position of random word (6*random number)
(10, count) -> (0, 4, 0, 0, count-1)
(0, 4, 0, _, _, index) -> (11, index*6)

# 12.i.b is the random word, i-ith letter
(11, pos1), (0, 2, 0, pos2, b) -> (12, 0, b, -(pos1-pos2)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 1, b, -(pos1-pos2+1)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 2, b, -(pos1-pos2+2)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 3, b, -(pos1-pos2+3)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 4, b, -(pos1-pos2+4)**2)
(12, i, b, _) -> (12, i, b)

# print the random word
(12, i, b) -> (0, 1, i, b)
-> (0, 1, 5, 10)
$ ./tuples.rb examples/random_word.txt spec/data/wordle-answers-alphabetical.txt
sweep
$ ./tuples.rb examples/random_word.txt spec/data/wordle-answers-alphabetical.txt
honey
$ ./tuples.rb examples/random_word.txt spec/data/wordle-answers-alphabetical.txt
rumor

Sum

And finally a program that finds all the numbers in a file and sums them up.

# 10.i.b - at position i, there's digit b
(0, 2, 0, i, b) -> (10, i, b-48, -(48-b)*(57-b))
(10, i, b, _) -> (10, i, b)

# 11.i - at position i, there's a non-digit, non-EOF
(0, 2, 0, i, b) -> (11, i, (47-b)*(58-b), b-1)
(11, i, _, _) -> (11, i)

# 12.i - at position i, there's EOF
(0, 2, 0, i, 0) -> (12, i)

# 13.i.v - partial numbers coming into i
# if number ended at i, it would be v
# so "x123yz" would result in [0, 0, 1, 12, 123, 0]
-> (13, 0, 0)
(11, i) -> (13, i+1, 0)
(13, i, v), (10, j, b) -> (13, i+1, 10*v+b, -(i-j)**2)
(13, i, v, _) -> (13, i, v)

# 14.i.v is v if number ends just before i, or 0 otherwise
(13, i, v), (11, j) -> (14, i, v, -(i-j)**2)
(13, i, v), (12, j) -> (14, i, v, -(i-j)**2)
(10, i, _) -> (14, i, 0)
(14, i, v, _) -> (14, i, v)

# 15.i.v is sum of all numbers before i
-> (15, 0, 0)
(15, i, v), (14, j, u) -> (15, i+1, u+v, -(i-j)**2)
(15, i, v, _) -> (15, i, v)

# 16.v is the final sum
# (there's extra +1 as sums shifted)
(15, i, v), (12, j) -> (16, v, -(i-j-1)**2)
(16, v, _) -> (16, v)

# send to printer, also add "\n"
(16, v) -> (20, v, 100)
-> (0, 1, 101, 10)

# (20, num, pos) prints number with last digit at pos
(20, num, pos) -> (0, 1, pos, 48+(num % 10))
(20, num, pos) -> (21, (num/10), pos-1, (num/10)-1)
(21, num, pos, _) -> (20, num, pos)

Let's try it on some files:

$ cat spec/data/numbers.txt
15 38 9 007
$ ./tuples.rb examples/sum.txt spec/data/numbers.txt
69
$ cat spec/data/budget.txt
Food $200
Data $150
Rent $800
Candles $3600
Utility $150
$ ./tuples.rb examples/sum.txt spec/data/budget.txt
4900

Runner

OK, that's enough Tuples, let's see how it was implemented:

  • tuples.rb - main program
  • tuples_parser.rb - Tuples parser
  • tuples_program.rb - Tuples program runner
  • spec - integration tests
  • spec/data/* - sample data files
  • spec/integration/*.txt - expected output

There's nothing interesting about specs, you can check the sources if you want to.

Main Program

The main program has very few responsibilities, it just loads some dependencies, handles ARGV, and loads files:

#!/usr/bin/env ruby

require "ostruct"
require "pathname"
require "pry"
require "set"
require_relative "tuples_parser"
require_relative "tuples_program"

class TuplesRunner
  def initialize(program_path, *file_paths)
    @program = TuplesParser.new(program_path).call
    @files = file_paths.map { |file_path| Pathname(file_path).read }
  end

  def call
    @program.reset
    @files.each_with_index do |file, i|
      @program.load_file file.codepoints, i
    end
    @program.call
  end
end

Signal.trap("PIPE", "EXIT")

unless ARGV.size >= 1
  STDERR.puts "Usage: #{$0} <input.txt> [<additional files> ...]"
  exit 1
end

TuplesRunner.new(*ARGV).call

TuplesParser

The parser is half-implemented. Each line is parsed separately, with any empty, comment lines, or lines not containing -> ignored. The left side of -> is fully parsed. The right side of -> just wrapped in a block and passed to Ruby eval.

This means the languge is not fully defined, but the intention is that you should only use integer arithmetic expressions (+, -, *, /, %, **, (), <<, >>, etc.), and no logic like ?:/or/and.

It's a decent way to prototype the language, get the big picture right, and leave the uninteresting details for later.

class TuplesParser
  def initialize(path)
    @path = Pathname(path)
  end

  def parse_error(fragment)
    raise "Parse error in #{@path}:#{@lineno}: #{fragment}"
  end

  def parse_left_part(fragment)
    fragment.split(/\s*,\s*/).map do |frag|
      case frag
      when /\A\d+\z/
        frag.to_i
      when "_"
        nil
      else
        frag.to_sym
      end
    end
  end

  def parse_left(text)
    return [] if text.empty?
    parse_error text unless text =~ /\A\((.*)\)\z/
    parts = $1.split(/\)\s*,\s*\(/)
    parts.map { |part| parse_left_part(part) }
  end

  def parse_right(text)
    parse_error text unless text =~ /\A\((.*)\)\z/
    eval "proc{ [#{$1}] }", nil, @path.to_s, @lineno
  end

  def call
    rules0 = []
    rules1 = []
    rules2 = []
    @lineno = 0
    @path.each_line do |line|
      @lineno += 1
      next if line =~ /\A\s*#/
      next unless line =~ /(.*)->(.*)/
      right = parse_right($2.strip)
      left = parse_left($1.strip)
      case left.size
      when 0
        rules0 << right
      when 1
        rules1 << [left[0], right]
      when 2
        rules2 << [left[0], left[1], right]
      else
        parse_error "Only rules with 0-2 dependencies allowed: #{text}"
      end
    end
    TuplesProgram.new(rules0, rules1, rules2)
  end
end

TuplesProgram

And finally the runner. For rules with 0 or 1 dependencies it's actually reasonably performant, for rules with 2 dependencies it's ridiculously slow.

If you run the program with DEBUG=1, you'll see a lot of debug messages, so you can debug your Tuples programs this way.

class TuplesProgram
  def initialize(rules0, rules1, rules2)
    @rules0 = rules0
    @rules1 = rules1
    @rules2 = rules2
  end

  def debug
    STDERR.puts yield if ENV["DEBUG"]
  end

  def add(state)
    return if @states.include?(state)
    return if state.any?(&:negative?)
    debug{ "Adding #{state.inspect}" }
    @states << state
    @queue << state
  end

  def process(state)
    @rules1.each do |pattern, block|
      process_rule state, pattern, block
    end
  end

  def process_rule(state, pattern, block)
    return unless pattern.size == state.size
    args = OpenStruct.new
    pattern.size.times do |i|
      # _ => nil => ignore
      next unless pattern[i]
      # symbol => pass to block
      if pattern[i].is_a?(Symbol)
        args[pattern[i]] = state[i]
      # number => check for equality
      elsif pattern[i] != state[i]
        return
      end
    end
    add args.instance_eval(&block)
  end

  # This is very slow
  def process_rule2s
    @rules2.each do |pattern1, pattern2, block|
      part1s = []
      part2s = []
      @states.each do |state|
        next unless state.size == pattern1.size
        h = process_partial_rule(pattern1, state)
        part1s << h if h
      end
      @states.each do |state|
        next unless state.size == pattern2.size
        h = process_partial_rule(pattern2, state)
        part2s << h if h
      end
      debug{ "Rule2: #{part1s.inspect} v #{part2s.inspect}" }
      part1s.each do |part1|
        part2s.each do |part2|
          arg = OpenStruct.new(part1.merge(part2))
          add arg.instance_eval(&block)
        end
      end
    end
  end

  def process_partial_rule(pattern, state)
    args = {}
    pattern.zip(state).each do |patterni, statei|
      next unless patterni
      if patterni.is_a?(Symbol)
        args[patterni] = statei
      elsif patterni != statei
        return
      end
    end
    args
  end

  def load_file(data, file_index)
    data.each_with_index do |c, i|
      add [0, 2, file_index, i, c]
    end
    add [0, 2, file_index, data.size, 0]
  end

  def reset
    @queue = []
    @states = Set[]
    @requests_done = Set[]
    @input_done = -1
    @output_done = -1
    @input = []
    @no_more_input = false
    @rules0.each do |block|
      add block.call
    end
  end

  def process_requests
    active = false
    input_request = @input_done
    output_request = @output_done
    @states.select{|s| s[0] == 0}.each do |state|
      next if @requests_done.include?(state)
      if state.size == 3 and state[1] == 0
        input_request = [state[2], input_request].max
      elsif state.size == 3 and state[1] == 3
        output_request = [state[2], output_request].max
        add [0, 3, state[2], 0]
      elsif state.size == 5 and state[1] == 4
        # Can just process this right away
        add [*state, rand(state[3]..state[4])]
      else
        next
      end
      active = true
      @requests_done << state
    end
    process_output output_request if output_request != @output_done
    process_input input_request if input_request != @input_done
    active
  end

  def process_input(input_request)
    debug{ "Processing input request up to #{input_request}" }
    while input_request >= @input.size or @no_more_input
      line = STDIN.gets
      if line
        debug{ "Adding #{line.codepoints}" }
        @input.push *line.codepoints
      else
        debug{ "Adding 0 for EOF" }
        @no_more_input = true
        @input.push 0
        break
      end
    end
    @input.each_with_index do |c, i|
      add [0, 0, i, c]
    end
    @input_done = input_request
  end

  def process_output(output_request)
    output_todo = @states.filter{|s|
      s.size == 4 and s[0] == 0 and s[1] == 1 and s[2] > @output_done and s[3] != 0
    }
    if output_request
      output = output_todo.filter{|s| s[2] <= output_request}.sort.map(&:last)
      debug{ "Processing output request #{@output_done+1} to #{output_request}: #{output.inspect}" }
      @output_done = output_request
    else
      output = output_todo.sort.map(&:last)
    end
    print output.pack("U*")
  end

  def call
    loop do
      until @queue.empty?
        process @queue.pop until @queue.empty?
        process_rule2s
      end
      break unless process_requests
    end

    process_output(nil)

    if ENV["DEBUG"]
      @states.sort.each do |state|
        p state
      end
    end
  end
end

Future Work

There are quite a few limitations:

  • it is it is really slow, especially anything with 2-dependency rules
  • the parser is only half-implemented, what goes on the right side is not defined at all
  • it could be packaged in a proper gem

There are also some questions where the language could go:

  • Tuples language is quite friendly for libraries, so we could have a directive like use library to include libraries for let's say printing numbers etc.
  • We could extend pattern matching system to treat multiples of the same as equality so we could do (13, i, v), (10, i, b) -> (13, i+1, 10*v+b) instead of (13, i, v), (10, j, b) -> (13, i+1, 10*v+b, -(i-j)**2). This would only really make sesne if that brings significant performance improvements to the 2-dependency rules.
  • The whole 0.* interface feels quite messy, I'm not sure if it could be improved somehow

Code

All code examples for the series will be in this repository.

Code for the Extending Esoteric Language Tuples episode is available here.