Simple thread safe caching in Raku

Tags:

So I posted yesterday about the weekyl challenge and mentioned caching and how I'd not done it because I liked the mathematical simplicity. Still Markus Holzer who's been known to tweet complete solutions to the challenges, suggested it and I thought I'd take a look.

The first thought is to use the is cached trait the problem is it's experimental and when I just added it to the square function it crashed. This is possibly because I only cached the function part that does a check and not the other part of the multi or possibly due to threading. Still it got me thinking about how I would want to do a thread safe cache implementation (without pulling in an external module like OO::Monitors ).

I took a look at the Lock class and for what I was thinking about (a cache hash that gets updated in a thread safe fashion) it should work. The updated square function now looks like this :

# Not caching, not need a no calculations involved.
multi sub square(Int $ where * <= 0) returns Bool { False }

# Note these are generated outside the sub routine so shared across threads.
my %square-cache;
my $lock = Lock.new;

multi sub square(Int \i) returns Bool {
  return %square-cache{i} if defined %square-cache{i};
  my \r = i.sqrt.narrow ~~ Int;
  $lock.protect( { %square-cache{i} = r } );
  return r;
}

Note I've updated the test using the narrow method again as per Markus's thoughts. But the code is pretty simple. Return the value from %square-cache if it exists, this isn't locked as it's a read and the worst comes to the worst you may have two threads try reading it at once, as they always write the same values it's not the worst. We do wrap the cache setting in a protect call to ensure we don't have multiple threads trying to update the cache at once.

I did try the code without the Lock and sometimes it works, sometimes it seg faults. Threading is hard y'all. Still I found with this in place there was a small but noticable speed increase for the 6 digit run. If I wasn't currently using a VM I'd try the 9 digit run again. On the bright side it doesn't seem to use a huge amount of memory. The 6 digite run making roughly 6000 cache keys.

Anyway, possibly more soon but I hope this was interesting.

Update

After I posted this Elizabeth Mattijsen shared it on Reddit with a note that accessing a Hash can cause a race condition (I figured it would be safe but I defer to Liz in such things as she's really smart).

Her updated code is even simpler, I'd put it here but my blogging platform is being weird, go checkout the link. The protect returns the value inside the block and we use //= to set the cache value, and as she also mentioned the return is superfluous and in these kind of cases I generally leave it out. So that's even cleaner and works fine. Awesome.

Thanks again Liz.