Perl Weekly Challenge 13

Tags:

So I know I've fallen off the blogging about the weekly challenge. This is not because I've not been doing (I have) or I've not been enoying them (I also have). It's because my blogging platform (a neat little Perl based system called Statocles) kind of died while parsing the Markdown on a previous blog post and this got me wound up.

It also got me to restart my personal plan to make a similar system for building simple, fast to update, static sites using Perl6. The first part of which (that is mostly so I can learn Grammars) is a Pure Perl6 Markdown parser.

It's taking a while but I'm hoping in my upcoming week off I'll get a lot more of it done.

Anyway I just read Laurent Rosenfield's post about this week and I wanted to share one of my two solutions becuase I thought it was kind of neat.

Hofstadter Female and Male sequences

So the Hofstadter Female and Male sequences are defined by the following equation :

F(0) = 1 ; M(0) = 0
F(n) = n - M(F(n-1)), n > 0
M(n) = n - F(M(n-1)), n > 0

So it's a pair of recursive functions. Where to calculate the value for n in either sequnces you need to calulate the ones before in both sequences. The problem with doing it as a pair of recursive functions is that without some kind of caching you're going to be calculating the same values over and over again (especailly) the smaller values of n. Laurent covered some nice ways to cache the data but when I had a look the word sequence jumped out at me. Hence my final code looks like this :

sub MAIN ( UInt() $n ) {
    my @M;
    my @F = lazy gather {
        my $n = 0;
        take 1;
        loop {
            $n++;
            take $n - @M[@F[$n-1]];
        }
    }

    @M = lazy gather {
        my $n = 0;
        take 0;
        loop {
            $n++;
            take $n - @F[@M[$n-1]];
        }
    };

    say "F : {@F[0..$n].join(", ")}";
    say "M : {@M[0..$n].join(", ")}";
}

Here I define a couple of lazy sequences that refer to one another. Note I have to predefine @M before I do @F or it raises and error but it's fine with my updating what @M is after the fact. I added some notes in the code to note before a take happened and each value is only taken once.

Later today I may run some speed tests of this code against the code Laurent wrote to see how performant it is.

An aside

While doing the first challenge for this week I found a bug in Perl6 where is you create a Range or Sequence of date objects and you give the first a custom formatter after the 28th of the month the formatted drops off. This bug was fixed as soon as I raised it in the Perl6 IRC and will hopefully be resolved in the next release.

Which is pretty cool.

Anyway, more later. Hopefully.

Perl Weekly Challenge Part 8

Tags:

Perfect Numbers

So if nothing else the challenges are helping me learn new bits of number theory. For this we're looking for Perfect Numbers which are numbers where the sum of their proper divisors equals the number itself.

I'll admit my first attempt at this involved me reading that, not reading the rest of the wiki page and making a couple of simple functions :

sub proper-divisors ( Int $p ) is pure {
    (1..^$p).grep( $p %% * );   
}

sub is-perfect ( Int $p ) is pure {
    $p == [+] proper-divisors( $p );
}

Once you've got those is simply a case of checking numbers until you get 5 where is-perfect is True. Which technically works. But it takes a while. A very long while to get to number 4 (8128) and I then stopped the script and looked at the wiki page.

Time for a new plan because I'd be dead of old age before we hit the next value.

Once again I scanned the code and whipped up a nice little function.

sub perfect-from-prime( Int $p where *.is-prime ) is pure {
    return (2**($p-1)) * ((2**$p) -1 ); 
}

So with this in place the code to get the list of prefect numbers is now simple.

.say for (1..*).grep( *.is-prime ).map( { perfect-from-prime( $_ ) } )[^5];

... I'll break that down.

.say for # Call the .say method on each item in the given list 
(1..*). # An infinite range starting at one. This will be evaluated lazily
grep( *.is-prime ). # Only include prime numbers
map( { perfect-from-prime( $_ ) } ) # Get the perfect number for the prime
[^5] # Get the range of items from this list starting at 0 and up to but not including 5

Which is nice. But as the eagle eyed amoung you will note .... utterly wrong. Because perfect numbers are not generated from primes but from Mersine primes. Did I mention I'm not a mathematician.

Anyway once I worked that out the fix was simple enough. Add a new function called is-mersine. This is a simple enough thing. If you call it with a prime number ($p) where ((2**$p)-1).is-prime is true then it's True. For any other value it's False. So you could do this with a ternary operator (or just by returning $p.is-prime && ((2**$p)-1).is-prime ) but I felt like pushing the boat out...

multi sub is-mersine( Int $ ) is pure { False; }

multi sub is-mersine( Int $p where ((2**$p)-1).is-prime ) is pure { True; }

Because I love multi subs an abusing signatures. In this case I'm not checking for primeness because I'm nothing if not consistent as I know I'm going to only pass it primes... Of course here's the final code I submitted.

#!/usr/bin/env perl6

use v6;

sub perfect-from-prime( Int $p where { $p.is-prime && is-mersine( $p ) } ) is pure {
    return (2**($p-1)) * ((2**$p) -1 ); 
}

multi sub is-mersine( Int $ ) is pure { False; }

multi sub is-mersine( Int $p where ((2**$p)-1).is-prime ) is pure { True; }

#| Help
multi sub MAIN ( Bool :h($help) where *.so ) {
    say $*USAGE;
}

#| Give the first X perfect integers
multi sub MAIN ( 
    Int $x = 5 #= Number of perfect integers to return (default 5)
) { 
    .say for (0..*).grep( *.is-prime ).grep( { is-mersine( $_ ) } ).map( { perfect-from-prime($_) } )[^$x];
}

Note my wonderful consistency if is-mersine not checking for primality but perfect-from-prime is. I dunno, I guess it made sense to me at the time. Anyway this was fun, enough maths, time for some text munging.

Centering Text

So this challenge asks for a function to add spaces to a list of strings to center them. In order to do this you are going to need to know the length of the longest string. So you're going to need to loop over the list a couple of times. Once to get the length and a second to add the spaces.

Of course I found a weird way to do it :

sub center ( *@lines, :$len = max( @lines.map( *.codes ) ) ) {
    @lines.map( { "{ ' ' x ( ( $len - $_.codes ) div 2 )}$_" } );
}

So here I have a function that takes any number of positional arguments and put them in an array *@lines. I also accept an optional named parameter :$len that by default is set to the max value of getting the length of each item in @lines (yes, the one defined in the signature).

Then we use a map to create out new strings. At this point I use the upgraded string interpolation powers of Perl6 to just embed my spacing code in here ' ' x ( ( $len - $_.codes ) div 2) gives us the difference between the max length and the current string halved (rounded down) then we just use the x string repition operator.

Of course by itself that's a bit dull so lets wrap that in a MAIN block and then... run it like so cat ch-2.p6 | perl6 ch-2.p6 which gives us.

                        #!/usr/bin/perl6

                            #! Help
    multi sub MAIN ( :help($h) where *.so ) { say $*USAGE; }

            #| Reads a list of strings from STDIN 
                    #| Outputs them centered
                    multi sub MAIN () {
                .say for center( $*IN.lines );
                                }

sub center ( *@lines, :$len = max( @lines.map( *.codes ) ) ) {
    @lines.map( { "{ ' ' x ( ( $len - $_.codes ) div 2 )}$_" } );
                                }

Isn't it pretty? The centering is a bit off of course as some of the lines start with spaces. I could probably strip those out first. But this is neat. So lets go with it.

Perl Weekly Week 5 (Part 2)

Tags:

Technical note

My blogging platform was really not happy with some of this post. I tihnk I'm going to have to go back to my earlier plan of building my own in Perl6. So I've cut up my first couple of failed attempts, one was kind of neat as it involved threading but I'm sure I'll revist it in the future. Meanwhile he's the final result.

Back to the drawing board

Ok. Brute force and ignorance, even with threads was not enough. Time to have a think about a better solution. Also... time to re-read the spec. I was trying to do too much. I already have a script that can find Anagrams for a given string, so why am I not doing what the spec says and follow the Unix paradigm? Write a script that does one thing well.

So I had a think. I needed to avoid trying to cross reference things, at this point I had an idea that was based on trees. The core idea goes like this, normalise and sort the string, then store it in a hash and keep count of numbers of times that node has been reached. Then I can keep running count of the biggest number and it's strings. Here's an example. Lets say we've just started and out first word is ant this goes into out tree which becomes :

{ 
 'a' => {
         'n' => {
                 't' => {
                         'count' => 1
                        }
                }
        }
}

So our counter is at 1 and the list of strings is ['ant']. Then we get act :

{ 
 'a' => {
         'c' => {
                 't' => {
                         'count' => 1
                        }
                }
         'n' => {
                 't' => {
                         'count' => 1
                        }
                }
        }
}

The counter stays at 1 and the list is now ['ant', 'act']. Now we get tan which becomes act and the tree is now

{ 
 'a' => {
         'c' => {
                 't' => {
                         'count' => 1
                        }
                }
         'n' => {
                 't' => {
                         'count' => 2
                        }
                }
        }
}

The counter is upped to 2 and the list is reset to ['ant'] and this continues. It's a nice simple algorithm and the code for it turns out like so :

#!/usr/bin/env perl6

use v6;
use lib 'lib';
use Anagrams;

my %*SUB-MAIN-OPTS = :named-anywhere;

sub USAGE { say $*USAGE }

subset FilePath of Str where *.IO.f;

#| Display Help file
multi sub MAIN ( Bool :h($help) where *.so ) { USAGE(); }

So here's our boilerplate again

#| Find the strings(s) which has the most anagrams in the given dictionary
#| Prints out 1 string per line to STDOUT
multi sub MAIN (
    FilePath :$dict = "/etc/dictionaries-common/words" #= Dictionary file to use. Defaults to "/etc/dictionaries-common/words"
) {
    my %store;
    my $max-length = 0;
    my @values;
    for $dict.IO.words -> $word {
        my $string = order-string( $word );
        my @keys = $string.comb;
        my $current = %store;
        for @keys -> $key {
            $current{$key} //= {};
            $current = $current{$key};
        }
        $current<count>++;

This part handles the spidering through the tree. At the end $current is a reference to the end node for the string. We increment the count value (creating it if it doesn't exists).

        given $current<count> {
            when * > $max-length {
                $max-length = $current<count>;
                @values = [$string];
            }
            when * == $max-length {
                @values.push($string);
            }
        }

Given is the Perl 6 version of a switch if our count is over the max length we up it and create a new output array. If it's the same as max length we add our new string to the output array.

    }
    @values>>.say;
}

Then once the loop is down we print out the results.

While the previous script took 3 hours and got no where this took 3 seconds. So I claim that as a win. Roll on next week.

Perl Weekly Week 5 (Part 1)

Tags:

So this weeks challenge was all about Anagrams and was very interesting. Because both of the solutions were Anagram based I ended up making a library used by both (and in the case of the second test all 3 of my attempted solutions) and I'm going to cover that first.

Anagram Library

So as this is a simple script I don't plan to release as a module or anything I'm not going full out with a META6.json file just a lib folder in the same folder as the scripts and inside that a file called Anagrams.pm6. (Note Perl6 modules can have iether a .pm6 or a .p6 extension. I think the latter is now recommended... but I'm sometimes slow to catch up.)

Our module starts by declaring it's namespace. As everything in the file is considered part of the module I can just use a unit declaration to state that (if you want to have multiple namespaces in a file you need to use blocks for scoping).

unit package Anagrams;

Of course a module with just that in would be a bit pointless so let's have a simple function is-anagram-of that given two strings will return true if the first is an anagram of the second. Now I know I'm going to be doing this a lot and that are some cases where we can quickly see it's not going to be true. For instance a word is not an anagram of itself so if our two strings are the same return false :

multi sub is-anagram-of( Str \target, Str \word where * eq target ) is export is pure { 
    False; 
}

I'll go into this in a bit more detail as there's a lot here and once you get your head around it it's kind of awesome :

  • multi sub is-anagram-of : This is a multi sub which means there will be more than one code path for it with different arguments.
  • Str \target : I'm defining my variables as a immutable Strings. Immutability is important if you want to easily leverage threading.
  • where * eq target : So this where clause is True if the two strings are the same
  • is export : I want this function to be exported to the callers scope.
  • is pure : For any given pair of strings this function always returns the same result. Can help the VM to speed up the code.
  • { False; } : No need for a return. The last thing evaluated is the return value for the sub.

Another case that's going to turn up a lot is when the two strings are not the same length. Then there's now way they can be anagrams :

multi sub is-anagram-of( Str \target, Str \word where *.codes != target.codes ) is export is pure { 
    False; 
}

Here I use the .codes method to count unicode code points which is generally What you want to do TM.

With those two False cases out of the way I can now test my strings. It's at this point I want to have a small digression about English and theft.

Cafes, Faces and Cáfes

The concept of an Anagram can only have been invented by and English speaker. The English language is much like the English people, a bit light fingered when it comes to other cultures stuff. Historically we were really good at wandering to other peoples countries and going "Oh that looks nice I. It's ours now". And we do it to this day with other languages, English speakers see nothing wrong with appropriating words from other languages. (Except Americanisms, those are just wrong.)

But... technically we don't really use accents on letters. We've just got the 26 and get all confused about funny squiggles on top of letters or dangling off the bottom. And don't get me started on ß...

Anyway when I was testing my anagram code I tried looking for the anagram of cafe and got face. And then I tried looking for the anagram of face and got... nothing. Because cafe is actually spelt cáfe. Please note I can't event type á easily. As a native English speaker though I don't care about the accent, just the letters and I'd quite like my anagram checker to be the same, also it would be good if it wasn't case sensitive.

This first part is simple enough lc will give us the string in lowercase. But stripping of accents and other marks that's going to be a pain isn't it?

Itroduction samemark. Which allows you to match the accent information between two strings. And if the second string is shorter than the first then the last character is used for all matching. Which means :

samemark( lc( "Cáfe" ), "a" ) eq "cafe"

That's great and I've got a little function to do that (I call it normal).

sub normal ( Str \word ) is pure {
    samemark( lc( word ), "a" )
}

This function is only used internally so it's not exported.

Back to Anagrams

Ok so to test if a string is an anagram of another string if I take the string, split it into letters and sort the list and then compare the sorted lists they should be the same. As I'm doing that twice I'll make a function to do it :

sub order-string ( Str \word ) is export is pure {
    normal( word ).comb.sort( { $^a cmp $^b } ).join;
}

Note that we're using our normalised string. I'm also joining it again but it would be trivial to return a list and compare that... I just didn't think of it.

With that in place we've got our final is-anagram-of sub :

multi sub is-anagram-of( Str \target, Str \word ) is export is pure {
    normal( target ) ne normal( word ) && order-string( target ) ~~ order-string( word );
}

So you'll see that the first thing we do is compare the normalised strings are not the same. Otherwise cafe would be an anagram of cáfe. This isn't caught by our initial where clause. I thought about doing it there but I figured it's more of an edge case and I didn't want to normalise every string.

So if our two normal strings are different and the ordered versions are the same then they are anagrams. Awesome!

Putting it together

Now I've got this library the script (which weirdly I see I gave a .p6 extension, I'm nothing if not inconsistent).

Firstly some boilerplate :

#!/usr/bin/env perl6

use v6;
use lib 'lib';
use Anagrams;

This loads our Anagrams module and makes sure if we accidentally run this with perl5 it explodes asap.

Then I like to use Perl6's auto generation for usage text and the following gives us a help / h flag.

my %*SUB-MAIN-OPTS = :named-anywhere;

sub USAGE { say $*USAGE }

#| Display Help file
multi sub MAIN ( Bool :h($help) where *.so ) { USAGE(); }

The my %*SUB-MAIN-OPTS = :named-anywhere; allows you to mix and matched named and positional arguments on the command line. $*USAGE is an autogenerated string based on the signatures and comments on the MAIN functions. (#| being a comment tagged to the function below).

For more information why not check out the talk I gave at London Perl Mongers last year?

With that out of the way here's the meat of the script :

subset FilePath of Str where *.IO.f;

#| Find the anagrams for a given word
multi sub MAIN (
    Str $word, #= Word to check for. Case insensitive
    FilePath :$dict = "/etc/dictionaries-common/words" #= Dictionary file to use. Defaults to "/etc/dictionaries-common/words"
) {
    $dict.IO.words.grep( { is-anagram-of( $word, $_ ) } )>>.say;
}

FilePath is a simple subset to ensure that a given string is actually a file, but the main function is no quite simple. $dict.IO.words opens our files and reads string from it splitting on white space (in this case newlines) this is a Sequence that can be iterated over. grep applies our is-anagram-of function with out given word and keeps those that are then we use the hyper operator and .say to print out any that work.

Generally nice and simple. How hard could part 2 be?

Oh ho ho ho ho. Tune in later for Part two of the challenge and be ready to mock my child like ignorance.

Perl Weekly Week 4

Tags:

So this weeks Perl Challenge is coming down to the wire. This is mostly because it's the Easter Holidays and I've been visiting family. Still as I did propose one of the two challenges I probably should do them.

Pi to n digits

The first challenge was to calculate Pi to N digits. I had a plan to go to Wikipedia an look into how to calculate the digits of Pi. Then Laurent Rosenfeld posted about the Bailey–Borwein–Plouffe formula in the Perl6 user mailing list.

So that made life a bit easier. One thing I did want was to ensure I use FatRat's and I thought this would be a good time to create a constuctor for them giving us this:

use v6;

sub infix:<///> ( Int() $nu, Int() $de ) { FatRat.new( $nu, $de ) }

sub bbp-digit ( Int $k ) {
    my $k8 = $k*8;
    (1 /// (16 ** $k)) * ((4 /// ($k8 + 1)) - (2 /// ($k8 + 4)) - (1 /// ($k8 + 5)) - (1 /// ($k8+6)));
}

sub calc-pi( Int $num ) {
    my $p = [+] ( 0..$num ).map( &bbp-digit );
    return $p.Str.substr(0,$num+1);
}

#| Calculate PI to a given number of digits
sub MAIN (
    Int $digits #= Number of digits to calcuate
) {
    say calc-pi( $digits );
}

Word puzzle solver

So this was my idea for a challenge and it comes from my spending too much time playing random word games on my phone while on the train to work. Some of these games entail making words from a limited set of letters and when I get stumped I tend to dive into the /usr/share/dict folder and start grepping. I'd kept meaning to writing a Perl6 script to do it for me and suggested it to Mohammad.

As I'm writing this I've got 25 minutes to hand in my homework so I'll post this quickly and them maybe come back to it and explain it more later :

use v6;

subset FilePath of Str where *.IO.f;

#| Given a dictionary of words to check and a list of available letters
#| Print the words that can be made using just those letters
sub MAIN (
    FilePath $dict, #= Dictionary file to read
    *@letters #= List of letters allowed to be used
) {
    my $check-set = bag( @letters.map( *.lc ) );

    $dict.IO.words.race.grep( { bag( $_.lc.comb ) (<=) $check-set } )>>.say;
}

Using Bags and set operators plus the race command and it will check the 102,000 lines American English dictionary for a list of 6 letters in 3 seconds. That'll do.