Perl Weekly Challenge Part 8

Tags:

Perfect Numbers

So if nothing else the challenges are helping me learn new bits of number theory. For this we're looking for Perfect Numbers which are numbers where the sum of their proper divisors equals the number itself.

I'll admit my first attempt at this involved me reading that, not reading the rest of the wiki page and making a couple of simple functions :

sub proper-divisors ( Int $p ) is pure {
    (1..^$p).grep( $p %% * );   
}

sub is-perfect ( Int $p ) is pure {
    $p == [+] proper-divisors( $p );
}

Once you've got those is simply a case of checking numbers until you get 5 where is-perfect is True. Which technically works. But it takes a while. A very long while to get to number 4 (8128) and I then stopped the script and looked at the wiki page.

Time for a new plan because I'd be dead of old age before we hit the next value.

Once again I scanned the code and whipped up a nice little function.

sub perfect-from-prime( Int $p where *.is-prime ) is pure {
    return (2**($p-1)) * ((2**$p) -1 ); 
}

So with this in place the code to get the list of prefect numbers is now simple.

.say for (1..*).grep( *.is-prime ).map( { perfect-from-prime( $_ ) } )[^5];

... I'll break that down.

.say for # Call the .say method on each item in the given list 
(1..*). # An infinite range starting at one. This will be evaluated lazily
grep( *.is-prime ). # Only include prime numbers
map( { perfect-from-prime( $_ ) } ) # Get the perfect number for the prime
[^5] # Get the range of items from this list starting at 0 and up to but not including 5

Which is nice. But as the eagle eyed amoung you will note .... utterly wrong. Because perfect numbers are not generated from primes but from Mersine primes. Did I mention I'm not a mathematician.

Anyway once I worked that out the fix was simple enough. Add a new function called is-mersine. This is a simple enough thing. If you call it with a prime number ($p) where ((2**$p)-1).is-prime is true then it's True. For any other value it's False. So you could do this with a ternary operator (or just by returning $p.is-prime && ((2**$p)-1).is-prime ) but I felt like pushing the boat out...

multi sub is-mersine( Int $ ) is pure { False; }

multi sub is-mersine( Int $p where ((2**$p)-1).is-prime ) is pure { True; }

Because I love multi subs an abusing signatures. In this case I'm not checking for primeness because I'm nothing if not consistent as I know I'm going to only pass it primes... Of course here's the final code I submitted.

#!/usr/bin/env perl6

use v6;

sub perfect-from-prime( Int $p where { $p.is-prime && is-mersine( $p ) } ) is pure {
    return (2**($p-1)) * ((2**$p) -1 ); 
}

multi sub is-mersine( Int $ ) is pure { False; }

multi sub is-mersine( Int $p where ((2**$p)-1).is-prime ) is pure { True; }

#| Help
multi sub MAIN ( Bool :h($help) where *.so ) {
    say $*USAGE;
}

#| Give the first X perfect integers
multi sub MAIN ( 
    Int $x = 5 #= Number of perfect integers to return (default 5)
) { 
    .say for (0..*).grep( *.is-prime ).grep( { is-mersine( $_ ) } ).map( { perfect-from-prime($_) } )[^$x];
}

Note my wonderful consistency if is-mersine not checking for primality but perfect-from-prime is. I dunno, I guess it made sense to me at the time. Anyway this was fun, enough maths, time for some text munging.

Centering Text

So this challenge asks for a function to add spaces to a list of strings to center them. In order to do this you are going to need to know the length of the longest string. So you're going to need to loop over the list a couple of times. Once to get the length and a second to add the spaces.

Of course I found a weird way to do it :

sub center ( *@lines, :$len = max( @lines.map( *.codes ) ) ) {
    @lines.map( { "{ ' ' x ( ( $len - $_.codes ) div 2 )}$_" } );
}

So here I have a function that takes any number of positional arguments and put them in an array *@lines. I also accept an optional named parameter :$len that by default is set to the max value of getting the length of each item in @lines (yes, the one defined in the signature).

Then we use a map to create out new strings. At this point I use the upgraded string interpolation powers of Perl6 to just embed my spacing code in here ' ' x ( ( $len - $_.codes ) div 2) gives us the difference between the max length and the current string halved (rounded down) then we just use the x string repition operator.

Of course by itself that's a bit dull so lets wrap that in a MAIN block and then... run it like so cat ch-2.p6 | perl6 ch-2.p6 which gives us.

                        #!/usr/bin/perl6

                            #! Help
    multi sub MAIN ( :help($h) where *.so ) { say $*USAGE; }

            #| Reads a list of strings from STDIN 
                    #| Outputs them centered
                    multi sub MAIN () {
                .say for center( $*IN.lines );
                                }

sub center ( *@lines, :$len = max( @lines.map( *.codes ) ) ) {
    @lines.map( { "{ ' ' x ( ( $len - $_.codes ) div 2 )}$_" } );
                                }

Isn't it pretty? The centering is a bit off of course as some of the lines start with spaces. I could probably strip those out first. But this is neat. So lets go with it.

Perl Weekly Week 5 (Part 1)

Tags:

So this weeks challenge was all about Anagrams and was very interesting. Because both of the solutions were Anagram based I ended up making a library used by both (and in the case of the second test all 3 of my attempted solutions) and I'm going to cover that first.

Anagram Library

So as this is a simple script I don't plan to release as a module or anything I'm not going full out with a META6.json file just a lib folder in the same folder as the scripts and inside that a file called Anagrams.pm6. (Note Perl6 modules can have iether a .pm6 or a .p6 extension. I think the latter is now recommended... but I'm sometimes slow to catch up.)

Our module starts by declaring it's namespace. As everything in the file is considered part of the module I can just use a unit declaration to state that (if you want to have multiple namespaces in a file you need to use blocks for scoping).

unit package Anagrams;

Of course a module with just that in would be a bit pointless so let's have a simple function is-anagram-of that given two strings will return true if the first is an anagram of the second. Now I know I'm going to be doing this a lot and that are some cases where we can quickly see it's not going to be true. For instance a word is not an anagram of itself so if our two strings are the same return false :

multi sub is-anagram-of( Str \target, Str \word where * eq target ) is export is pure { 
    False; 
}

I'll go into this in a bit more detail as there's a lot here and once you get your head around it it's kind of awesome :

  • multi sub is-anagram-of : This is a multi sub which means there will be more than one code path for it with different arguments.
  • Str \target : I'm defining my variables as a immutable Strings. Immutability is important if you want to easily leverage threading.
  • where * eq target : So this where clause is True if the two strings are the same
  • is export : I want this function to be exported to the callers scope.
  • is pure : For any given pair of strings this function always returns the same result. Can help the VM to speed up the code.
  • { False; } : No need for a return. The last thing evaluated is the return value for the sub.

Another case that's going to turn up a lot is when the two strings are not the same length. Then there's now way they can be anagrams :

multi sub is-anagram-of( Str \target, Str \word where *.codes != target.codes ) is export is pure { 
    False; 
}

Here I use the .codes method to count unicode code points which is generally What you want to do TM.

With those two False cases out of the way I can now test my strings. It's at this point I want to have a small digression about English and theft.

Cafes, Faces and Cáfes

The concept of an Anagram can only have been invented by and English speaker. The English language is much like the English people, a bit light fingered when it comes to other cultures stuff. Historically we were really good at wandering to other peoples countries and going "Oh that looks nice I. It's ours now". And we do it to this day with other languages, English speakers see nothing wrong with appropriating words from other languages. (Except Americanisms, those are just wrong.)

But... technically we don't really use accents on letters. We've just got the 26 and get all confused about funny squiggles on top of letters or dangling off the bottom. And don't get me started on ß...

Anyway when I was testing my anagram code I tried looking for the anagram of cafe and got face. And then I tried looking for the anagram of face and got... nothing. Because cafe is actually spelt cáfe. Please note I can't event type á easily. As a native English speaker though I don't care about the accent, just the letters and I'd quite like my anagram checker to be the same, also it would be good if it wasn't case sensitive.

This first part is simple enough lc will give us the string in lowercase. But stripping of accents and other marks that's going to be a pain isn't it?

Itroduction samemark. Which allows you to match the accent information between two strings. And if the second string is shorter than the first then the last character is used for all matching. Which means :

samemark( lc( "Cáfe" ), "a" ) eq "cafe"

That's great and I've got a little function to do that (I call it normal).

sub normal ( Str \word ) is pure {
    samemark( lc( word ), "a" )
}

This function is only used internally so it's not exported.

Back to Anagrams

Ok so to test if a string is an anagram of another string if I take the string, split it into letters and sort the list and then compare the sorted lists they should be the same. As I'm doing that twice I'll make a function to do it :

sub order-string ( Str \word ) is export is pure {
    normal( word ).comb.sort( { $^a cmp $^b } ).join;
}

Note that we're using our normalised string. I'm also joining it again but it would be trivial to return a list and compare that... I just didn't think of it.

With that in place we've got our final is-anagram-of sub :

multi sub is-anagram-of( Str \target, Str \word ) is export is pure {
    normal( target ) ne normal( word ) && order-string( target ) ~~ order-string( word );
}

So you'll see that the first thing we do is compare the normalised strings are not the same. Otherwise cafe would be an anagram of cáfe. This isn't caught by our initial where clause. I thought about doing it there but I figured it's more of an edge case and I didn't want to normalise every string.

So if our two normal strings are different and the ordered versions are the same then they are anagrams. Awesome!

Putting it together

Now I've got this library the script (which weirdly I see I gave a .p6 extension, I'm nothing if not inconsistent).

Firstly some boilerplate :

#!/usr/bin/env perl6

use v6;
use lib 'lib';
use Anagrams;

This loads our Anagrams module and makes sure if we accidentally run this with perl5 it explodes asap.

Then I like to use Perl6's auto generation for usage text and the following gives us a help / h flag.

my %*SUB-MAIN-OPTS = :named-anywhere;

sub USAGE { say $*USAGE }

#| Display Help file
multi sub MAIN ( Bool :h($help) where *.so ) { USAGE(); }

The my %*SUB-MAIN-OPTS = :named-anywhere; allows you to mix and matched named and positional arguments on the command line. $*USAGE is an autogenerated string based on the signatures and comments on the MAIN functions. (#| being a comment tagged to the function below).

For more information why not check out the talk I gave at London Perl Mongers last year?

With that out of the way here's the meat of the script :

subset FilePath of Str where *.IO.f;

#| Find the anagrams for a given word
multi sub MAIN (
    Str $word, #= Word to check for. Case insensitive
    FilePath :$dict = "/etc/dictionaries-common/words" #= Dictionary file to use. Defaults to "/etc/dictionaries-common/words"
) {
    $dict.IO.words.grep( { is-anagram-of( $word, $_ ) } )>>.say;
}

FilePath is a simple subset to ensure that a given string is actually a file, but the main function is no quite simple. $dict.IO.words opens our files and reads string from it splitting on white space (in this case newlines) this is a Sequence that can be iterated over. grep applies our is-anagram-of function with out given word and keeps those that are then we use the hyper operator and .say to print out any that work.

Generally nice and simple. How hard could part 2 be?

Oh ho ho ho ho. Tune in later for Part two of the challenge and be ready to mock my child like ignorance.

Perl Weekly Week 5 (Part 2)

Tags:

Technical note

My blogging platform was really not happy with some of this post. I tihnk I'm going to have to go back to my earlier plan of building my own in Perl6. So I've cut up my first couple of failed attempts, one was kind of neat as it involved threading but I'm sure I'll revist it in the future. Meanwhile he's the final result.

Back to the drawing board

Ok. Brute force and ignorance, even with threads was not enough. Time to have a think about a better solution. Also... time to re-read the spec. I was trying to do too much. I already have a script that can find Anagrams for a given string, so why am I not doing what the spec says and follow the Unix paradigm? Write a script that does one thing well.

So I had a think. I needed to avoid trying to cross reference things, at this point I had an idea that was based on trees. The core idea goes like this, normalise and sort the string, then store it in a hash and keep count of numbers of times that node has been reached. Then I can keep running count of the biggest number and it's strings. Here's an example. Lets say we've just started and out first word is ant this goes into out tree which becomes :

{ 
 'a' => {
         'n' => {
                 't' => {
                         'count' => 1
                        }
                }
        }
}

So our counter is at 1 and the list of strings is ['ant']. Then we get act :

{ 
 'a' => {
         'c' => {
                 't' => {
                         'count' => 1
                        }
                }
         'n' => {
                 't' => {
                         'count' => 1
                        }
                }
        }
}

The counter stays at 1 and the list is now ['ant', 'act']. Now we get tan which becomes act and the tree is now

{ 
 'a' => {
         'c' => {
                 't' => {
                         'count' => 1
                        }
                }
         'n' => {
                 't' => {
                         'count' => 2
                        }
                }
        }
}

The counter is upped to 2 and the list is reset to ['ant'] and this continues. It's a nice simple algorithm and the code for it turns out like so :

#!/usr/bin/env perl6

use v6;
use lib 'lib';
use Anagrams;

my %*SUB-MAIN-OPTS = :named-anywhere;

sub USAGE { say $*USAGE }

subset FilePath of Str where *.IO.f;

#| Display Help file
multi sub MAIN ( Bool :h($help) where *.so ) { USAGE(); }

So here's our boilerplate again

#| Find the strings(s) which has the most anagrams in the given dictionary
#| Prints out 1 string per line to STDOUT
multi sub MAIN (
    FilePath :$dict = "/etc/dictionaries-common/words" #= Dictionary file to use. Defaults to "/etc/dictionaries-common/words"
) {
    my %store;
    my $max-length = 0;
    my @values;
    for $dict.IO.words -> $word {
        my $string = order-string( $word );
        my @keys = $string.comb;
        my $current = %store;
        for @keys -> $key {
            $current{$key} //= {};
            $current = $current{$key};
        }
        $current<count>++;

This part handles the spidering through the tree. At the end $current is a reference to the end node for the string. We increment the count value (creating it if it doesn't exists).

        given $current<count> {
            when * > $max-length {
                $max-length = $current<count>;
                @values = [$string];
            }
            when * == $max-length {
                @values.push($string);
            }
        }

Given is the Perl 6 version of a switch if our count is over the max length we up it and create a new output array. If it's the same as max length we add our new string to the output array.

    }
    @values>>.say;
}

Then once the loop is down we print out the results.

While the previous script took 3 hours and got no where this took 3 seconds. So I claim that as a win. Roll on next week.

Perl Weekly Week 4

Tags:

So this weeks Perl Challenge is coming down to the wire. This is mostly because it's the Easter Holidays and I've been visiting family. Still as I did propose one of the two challenges I probably should do them.

Pi to n digits

The first challenge was to calculate Pi to N digits. I had a plan to go to Wikipedia an look into how to calculate the digits of Pi. Then Laurent Rosenfeld posted about the Bailey–Borwein–Plouffe formula in the Perl6 user mailing list.

So that made life a bit easier. One thing I did want was to ensure I use FatRat's and I thought this would be a good time to create a constuctor for them giving us this:

use v6;

sub infix:<///> ( Int() $nu, Int() $de ) { FatRat.new( $nu, $de ) }

sub bbp-digit ( Int $k ) {
    my $k8 = $k*8;
    (1 /// (16 ** $k)) * ((4 /// ($k8 + 1)) - (2 /// ($k8 + 4)) - (1 /// ($k8 + 5)) - (1 /// ($k8+6)));
}

sub calc-pi( Int $num ) {
    my $p = [+] ( 0..$num ).map( &bbp-digit );
    return $p.Str.substr(0,$num+1);
}

#| Calculate PI to a given number of digits
sub MAIN (
    Int $digits #= Number of digits to calcuate
) {
    say calc-pi( $digits );
}

Word puzzle solver

So this was my idea for a challenge and it comes from my spending too much time playing random word games on my phone while on the train to work. Some of these games entail making words from a limited set of letters and when I get stumped I tend to dive into the /usr/share/dict folder and start grepping. I'd kept meaning to writing a Perl6 script to do it for me and suggested it to Mohammad.

As I'm writing this I've got 25 minutes to hand in my homework so I'll post this quickly and them maybe come back to it and explain it more later :

use v6;

subset FilePath of Str where *.IO.f;

#| Given a dictionary of words to check and a list of available letters
#| Print the words that can be made using just those letters
sub MAIN (
    FilePath $dict, #= Dictionary file to read
    *@letters #= List of letters allowed to be used
) {
    my $check-set = bag( @letters.map( *.lc ) );

    $dict.IO.words.race.grep( { bag( $_.lc.comb ) (<=) $check-set } )>>.say;
}

Using Bags and set operators plus the race command and it will check the 102,000 lines American English dictionary for a list of 6 letters in 3 seconds. That'll do.

Perl Weekly Week 3

Tags:

So I didn't blog last weeks challenge because it was a very busy week for me but this week I've got a bit of time so here's my solutions to this weeks challenges, in Perl6 for now.

Regular Numbers

I'm not a great mathematician, I've kind of got an instinct for a lot of it but formal math learning and me... have always been uneasy partners so I'd not heard of Regular Numbers before.

One nice thing about that article is it gives a nice algorithm which changes the problem to be "implement this algorithm" and that's something I can do. :)

So this break down to the following :

  1. Make a list with 1 element 1
  2. Take the first element from the list call it h.
  3. Add 2h, 3h, 5h to the list (There's a catch here).
  4. Return h
  5. When you want the next element return to 2

Ok that's nice and looking at it a thought came to mind, this seems a great time to use gather and take to make a sequence of Regular Numbers.

Here's iteration one :

my @h = gather {
    my @items = [1];
    loop {
        my $h = @items.shift;
        @items.push( 2*$h, 3*$h, 5*$h );
        take $h;
    }
}
say @h[0..10];

This.... did not work. It's got a few bugs. Firstly the VM doesn't tag it as a lazy Sequence so it runs the loop forever (Perl6 has for loops for iteration, while loops for testing a truth value and loop which can be used as an infinite loop or as a C style loop).

Fixing the first issue is easy add a lazy in front of the gather and we get some results. But they are a bit all over the place.

(1 2 3 5 4 6 10 6 9 15 10)

Ooooh. Of course when we take 2 from the list we append [4,6,10] to [3,5] and then when we take 3 we append [6,9,15] so on. Now at this point I thought of using a SetHash to ensure I had unique values. But then I'd have to worry about sorting the keys each time. Easier instead to just sort the list and remove unique values. On top of that this seems a good time to add a MAIN wrapper to find out how many numbers are required.

sub MAIN(Int() $count) {
    my @h = lazy gather {
        my @items = (1);
        loop {
            my $h = @items.shift;
            take $h;
            @items.push(2*$h,3*$h,5*h);
            @items = @items.unique.sort( { $^a <=> $^b } );
        }
    };
    say @h[0..$count-1]
}

That's nice and seems to do the trick. Which means it's time for the next challenge.

Pascal's Triangle

So Pascal's Triangle is pretty and I decided I wanted to get it nicely laid out like so:

                   1
                 1   1
               1   2   1
             1   3   3   1
           1   4   6   4   1
         1   5   10  10  5   1
       1   6   15  20  15  6   1
     1   7   21  35  35  21  7   1
   1   8   28  56  70  56  28  8   1
 1   9   36  84 126 126  84  36  9   1

So this challenge has two parts. Generating the numbers for the triangle and then laying it out. In this case I didn't read the article too much but had a neat idea. For a given row (eg [1,2,1] you can generate the next row by implementing doing the following):

# Here's our row
my @row = [1,2,1];
# Make a copy with a 0 at the start
my @row1 = [0, |@row]; # [0,1,2,1] (| breaks the array down so it doesn't get added as a single object in this case)
# Make a copy with a 0 at the end
my @row2 = [|@row, 0]; # [1,2,1,0]

# For each item in row1 add it to the same item in row2
my @row3 = [ 
    @row1[0] + @row2[0], # 0 + 1 = 1
    @row1[1] + @row2[1], # 1 + 2 = 3
    @row1[2] + @row2[2], # 2 + 1 = 3
    @row1[3] + @row2[3], # 1 + 0 = 1
];

So... that's a thing but it's a bit unwieldy if only there was some kind of way to simplify this. Oh yes, the Zip metaoperator Z. Place a Z between two lists and it will default to applying the , operator to each item in turn to make a new list of lists :

(0, 1, 2, 1) Z (1, 2, 1, 0) => ((0,1), (1,2), (2,1), (1,0))

But you can give Z and inline operator to apply list for instance + giving you :

    (0,1,2,1) Z+ (1,2,1,0) => (1, 3, 3, 1)

Which looks very nice. So if we just wanted our script to out lists of lists that make up Pascal's Triangle we could to this :

# Note the challenge says we need at least 3 lines so we catch that here.
multi sub MAIN( Int() $lines is copy where * > 2 ) {
    # Set up the initial state.
    # Decrement lines.
    my @row = (1);
    my @out = [ [1], ]; # Note we need a trailing comma or the array will be flattened
    $lines--;

    # Did I mention this earlier? Repeat / While is like the Perl5 do / while pair
    repeat { 
        @row = (0, |(@row)) Z+ (|(@row), 0); # Here's the Zip magic
        @out.push( @row.clone ); # We need to clone our row or will get clobbered in the next loop
        $lines--;
    } while ( $lines );

    @out.say;
}

And that works but... well I wanted it pretty. The simple thought would be to join each row with spaces and then join the lot with newlines but just doing that you end up with :

1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1

Which is.... a bit meh.

Of course you could make those lines up then find the length of the last line and left pad all the other by half that amount.

@out = @out.map( { $_.join(" ") } );

my $len = @out[*-1].codes;

@out.map( { ( " " x ( ($len - $_.codes) div 2) ) ~ "$_" } ).join("\n").say;

And that's a bit nicer :

             1
            1 1
           1 2 1
          1 3 3 1
         1 4 6 4 1
       1 5 10 10 5 1
     1 6 15 20 15 6 1
   1 7 21 35 35 21 7 1
  1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1

But it falls apart when larger numbers get added. So we need to pad our output based on the length of the biggers number. Here's what I came up with :

sub pad( Str $val, Int $len ) {
    my $diff = $len - $val.codes;
    my $rpad = " " x ( ( $diff div 2 ) + 1 );
    my $lpad = " " x ( ( $diff div 2 ) + ( $diff % 2 ) );
    return "{$lpad}{$val}{$rpad}";
}

Pass in the number to pad and the length of the largest number and it returns the number all laiud out nicely. With that here's my final code :

sub pad( Str $val, Int $len ) {
    my $diff = $len - $val.codes;
    my $rpad = " " x ( ( $diff div 2 ) + 1 );
    my $lpad = " " x ( ( $diff div 2 ) + ( $diff % 2 ) );
    return "{$lpad}{$val}{$rpad}";
}

multi sub MAIN( Int() $lines is copy where * > 2 ) {
    my @row = (1);
    my @out = [ [1], ];
    $lines--;
    my $max = 1;

    repeat {
        @row = (0, |(@row)) Z+ (|(@row), 0);
        @out.push( @row.clone );
        $max = @row.max;
        $lines--;
    } while ( $lines );

    my $len = $max.codes;

    @out = @out.map( -> @list { @list.map( { pad($_.Str,$len) } ).join("") } );

    $len = @out[*-1].codes;

    @out.map( { ( " " x ( ($len - $_.codes) div 2) ) ~ "$_" } ).join("\n").say;
}

multi sub MAIN($) {
    note "Please input a number of lines that must be at least 3";
}

And that seems to do the trick which is fun. (This is the code used to make the triangle at the top).