Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] Alphabetize a word

Expand Messages
  • Jenda Krynicky
    - OK then. - Because then it s all backwards. - Why is that? - Please don t top-post! ... Declare the %bucket hash. It s not initialized to anything, but will
    Message 1 of 15 , Sep 8, 2007
    • 0 Attachment
      - OK then.
      - Because then it's all backwards.
      - Why is that?
      - Please don't top-post!

      Ryan J Nauman <RJNauman@...> wrote:
      > Ah, excellent. However, allow my noobness to kick in. Alright so you
      > want me to create a hash that contains the entire list of words (values)
      > associated with their alphabetized version (keys)? Should I do this as
      > soon as I read in the input file? And should I continue with the line of
      > code "my @WORDS = <INFILE>;" and create the bucket directly after?
      > Now for the "dumping each bucket" portion I don't follow. Assuming I am
      > following correctly on the first part, you're sorting the hash we made in
      > the first step correct? I don't understand the print line. And is this
      > used solely for sorting our newly created hash and then using my
      > conditions against this new word list?
      > Once the "bucket" hash is created would it be beneficial to my programs
      > memory/execution time to close the INFILE and is there a way to delete the
      > original @WORDS array now that the improved hash is in place?
      > Thanks,
      > Ryan

      OK. Let me try to dissect the merlyn's code for you:

      > merlyn@... wrote:
      > Ooof, yeah. You're doing an exponential matching O(n squared)
      > instead of just stashing everything according to its anagram.
      > You need to use a "bucket" approach. As you compute each anagram,
      > dump the original word into a bucket keyed by anagram, using a hash
      > of arrayrefs...
      > my %bucket;

      Declare the %bucket hash. It's not initialized to anything, but will
      be used as a HashOfArrays.

      > for my $word (@words) {
      > my $alphaword = alphabetize($word);

      Let's sort the letters in the word

      > push @{$bucket{$alphaword}}, $word;

      And now let's the word at the the end of the array referenced by the
      value of the hash %bucket using the key $alphaword.

      When storing the first word containing a set of letters the
      $bucket{$alphaword} doesn't exist in the hash, but that doesn't
      matter. Perl will create it for us and as we are aparently expecting
      the value to be a reference to an array ( @$ref is the array pointed
      to by the reference stored in $ref) Perl even creates a brand new
      array for us and stores its reference in $bucket{$alphaword}.

      > }

      Once this loop completes all words in the @words array have been
      processed and are stored in the HashOfArrays named %bucket. They keys
      are the sorted sets of letters and the values of the hash are
      references to arrays containing the words.

      You may clear the @words array now.

      If your @words was created by
      my @WORDS = <INFILE>;

      You might actually get rid of the array completely and save memory.
      Just remove that statement and change the loop to

      while (my $word = <INFILE>) {
      my $alphaword = alphabetize($word);
      push @{$bucket{$alphaword}}, $word;

      > Now it's simply a matter of dumping each bucket:
      > for my $alphaword (sort keys %bucket) {
      > my $aref = $bucket{$alphaword};
      > print "@$aref\n";
      > }

      As I said, %bucket is a hash of arrayrefs so in the loop above we get
      the list of keys (the sorted sets of letters)
      keys %bucket
      sort it alphabetically
      sort keys %bucket
      and then loop through the sorted list, get the reference contained in
      the hash and print the values in the referenced array.

      HTH, Jenda

      ===== Jenda@... === http://Jenda.Krynicky.cz =====
      When it comes to wine, women and song, wizards are allowed
      to get drunk and croon as much as they like.
      -- Terry Pratchett in Sourcery
    Your message has been successfully submitted and would be delivered to recipients shortly.