Loading ...
Sorry, an error occurred while loading the content.

Re: FW: TWiki ( was RE: [XP] is the wiki open source and if so, wheredo I get it?)

Expand Messages
  • Tony Bowden
    ... Doh! To match your current code, remove that colon: + return map $count{$_} $_ , splice(@returnList, 0, 5); However, I was refactoring this some
    Message 1 of 4 , Jul 31, 2001
    • 0 Attachment
      On Tue, Jul 31, 2001 at 11:10:40PM +0100, Tony Bowden wrote:
      > return map " $count{$_}: $_", splice(@returnList, 0, 5);

      Doh!

      To match your current code, remove that colon:
      + return map " $count{$_} $_", splice(@returnList, 0, 5);

      However, I was refactoring this some more, as I saw that the function
      was only actually called twice. In processWeb, you currently do a lot
      of iterating over the logfile, using greps:

      my @list = grep( /^\|[^\|]*\|[^\|]*\| view \| $webName/, @theLogList );
      my $statViews = @list;
      my @topViews = getTopList( 0, $wiki::statsTopViews, @list );

      @list = grep( /^\|[^\|]*\|[^\|]*\| save \| $webName/, @theLogList );
      my $statSaves = @list;

      @list = grep( /^\|[^\|]*\|[^\|]*\| upload \| $webName/, @theLogList );
      my $statUploads = @list;

      @list = grep( /^\|[^\|]*\|[^\|]*\| (save|upload) \| $webName/, @theLogList );
      my @topContrib = getTopList( 1, $wiki::statsTopContrib, @list );

      For a long log this will be very slow, as it will have to iterate over it 4
      times, so I've changed it to only step through the log once.

      Those regular expressions, as with the ones in the function from earlier,
      are also quite difficult to read. From what I can see the logfile is
      pipe-delimited, so I've used a 'split' to extract the relevant fields[1]:

      my (%actionCount, %topView, %topContrib);
      foreach (@theLogList) {
      my @parts = split / \| /;
      my ($user, $action, $page) = @parts[1 .. 3];
      ++$actionCount{$action};

      if ($action eq "view") {
      ++$topView{$page};
      } elsif ($action eq "upload" or $action eq "save") {
      ++$topContrib{$user};
      }
      }

      This now has the added advantage that we have pre-populated 'count'
      hashes to pass to our 'getTopList' function, which doesn't need to know
      anymore whether it's getting a username or a pagename, and can become
      much simpler (and renamed to something more generic like _sortHash). Our
      old 42 line subroutine now reads:

      sub _sortHash
      {
      my( $theMaxNum, %count ) = @_;
      my @returnList = sort {$count{$b} <=> $count{$a}} keys %count;
      return map " $count{$_} $_", splice @returnList, 0, 5;
      }

      And then where you print out the resulting stats:

      printMsg( " - view: $statViews, save: $statSaves, upload: $statUploads" );
      my $statTopViews = "";
      my $statTopContributors = "";
      if( @topViews ) {
      printMsg( " - top view: $topViews[0]" );
      for( $x = 0; $x < @topViews; $x++ )
      {
      $statTopViews .= "$topViews[$x] <br>";
      }
      }
      if( @topContrib ) {
      printMsg( " - top contributor: $topContrib[0]" );
      for( $x = 0; $x < @topContrib; $x++ )
      {
      $statTopContributors .= "$topContrib[$x] <br>";
      }
      }

      we can drop it all to:
      my @topViews = _sortHash ( $wiki::statsTopViews, %topView );
      my @topContrib = _sortHash ( $wiki::statsTopViews, %topContrib );
      my $statTopViews = join " <br>", @topViews;
      my $statTopContributors = join " <br>", @topContrib;

      printMsg( sprintf " - view: %d, save: %d, upload: %d",
      $actionCount{'view'} || 0,
      $actionCount{'save'} || 0,
      $actionCount{'upload'} || 0
      );
      printMsg( " - top view: $topViews[0]" ) if( @topViews );
      printMsg( " - top contributor: $topContrib[0]" ) if( @topContrib );


      Although the resulting code in processWeb() isn't much shorter this time
      (although the helper function for it is considerably shorter), I think
      this is all a lot more readable, and again will be considerably faster.

      There's obviously further refactoring can happen (and there's a few more
      print statements later on in the complicated templating code that will
      need changed to the new variables), but that's fairly trivial to do now...

      Hope this is helpful,

      Tony

      [1] I'm not sure if you can have actually log lines that don't match
      $webName. If so just put a 'next unless matches' check at the top
      of this loop...

      --
      --------------------------------------------------------------------------
      Tony Bowden | tony@... | http://www.tmtm.com/
      every tool is a weapon - if you hold it right
      --------------------------------------------------------------------------
    Your message has been successfully submitted and would be delivered to recipients shortly.