Loading ...
Sorry, an error occurred while loading the content.

RE: [PBML] limiting a regex search to the first two lines of a file.

Expand Messages
  • Frankie
    disregard this.. I have bouts of stupidity occasionally... I m afraid it s terminal :-) rgds Franki ... From: Frankie [mailto:frankieh@iinet.net.au] Sent:
    Message 1 of 1 , Apr 30 8:41 AM
    • 0 Attachment
      disregard this.. I have bouts of stupidity occasionally...

      I'm afraid it's terminal :-)


      rgds

      Franki

      -----Original Message-----
      From: Frankie [mailto:frankieh@...]
      Sent: Wednesday, 30 April 2003 9:34 PM
      To: perl-beginner@yahoogroups.com
      Subject: [PBML] limiting a regex search to the first two lines of a
      file.


      Hi guys,

      My current scripts works fine.. but its doing an enormous amount of
      unnecessary parsing..

      I have two dozen tutorial shtml files in a directory..
      The main menu page for these is an SSI page linked to a perl script that
      looks in the tute directory and makes href links out of all the files in
      there. (so we dont have to edit the menu to add new tutes.)

      Anyway, I added:
      <!-- <desc A summary description of this pages contents desc> //-->

      Directly under the doctype declaration of each page. (doesn't stop them
      validating xHTML)

      And the perl script parses each file to catch the text between <desc and
      desc> and makes it the summary to the link. (have I explained that
      clearly??)

      Anyway, the regex would keep parsing even after the first match, so I put it
      in an if condition and put a last; if it gets a match... is that the correct
      thing to do to stop it parsing the html unecessarily?? or is there a better
      way?

      Here is the code:

      (Sample of SHTML file)
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      <!-- <desc A description of this page. desc> //-->
      <html><head>// rest of the page.....
      ---------------------------------------------
      And here is the relevant bit of the perl script.


      if (( $file =~ m/.htaccess/) || ($file =~ /.txt/) || ($file =~
      /tute_menu/)){
      $file = '';
      $path_and_file = '';
      }

      if ($file ne ""){
      $path_and_file = $directory_path . '/' . $file;
      open (HTML, "$path_and_file") || die "Can't open file $!";
      flock(HTML, 1) or die "cannot lock file: $!";
      while (<HTML>)
      {
      # THIS IS THE REGEX TO GET THE DESC..
      if ($_ =~ /<desc(.*?)desc>/is )
      { $summary = $1; last; } # This is where I tried to stop after
      the first match.
      }
      flock(HTML, 8);
      close (HTML);
      print "<li><a href='$download_url/$file'>$name_file</a><br /><table
      width='420px'><tr><td>$summary</td></tr></table><br /></li>\n";
      $summary = '';
      }

      Any thoughts would be great.

      Many thanks..

      regards

      Franki












      Unsubscribing info is here:
      http://help.yahoo.com/help/us/groups/groups-32.html

      Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
    Your message has been successfully submitted and would be delivered to recipients shortly.