Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] Extracting from to

Expand Messages
  • daymobrew
    ... way I d like. Could I get a little more help? ... http://www.gellyfish.com/htexamples. ... not everything between and . How can I modify
    Message 1 of 8 , May 2 6:30 AM
    • 0 Attachment
      --- In perl-beginner@y..., "Greg Krieser" <greg@k...> wrote:
      > Thanks for everyone's help. I've got a sample working, but not the
      way I'd like. Could I get a little more help?
      >
      > Tried this parser sample code I found at:
      http://www.gellyfish.com/htexamples.
      >
      > #!/usr/bin/perl -w
      > package Example;
      > use strict;
      > require HTML::Parser;
      > @Example::ISA = qw(HTML::Parser);
      > my $parser = Example->new;
      > $parser->parse_file('test.html');
      > print $parser->{TEXT};
      > sub text
      > {
      > my ($self,$text) = @_;
      > $self->{TEXT} .= $text;
      > }
      >  
      > This produces some javascript that is in the test.html file, but
      not everything between <body> and </body>. How can I modify the code
      to specify this requirement?
      >
      > Thanks A Lot,
      >
      > Greg

      I modified Jeff's (working) regexp. I got the modified regexp
      working. Here is the full code:

      #!/usr/local/bin/perl -w

      use strict;

      if ( open( FH, 'body.html' ) )
      {
      my $whole_file = join( '', <FH> );
      close( FH );

      $whole_file =~ s@.*<body>(.*)</body>.*@$1@si;
      print "$whole_file";
      }
    • Greg Krieser
      VERY IMPRESSIVE! Works like a champ! Thanks! This list is showing me the importance of regexps. Thanks for the help. Can t wait to implement this
      Message 2 of 8 , May 2 8:14 AM
      • 0 Attachment
        VERY IMPRESSIVE! Works like a champ! Thanks!

        This list is showing me the importance of regexps. Thanks for the help. Can't wait to implement this everywhere.

        Thanks!

        The following message was sent by "daymobrew" <daymobrew@...> on Thu, 02 May 2002 13:30:53 -0000.

        > <html><body>
        >
        >
        > <tt>
        > --- In perl-beginner@y..., "Greg Krieser" <greg@k...> wrote:<BR>
        > > Thanks for everyone's help.� I've got a sample working, but not
        > the <BR>
        > way I'd like.� Could I get a little more help?<BR>
        > > <BR>
        > > Tried this parser sample code I found at: <BR>
        > <a href="http://www.gellyfish.com/htexamples.">http://www.gellyfish.com/htexamples.</a>�
        > <BR>
        > > <BR>
        > > #!/usr/bin/perl -w<BR>
        > > package Example;<BR>
        > > use strict;<BR>
        > > require HTML::Parser;<BR>
        > > @Example::ISA = qw(HTML::Parser);<BR>
        > > my $parser = Example->new;<BR>
        > > $parser->parse_file('test.html');<BR>
        > > print $parser->{TEXT};<BR>
        > > sub text<BR>
        > > {<BR>
        > > my ($self,$text) = @_;<BR>
        > > $self->{TEXT} .= $text;<BR>
        > > }<BR>
        > > �<BR>
        > > This produces some javascript that is in the test.html file, but <BR>
        > not everything between <body> and </body>.� How can I
        > modify the code <BR>
        > to specify this requirement?<BR>
        > > <BR>
        > > Thanks A Lot,<BR>
        > > <BR>
        > > Greg<BR>
        > <BR>
        > I modified Jeff's (working) regexp. I got the modified regexp <BR>
        > working. Here is the full code:<BR>
        > <BR>
        > #!/usr/local/bin/perl -w<BR>
        > <BR>
        > use strict;<BR>
        > <BR>
        > if ( open( FH, 'body.html' ) )<BR>
        > {<BR>
        > ��� my $whole_file = join( '', <FH> );<BR>
        > ��� close( FH );<BR>
        > <BR>
        > ��� $whole_file =~ s@.*<body>(.*)</body>.*@$1@si;<BR>
        > ��� print "$whole_file";<BR>
        > }<BR>
        > <BR>
        > <BR>
        > </tt>
        >
        > <br>
        >
        > <!-- |**|begin egp html banner|**| -->
        >
        > <table border=0 cellspacing=0 cellpadding=2>
        > <tr bgcolor=#FFFFCC>
        > <td align=center><font size="-1" color=#003399><b>Yahoo! Groups Sponsor</b></font></td>
        > </tr>
        > <tr bgcolor=#FFFFFF>
        > <td align=center width=470><table border=0 cellpadding=0 cellspacing=0><tr><td
        > align=center><font face=arial size=-2>ADVERTISEMENT</font><br><a href="http://rd.yahoo.com/M=225001.2005406.3486599.1971030/D=egroupweb/S=1705006951:HM/A=1044510/R=0/*http://www.gotomypc.com/u/tr/yh/grp/300_g2_01/g22lp?Target=mm/g22lp.tmpl"
        > target=_top><img src="http://us.a1.yimg.com/us.yimg.com/a/ex/expert_city/300_gotomypc_01.gif"
        > alt="Click Here!" width="300" height="250" border="0"></a></td></tr></table></td>
        > </tr>
        > <tr><td><img alt="" width=1 height=1 src="http://us.adserver.yahoo.com/l?M=225001.2005406.3486599.1971030/D=egroupmail/S=1705006951:HM/A=1044510/rand=566292783"></td></tr>
        > </table>
        >
        > <!-- |**|end egp html banner|**| -->
        >
        >
        > <br>
        > <tt>
        > Unsubscribing info is here: <a href="http://help.yahoo.com/help/us/groups/groups-32.html">http://help.yahoo.com/help/us/groups/groups-32.html</a></tt>
        > <br>
        >
        > <br>
        > <tt>Your use of Yahoo! Groups is subject to the <a href="http://docs.yahoo.com/info/terms/">Yahoo!
        > Terms of Service</a>.</tt>
        > </br>
        >
        > </body></html>
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.