Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] Regex help

Expand Messages
  • Malcolm Mill
    On Tue, 2 Nov 2004 16:57:42 -0500 (EST), Jeff japhy Pinyan ... Hi, Japhy. I wasnt sure how to use the backreference. The example from my book I based my
    Message 1 of 7 , Nov 2, 2004
    • 0 Attachment
      On Tue, 2 Nov 2004 16:57:42 -0500 (EST), Jeff 'japhy' Pinyan
      <japhy@...> wrote:
      > On Nov 2, Malcolm Mill said:
      >
      >
      >
      > ><tr>
      > > <td align="right" valign="top">
      > > <font size="2" color="#000000" class="listings">
      > > 11:30 am<br />
      > > </font>
      > > </td>
      > ></tr>
      >
      > > s/([\t][\d2][:][\d2][\s])/\1/;
      >
      > That regex has SEVERAL things wrong with it. First of all, backreferences
      > on the right-hand side should be $1, not \1. Second, quantifiers on parts
      > of a regex are done with {...}, so two digits is \d{2}. Third, NONE of
      > those character classes is necessary:
      >
      > s/(\t\d{2}:\d{2}\s)/$1/;

      Hi, Japhy.
      I wasnt sure how to use the backreference. The example from my book I
      based my regex on was

      $name = "Ingelbert Inguishable";
      $name =~ s/([A-Z]\w+)\b ([A-Z]\w+)\b/\2, \1/;
      print "$name\n";

      >
      > But why are you matching something and replacing it with itself? Perhaps
      > you want to do:
      >
      > print $1 if /\t(\d{2}:\d{2})\s/;

      Yep, that is what I wanted to do. I made a test script

      if ("06:00" =~ /[\d2][:][\d2]/) {
      print "Time found.\n";
      }


      This worked and I assumed I could use it in a larger expression. Why
      was I trying to match something and replace it with itself? I didn't
      know any other way. I am just working from a basic book with limited
      examples trying to extrapolate what I want where there are no explicit
      examples for what I want.

      This code (from a response to this query submitted to another list)
      does what I ultimately wanted to do.

      foreach (<>) {
      print "$1\n" while ($_ =~ /(\d{1,2}\:\d{2} [ap]m)/gi);
      }

      Thanks,
      Malcolm.

      > --
      > Jeff "japhy" Pinyan % How can we ever be the sold short or
      > RPI Acacia Brother #734 % the cheated, we who for every service
      > http://japhy.perlmonk.org/ % have long ago been overpaid?
      > http://www.perlmonks.org/ % -- Meister Eckhart
      >
      >
      >
      >
      > Unsubscribing info is here:
      > http://help.yahoo.com/help/us/groups/groups-32.html
      >
      >
      >
      > Yahoo! Groups Sponsor
      >
      > ADVERTISEMENT
      >
      >
      >
      >
      > ________________________________
      > Yahoo! Groups Links
      >
      > To visit your group on the web, go to:
      > http://groups.yahoo.com/group/perl-beginner/
      >
      > To unsubscribe from this group, send an email to:
      > perl-beginner-unsubscribe@yahoogroups.com
      >
      > Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
    • Jenda Krynicky
      From: Malcolm Mill ... Try to read perlretut . Either run perldoc perlretut or go to
      Message 2 of 7 , Nov 3, 2004
      • 0 Attachment
        From: Malcolm Mill <malcolm.mill@...>
        > Hi Jenda,
        > The book I'm working from is "Apache, MySQL, and PHP Web Development:
        > For Dummies". It is a 7 books in 1 title, and has sections on Perl and
        > Regular Expressions. I looked for explicit examples of what I want to
        > to but couldnt find any so have been messing around with given
        > examples to try get the result I want.

        Try to read "perlretut".
        Either run
        perldoc perlretut
        or go to http://www.perldoc.com/perl5.8.4/pod/perlretut.html

        > > > print $_;
        > >
        > > If you are searching for something you should not mangle the data,
        > > just capture what do you need:
        > >
        > > /\t(\d{2}:\d{2})\s/ and print "$1\n";
        > >
        > >
        > > > }
        > >
        >
        > This code, from another list does what I eventually wanted to do,
        > which was capture not only d2 numbers, but d1,2 and [ap]m.
        >
        > foreach (<>) {
        > print "$1\n" while ($_ =~ /(\d{1,2}\:\d{2} [ap]m)/gi);
        > }
        >
        > What exactly do { } and g do?

        The {} specifies that the number(s) within specify how many
        occurances of the previous character or group to match.

        > I understand \d{1,2} means capture digits of length 1 or 2. Why
        > couldnt I use [ ]? Are they only for characters?

        Inside [] most special characters are not special anymore. The []
        specifies a group of characters (a character class) that may be
        accepted at the current place. This means that
        [\d2]
        means "match either a digit or 2" and is equivalent to
        \d
        or
        [0-9]
        or
        [0123456789]
        Also
        [\d{1,2}]
        is equivalent to
        [}{,\d]
        that is it means "match an opening or closng curly or a comma or a
        digit.

        Also /[:]/ is the same as /:/, /[\t]/ is the same as /\t/, /[\s]/ is
        the same as /\s/.

        So if I just simplified your regexp
        s/([\t][\d2][:][\d2][\s])/\1/;
        I'd get
        s/(\t\d:\d\s)/\1/;

        Jenda
        ===== Jenda@... === http://Jenda.Krynicky.cz =====
        When it comes to wine, women and song, wizards are allowed
        to get drunk and croon as much as they like.
        -- Terry Pratchett in Sourcery
      Your message has been successfully submitted and would be delivered to recipients shortly.