Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] Parsing a LONG line more than once

Expand Messages
  • Electron One
    ... I understand that. The problem comes in when I want to save it. Taking the example I provided, $Line = 123 452 6847 293
    Message 1 of 4 , Sep 4, 2003
    • 0 Attachment
      At 03:59 PM 9/4/2003 -0600, you wrote:
      > > If i had a VERY Long line, $Line, that contained data preceded by a
      > >header, is there a way to match multiple instances of that data on the
      >
      > > same line?
      >
      >Use the 'g' option on your regex:
      >
      >m/(whatever it is you want to match)/g;

      I understand that. The problem comes in when I want to save it. Taking the
      example I provided,

      $Line = <THESE NUMBERS> 123 452 6847 293 <OTHER STUFF> nobody cares abou
      tthis <THIS EITHER> 123 492 3845 <THESE NUMBERS> 563 9543 38945 284 <NOT
      THESE> 49595

      the numbers will always terminate with a "<". So this regex works,
      if ($line =~/<THESE NUMBERS>([^<]*)</g){
      $Data = $1;
      }

      $Data would still only included the first set of numbers(123 452 6847 293),
      not the second set(563 9543 38945 284). It's not about the second set
      overwriting the first set, i could always push them to an array, its that
      it didn't even get the second set.

      The only way I could think of going around this is to do something like,

      @Array = $line =~ /(<THESE NUMBERS>([^<]*)<)/g;

      then doing some kind of foreach,

      foreach (@Array){
      if ($line =~/<THESE NUMBERS>([^<]*)</g){
      push(@Data,$1);
      }
      }

      But this seems to verbose. Is there a more efficient way?






      >Perldoc is your friend: perldoc perlrequick should contain something
      >about this.
      >
      >Hope this helps,
      >
      >Jeff Eggen
      >IT Programmer Analyst
      >Saskatchewan Government Insurance
      >Ph (306) 751-1795
      >email jeggen@...
      >
      >Unsubscribing info is here:
      >http://help.yahoo.com/help/us/groups/groups-32.html
      >
      >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      >
    • Paul Archer
      This is almost certainly *not* the best way to do this, but it works: $Line = 123 452 6847 293 nobody cares about this
      Message 2 of 4 , Sep 4, 2003
      • 0 Attachment
        This is almost certainly *not* the best way to do this, but it works:

        $Line = "<THESE NUMBERS> 123 452 6847 293 <OTHER STUFF> nobody cares about
        this <THIS EITHER> 123 492 3845 <THESE NUMBERS> 563 9543 38945 284 <NOT
        THESE> 49595";

        @bits= split /<THESE NUMBERS>/, $Line;

        foreach (@bits) {
        s/^(.*?) <.*$/$1/; print "$_\n"
        }

        HTH,

        Paul



        4:09pm, Electron One wrote:

        > At 03:59 PM 9/4/2003 -0600, you wrote:
        > > > If i had a VERY Long line, $Line, that contained data preceded by a
        > > >header, is there a way to match multiple instances of that data on the
        > >
        > > > same line?
        > >
        > >Use the 'g' option on your regex:
        > >
        > >m/(whatever it is you want to match)/g;
        >
        > I understand that. The problem comes in when I want to save it. Taking the
        > example I provided,
        >
        > $Line = <THESE NUMBERS> 123 452 6847 293 <OTHER STUFF> nobody cares abou
        > tthis <THIS EITHER> 123 492 3845 <THESE NUMBERS> 563 9543 38945 284 <NOT
        > THESE> 49595
        >
        > the numbers will always terminate with a "<". So this regex works,
        > if ($line =~/<THESE NUMBERS>([^<]*)</g){
        > $Data = $1;
        > }
        >
        > $Data would still only included the first set of numbers(123 452 6847 293),
        > not the second set(563 9543 38945 284). It's not about the second set
        > overwriting the first set, i could always push them to an array, its that
        > it didn't even get the second set.
        >
        > The only way I could think of going around this is to do something like,
        >
        > @Array = $line =~ /(<THESE NUMBERS>([^<]*)<)/g;
        >
        > then doing some kind of foreach,
        >
        > foreach (@Array){
        > if ($line =~/<THESE NUMBERS>([^<]*)</g){
        > push(@Data,$1);
        > }
        > }
        >
        > But this seems to verbose. Is there a more efficient way?
        >
        >
        >
        >
        >
        >
        > >Perldoc is your friend: perldoc perlrequick should contain something
        > >about this.
        > >
        > >Hope this helps,
        > >
        > >Jeff Eggen
        > >IT Programmer Analyst
        > >Saskatchewan Government Insurance
        > >Ph (306) 751-1795
        > >email jeggen@...
        > >
        > >Unsubscribing info is here:
        > >http://help.yahoo.com/help/us/groups/groups-32.html
        > >
        > >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
        > >
        >
        >
        >
        > Unsubscribing info is here: http://help.yahoo.com/help/us/groups/groups-32.html
        >
        > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
        >
        >

        ---------------------------------------------------
        Tech Support: "I need you to boot the computer."
        Customer: (THUMP! Pause.) "No, that didn't help."
        ---------(http://www.rinkworks.com/stupid)---------
      • Jeff Eggen
        ... the ... abou ...
        Message 3 of 4 , Sep 5, 2003
        • 0 Attachment
          >>> electron1@... 09/04/03 05:09pm >>>
          >I understand that. The problem comes in when I want to save it. Taking
          the
          >example I provided,

          >$Line = <THESE NUMBERS> 123 452 6847 293 <OTHER STUFF> nobody cares
          abou
          >tthis <THIS EITHER> 123 492 3845 <THESE NUMBERS> 563 9543 38945 284
          <NOT
          >THESE> 49595

          >the numbers will always terminate with a "<". So this regex works,
          >if ($line =~/<THESE NUMBERS>([^<]*)</g){
          >$Data = $1;
          >}

          >$Data would still only included the first set of numbers(123 452 6847
          293),
          >not the second set(563 9543 38945 284). It's not about the second set

          >overwriting the first set, i could always push them to an array, its
          that
          >it didn't even get the second set.

          >The only way I could think of going around this is to do something
          like,

          >@Array = $line =~ /(<THESE NUMBERS>([^<]*)<)/g;

          >then doing some kind of foreach,

          >foreach (@Array){
          >if ($line =~/<THESE NUMBERS>([^<]*)</g){
          >push(@Data,$1);
          >}
          >}

          >But this seems to verbose. Is there a more efficient way?

          So, all that matters is that you catch all the numbers that follow the
          header into an array?

          my @Array = $Line =~ m/<THESE NUMBERS>([^<]+)</g;

          This seems to do it okay for me with some test data I threw together.
          This also works:

          push @Array, $Line =~ m/<THESE NUMBERS>([^<]+)</g;

          The second one is nicer if $line is, say, the current line of a file
          you're iterating through. Either way, it's only a one-liner: I don't
          think you have to split up each line and then iterate through the
          pieces. But then, if I'm still misunderstanding the problem, I may be
          way off here. Here's my test script, in it's entirety:

          ### Begin Code
          # Forgive the no-strict, no-warnings code, this was quick & dirty
          open FILE, "< test.txt";
          while (<FILE>)
          {
          push @array, m/<THESE NUMBERS>([^<]+)</g;
          }
          print join "\n", @array, "";
          ### End Code

          Then I made test.txt contain 3 copies of the line you provided in your
          original example, and this picked up all instances of the numbers
          following the <THESE NUMBERS> header from every line.

          Hope this helps,

          Jeff Eggen
          IT Programmer Analyst
          Saskatchewan Government Insurance
          Ph (306) 751-1795
          email jeggen@...
        Your message has been successfully submitted and would be delivered to recipients shortly.