Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] Make matching with grep faster

Expand Messages
  • Gordon Stewart
    At 13:16 5/07/02 -0400, Denny Malloy wrote :- ... Do you want something like :- @new = grep(/^$to_match/i, @match_file); # weed out non searched while
    Message 1 of 4 , Jul 5, 2002
    • 0 Attachment
      At 13:16 5/07/02 -0400, Denny Malloy wrote :-
      >Hi,
      >
      >The following grep works correctly in my script but I was wondering if there
      >is a way I can make this more efficient. The @match_file can be 1 to over
      >100,000 elements. Is there a way every time I get a match I can delete that
      >value out of @match_file? The code is in a while loop and every value in
      >the array will eventually match put can only match once. I thought by
      >making the array smaller as the script ran it would speed up my matching.
      >If someone has any better way I can do my matching that would also be very
      >appreciated.
      >
      >my ($matched)=grep (/^$to_match/i,@match_file);
      >if (defined($matched)){
      > ...... do stuff
      >}


      Do you want something like :-

      @new = grep(/^$to_match/i, @match_file); # weed out non searched

      while (@new){
      Do something useful
      }


      Is that what you want ?

      This will allow you to do a 'loop' - But it ONLY does a loop on the
      records / lines that match your criteria - reducing the workload...

      G.
    • Denny Malloy
      At 13:16 5/07/02 -0400, Denny Malloy wrote :- ... there ... that ... Do you want something like :- @new = grep(/^$to_match/i, @match_file); # weed out non
      Message 2 of 4 , Jul 5, 2002
      • 0 Attachment
        At 13:16 5/07/02 -0400, Denny Malloy wrote :-
        >Hi,
        >
        >The following grep works correctly in my script but I was wondering if
        there
        >is a way I can make this more efficient. The @match_file can be 1 to over
        >100,000 elements. Is there a way every time I get a match I can delete
        that
        >value out of @match_file? The code is in a while loop and every value in
        >the array will eventually match put can only match once. I thought by
        >making the array smaller as the script ran it would speed up my matching.
        >If someone has any better way I can do my matching that would also be very
        >appreciated.
        >
        >my ($matched)=grep (/^$to_match/i,@match_file);
        >if (defined($matched)){
        > ...... do stuff
        >}


        Do you want something like :-

        @new = grep(/^$to_match/i, @match_file); # weed out non searched

        while (@new){
        Do something useful
        }


        Is that what you want ?

        This will allow you to do a 'loop' - But it ONLY does a loop on the
        records / lines that match your criteria - reducing the workload...

        G.

        This is not exactly what I want. The $to_match variable should only pull
        one element from array @match_file. The only way to get this to work is
        pull non-matches into the @new array. The UNIX grep would do this by using
        -v option. I do not need to use grep if someone has a better way.

        Thanks for your help.

        Denny

        Unsubscribing info is here:
        http://help.yahoo.com/help/us/groups/groups-32.html

        Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      • Jeff 'japhy' Pinyan
        ... You should not be using an array for this. You should be using a hash instead. The only drawback would be that your hash can t have the same key twice,
        Message 3 of 4 , Jul 5, 2002
        • 0 Attachment
          On Jul 5, Denny Malloy said:

          >The following grep works correctly in my script but I was wondering if there
          >is a way I can make this more efficient. The @match_file can be 1 to over
          >100,000 elements. Is there a way every time I get a match I can delete that
          >value out of @match_file? The code is in a while loop and every value in
          >the array will eventually match put can only match once. I thought by
          >making the array smaller as the script ran it would speed up my matching.

          You should not be using an array for this. You should be using a hash
          instead. The only drawback would be that your hash can't have the same
          key twice, but your problem becomes EXPONENTIALLY simpler:

          my %hash;
          @hash{@data} = ();

          if (exists $hash{$some_key}) {
          delete $hash{$some_key};
          # ...
          }

          >my ($matched)=grep (/^$to_match/i,@match_file);

          WHY are you using a regex there? Do you have to? I doubt it.

          --
          Jeff "japhy" Pinyan japhy@... http://www.pobox.com/~japhy/
          RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
          ** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
          <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
          [ I'm looking for programming work. If you like my work, let me know. ]
        Your message has been successfully submitted and would be delivered to recipients shortly.