Loading ...
Sorry, an error occurred while loading the content.

filter data

Expand Messages
  • timmarkus498
    Dear, I am beginning and Ive been trying to compare lines between two files by one specify column in two files and put in other file (output) lines common by
    Message 1 of 5 , Aug 17, 2012
    • 0 Attachment
      Dear,

      I am beginning and Ive been trying to compare lines between two files by one specify column in two files and put in other file (output) lines common by second column. For some reason the code below only ever goes through and prints the 'if' statement regardless of if the two variables match.

      file1
      1
      2
      3

      file2
      1 1 0 2 -9
      3 8 0 2 2
      2 6 0 1 6
      1 3 0 2 -9
      1 4 0 2 2
      2 2 0 1 6

      my output (file3)
      1 1 0 2 -9
      2 2 0 1 6
      1 3 0 2 -9


      my code
      #!/usr/bin/perl
      use strict;
      use warnings;

      my $usage = "perl <script> <file1> <file2> <output>\n";

      my $file1 = $ARGV[0] || die $usage;
      my $file2 = $ARGV[1] || die $usage;
      my $file3 =

      my %dic_pat = (); #hashes

      open (PAT,$ani) || die "$!\n";

      while (<PAT>){
      chomp;
      $dic_pat{$_}=1;
      }
      close (PAT);

      open (IN,$fileplink);
      while (<IN>){
      chomp;
      my @fields = split /\s+/,$_;

      #### only prints if dont have the pattern...
      if ($dic_pat{$fields[1]}){
      next;
      }
      else{
      print $_."\n";
      }
      }
      close (IN);

      output()

      Thanks
    • Shlomi Fish
      Hi timmarkus, Let me review your code. On Fri, 17 Aug 2012 22:28:04 -0000 ... It s great that you re using strict and warnings. ... my $file3 = what
      Message 2 of 5 , Aug 18, 2012
      • 0 Attachment
        Hi timmarkus,

        Let me review your code.

        On Fri, 17 Aug 2012 22:28:04 -0000
        "timmarkus498" <timmarkus498@...> wrote:

        > Dear,
        >
        > I am beginning and Ive been trying to compare lines between two files
        > by one specify column in two files and put in other file (output)
        > lines common by second column. For some reason the code below only
        > ever goes through and prints the 'if' statement regardless of if the
        > two variables match.
        >
        > file1
        > 1
        > 2
        > 3
        >
        > file2
        > 1 1 0 2 -9
        > 3 8 0 2 2
        > 2 6 0 1 6
        > 1 3 0 2 -9
        > 1 4 0 2 2
        > 2 2 0 1 6
        >
        > my output (file3)
        > 1 1 0 2 -9
        > 2 2 0 1 6
        > 1 3 0 2 -9
        >
        >
        > my code
        > #!/usr/bin/perl
        > use strict;
        > use warnings;

        It's great that you're using strict and warnings.

        >
        > my $usage = "perl <script> <file1> <file2> <output>\n";
        >
        > my $file1 = $ARGV[0] || die $usage;
        > my $file2 = $ARGV[1] || die $usage;
        > my $file3 =

        "my $file3 = " what exactly? And see:

        * http://perl-begin.org/tutorials/bad-elements/#subroutine-arguments

        * http://perl-begin.org/tutorials/bad-elements/#calling-variables-file

        >
        > my %dic_pat = (); #hashes
        >
        > open (PAT,$ani) || die "$!\n";
        >

        1. Don't use 2-args open.

        2. Use lexical filehandles.

        3. Where does "$ani" come from? It's not declared anywhere previously.

        See:

        http://perl-begin.org/tutorials/bad-elements/#open-function-style

        > while (<PAT>){
        > chomp;
        > $dic_pat{$_}=1;
        > }
        > close (PAT);

        You should not abuse $_ too much. Use a lexical variable. Alternatively, in
        this case, you can also use the list-form of <...>:

        my @keys = <$pat_fh>;
        chomp(@keys);
        @dic_pat{@keys} = ((1) x @keys);

        >
        > open (IN,$fileplink);

        Again. And what is $fileplink.

        > while (<IN>){
        > chomp;
        > my @fields = split /\s+/,$_;
        >
        > #### only prints if dont have the pattern...
        > if ($dic_pat{$fields[1]}){

        Shouldn't it be $fields[0]? The first element in an array is indexed with 0. And you should better
        use http://perldoc.perl.org/functions/exists.html .

        > next;

        Always use "next LABEL;":

        http://perl-begin.org/tutorials/bad-elements/#flow-stmts-without-labels

        > }
        > else{
        > print $_."\n";
        > }

        This conditional will print the values that are _not_ keyed by %dic_pat? Do you want it the other
        way around?

        > }
        > close (IN);
        >

        BTW, you may get more help on beginners@... :

        http://learn.perl.org/faq/beginners.html

        You may also wish to peruse some beginners' resources at http://perl-begin.org/ .

        Regards,

        Shlomi Fish

        --
        -----------------------------------------------------------------
        Shlomi Fish http://www.shlomifish.org/
        "The Human Hacking Field Guide" - http://shlom.in/hhfg

        In Soviet Russia, every time you kill a kitten, God masturbates.
        http://linux.slashdot.org/comments.pl?sid=195378&cid=16009070

        Please reply to list if it's a mailing list post - http://shlom.in/reply .
      • Shawn H Corey
        On Sat, 18 Aug 2012 13:01:38 +0300 ... No, the he said he wanted the second column, which is indexed by 1. -- Just my 0.00000002 million dollars worth, Shawn
        Message 3 of 5 , Aug 18, 2012
        • 0 Attachment
          On Sat, 18 Aug 2012 13:01:38 +0300
          Shlomi Fish <shlomif@...> wrote:

          > On Fri, 17 Aug 2012 22:28:04 -0000
          > "timmarkus498" <timmarkus498@...> wrote:
          >
          > > Dear,
          > >
          > > I am beginning and Ive been trying to compare lines between two
          > > files by one specify column in two files and put in other file
          > > (output) lines common by second column.

          ...

          > > while (<IN>){
          > > chomp;
          > > my @fields = split /\s+/,$_;
          > >
          > > #### only prints if dont have the pattern...
          > > if ($dic_pat{$fields[1]}){
          >
          > Shouldn't it be $fields[0]? The first element in an array is indexed
          > with 0.

          No, the he said he wanted the second column, which is indexed by 1.


          --
          Just my 0.00000002 million dollars worth,
          Shawn

          Programming is as much about organization and communication
          as it is about coding.

          _Perl links_
          official site : http://www.perl.org/
          beginners' help : http://learn.perl.org/faq/beginners.html
          advance help : http://perlmonks.org/
          documentation : http://perldoc.perl.org/
          news : http://perlsphere.net/
          repository : http://www.cpan.org/
          blog : http://blogs.perl.org/
          regional groups : http://www.pm.org/
        • Charles K. Clarkson
          You are writing this solution from the top down. You do not have to write it in that order. Write the core algorithm first, then add the UI and I/O after that
          Message 4 of 5 , Aug 18, 2012
          • 0 Attachment
            You are writing this solution from the top down. You do not have to
            write it in that order. Write the core algorithm first, then add the UI
            and I/O after that is done.

            Eliminating the file conversion and the user interface stuff allows you
            to write and test the record comparison algorithm. Once that you know
            works, you can move on to make it look pretty for the user.

            use strict;
            use warnings;

            # Pull this info from file 1.
            my %is_valid = (
            1 => 1,
            2 => 1,
            3 => 1,
            );


            # Test data.
            # Replace this with a while loop over file 2 or a tied array.
            my @records = (
            '1 1 0 2 -9',
            '3 8 0 2 2',
            '2 6 0 1 6',
            '1 3 0 2 -9',
            '1 4 0 2 2',
            '2 2 0 1 6',
            );

            # Replace this with a while loop over file 2 or a tied array.
            while ( my $record = shift @records ) {

            my @fields = split ' ', $record;

            if ( $is_valid{ $fields[1] } ) {
            print "$record\n";
            }
            }

            __END__



            Charles Clarkson
            --
            I'm not really a smart person. I just play one on the Internet.
            +1 (254) 434-2733
          • Markus Tim
            Thank you for help and suggestion! [Non-text portions of this message have been removed]
            Message 5 of 5 , Aug 22, 2012
            • 0 Attachment
              Thank you for help and suggestion!

              [Non-text portions of this message have been removed]
            Your message has been successfully submitted and would be delivered to recipients shortly.