Loading ...
Sorry, an error occurred while loading the content.

Re: Trying to get File and Directory info off of external server quickly

Expand Messages
  • Boysenberry Payne
    I m not sure if HEAD would work. Basically, I m trying to read a directory s files. After I confirm a file exists and doesn t have zero size I check that it
    Message 1 of 16 , Aug 1, 2005
    • 0 Attachment
      I'm not sure if HEAD would work.
      Basically, I'm trying to read a directory's files.
      After I confirm a file exists and doesn't have zero
      size I check that it has the appropriate extension
      for the directory then I add the directory address,
      file name and extension to a table in our database.

      It used to be easy when I was using php on a single
      server system (which is what I'm migrating from.) Now
      that I am using a two server system it's a little tricker.

      Thanks,
      Boysenberry

      This message contains information that is confidential
      and proprietary to Humaniteque and / or its affiliates.
      It is intended only for the recipient named and for
      the express purpose(s) described therein.
      Any other use is prohibited.

      http://www.habitatlife.com
      The World's Best Site Builder
      On Aug 1, 2005, at 5:12 PM, Philippe M. Chiasson wrote:

      > Boysenberry Payne wrote:
      >> Hello All,
      >>
      >> I've got a two server platform one a static server for files and
      >> runs the mysql server
      >> and the other runs mod_perl. I'm trying to figure out the fastest way
      >> to get info on directories
      >> and files from the static server to the mod_perl server. Right now
      >> I'm
      >> using Net::FTP which
      >> is really slow, especially when they're a lot of files.
      >> Unfortunately,
      >> I need to check the file info
      >> quite frequently. I was wondering if anyone knew what was the fast
      >> way
      >> to get this info, LDAP,
      >> SSH, etc?
      >
      > Wouldn't HTTP HEAD request achieve this fairly nicely ?
      >
      > I am not sure you have described the actual problem you are trying to
      > solve.
      > Why is that information needed and how is it being used?
      >
      > --
      > Philippe M. Chiasson m/gozer\@(apache|cpan|ectoplasm)\.org/ GPG KeyID
      > : 88C3A5A5
      > http://gozer.ectoplasm.org/ F9BF E0C2 480E 7680 1AE5 3631 CB32
      > A107 88C3A5A5
    • Philip M. Gollucci
      ... We actually do something very similar to this involving pictures being uploaded from a digital camera to eventually be published on a website. Cronjob1:
      Message 2 of 16 , Aug 1, 2005
      • 0 Attachment
        Boysenberry Payne wrote:
        > I'm not sure if HEAD would work.
        > Basically, I'm trying to read a directory's files.
        > After I confirm a file exists and doesn't have zero
        > size I check that it has the appropriate extension
        > for the directory then I add the directory address,
        > file name and extension to a table in our database.
        We actually do something very similar to this involving pictures being
        uploaded from a digital camera to eventually be published on a website.

        Cronjob1:
        Poll destination directory and move the files to a temp location
        The destination directory is where the camera puts them.

        Cronjob2:
        Poll temp directory and move image into permenent
        location andinsert a row into our "images" table.

        Its split only so that if some part breaks the uploading from camera's
        does not and people can continue to upload from camera's. Digital
        camera's [at least the ones the government uses :)] upload with the same
        non-unique file names for each upload, so we have to process each batch
        rather quickly.

        I didn't write this, but I can say in 3 years its only crashed once and
        makes us millions.

        [snipped for breviety of course]
        Cronjob1:

        use Net::FTP;
        my $ftp = Net::FTP->new($PEERADDR, Debug => 0, Timeout => 30)
        || die "Connect to server failed\n";
        $ftp->login($USERNAME, $PASSWORD)
        || die "Cannot login to FTP server\n";
        $ftp->binary();

        my @files = $ftp->ls('-R');
        foreach my $file (@files) {
        unless some critera
        $ftp->get("$dir/$file", $localFilename);
        }
        $ftp->quit();
      • Boysenberry Payne
        I ve already got it working using Net::FTP. The problem is it runs slow using FTP. Here is an example of what I m trying to do: my $h = $ftp- {handle};
        Message 3 of 16 , Aug 1, 2005
        • 0 Attachment
          I've already got it working using Net::FTP. The problem is it runs
          slow using FTP. Here is an example of what I'm trying to do:

          my $h = $ftp->{handle};
          foreach my $directory ( @directories ) {
          $h->cwd( $directory ) or die "can't change to directory: $directory
          $!";
          my $dir_ls = $h->ls;
          foreach my $file_name ( @$dir_ls ) {
          unless ( substr( $file_name, 0, 1 ) eq "." ) {
          my $dir_nfo = $h->dir( $directory . $file_name );
          $_ = $dir_nfo->[ 0 ];
          s/(\s)+/ /g;
          my @file_nfo = split / /, $_;
          my $file_size = $file_nfo[ 4 ];
          if( $file_size != 0 ) {
          add to database
          }
          }
          }
          }
          $h->quit;

          I tried using ftp->size( $directory . $file_name );
          But it seems to only return a size for small files,
          at least on my OSX box.

          Thanks,
          Boysenberry

          This message contains information that is confidential
          and proprietary to Humaniteque and / or its affiliates.
          It is intended only for the recipient named and for
          the express purpose(s) described therein.
          Any other use is prohibited.

          http://www.habitatlife.com
          The World's Best Site Builder
          On Aug 1, 2005, at 5:54 PM, Philip M. Gollucci wrote:

          > Boysenberry Payne wrote:
          >> I'm not sure if HEAD would work.
          >> Basically, I'm trying to read a directory's files.
          >> After I confirm a file exists and doesn't have zero
          >> size I check that it has the appropriate extension
          >> for the directory then I add the directory address,
          >> file name and extension to a table in our database.
          > We actually do something very similar to this involving pictures being
          > uploaded from a digital camera to eventually be published on a
          > website.
          >
          > Cronjob1:
          > Poll destination directory and move the files to a temp location
          > The destination directory is where the camera puts them.
          >
          > Cronjob2:
          > Poll temp directory and move image into permenent
          > location andinsert a row into our "images" table.
          >
          > Its split only so that if some part breaks the uploading from camera's
          > does not and people can continue to upload from camera's. Digital
          > camera's [at least the ones the government uses :)] upload with the
          > same non-unique file names for each upload, so we have to process each
          > batch rather quickly.
          >
          > I didn't write this, but I can say in 3 years its only crashed once
          > and makes us millions.
          >
          > [snipped for breviety of course]
          > Cronjob1:
          >
          > use Net::FTP;
          > my $ftp = Net::FTP->new($PEERADDR, Debug => 0, Timeout => 30)
          > || die "Connect to server failed\n";
          > $ftp->login($USERNAME, $PASSWORD)
          > || die "Cannot login to FTP server\n";
          > $ftp->binary();
          >
          > my @files = $ftp->ls('-R');
          > foreach my $file (@files) {
          > unless some critera
          > $ftp->get("$dir/$file", $localFilename);
          > }
          > $ftp->quit();
          >
          >
        • Randy Kobes
          ... Can you get someone on the remote server to do a cd top_level_directory ls -lR ls-lR # or find -fls find-ls gzip ls-lR # or gzip find-ls
          Message 4 of 16 , Aug 1, 2005
          • 0 Attachment
            On Mon, 1 Aug 2005, Boysenberry Payne wrote:

            > I'm not sure if HEAD would work.
            > Basically, I'm trying to read a directory's files.
            > After I confirm a file exists and doesn't have zero
            > size I check that it has the appropriate extension
            > for the directory then I add the directory address,
            > file name and extension to a table in our database.

            Can you get someone on the remote server to do a
            cd top_level_directory
            ls -lR > ls-lR # or find -fls find-ls
            gzip ls-lR # or gzip find-ls
            periodically, and then you can grab and parse ls-lR.gz or
            find-ls.gz?

            --
            best regards,
            randy kobes
          • Philip M. Gollucci
            ... Whats this ftp- {handle} stuff ? Shouldn t it just be $ftp- xxx Thats not in perldoc Net::FTP. Are you using a relatively new version of it ? We ve got
            Message 5 of 16 , Aug 1, 2005
            • 0 Attachment
              Boysenberry Payne wrote:
              > my $h = $ftp->{handle};
              > foreach my $directory ( @directories ) {
              > $h->cwd( $directory ) or die "can't change to directory: $directory
              > $!";
              > my $dir_ls = $h->ls;
              > foreach my $file_name ( @$dir_ls ) {
              > unless ( substr( $file_name, 0, 1 ) eq "." ) {
              > my $dir_nfo = $h->dir( $directory . $file_name );
              > $_ = $dir_nfo->[ 0 ];
              > s/(\s)+/ /g;
              > my @file_nfo = split / /, $_;
              > my $file_size = $file_nfo[ 4 ];
              > if( $file_size != 0 ) {
              > add to database
              > }
              > }
              > }
              > }
              > $h->quit;
              Whats this ftp->{handle} stuff ?
              Shouldn't it just be $ftp->xxx
              Thats not in perldoc Net::FTP.

              Are you using a relatively new version of it ?
              We've got Net::FTP 2.75,
              Linux wickedwitch 2.6.12.3 #2 SMP Mon Jul 18 17:14:55 EDT 2005 i686 i686
              i386 GNU/Linux

              What is the file size check for .. why do you have files of size 0 ?

              I think it might be faster to do

              next unless $file_name =~ /^\./o;

              instead of
              > unless ( substr( $file_name, 0, 1 ) eq "." ) {

              It might be your database connection... Do you prepare the handle
              outside of the loop? Is the database connect/disconnect outside of the
              loop? What are you inserting into the database ? If you're inserting a
              BLOB of the image/file data it could be the bandwidth transfer now thats
              its not a local socket anymore.

              Luck

              --
              END
              -----------------------------------------------------------------------------
              Philip M. Gollucci
              Senior Developer - Liquidity Services Inc.
              Phone: 202.558.6268 (Direct)
              Cell: 301.254.5198
              E-Mail: pgollucci@...
              Web: http://www.liquidityservicesinc.com
              http://www.liquidation.com
              http://www.uksurplus.com
              http://www.govliquidation.com
              http://www.gowholesale.com
            • Boysenberry Payne
              $ftp- {handle} = Net::FTP- new( $ftp- {host}, Passive = 1 ) or die Can t create new ftp with host: $ftp- {host} ; It s part of my FTP module Thanks,
              Message 6 of 16 , Aug 1, 2005
              • 0 Attachment
                $ftp->{handle} = Net::FTP->new( $ftp->{host}, Passive => 1 ) or die
                "Can't create new ftp with host: $ftp->{host}";
                It's part of my FTP module

                Thanks,
                Boysenberry

                This message contains information that is confidential
                and proprietary to Humaniteque and / or its affiliates.
                It is intended only for the recipient named and for
                the express purpose(s) described therein.
                Any other use is prohibited.

                http://www.habitatlife.com
                The World's Best Site Builder
                On Aug 1, 2005, at 6:31 PM, Philip M. Gollucci wrote:

                > Boysenberry Payne wrote:
                >> my $h = $ftp->{handle};
                >> foreach my $directory ( @directories ) {
                >> $h->cwd( $directory ) or die "can't change to directory:
                >> $directory $!";
                >> my $dir_ls = $h->ls;
                >> foreach my $file_name ( @$dir_ls ) {
                >> unless ( substr( $file_name, 0, 1 ) eq "." ) {
                >> my $dir_nfo = $h->dir( $directory . $file_name );
                >> $_ = $dir_nfo->[ 0 ];
                >> s/(\s)+/ /g;
                >> my @file_nfo = split / /, $_;
                >> my $file_size = $file_nfo[ 4 ];
                >> if( $file_size != 0 ) {
                >> add to database
                >> }
                >> }
                >> }
                >> }
                >> $h->quit;
                > Whats this ftp->{handle} stuff ?
                > Shouldn't it just be $ftp->xxx
                > Thats not in perldoc Net::FTP.
                >
                > Are you using a relatively new version of it ?
                > We've got Net::FTP 2.75,
                > Linux wickedwitch 2.6.12.3 #2 SMP Mon Jul 18 17:14:55 EDT 2005 i686
                > i686 i386 GNU/Linux
                >
                > What is the file size check for .. why do you have files of size 0 ?
                >
                > I think it might be faster to do
                >
                > next unless $file_name =~ /^\./o;
                >
                > instead of
                > > unless ( substr( $file_name, 0, 1 ) eq "." ) {
                >
                > It might be your database connection... Do you prepare the handle
                > outside of the loop? Is the database connect/disconnect outside of
                > the loop? What are you inserting into the database ? If you're
                > inserting a BLOB of the image/file data it could be the bandwidth
                > transfer now thats its not a local socket anymore.
                >
                > Luck
                >
                > --
                > END
                > -----------------------------------------------------------------------
                > ------
                > Philip M. Gollucci
                > Senior Developer - Liquidity Services Inc.
                > Phone: 202.558.6268 (Direct)
                > Cell: 301.254.5198
                > E-Mail: pgollucci@...
                > Web: http://www.liquidityservicesinc.com
                > http://www.liquidation.com
                > http://www.uksurplus.com
                > http://www.govliquidation.com
                > http://www.gowholesale.com
                >
                >
                >
              • Goddard Lee
                ... I trust you are putting thousands into perl, mod_perl and other good things, then ;)
                Message 7 of 16 , Aug 2, 2005
                • 0 Attachment
                  > From: Philip M. Gollucci [mailto:pgollucci@...]

                  > I didn't write this, but ...[it]... makes us millions.

                  I trust "you" are putting thousands into perl, mod_perl and other good
                  things, then ;)
                • Torsten Foertsch
                  ... mod_dav may be an option. Torsten
                  Message 8 of 16 , Aug 2, 2005
                  • 0 Attachment
                    On Monday 01 August 2005 23:12, Boysenberry Payne wrote:
                    > Hello All,
                    >
                    > I've got a two server platform one a static server for files and runs
                    > the mysql server
                    > and the other runs mod_perl. I'm trying to figure out the fastest way
                    > to get info on directories
                    > and files from the static server to the mod_perl server. Right now I'm
                    > using Net::FTP which
                    > is really slow, especially when they're a lot of files. Unfortunately,
                    > I need to check the file info
                    > quite frequently. I was wondering if anyone knew what was the fast way
                    > to get this info, LDAP,
                    > SSH, etc?

                    mod_dav may be an option.

                    Torsten
                  • Boysenberry Payne
                    Thank You Everyone, I think now that I know I can use $ftp- ls( -lR ), which I couldn t find anywhere in the Net::FTP docs or other O Reilly books I have, I
                    Message 9 of 16 , Aug 2, 2005
                    • 0 Attachment
                      Thank You Everyone,

                      I think now that I know I can use $ftp->ls( "-lR" ), which I couldn't
                      find
                      anywhere in the Net::FTP docs or other O'Reilly books I have, I can
                      stick to Net::FTP without is being slow. What was causing my script
                      to take so long was the multiple $ftp->cwd( $directory ), $ftp->ls() and
                      $ftp->dir( $directory . $file ) calls for each directory in my
                      directory loop.

                      Now I use one cwd and ls("-lR") from my public html area then process
                      the return array, which is a lot faster. It would be nice to be able
                      to specify
                      the directory as well as the "-lR" without using cwd( $directory ); does
                      anyone know how to do it?

                      Thanks for the tips on making my code more efficient too.

                      Boysenberry

                      This message contains information that is confidential
                      and proprietary to Humaniteque and / or its affiliates.
                      It is intended only for the recipient named and for
                      the express purpose(s) described therein.
                      Any other use is prohibited.

                      http://www.habitatlife.com
                      The World's Best Site Builder
                      On Aug 1, 2005, at 6:28 PM, Randy Kobes wrote:

                      > On Mon, 1 Aug 2005, Boysenberry Payne wrote:
                      >
                      >> I'm not sure if HEAD would work.
                      >> Basically, I'm trying to read a directory's files.
                      >> After I confirm a file exists and doesn't have zero
                      >> size I check that it has the appropriate extension
                      >> for the directory then I add the directory address,
                      >> file name and extension to a table in our database.
                      >
                      > Can you get someone on the remote server to do a
                      > cd top_level_directory
                      > ls -lR > ls-lR # or find -fls find-ls
                      > gzip ls-lR # or gzip find-ls
                      > periodically, and then you can grab and parse ls-lR.gz or find-ls.gz?
                      >
                      > --
                      > best regards,
                      > randy kobes
                      >
                      >
                    Your message has been successfully submitted and would be delivered to recipients shortly.