Loading ...
Sorry, an error occurred while loading the content.
 

Re: [PBML] Re: PDF to TEXT

Expand Messages
  • Prasanna Goupal
    Hi, There are 30000 PDF which have to convert to text. I got solution over this - ps2ascii unix command. Thanks for your reply. Regards, Prasanna A. Goupal ...
    Message 1 of 19 , Jun 3, 2006
      Hi,

      There are 30000 PDF which have to convert to text.
      I got solution over this - ps2ascii unix command.

      Thanks for your reply.

      Regards,
      Prasanna A. Goupal

      Damien Carbery <daymobrew@...> wrote:
      --- In perl-beginner@yahoogroups.com, Prasanna Goupal
      <perl_developer@...> wrote:
      >
      > Hi,
      >
      > I have to extract text from pdf file using perl. If anyone have
      any idea about this, then please reply to this mail.
      >
      > Also is there any unix command for the same?
      >
      I had a very quick look at http://search.cpan.org (searched "pdf to
      text") but didn't find anything useful. I might not have looked hard
      enough.

      It might be easier to run Acrobat Reader and choose File/Save as Text.





      Unsubscribing info is here: http://help.yahoo.com/help/us/groups/groups-32.html



      SPONSORED LINKS
      Basic programming language C programming language Computer programming languages The c programming language C++ programming language Software programming language

      ---------------------------------
      YAHOO! GROUPS LINKS


      Visit your group "perl-beginner" on the web.

      To unsubscribe from this group, send an email to:
      perl-beginner-unsubscribe@yahoogroups.com

      Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


      ---------------------------------




      __________________________________________________
      Do You Yahoo!?
      Tired of spam? Yahoo! Mail has the best spam protection around
      http://mail.yahoo.com

      [Non-text portions of this message have been removed]
    • Hetal Modi
      There is one software called PDFTOTEXT, you can download that software. And, use it in Unix. That does pretty good job. If you want to know more options, let
      Message 2 of 19 , Jun 3, 2006
        There is one software called PDFTOTEXT, you can download that software. And, use it in Unix. That does pretty good job.

        If you want to know more options, let me know.

        -Hetal

        Prasanna Goupal <perl_developer@...> wrote:
        Hi,

        I have to extract text from pdf file using perl. If anyone have any idea about this, then please reply to this mail.

        Also is there any unix command for the same?

        Thanks.
        Regards,
        Prasanna A. Goupal



        __________________________________________________
        Do You Yahoo!?
        Tired of spam? Yahoo! Mail has the best spam protection around
        http://mail.yahoo.com

        [Non-text portions of this message have been removed]



        Unsubscribing info is here: http://help.yahoo.com/help/us/groups/groups-32.html



        SPONSORED LINKS
        Basic programming language C programming language Computer programming languages The c programming language C++ programming language Software programming language

        ---------------------------------
        YAHOO! GROUPS LINKS


        Visit your group "perl-beginner" on the web.

        To unsubscribe from this group, send an email to:
        perl-beginner-unsubscribe@yahoogroups.com

        Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.


        ---------------------------------




        __________________________________________________
        Do You Yahoo!?
        Tired of spam? Yahoo! Mail has the best spam protection around
        http://mail.yahoo.com

        [Non-text portions of this message have been removed]
      • Peter Dominey
        You may already have had a replay t this, if so sorry. Anyway, a pdf file, is just a text file so you can parse for text just as you would for any text file.
        Message 3 of 19 , Jun 4, 2006
          You may already have had a replay t this, if so sorry.

          Anyway, a pdf file, is just a text file so you can parse for text just as you
          would for any text file.

          Thanks

          Peter


          On Saturday 03 June 2006 01:36, Prasanna Goupal wrote:
          > Hi,
          >
          > I have to extract text from pdf file using perl. If anyone have any idea
          > about this, then please reply to this mail.
          >
          > Also is there any unix command for the same?
          >
          > Thanks.
          > Regards,
          > Prasanna A. Goupal
          >
          >
          >
          > __________________________________________________
          > Do You Yahoo!?
          > Tired of spam? Yahoo! Mail has the best spam protection around
          > http://mail.yahoo.com
          >
          > [Non-text portions of this message have been removed]
          >
          >
          >
          >
          > Unsubscribing info is here:
          > http://help.yahoo.com/help/us/groups/groups-32.html Yahoo! Groups Links
          >
          >
          >
          >
          >
          >
          >
          > ----------------------------------------
          > Scanned for Viruses! mail.dominey.biz
          >
          >
          > ----------------------------------------
          > Scanned for Viruses! mail.dominey.biz

          --
          +-------------------------------------------------------------------+
          | P J Dominey |
          | Independent UNIX Contractor |
          | |
          | E-Mail: pdominey@... |
          | Web Site: www.pdrinformationsolutions.com (www.pdris.com) |
          | |
          | Tel: 817-488-5957 |
          | Yahoo IM: pdominey |
          | AOL IM: peterdominey |
          +-------------------------------------------------------------------+
          ----------------------------------------
          Scanned for Viruses! mail.dominey.biz


          ----------------------------------------
          Scanned for Viruses! mail.dominey.biz
        • Mike Southern
          Unless it s compressed.
          Message 4 of 19 , Jun 4, 2006
            Unless it's compressed.

            On 6/4/06 9:49 PM, Peter Dominey at pdominey@... wrote:

            > You may already have had a replay t this, if so sorry.
            >
            > Anyway, a pdf file, is just a text file so you can parse for text just as you
            > would for any text file.
            >
            > Thanks
            >
            > Peter
          • RR MISHRA
            Hi Everybody, Can any body give me idea about the use of subroutines and bugs.If any body have the tutorials or sample examples about it then plz send me.I am
            Message 5 of 19 , Jun 5, 2006
              Hi Everybody,
              Can any body give me idea about the use of subroutines and bugs.If any body have the tutorials or sample examples about it then plz send me.I am also want to know how subroutines and bugs are helpful in bioinformatics work.Plz guide me.I need your guidance.
              Thanx in advance.
              Regards.



              ---------------------------------
              Yahoo! India Answers: Share what you know. Learn something new Click here
              Send free SMS to your Friends on Mobile from your Yahoo! Messenger Download now

              [Non-text portions of this message have been removed]
            • DigiDoc
              I need to write some code that will take an ID from File-1 and see if it exists on File-2. If it does, then I want to write out the record from File-2 to
              Message 6 of 19 , Jun 5, 2006
                I need to write some code that will take an ID from File-1 and see if it
                exists on File-2. If it does, then I want to write out the record from
                File-2 to another file.

                File-1 is just IDs, and File-2 is the ID plus a bunch of other fields
                comma delimited. The ID however is variable length. The files are
                ASCII. File-1 will only contain a small number of records (about 1K),
                while File-2 will be about 100K records.

                I have no idea how to do this (I'm a total Perl novice) and could
                greatly use help.


                Thanks!


                ~~Kevin~


                [Non-text portions of this message have been removed]
              • a_z0_9_blah
                ... if it ... from ... fields ... are ... 1K), ... Could you show some sample lines from the first file and from the second file?
                Message 7 of 19 , Jun 5, 2006
                  --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@...> wrote:
                  >
                  > I need to write some code that will take an ID from File-1 and see
                  if it
                  > exists on File-2. If it does, then I want to write out the record
                  from
                  > File-2 to another file.
                  >
                  > File-1 is just IDs, and File-2 is the ID plus a bunch of other
                  fields
                  > comma delimited. The ID however is variable length. The files
                  are
                  > ASCII. File-1 will only contain a small number of records (about
                  1K),
                  > while File-2 will be about 100K records.
                  >
                  > I have no idea how to do this (I'm a total Perl novice) and could
                  > greatly use help.
                  >
                  >
                  > Thanks!
                  >
                  >
                  > ~~Kevin~
                  >


                  Could you show some sample lines from the first file and from the
                  second file?
                • Charles K. Clarkson
                  ... Read the perlsub file in the Perl documentation. ... Goto http://rt.perl.org/perlbug/ and click on the Current Perl 5 Issues link. Charles K. Clarkson --
                  Message 8 of 19 , Jun 5, 2006
                    RR MISHRA wrote:

                    :Can any body give me idea about the use of subroutines

                    Read the 'perlsub' file in the Perl documentation.


                    : and bugs.

                    Goto http://rt.perl.org/perlbug/ and click on the
                    Current Perl 5 Issues link.



                    Charles K. Clarkson
                    --
                    Mobile Homes Specialist
                    Free Market Advocate
                    Web Programmer

                    254 968-8328

                    Don't tread on my bandwidth. Trim your posts.
                  • DigiDoc
                    File-1 ... abc bg cd1234 File-2 ... abc,john,doe,1234,9999 addathk,kathy,smith,3453,5629 bg,joe,shmo,4532,5343 cd1234,jane,madle,5432,0932
                    Message 9 of 19 , Jun 5, 2006
                      File-1
                      -------
                      abc
                      bg
                      cd1234


                      File-2
                      -------
                      abc,john,doe,1234,9999
                      addathk,kathy,smith,3453,5629
                      bg,joe,shmo,4532,5343
                      cd1234,jane,madle,5432,0932
                      dkk32,marge,hasbro,2345,1234


                      Note: Data layout in File-2 may vary from time to time, but will always
                      start with the ID.


                      ~~Kevin~


                      a_z0_9_blah wrote:
                      > --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@...> wrote:
                      >
                      >> I need to write some code that will take an ID from File-1 and see
                      >>
                      > if it
                      >
                      >> exists on File-2. If it does, then I want to write out the record
                      >>
                      > from
                      >
                      >> File-2 to another file.
                      >>
                      >> File-1 is just IDs, and File-2 is the ID plus a bunch of other
                      >>
                      > fields
                      >
                      >> comma delimited. The ID however is variable length. The files
                      >>
                      > are
                      >
                      >> ASCII. File-1 will only contain a small number of records (about
                      >>
                      > 1K),
                      >
                      >> while File-2 will be about 100K records.
                      >>
                      >> I have no idea how to do this (I'm a total Perl novice) and could
                      >> greatly use help.
                      >>
                      >>
                      >> Thanks!
                      >>
                      >>
                      >> ~~Kevin~
                      >>
                      >>
                      >
                      >
                      > Could you show some sample lines from the first file and from the
                      > second file?
                      >
                      >
                      >
                      >
                      >


                      [Non-text portions of this message have been removed]
                    • a_z0_9_blah
                      ... always ... see ... record ... (about ... could ... the ... You could try the following code on your sample data. If you will be massaging the data in
                      Message 10 of 19 , Jun 5, 2006
                        --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@...> wrote:
                        >
                        >
                        > Note: Data layout in File-2 may vary from time to time, but will
                        always
                        > start with the ID.
                        >
                        >
                        > ~~Kevin~
                        >
                        >
                        > a_z0_9_blah wrote:
                        > > --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@> wrote:
                        > >
                        > >> I need to write some code that will take an ID from File-1 and
                        see
                        > >>
                        > > if it
                        > >
                        > >> exists on File-2. If it does, then I want to write out the
                        record
                        > >>
                        > > from
                        > >
                        > >> File-2 to another file.
                        > >>
                        > >> File-1 is just IDs, and File-2 is the ID plus a bunch of other
                        > >>
                        > > fields
                        > >
                        > >> comma delimited. The ID however is variable length. The files
                        > >>
                        > > are
                        > >
                        > >> ASCII. File-1 will only contain a small number of records
                        (about
                        > >>
                        > > 1K),
                        > >
                        > >> while File-2 will be about 100K records.
                        > >>
                        > >> I have no idea how to do this (I'm a total Perl novice) and
                        could
                        > >> greatly use help.
                        > >>
                        > >>
                        > >> Thanks!
                        > >>
                        > >>
                        > >> ~~Kevin~
                        > >>
                        > >>
                        > >
                        > >
                        > > Could you show some sample lines from the first file and from
                        the
                        > > second file?
                        > >
                        > File-1
                        > -------
                        > abc
                        > bg
                        > cd1234
                        >
                        >
                        > File-2
                        > -------
                        > abc,john,doe,1234,9999
                        > addathk,kathy,smith,3453,5629
                        > bg,joe,shmo,4532,5343
                        > cd1234,jane,madle,5432,0932
                        > dkk32,marge,hasbro,2345,1234
                        >

                        You could try the following code on your sample data.

                        If you will be 'massaging' the data in file-2,
                        you might consider treating your second
                        file as a database (using DBD::CSV).


                        #!/usr/bin/perl
                        use strict;
                        use warnings;

                        my %ids;

                        open my $id, "<", "o33.txt" or die "Unable to open o33.txt $!";

                        while (<$id>) {
                        chomp;
                        $ids{$_} = 1;
                        }

                        close $id or die $!;

                        open my $data, "<", "o44.txt" or die "Unable to open o44.txt $!";
                        open my $out, ">", "o55.txt" or die "Couldn't write results $!";

                        while (<$data>) {
                        my $key = (split /,/)[0];
                        if ($ids{$key}) {
                        print $out $_;
                        }
                        }

                        close $data or die $!;
                        close $out or die $!;
                      • DigiDoc
                        Great, thanks for the reply. I think I understand the majority of this code. I ll research the DBD::CSV as well. I definitely would not have known about that
                        Message 11 of 19 , Jun 5, 2006
                          Great, thanks for the reply.

                          I think I understand the majority of this code. I'll research the
                          DBD::CSV as well. I definitely would not have known about that without
                          your help. THANK YOU!!!!!

                          I can read a fair amount of Perl code, but am just not up to quickly
                          putting code together yet (and probably not for some time). It takes me
                          forever.

                          This helps me out immensely.

                          Dumb question. If I read this correctly, you've hard coded the files,
                          how would I set it up to make these variables?

                          Thanks!

                          ~~Kevin~
                          >
                          >
                          > You could try the following code on your sample data.
                          >
                          > If you will be 'massaging' the data in file-2,
                          > you might consider treating your second
                          > file as a database (using DBD::CSV).
                          >
                          >
                          > #!/usr/bin/perl
                          > use strict;
                          > use warnings;
                          >
                          > my %ids;
                          >
                          > open my $id, "<", "o33.txt" or die "Unable to open o33.txt $!";
                          >
                          > while (<$id>) {
                          > chomp;
                          > $ids{$_} = 1;
                          > }
                          >
                          > close $id or die $!;
                          >
                          > open my $data, "<", "o44.txt" or die "Unable to open o44.txt $!";
                          > open my $out, ">", "o55.txt" or die "Couldn't write results $!";
                          >
                          > while (<$data>) {
                          > my $key = (split /,/)[0];
                          > if ($ids{$key}) {
                          > print $out $_;
                          > }
                          > }
                          >
                          > close $data or die $!;
                          > close $out or die $!;
                          >
                          >
                          >
                          >
                          >
                          >


                          [Non-text portions of this message have been removed]
                        • Alan_C
                          On Monday 05 June 2006 19:16, DigiDoc wrote: [ . . ] ...
                          Message 12 of 19 , Jun 6, 2006
                            On Monday 05 June 2006 19:16, DigiDoc wrote:
                            [ . . ]
                            > Dumb question. If I read this correctly, you've hard coded the files,
                            > how would I set it up to make these variables?

                            http://groups.google.com/group/perl.beginners/browse_thread/thread/69865c81985c57f4/7d45c2b46cb7ba2b#7d45c2b46cb7ba2b

                            Take a look at the code in that it uses shift

                            A command line example:

                            perlscript file1

                            in that above command line example, file1 will be opened as filehandle OLD

                            if need to, can be set up to shift more than one file off the command line.

                            Further search topics be: shift, commandline, command line, ARGV, @ARGV,
                            $ARGV, $ARGV[0], $ARGV[1]

                            try search at <http://learn.perl.org/first-response> too likely turn up lots
                            yet even more.

                            --
                            Alan.
                          • DigiDoc
                            Thanks for all the help to this question. I ve got it working with the $ARGV stuff. I ve got more tweaks I want to make, but the base code you provided really
                            Message 13 of 19 , Jun 6, 2006
                              Thanks for all the help to this question.

                              I've got it working with the $ARGV stuff. I've got more tweaks I want
                              to make, but the base code you provided really helps me out.

                              I'm thinking I'll want to make a GUI out of it now. I was doing some
                              research and found "The GUI Loft". It's a front end that uses
                              Win32::GUI. I haven't used it yet, but it seems like what I need. If
                              anyone has a better suggestion, please let me know.

                              I can't tell me how much time you've saved me, not to mention that I've
                              learned a few things as well. Code examples help me out a great deal, I
                              tend to learn faster from them.

                              Thanks!



                              [Non-text portions of this message have been removed]
                            • Ken Shail
                              ... From: DigiDoc ; DigiDoc To: perl-beginner@yahoogroups.com Sent: Tuesday, June 06, 2006 8:45 PM Subject: Re: [PBML] Re: File lookup? Thanks for all the help
                              Message 14 of 19 , Jun 6, 2006
                                ----- Original Message -----
                                From: DigiDoc ; DigiDoc
                                To: perl-beginner@yahoogroups.com
                                Sent: Tuesday, June 06, 2006 8:45 PM
                                Subject: Re: [PBML] Re: File lookup?


                                Thanks for all the help to this question.

                                I've got it working with the $ARGV stuff. I've got more tweaks I want
                                to make, but the base code you provided really helps me out.

                                I'm thinking I'll want to make a GUI out of it now. I was doing some
                                research and found "The GUI Loft". It's a front end that uses
                                Win32::GUI. I haven't used it yet, but it seems like what I need. If
                                anyone has a better suggestion, please let me know.

                                I can't tell me how much time you've saved me, not to mention that I've
                                learned a few things as well. Code examples help me out a great deal, I
                                tend to learn faster from them.

                                Thanks!

                                Perl Tk - works on Linux and windoze
                                http://dc.pm.org/talks/tk/home.html
                                http://www.perltk.org/index.php?option=com_frontpage&Itemid=1
                                http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html
                              • DigiDoc
                                Thanks for the links. I was looking at that, but couldn t find a nice front end that lets you work visually. The GUI Loft let s you do things visually, then
                                Message 15 of 19 , Jun 6, 2006
                                  Thanks for the links.

                                  I was looking at that, but couldn't find a nice front end that lets you
                                  work visually. The GUI Loft let's you do things visually, then
                                  generates the code for you. Is there something in tk that does that, or
                                  am I missing something? Or at the very least, some application that has
                                  some kind of code snippets that are easy to insert?

                                  My goal is to develop code as rapidly as possible since I'll have a lot
                                  to crank out.


                                  Ken Shail wrote:
                                  >
                                  >
                                  > ----- Original Message -----
                                  > From: DigiDoc ; DigiDoc
                                  > To: perl-beginner@yahoogroups.com
                                  > <mailto:perl-beginner%40yahoogroups.com>
                                  > Sent: Tuesday, June 06, 2006 8:45 PM
                                  > Subject: Re: [PBML] Re: File lookup?
                                  >
                                  > Thanks for all the help to this question.
                                  >
                                  > I've got it working with the $ARGV stuff. I've got more tweaks I want
                                  > to make, but the base code you provided really helps me out.
                                  >
                                  > I'm thinking I'll want to make a GUI out of it now. I was doing some
                                  > research and found "The GUI Loft". It's a front end that uses
                                  > Win32::GUI. I haven't used it yet, but it seems like what I need. If
                                  > anyone has a better suggestion, please let me know.
                                  >
                                  > I can't tell me how much time you've saved me, not to mention that I've
                                  > learned a few things as well. Code examples help me out a great deal, I
                                  > tend to learn faster from them.
                                  >
                                  > Thanks!
                                  >
                                  > Perl Tk - works on Linux and windoze
                                  > http://dc.pm.org/talks/tk/home.html <http://dc.pm.org/talks/tk/home.html>
                                  > http://www.perltk.org/index.php?option=com_frontpage&Itemid=1
                                  > <http://www.perltk.org/index.php?option=com_frontpage&Itemid=1>
                                  > http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html
                                  > <http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html>
                                  >
                                  >


                                  [Non-text portions of this message have been removed]
                                • Alan_C
                                  ... [ Perl Tk - works on Linux and windoze ] (if you quote a bit of context, it makes it easier for me (+ probably some others) to track this thread and to
                                  Message 16 of 19 , Jun 6, 2006
                                    On Tuesday 06 June 2006 13:24, DigiDoc wrote:
                                    > Thanks for the links.

                                    [ > > Perl Tk - works on Linux and windoze ]

                                    (if you quote a bit of context, it makes it easier for me (+ probably some
                                    others) to track this thread and to then offer any help if I have any help to
                                    offer.

                                    > I was looking at that, but couldn't find a nice front end that lets you
                                    > work visually. The GUI Loft let's you do things visually, then
                                    > generates the code for you. Is there something in tk that does that, or
                                    > am I missing something? Or at the very least, some application that has
                                    > some kind of code snippets that are easy to insert?
                                    >
                                    > My goal is to develop code as rapidly as possible since I'll have a lot
                                    > to crank out.

                                    There's WX Widgets. It has drag and drop (for the builder). It's cross
                                    platform ie Win, Linux, etc. There's article on WX Widgets in the
                                    latest "The Perl Review" (for subscribers) hey it's reasonable enough and I
                                    look forward to each issue's arrival. (I've no affiliation, just a
                                    subscriber).

                                    http://www.theperlreview.com/

                                    The summer issue has article on WX Widgets.

                                    Other than what I've shared, I know nothing about WX. I want to try it but
                                    it'll be + few weeks until I do due to other business currently.

                                    > > I'm thinking I'll want to make a GUI out of it now. I was doing some
                                    > > research and found "The GUI Loft". It's a front end that uses
                                    > > Win32::GUI. I haven't used it yet, but it seems like what I need. If
                                    > > anyone has a better suggestion, please let me know.

                                    --
                                    Alan.
                                  Your message has been successfully submitted and would be delivered to recipients shortly.