Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] PDF to TEXT

Expand Messages
  • Peter Dominey
    You may already have had a replay t this, if so sorry. Anyway, a pdf file, is just a text file so you can parse for text just as you would for any text file.
    Message 1 of 19 , Jun 4, 2006
    • 0 Attachment
      You may already have had a replay t this, if so sorry.

      Anyway, a pdf file, is just a text file so you can parse for text just as you
      would for any text file.

      Thanks

      Peter


      On Saturday 03 June 2006 01:36, Prasanna Goupal wrote:
      > Hi,
      >
      > I have to extract text from pdf file using perl. If anyone have any idea
      > about this, then please reply to this mail.
      >
      > Also is there any unix command for the same?
      >
      > Thanks.
      > Regards,
      > Prasanna A. Goupal
      >
      >
      >
      > __________________________________________________
      > Do You Yahoo!?
      > Tired of spam? Yahoo! Mail has the best spam protection around
      > http://mail.yahoo.com
      >
      > [Non-text portions of this message have been removed]
      >
      >
      >
      >
      > Unsubscribing info is here:
      > http://help.yahoo.com/help/us/groups/groups-32.html Yahoo! Groups Links
      >
      >
      >
      >
      >
      >
      >
      > ----------------------------------------
      > Scanned for Viruses! mail.dominey.biz
      >
      >
      > ----------------------------------------
      > Scanned for Viruses! mail.dominey.biz

      --
      +-------------------------------------------------------------------+
      | P J Dominey |
      | Independent UNIX Contractor |
      | |
      | E-Mail: pdominey@... |
      | Web Site: www.pdrinformationsolutions.com (www.pdris.com) |
      | |
      | Tel: 817-488-5957 |
      | Yahoo IM: pdominey |
      | AOL IM: peterdominey |
      +-------------------------------------------------------------------+
      ----------------------------------------
      Scanned for Viruses! mail.dominey.biz


      ----------------------------------------
      Scanned for Viruses! mail.dominey.biz
    • Mike Southern
      Unless it s compressed.
      Message 2 of 19 , Jun 4, 2006
      • 0 Attachment
        Unless it's compressed.

        On 6/4/06 9:49 PM, Peter Dominey at pdominey@... wrote:

        > You may already have had a replay t this, if so sorry.
        >
        > Anyway, a pdf file, is just a text file so you can parse for text just as you
        > would for any text file.
        >
        > Thanks
        >
        > Peter
      • RR MISHRA
        Hi Everybody, Can any body give me idea about the use of subroutines and bugs.If any body have the tutorials or sample examples about it then plz send me.I am
        Message 3 of 19 , Jun 5, 2006
        • 0 Attachment
          Hi Everybody,
          Can any body give me idea about the use of subroutines and bugs.If any body have the tutorials or sample examples about it then plz send me.I am also want to know how subroutines and bugs are helpful in bioinformatics work.Plz guide me.I need your guidance.
          Thanx in advance.
          Regards.



          ---------------------------------
          Yahoo! India Answers: Share what you know. Learn something new Click here
          Send free SMS to your Friends on Mobile from your Yahoo! Messenger Download now

          [Non-text portions of this message have been removed]
        • DigiDoc
          I need to write some code that will take an ID from File-1 and see if it exists on File-2. If it does, then I want to write out the record from File-2 to
          Message 4 of 19 , Jun 5, 2006
          • 0 Attachment
            I need to write some code that will take an ID from File-1 and see if it
            exists on File-2. If it does, then I want to write out the record from
            File-2 to another file.

            File-1 is just IDs, and File-2 is the ID plus a bunch of other fields
            comma delimited. The ID however is variable length. The files are
            ASCII. File-1 will only contain a small number of records (about 1K),
            while File-2 will be about 100K records.

            I have no idea how to do this (I'm a total Perl novice) and could
            greatly use help.


            Thanks!


            ~~Kevin~


            [Non-text portions of this message have been removed]
          • a_z0_9_blah
            ... if it ... from ... fields ... are ... 1K), ... Could you show some sample lines from the first file and from the second file?
            Message 5 of 19 , Jun 5, 2006
            • 0 Attachment
              --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@...> wrote:
              >
              > I need to write some code that will take an ID from File-1 and see
              if it
              > exists on File-2. If it does, then I want to write out the record
              from
              > File-2 to another file.
              >
              > File-1 is just IDs, and File-2 is the ID plus a bunch of other
              fields
              > comma delimited. The ID however is variable length. The files
              are
              > ASCII. File-1 will only contain a small number of records (about
              1K),
              > while File-2 will be about 100K records.
              >
              > I have no idea how to do this (I'm a total Perl novice) and could
              > greatly use help.
              >
              >
              > Thanks!
              >
              >
              > ~~Kevin~
              >


              Could you show some sample lines from the first file and from the
              second file?
            • Charles K. Clarkson
              ... Read the perlsub file in the Perl documentation. ... Goto http://rt.perl.org/perlbug/ and click on the Current Perl 5 Issues link. Charles K. Clarkson --
              Message 6 of 19 , Jun 5, 2006
              • 0 Attachment
                RR MISHRA wrote:

                :Can any body give me idea about the use of subroutines

                Read the 'perlsub' file in the Perl documentation.


                : and bugs.

                Goto http://rt.perl.org/perlbug/ and click on the
                Current Perl 5 Issues link.



                Charles K. Clarkson
                --
                Mobile Homes Specialist
                Free Market Advocate
                Web Programmer

                254 968-8328

                Don't tread on my bandwidth. Trim your posts.
              • DigiDoc
                File-1 ... abc bg cd1234 File-2 ... abc,john,doe,1234,9999 addathk,kathy,smith,3453,5629 bg,joe,shmo,4532,5343 cd1234,jane,madle,5432,0932
                Message 7 of 19 , Jun 5, 2006
                • 0 Attachment
                  File-1
                  -------
                  abc
                  bg
                  cd1234


                  File-2
                  -------
                  abc,john,doe,1234,9999
                  addathk,kathy,smith,3453,5629
                  bg,joe,shmo,4532,5343
                  cd1234,jane,madle,5432,0932
                  dkk32,marge,hasbro,2345,1234


                  Note: Data layout in File-2 may vary from time to time, but will always
                  start with the ID.


                  ~~Kevin~


                  a_z0_9_blah wrote:
                  > --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@...> wrote:
                  >
                  >> I need to write some code that will take an ID from File-1 and see
                  >>
                  > if it
                  >
                  >> exists on File-2. If it does, then I want to write out the record
                  >>
                  > from
                  >
                  >> File-2 to another file.
                  >>
                  >> File-1 is just IDs, and File-2 is the ID plus a bunch of other
                  >>
                  > fields
                  >
                  >> comma delimited. The ID however is variable length. The files
                  >>
                  > are
                  >
                  >> ASCII. File-1 will only contain a small number of records (about
                  >>
                  > 1K),
                  >
                  >> while File-2 will be about 100K records.
                  >>
                  >> I have no idea how to do this (I'm a total Perl novice) and could
                  >> greatly use help.
                  >>
                  >>
                  >> Thanks!
                  >>
                  >>
                  >> ~~Kevin~
                  >>
                  >>
                  >
                  >
                  > Could you show some sample lines from the first file and from the
                  > second file?
                  >
                  >
                  >
                  >
                  >


                  [Non-text portions of this message have been removed]
                • a_z0_9_blah
                  ... always ... see ... record ... (about ... could ... the ... You could try the following code on your sample data. If you will be massaging the data in
                  Message 8 of 19 , Jun 5, 2006
                  • 0 Attachment
                    --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@...> wrote:
                    >
                    >
                    > Note: Data layout in File-2 may vary from time to time, but will
                    always
                    > start with the ID.
                    >
                    >
                    > ~~Kevin~
                    >
                    >
                    > a_z0_9_blah wrote:
                    > > --- In perl-beginner@yahoogroups.com, DigiDoc <DigiDoc@> wrote:
                    > >
                    > >> I need to write some code that will take an ID from File-1 and
                    see
                    > >>
                    > > if it
                    > >
                    > >> exists on File-2. If it does, then I want to write out the
                    record
                    > >>
                    > > from
                    > >
                    > >> File-2 to another file.
                    > >>
                    > >> File-1 is just IDs, and File-2 is the ID plus a bunch of other
                    > >>
                    > > fields
                    > >
                    > >> comma delimited. The ID however is variable length. The files
                    > >>
                    > > are
                    > >
                    > >> ASCII. File-1 will only contain a small number of records
                    (about
                    > >>
                    > > 1K),
                    > >
                    > >> while File-2 will be about 100K records.
                    > >>
                    > >> I have no idea how to do this (I'm a total Perl novice) and
                    could
                    > >> greatly use help.
                    > >>
                    > >>
                    > >> Thanks!
                    > >>
                    > >>
                    > >> ~~Kevin~
                    > >>
                    > >>
                    > >
                    > >
                    > > Could you show some sample lines from the first file and from
                    the
                    > > second file?
                    > >
                    > File-1
                    > -------
                    > abc
                    > bg
                    > cd1234
                    >
                    >
                    > File-2
                    > -------
                    > abc,john,doe,1234,9999
                    > addathk,kathy,smith,3453,5629
                    > bg,joe,shmo,4532,5343
                    > cd1234,jane,madle,5432,0932
                    > dkk32,marge,hasbro,2345,1234
                    >

                    You could try the following code on your sample data.

                    If you will be 'massaging' the data in file-2,
                    you might consider treating your second
                    file as a database (using DBD::CSV).


                    #!/usr/bin/perl
                    use strict;
                    use warnings;

                    my %ids;

                    open my $id, "<", "o33.txt" or die "Unable to open o33.txt $!";

                    while (<$id>) {
                    chomp;
                    $ids{$_} = 1;
                    }

                    close $id or die $!;

                    open my $data, "<", "o44.txt" or die "Unable to open o44.txt $!";
                    open my $out, ">", "o55.txt" or die "Couldn't write results $!";

                    while (<$data>) {
                    my $key = (split /,/)[0];
                    if ($ids{$key}) {
                    print $out $_;
                    }
                    }

                    close $data or die $!;
                    close $out or die $!;
                  • DigiDoc
                    Great, thanks for the reply. I think I understand the majority of this code. I ll research the DBD::CSV as well. I definitely would not have known about that
                    Message 9 of 19 , Jun 5, 2006
                    • 0 Attachment
                      Great, thanks for the reply.

                      I think I understand the majority of this code. I'll research the
                      DBD::CSV as well. I definitely would not have known about that without
                      your help. THANK YOU!!!!!

                      I can read a fair amount of Perl code, but am just not up to quickly
                      putting code together yet (and probably not for some time). It takes me
                      forever.

                      This helps me out immensely.

                      Dumb question. If I read this correctly, you've hard coded the files,
                      how would I set it up to make these variables?

                      Thanks!

                      ~~Kevin~
                      >
                      >
                      > You could try the following code on your sample data.
                      >
                      > If you will be 'massaging' the data in file-2,
                      > you might consider treating your second
                      > file as a database (using DBD::CSV).
                      >
                      >
                      > #!/usr/bin/perl
                      > use strict;
                      > use warnings;
                      >
                      > my %ids;
                      >
                      > open my $id, "<", "o33.txt" or die "Unable to open o33.txt $!";
                      >
                      > while (<$id>) {
                      > chomp;
                      > $ids{$_} = 1;
                      > }
                      >
                      > close $id or die $!;
                      >
                      > open my $data, "<", "o44.txt" or die "Unable to open o44.txt $!";
                      > open my $out, ">", "o55.txt" or die "Couldn't write results $!";
                      >
                      > while (<$data>) {
                      > my $key = (split /,/)[0];
                      > if ($ids{$key}) {
                      > print $out $_;
                      > }
                      > }
                      >
                      > close $data or die $!;
                      > close $out or die $!;
                      >
                      >
                      >
                      >
                      >
                      >


                      [Non-text portions of this message have been removed]
                    • Alan_C
                      On Monday 05 June 2006 19:16, DigiDoc wrote: [ . . ] ...
                      Message 10 of 19 , Jun 6, 2006
                      • 0 Attachment
                        On Monday 05 June 2006 19:16, DigiDoc wrote:
                        [ . . ]
                        > Dumb question. If I read this correctly, you've hard coded the files,
                        > how would I set it up to make these variables?

                        http://groups.google.com/group/perl.beginners/browse_thread/thread/69865c81985c57f4/7d45c2b46cb7ba2b#7d45c2b46cb7ba2b

                        Take a look at the code in that it uses shift

                        A command line example:

                        perlscript file1

                        in that above command line example, file1 will be opened as filehandle OLD

                        if need to, can be set up to shift more than one file off the command line.

                        Further search topics be: shift, commandline, command line, ARGV, @ARGV,
                        $ARGV, $ARGV[0], $ARGV[1]

                        try search at <http://learn.perl.org/first-response> too likely turn up lots
                        yet even more.

                        --
                        Alan.
                      • DigiDoc
                        Thanks for all the help to this question. I ve got it working with the $ARGV stuff. I ve got more tweaks I want to make, but the base code you provided really
                        Message 11 of 19 , Jun 6, 2006
                        • 0 Attachment
                          Thanks for all the help to this question.

                          I've got it working with the $ARGV stuff. I've got more tweaks I want
                          to make, but the base code you provided really helps me out.

                          I'm thinking I'll want to make a GUI out of it now. I was doing some
                          research and found "The GUI Loft". It's a front end that uses
                          Win32::GUI. I haven't used it yet, but it seems like what I need. If
                          anyone has a better suggestion, please let me know.

                          I can't tell me how much time you've saved me, not to mention that I've
                          learned a few things as well. Code examples help me out a great deal, I
                          tend to learn faster from them.

                          Thanks!



                          [Non-text portions of this message have been removed]
                        • Ken Shail
                          ... From: DigiDoc ; DigiDoc To: perl-beginner@yahoogroups.com Sent: Tuesday, June 06, 2006 8:45 PM Subject: Re: [PBML] Re: File lookup? Thanks for all the help
                          Message 12 of 19 , Jun 6, 2006
                          • 0 Attachment
                            ----- Original Message -----
                            From: DigiDoc ; DigiDoc
                            To: perl-beginner@yahoogroups.com
                            Sent: Tuesday, June 06, 2006 8:45 PM
                            Subject: Re: [PBML] Re: File lookup?


                            Thanks for all the help to this question.

                            I've got it working with the $ARGV stuff. I've got more tweaks I want
                            to make, but the base code you provided really helps me out.

                            I'm thinking I'll want to make a GUI out of it now. I was doing some
                            research and found "The GUI Loft". It's a front end that uses
                            Win32::GUI. I haven't used it yet, but it seems like what I need. If
                            anyone has a better suggestion, please let me know.

                            I can't tell me how much time you've saved me, not to mention that I've
                            learned a few things as well. Code examples help me out a great deal, I
                            tend to learn faster from them.

                            Thanks!

                            Perl Tk - works on Linux and windoze
                            http://dc.pm.org/talks/tk/home.html
                            http://www.perltk.org/index.php?option=com_frontpage&Itemid=1
                            http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html
                          • DigiDoc
                            Thanks for the links. I was looking at that, but couldn t find a nice front end that lets you work visually. The GUI Loft let s you do things visually, then
                            Message 13 of 19 , Jun 6, 2006
                            • 0 Attachment
                              Thanks for the links.

                              I was looking at that, but couldn't find a nice front end that lets you
                              work visually. The GUI Loft let's you do things visually, then
                              generates the code for you. Is there something in tk that does that, or
                              am I missing something? Or at the very least, some application that has
                              some kind of code snippets that are easy to insert?

                              My goal is to develop code as rapidly as possible since I'll have a lot
                              to crank out.


                              Ken Shail wrote:
                              >
                              >
                              > ----- Original Message -----
                              > From: DigiDoc ; DigiDoc
                              > To: perl-beginner@yahoogroups.com
                              > <mailto:perl-beginner%40yahoogroups.com>
                              > Sent: Tuesday, June 06, 2006 8:45 PM
                              > Subject: Re: [PBML] Re: File lookup?
                              >
                              > Thanks for all the help to this question.
                              >
                              > I've got it working with the $ARGV stuff. I've got more tweaks I want
                              > to make, but the base code you provided really helps me out.
                              >
                              > I'm thinking I'll want to make a GUI out of it now. I was doing some
                              > research and found "The GUI Loft". It's a front end that uses
                              > Win32::GUI. I haven't used it yet, but it seems like what I need. If
                              > anyone has a better suggestion, please let me know.
                              >
                              > I can't tell me how much time you've saved me, not to mention that I've
                              > learned a few things as well. Code examples help me out a great deal, I
                              > tend to learn faster from them.
                              >
                              > Thanks!
                              >
                              > Perl Tk - works on Linux and windoze
                              > http://dc.pm.org/talks/tk/home.html <http://dc.pm.org/talks/tk/home.html>
                              > http://www.perltk.org/index.php?option=com_frontpage&Itemid=1
                              > <http://www.perltk.org/index.php?option=com_frontpage&Itemid=1>
                              > http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html
                              > <http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html>
                              >
                              >


                              [Non-text portions of this message have been removed]
                            • Alan_C
                              ... [ Perl Tk - works on Linux and windoze ] (if you quote a bit of context, it makes it easier for me (+ probably some others) to track this thread and to
                              Message 14 of 19 , Jun 6, 2006
                              • 0 Attachment
                                On Tuesday 06 June 2006 13:24, DigiDoc wrote:
                                > Thanks for the links.

                                [ > > Perl Tk - works on Linux and windoze ]

                                (if you quote a bit of context, it makes it easier for me (+ probably some
                                others) to track this thread and to then offer any help if I have any help to
                                offer.

                                > I was looking at that, but couldn't find a nice front end that lets you
                                > work visually. The GUI Loft let's you do things visually, then
                                > generates the code for you. Is there something in tk that does that, or
                                > am I missing something? Or at the very least, some application that has
                                > some kind of code snippets that are easy to insert?
                                >
                                > My goal is to develop code as rapidly as possible since I'll have a lot
                                > to crank out.

                                There's WX Widgets. It has drag and drop (for the builder). It's cross
                                platform ie Win, Linux, etc. There's article on WX Widgets in the
                                latest "The Perl Review" (for subscribers) hey it's reasonable enough and I
                                look forward to each issue's arrival. (I've no affiliation, just a
                                subscriber).

                                http://www.theperlreview.com/

                                The summer issue has article on WX Widgets.

                                Other than what I've shared, I know nothing about WX. I want to try it but
                                it'll be + few weeks until I do due to other business currently.

                                > > I'm thinking I'll want to make a GUI out of it now. I was doing some
                                > > research and found "The GUI Loft". It's a front end that uses
                                > > Win32::GUI. I haven't used it yet, but it seems like what I need. If
                                > > anyone has a better suggestion, please let me know.

                                --
                                Alan.
                              Your message has been successfully submitted and would be delivered to recipients shortly.