Loading ...
Sorry, an error occurred while loading the content.

pdf converter to html

Expand Messages
  • Adrien Verlee
    Many converters exist. Before to wade through it (exceptionally I now have to convert 2 pdf s), is there maybe someone with experience, who can say which one
    Message 1 of 3 , Feb 1, 2013
    • 0 Attachment
      Many converters exist.
      Before to wade through it (exceptionally I now have to convert 2 pdf's),
      is there maybe someone with experience, who can say which one to take.

      Conversion to basic html is sufficient.
      Thanks.

      --
      Translatie > www.adrien-verlee.be

      - Ongedeelde informatie = verloren informatie -
    • Alec Burgess
      Hi Adrien: I use pdftotext http://en.wikipedia.org/wiki/Pdftotext download from http://www.foolabs.com/xpdf/home.html I use this command ^!replace .*
      Message 2 of 3 , Feb 1, 2013
      • 0 Attachment
        Hi Adrien:
        I use pdftotext http://en.wikipedia.org/wiki/Pdftotext download from
        http://www.foolabs.com/xpdf/home.html
        I use this command ^!replace ".*" >> "pdftotext.exe -nopgbrk -layout
        "$0"" rwais
        on a buffer containing the names of the pdf files to converted, save
        that as a batch file and run it in the folder containing the files to be
        converted.

        I then run this clip:
        H=pdfToTextToHtml
        ; --- use this on list of files ^!replace ".*" >> "pdftotext.exe
        -nopgbrk -layout "$0"" rwais
        ^!replace "^.{5,68}\R(?!\R)" >> "$0\r\n\r\n" rwais
        ^!replace "(?<=\.|"\?\!)\R" >> "\r\n\r\n\r\n" rwais
        ^!select all
        ^!toolbar "join lines"
        ^!toolbar "Document to HTML"
        ^!replace "(?<=[a-z])\</p\>\r\n\r\n\<p\>(?!chapter)" >> "\x20" rwais
        ^!replace "(?<=,|\-|\.\.\.|:)\</p\>\r\n\r\n\<p\>" >> "\x20" rwais
        ^!replace "(Mr|Mrs|Dr)\.\</p\>\r\n\r\n\<p\>\r\n" >> "\x20" rwais
        ^!save as "^$getname(^$getdocname$)$.html"

        on the resulting TXT files which creates the basic HTML files you requested.
        Note: the ^!replace lines in above are just tweaks used to get
        paragraph and line breaks as I desire.


        On 2013-02-01 07:37, Adrien Verlee wrote:
        > Many converters exist.
        > Before to wade through it (exceptionally I now have to convert 2 pdf's),
        > is there maybe someone with experience, who can say which one to take.
        >
        > Conversion to basic html is sufficient.
        --
        Regards ... Alec (buralex@gmail & WinLiveMess - alec.m.burgess@skype)
      • Adrien Verlee
        ... Sometimes one is so foolish to look elsewhere, where the solution lies next door. I had only to open the PDF in Acrobat Reader X and save as text. Where my
        Message 3 of 3 , Feb 2, 2013
        • 0 Attachment
          Op 2/02/2013 0:04, Alec Burgess schreef:
          > I use pdftotexthttp://en.wikipedia.org/wiki/Pdftotext download from
          > http://www.foolabs.com/xpdf/home.html

          Sometimes one is so foolish to look elsewhere, where the solution lies
          next door.

          I had only to open the PDF in Acrobat Reader X and save as text. Where
          my Word-macro can run on.

          But thanks for your post.
          --
          Translatie > www.adrien-verlee.be

          - Ongedeelde informatie = verloren informatie -
        Your message has been successfully submitted and would be delivered to recipients shortly.