Loading ...
Sorry, an error occurred while loading the content.
 

Re: [NH] Clipbook / program to convert Word junk -> HTML

Expand Messages
  • Julie
    ... Are characters like this font dependant? I ve tried all the encoding options, as well as checking the source document is from MS Word 2000 tab, but none
    Message 1 of 9 , Jan 3, 2007
      At 1/3/2007 10:29 AM, Rudolf Horbas wrote:

      >Here I can help, my New Year holiday is over since yesterday:
      >http://tidy.sourceforge.net/docs/quickref.html#word-2000
      >
      >Instead of making a new config file for NoteTab, I'd suggest to use
      >TidyGUI (no longer maintained, but functional):
      >http://perso.orange.fr/ablavier/TidyGUI/index.html
      >
      >The tab "cleanup" has the option "Source document is from MS Word 2000".

      Are characters like this font dependant? I've tried all the encoding
      options, as well as checking the "source document is from MS Word
      2000" tab, but none of the tries has correctly converted the characters.

      from the http://textism.com/wordcleaner/ site my text from my other
      post translates as

      “Good and ill have not changed since yesteryear; nor are they
      one thing among Elves and Dwarves and another thing among Men. It is
      man’s part to discern them as much in the Golden Wood as in his
      own house.” Aragorn to Éomer

      Which looks fine on preview in the browser. Any helpful hints?
    • loro
      ... It s Word s curly quotes that give you trouble. They are non-standard. You can turn them off in Word. I don t know what happens if you try to turn them
      Message 2 of 9 , Jan 3, 2007
        Julie wrote:
        >from the http://textism.com/wordcleaner/ site my text from my other
        >post translates as
        >
        >“Good and ill have not changed since yesteryear; nor are they
        >one thing among Elves and Dwarves and another thing among Men. It is
        >man’s part to discern them as much in the Golden Wood as in his
        >own house.” Aragorn to Éomer
        >
        >Which looks fine on preview in the browser. Any helpful hints?

        It's Word's curly quotes that give you trouble. They are non-standard. You
        can turn them off in Word. I don't know what happens if you try to turn
        them off on a document that already has them. Maybe if you turn them off
        and then paste the text into a new document?

        I think it's this:
        Tool | AutorCorrect, then you have them on both the AutoFormat and
        AutoFormat As You Type tabs, "Replace straight quotes with smart quotes".

        Too bad about WordCleaner. It used to be free. :-(

        Lotta
      • Julie
        Hey Lotta ... It s also accented letters like Éomer. Many of these are articles that have been posted in blogs that I ve collected... I can t believe people
        Message 3 of 9 , Jan 3, 2007
          Hey Lotta

          >It's Word's curly quotes that give you trouble. They are non-standard. You
          >can turn them off in Word. I don't know what happens if you try to turn
          >them off on a document that already has them. Maybe if you turn them off
          >and then paste the text into a new document?

          It's also accented letters like Éomer. Many of
          these are articles that have been posted in blogs
          that I've collected... I can't believe people
          posted that mess! A friend wants to repost them
          cleaned up, so I thought I'd see if there was an easy way to do this. :-)

          >Too bad about WordCleaner. It used to be free. :-(

          The site gives me six uses a day. The potential
          project isn't a rush at least... doesn't matter
          how long it takes, but I have a substantial
          number of articles to convert. Could take a while. LOL

          Julie


          --
          No virus found in this outgoing message.
          Checked by AVG Free Edition.
          Version: 7.5.432 / Virus Database: 268.16.4/615 - Release Date: 1/3/2007 1:34 PM
        • loro
          ... You can do it with Notetab too. Notetab can display the curly quotes and the Replace thingie recognizes them, so you can select one of each kind and do a
          Message 4 of 9 , Jan 3, 2007
            I wrote:
            >It's Word's curly quotes that give you trouble.

            You can do it with Notetab too. Notetab can display the curly quotes and
            the Replace thingie recognizes them, so you can select one of each kind and
            do a "replace all" with the entity for the corresponding legit curly quote.

            Lotta
          • loro
            ... Ah. The first example came through all jumbled so I went by the second one. ... You could use a proxy. ;-o) Lotta
            Message 5 of 9 , Jan 3, 2007
              Julie wrote:
              > >It's Word's curly quotes that give you trouble.

              >It's also accented letters like Éomer.

              Ah. The first example came through all jumbled so I went by the second one.

              >The site gives me six uses a day. The potential
              >project isn't a rush at least... doesn't matter
              >how long it takes, but I have a substantial
              >number of articles to convert. Could take a while. LOL

              You could use a proxy. ;-o)

              Lotta
            • Julie
              Hey Lotta - ... The thought has crossed my mind. Julie -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.432 / Virus
              Message 6 of 9 , Jan 3, 2007
                Hey Lotta -

                >You could use a proxy. ;-o)

                The thought has crossed my mind. <G>

                Julie


                --
                No virus found in this outgoing message.
                Checked by AVG Free Edition.
                Version: 7.5.432 / Virus Database: 268.16.4/615 - Release Date: 1/3/2007 1:34 PM
              • bruce.somers@web.de
                Julie wrote: I can t believe people posted that mess! A friend wants to repost them cleaned up, so I thought I d see if there was an easy
                Message 7 of 9 , Jan 4, 2007
                  Julie <gleits@...> wrote:

                  I can't believe people
                  posted that mess! A friend wants to repost them
                  cleaned up, so I thought I'd see if there was an easy way to do this. :-)

                  No, you needn't 't believe that. It's much more likely that some component (program) used by the poster of the blog entry, has replaced what it considered to be non-standard characters, curly quotes, accented characters etc., with their corresponding "escape-codes", because many viewers will not have the character sets needed to display them. Many systems recognize only the extremely provincial and badly limited ASCII character set.

                  It's probably the blog software that is not able to replace those escape-codes with the corresponding characters.

                  Bruce
                Your message has been successfully submitted and would be delivered to recipients shortly.