Loading ...
Sorry, an error occurred while loading the content.

5498Re: [NH] Code for HTML Extended Characters

Expand Messages
  • Marcelo de Castro Bastos
    May 2, 2006
    • 0 Attachment
      On the last exciting episode, aired on 2/5/2006 17:53, Jody invited the
      wrath of the gods by saying:
      > Hi All,
      >
      > I know I have this somewhere, but does anybody have a list of the
      > extended characters converted to HTML that are not working
      > correctly. I was once told that it was from a certain number and
      > all the way up the list from there, if that makes any sense to
      > you. <g>
      >
      Well, not in table form, no, but...
      ... i made a couple clips to solve this issue. For instance, numbered
      entities in the range 128-159 are invalid and should never be used;
      since I sometimes receive a HTML file from somewhere with this problem
      (Tidy will scream bloody murder when trying to process it), I did this
      clip to convert these entities to the correct named entities:

      ^!Replace "ƒ" >> "ƒ" TWSA
      ^!Replace "•" >> "•" TWSA
      ^!Replace "…" >> "…" TWSA
      ^!Replace "Œ" >> "Œ" TWSA
      ^!Replace "œ" >> "œ" TWSA
      ^!Replace "Š" >> "Š" TWSA
      ^!Replace "š" >> "š" TWSA
      ^!Replace "ˆ" >> "ˆ" TWSA
      ^!Replace "˜" >> "˜" TWSA
      ^!Replace "–" >> "–" TWSA
      ^!Replace "—" >> "—" TWSA
      ^!Replace "‘" >> "‘" TWSA
      ^!Replace "’" >> "’" TWSA
      ^!Replace "“" >> "“" TWSA
      ^!Replace "”" >> "”" TWSA
      ^!Replace "„" >> "„" TWSA
      ^!Replace "†" >> "†" TWSA
      ^!Replace "‡" >> "‡" TWSA
      ^!Replace "“" >> "‰" TWSA
      ^!Replace "‹" >> "‹" TWSA
      ^!Replace "›" >> "›" TWSA
      ^!Replace "€" >> "€" TWSA

      However, when using the "convert to HTML" feature in Notetab, the
      problem is NOT limited to that; the sad truth is that Notetab converts
      some characters to the WRONG numbered/named entity. For instance,
      smartquotes are converted wrongly. So, I developed a second clip that
      takes this into account (and do a couple cosmetic things on the side):

      ^!Replace "’" >> "”" TWSA
      ^!Replace "´" >> "’" TWSA
      ^!Replace "ƒ" >> "ƒ" TWSA
      ^!Replace "•" >> "•" TWSA
      ^!Replace "…" >> "…" TWSA
      ^!Replace "Œ" >> "Œ" TWSA
      ^!Replace "œ" >> "œ" TWSA
      ^!Replace "Š" >> "Š" TWSA
      ^!Replace "š" >> "š" TWSA
      ^!Replace "ˆ" >> "ˆ" TWSA
      ^!Replace "˜" >> "˜" TWSA
      ^!Replace "–" >> "–" TWSA
      ^!Replace "—" >> "—" TWSA
      ^!Replace "‘" >> "‘" TWSA
      ^!Replace "“" >> "“" TWSA
      ^!Replace "„" >> "„" TWSA
      ^!Replace "†" >> "†" TWSA
      ^!Replace "‡" >> "‡" TWSA
      ^!Replace "Ž" >> "é" TWSA
      ^!Replace "‰" >> "‰" TWSA
      ^!Replace "‹" >> "‹" TWSA
      ^!Replace "›" >> "›" TWSA
      ^!Replace "€" >> "€" TWSA
      ^!Replace "‚" >> "é" TWSA

      Marcelo Bastos
    • Show all 23 messages in this topic