Loading ...
Sorry, an error occurred while loading the content.

5500Re: [NH] Code for HTML Extended Characters

Expand Messages
  • Kerry Coates
    May 2, 2006
    • 0 Attachment
      Marcelo,
      I still use " for double quotes. Is that not correct anymore? I
      also use ® for Registered Trademark and & for the symbol "&", etc.
      When I validate my pages I don't get any errors on these ( just other stuff
      I can't seem to get right!)

      Kerry Coates
      Visit all my websites!
      http://www.GilaMountainDulcimers.com
      http://www.Amazing-Health-Products.com
      http://www.GamblingStrategyCards.com
      http://www.CharroCoats.com
      http://www.PaulCoatesGuitar.com


      ----- Original Message -----
      From: "Marcelo de Castro Bastos" <mcblista@...>
      To: <ntb-html@yahoogroups.com>
      Sent: Tuesday, May 02, 2006 6:37 PM
      Subject: Re: [NH] Code for HTML Extended Characters


      > On the last exciting episode, aired on 2/5/2006 17:53, Jody invited the
      > wrath of the gods by saying:
      >> Hi All,
      >>
      >> I know I have this somewhere, but does anybody have a list of the
      >> extended characters converted to HTML that are not working
      >> correctly. I was once told that it was from a certain number and
      >> all the way up the list from there, if that makes any sense to
      >> you. <g>
      >>
      >>
      > Just found an old message in which I gave the same info as in the
      > previous message, only in a more structured way:
      >
      > Marcelo
      >
      > -------- Original Message --------
      > Subject: Re: [NH] Extended Characters Not Getting Converted?
      > Date: Thu, 27 May 2004 16:37:49 -0300
      > From: Marcelo de Castro Bastos <mcblista@...>
      > Reply-To: ntb-html@yahoogroups.com
      > To: ntb-html@yahoogroups.com
      > References: <4.3.2.7.2.20040527085617.00caa690@...>
      >
      >
      >
      > On 27/5/2004 14:58, Jody invited the wrath of the gods by saying:
      >
      >>Hi All,
      >>
      >>(Please don't all reply here at once. ;) See what has been posted -
      >>thanks!)
      >>
      >>There were some reports awhile back about some extended
      >>characters not getting converted correctly. I believe it was when
      >>doing a Document to HTML. They were either incorrect or not
      >>converted at all. I need to know:
      >>
      >>What are the characters, ie Alt+0252, ü,
      >>252 FC 374 11111100 ü · ü Latin small U, diæresis/umlaut)
      >>
      >>
      >>
      > Main problem stems from the fact that characters from 128 to 159 are
      > UNDEFINED on Latin-1 (iso-8859-1) encoding. However, Windows has special
      > characters defined in this range, which can cause problems. The ones I
      > noticed, for instance, are all in this range. There's two in particular
      > which annoy the hell of me, because Notetab converts them to the wrong
      > entity:
      >
      > 146 92 U+2019 : RIGHT SINGLE QUOTATION MARK (gets converted to ´
      > which is NOT the same -- should be ’ or ’)
      > 148 94 U+201D : RIGHT DOUBLE QUOTATION MARK (gets converted to ’
      > which is plain wrong -- should be ” or ”)
      >
      > Generally speaking, characters in the range 128-159 should NOT be
      > converted to numeric entities in the range € - Ÿ -- this range
      > is undefined and disallowed. Oh, it WORKS, kinda, IF you don't mind that
      > your page won't validate and if you don't mind that people with
      > non-Windows systems will get weird characters. HTMLTidy will fix it --
      > but will also generate a long error report, with the actual *important*
      > coding errors lost in the middle of the garbage. I got to the point that
      > I created a clip to fix the conversion...
      >
      >>What NoteTab does or does not do with them and what it should do...
      >>
      >>
      >>
      > Characters in the 128-159 range should be converted thus (at least for
      > Windows standard fonts, with encoding Windows-1252) -- some authors
      > prefer numeric entities (defined in
      > http://www.w3.org/TR/REC-html40/sgml/entities.html ) because not all
      > browsers will recognize all named entities. I didn't have the time to
      > look up all numeric references one by one, BUT Tidy will convert from
      > characters to numbered entities with NO error messages.
      >
      > 128 to €
      > 130 to ‚
      > 131 to ƒ
      > 132 to „
      > 133 to …
      > 134 to †
      > 135 to ‡
      > 136 to ˆ
      > 137 to ‰
      > 138 to Š
      > 139 to ‹
      > 140 to Œ
      > 142 to é
      > 145 to ‘
      > 146 to ’
      > 147 to “
      > 148 to ”
      > 149 to •
      > 150 to –
      > 151 to —
      > 152 to ˜
      > 153 to ™
      > 154 to š
      > 155 to ›
      > 156 to œ
      > 158 to ž
      > 159 to Ÿ
      >
      >>Is it NoteTab Pro or NoteTab Standard/Light or all...
      >>
      >>
      >>
      >>
      > I have noticed it both in Light and in Pro.
      >
      >>What steps to take to see it happen...
      >>
      >>
      >>
      > 1. Write a plain text file containing characters in the range above
      > (especially "smart" quotes)
      > 2. Convert to HTML
      > 3. RunTIDY in order to see the error messages (not harmful)
      > 4. Open the HTML file in your browser and check the smartquotes.
      >
      > --
      > Marcelo de Castro Bastos
      >
      >
      >
      >
      >
      >
      > CSE HTML Validator Lite - it's free:
      > http://home.earthlink.net/~5wink/dl/cselite652.exe
      >
      > Fookes Software Home: http://www.fookes.us/redir
      >
      > Yahoo! Groups Links
      >
      >
      >
      >
      >
      >
      >
      >
    • Show all 23 messages in this topic