Smart quotes + Characters to HTML = BAD!
- There's quite a major bug in the way that certain typesetting
characters-- specifically, the em and en dash and smart quotes-- are
handled by the Modify->Characters to HTML command. They're all
replaced with entities, but three of the entities is deprecated and
two are just plain wrong.
These are the replacements made by NoteTab, which seem to be
hard-coded into the program as far as I can tell:
left single quote (alt-0145) = ‘
This entity is correct, but not supported by all browsers.
right single quote (alt-0146) = ´
This is the WRONG character, though it looks somewhat like a right
left double quote (alt-0147) =
This entity is deprecated, though supported by some browsers.
right double quote (alt-0148) = ’
This is a right *single* quote, not a right double quote!
en dash (alt-0150) =
em dash (alt-0151) =
Both dashes' entities are deprecated.
These really need to be fixed for those of us who wish to easily
publish HTML documents containing curly quotes and dashes. Even
deprecated entities would be better than those which are just plain
The correct XHTML-compliant entities, for the record, are:
left single (alt-0145) = ‘
right single (alt-0146) = ’
left double (alt-0147) = “
right double (alt-0148) = ”
en dash (alt-0150) = –
em dash (alt-0151) = —
-- cody a.k.a. codeman38
- --- In email@example.com, "Cody B." <cody@z...> wrote:
> ...three of the entities is deprecated andErm. Three of the entities *are* deprecated, not is. I really ought to
> two are just plain wrong.
proofread my posts after I've made changes to them... :-/
-- cody, compulsive proofreader
>Erm. Three of the entities *are* deprecated, not is. I really ought toBah, I do that all the time. :-)
>proofread my posts after I've made changes to them... :-/
Yeah, they are all illegal but I'm wondering, how did you get the curlies
and em and en dashes into Notatab in the first place? At first I didn't
understand what you were talking about because NTB has never converted any
quotes for me but then I got clever and used the character map to insert
curlies and now NTB did what you say.
If this or similar is how you do it I guess it's disputable if NTB does it
right or wrong. After all, that is the characters it's given. Or am I wrong
in thinking all curlies aren't the same curlies?
Talking about the character conversion feature, wasn't there a fix that
stopped NTB from converting spaces to &nbs;? Something in the ini? Can
someone refresh my memory?
- Hi Lotta,
>Talking about the character conversion feature, wasn't there a fix thatI'm pretty sure you are right. It was one of those "silent"
>stopped NTB from converting spaces to &nbs;? Something in the ini? Can
>someone refresh my memory?
fixes like "I'll fix it in the next release" (betas at the time).
For now you can just run a Clip for as many as you need such as:
^!Replace "�" >> "" WAS
^!Replace "�" >> "" WAS
or just type it as you go: Alt+0147 and Alt+0148
I notice NoteTab does OK on the 147, but renders a single curly closing for 148.
I don't use them so the above it just a how to, not a this is
correct thing. So don't get all geek'd out on me. :)
This was not it, I do not think. I'll try to get Eric to step in
here and let us know where we stand.
---- Date: Mon, 17 Jun 2002 14:07:18 +0200 ----
Subject: RE: [NRN] NTP 4.9: Characters to HTML !?
Content-Type: text/html; charset=US-ASCII
Ruben you wrote:
> Eric added this as new feature.Uhu...I'm with Knut. This is not good. In fact it will be a PITA. Both
> Often spaces are used to align text, converting them to 's
> keeps that the way the originator intended. If you not translate
> them all spaces will be compressed to one by the browser.
"Extended characters" and "All special characters" work this way too it seems.
- Hi cody,
>There's quite a major bug in the way that certain typesettingThanks for your feedback. Referring to this as a "major bug" is a slight
>characters-- specifically, the em and en dash and smart quotes-- are
>handled by the Modify->Characters to HTML command. They're all
>replaced with entities, but three of the entities is deprecated and
>two are just plain wrong.
exaggeration in my opinion <g>.
The "Characters to HTML command" command has been around for quite a while.
At the time of its development, the entities you mentioned were not
deprecated. I haven't made changes to the code because I believe there are
more browsers that will fail to correctly render the compliant entities
than the deprecated ones. I'm planning to make the entity conversion user
customizable in NoteTab 5, so this won't really be an issue anymore.
I've taken note of the two wrong entities and will correct them in the next
Eric G.V. Fookes
Author of NoteTab, Mailbag Assistant, and Easy Imager
http://www.fookes.com/ and http://www.notetab.com/