Loading ...
Sorry, an error occurred while loading the content.

Strange characters...

Expand Messages
  • Bob Gorman
    Friends, Just today I saw some strange characters when I upload a web page to my website. If I select View in Browser from NoteTab5, it displays OK. The
    Message 1 of 28 , Nov 9, 2008
    • 0 Attachment
      Friends,
      Just today I saw some strange characters when I upload a web page to my
      website.
      If I select View in Browser from NoteTab5, it displays OK.
      The characters are superscript 1 and 2 and are in the same font as the
      rest of my page, namely ComicSans.
      But after I upload it to my ISP I get these funny looking characters.
      It looks like
      FF
      FD
      with a box around it.
      Sound like something to do with font's but what?
      If you want to see it, it's at:
      http://www.kncell.org/ILP.html
      at the bottom of the 1st table.
      Thanks

      Bob

      --
      'I am only one; but I am still one.
      I cannot do everything, but still I can do something.
      I will not refuse to do the something I can do.'
      -- Helen Keller


      [Non-text portions of this message have been removed]
    • Axel Berger
      ... Simple. you don t ever bother to declare a character set and use illegal characters. After telling it to ignore the declaration (by the server) of UTF-8
      Message 2 of 28 , Nov 9, 2008
      • 0 Attachment
        Bob Gorman wrote:
        > Sound like something to do with font's but what?

        Simple. you don't ever bother to declare a character set and use illegal
        characters.

        After telling it to ignore the declaration (by the server) of UTF-8 and
        to use cp-1252 (an easy guess, it is always Windows adherents, who
        believe their way is the only way) your page still contains 20 errors.

        First write at least semantically correct code that at least validates.
        If there are problems left then, do go ahead and ask. Doctoring around
        in pages without correcting the mistakes first is a fool's errand.

        See:
        http://validator.w3.org/check?uri=http%3A%2F%2Fwww.kncell.org%2FILP.html&charset=windows-1252&doctype=Inline&group=0&user-agent=W3C_Validator%2F1.591

        Axel
      • loro
        ... I don t get anything fancy like that. Just question marks. :-( Character encoding mismatch. You declare UTF-8 on the server but I guess you wrote the page
        Message 3 of 28 , Nov 9, 2008
        • 0 Attachment
          Bob Gorman wrote:
          >Just today I saw some strange characters when I upload a web page to my
          >website.
          >If I select View in Browser from NoteTab5, it displays OK.
          >The characters are superscript 1 and 2 and are in the same font as the
          >rest of my page, namely ComicSans.
          >But after I upload it to my ISP I get these funny looking characters.
          >It looks like
          >FF
          >FD
          >with a box around it.
          >Sound like something to do with font's but what?
          >If you want to see it, it's at:
          >http://www.kncell.org/ILP.html
          >at the bottom of the 1st table.

          I don't get anything fancy like that. Just question marks. :-(

          Character encoding mismatch. You declare UTF-8 on the server but I
          guess you wrote the page in Notetab, which means ANSI. Characters
          other than the basic ASCII ones are encoded differently in UTF-8 and
          ANSI/iso latin, so browsers throw a fit. Use ¹ and ² if you
          want to declare UTF-8 on the server and all should be well. Or change
          to iso latin on the server if you want to type the superscript characters.

          Lotta
        • loro
          Hold your horses, Axel. No need to be rude, is there? Everyone isn t at the same level at the same time, you know. ... Yes, he does. ... There you see! He
          Message 4 of 28 , Nov 9, 2008
          • 0 Attachment
            Hold your horses, Axel. No need to be rude, is there? Everyone isn't
            at the same level at the same time, you know.

            Axel Berger wrote:
            >Simple. you don't ever bother to declare a character set and use illegal
            >characters.

            Yes, he does.

            >After telling it to ignore the declaration (by the server) of UTF-8

            There you see! He does declare a charset. In what way does he say
            that UTF-8 should be ignored? How is it even possible to do that?

            >and
            >to use cp-1252 (an easy guess, it is always Windows adherents, who
            >believe their way is the only way) your page still contains 20 errors.

            Easy maybe, but wrong. Superscript 1 and 2 are not in the illegal range.

            >First write at least semantically correct code that at least validates.
            >If there are problems left then, do go ahead and ask.

            I'm not so sure I would come back for more of this...

            > Doctoring around
            >in pages without correcting the mistakes first is a fool's errand.

            Well, it didn't take me many seconds and I didn't mind at all. :-)

            Excuses in advance,
            Lotta
          • Bob Gorman
            Lotta & Axel Thank you, Thank you, Thank you. ... How? Where? How do I change it, if that is what I need to do? Is there an explanation about such matters? ...
            Message 5 of 28 , Nov 9, 2008
            • 0 Attachment
              Lotta & Axel Thank you, Thank you, Thank you.

              loro wrote:
              > I don't get anything fancy like that. Just question marks. :-(
              >
              > Character encoding mismatch. You declare UTF-8 on the server
              How? Where? How do I change it, if that is what I need to do?
              Is there an explanation about such matters?

              > but I guess you wrote the page in Notetab,
              Absolutely!
              > which means ANSI. Characters
              > other than the basic ASCII ones are encoded differently in UTF-8 and
              > ANSI/iso latin, so browsers throw a fit. Use ¹ and ² if you
              > want to declare UTF-8 on the server and all should be well.
              I did that and it worked well, but I have no clue why.

              > Or change to iso latin on the server if you want to type the
              superscript characters.
              Again, how do I do that?
              Something in the Head part of my html files?
              > Lotta

              Axel,
              I'll reply separately...

              Bob

              --
              If at first, you don't succeed; Parachuting is probably not for you!
              http://www.KnCell.org
              http://blog.KnCell.org
              For sale: Parachute. Only used once, never opened, small stain...


              [Non-text portions of this message have been removed]
            • Axel Berger
              ... Yes, that too. First you need to shut up that server. You might need help from your provider, but this line in .htaccess ought to be the first step:
              Message 6 of 28 , Nov 9, 2008
              • 0 Attachment
                Bob Gorman wrote:
                > Something in the Head part of my html files?

                Yes, that too. First you need to shut up that server. You might need
                help from your provider, but this line in .htaccess ought to be the
                first step:

                AddDefaultCharset Off

                See: http://wsabstract.com/howto/htaccess.shtml

                You might even be able to make it declare the correct set, but few
                providers offer that.


                Then your header should contain the line:

                <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=US-ASCII">

                or

                <META HTTP-EQUIV="Content-Type"
                CONTENT="text/html; charset=windows-1252">

                And last you should use Firefox and the "Html Validator" extension for
                testing, the easiest and quickes way for flagging all those syntax
                errors that will creep in, however diligent you try to be.

                Axel
              • Axel Berger
                ... In away yes, but it is in fact the provider doing for him and doing it wrong for the actual content. ... For UTF-8 the ones Bob uses are. Axel
                Message 7 of 28 , Nov 9, 2008
                • 0 Attachment
                  loro wrote:
                  > There you see! He does declare a charset.

                  In away yes, but it is in fact the provider doing for him and doing it
                  wrong for the actual content.

                  > Easy maybe, but wrong. Superscript 1 and 2 are not in the illegal range.

                  For UTF-8 the ones Bob uses are.

                  Axel
                • Bob Gorman
                  Axel Berger wrote: Thank you, but you raise other issues that concern me... ... How? Where did I do that, or fail to do that? ... I share your contempt for
                  Message 8 of 28 , Nov 9, 2008
                  • 0 Attachment
                    Axel Berger wrote:

                    Thank you, but you raise other issues that concern me...

                    > Simple. you don't ever bother to declare a character set and use illegal
                    > characters.
                    How? Where did I do that, or fail to do that?

                    > After telling it to ignore the declaration (by the server) of UTF-8 and
                    > to use cp-1252 (an easy guess, it is always Windows adherents, who
                    > believe their way is the only way) your page still contains 20 errors.

                    I share your contempt for Windows. But How do I fix it?

                    > First write at least semantically correct code that at least validates.
                    > If there are problems left then, do go ahead and ask. Doctoring around
                    > in pages without correcting the mistakes first is a fool's errand.

                    Sorry, how do I mend my ways?


                    > See:
                    >
                    http://validator.w3.org/check?uri=http%3A%2F%2Fwww.kncell.org%2FILP.html&charset=windows-1252&doctype=Inline&group=0&user-agent=W3C_Validator%2F1.591
                    >
                    > Axel
                    I did, and it seems like the 1 warning was more important than the 30
                    errors.

                    I did not wake up one morning and declare: I want to screw the Internet!

                    Most of my code is generated by using the 2 files:
                    C:\Program Files\NoteTab Pro 5\Libraries\HTML-1.ctb & HTML-2.ctb.

                    Is this the best available, or do I have some deep seated character flaw
                    for choosing these files?

                    Most of the errors seem to have come from using the Table Wizard.
                    Is it more illusive that the Wizard in "Alice in Wonderland"?

                    Bob

                    --
                    "The difference between fiction and reality?
                    Fiction has to make sense."
                    /-- Tom Clancy
                    /



                    [Non-text portions of this message have been removed]
                  • Axel Berger
                    ... Ignore that, it was a snide remark and not very nice. When the charset is wrong the validator stops right there and tells you no more. So I had to use its
                    Message 9 of 28 , Nov 9, 2008
                    • 0 Attachment
                      Bob Gorman wrote:
                      > I share your contempt for Windows. But How do I fix it?

                      Ignore that, it was a snide remark and not very nice. When the charset
                      is wrong the validator stops right there and tells you no more. So I had
                      to use its "try this set instead" option and had to guess which one --
                      Windows default was the obvious choice.

                      > Most of the errors seem to have come from using the Table Wizard.
                      > Is it more illusive that the Wizard in "Alice in Wonderland"?

                      Possibly, I haven't looked. Unfortunately my best HTML tutorial and
                      syntax lookup is in German - hopefully the others here can suggest
                      something for you.
                      I use my own very heavily edited set of clips, but I believe the default
                      ones in NoteTab contain a lot of old deprecated and non-standardized
                      stuff.

                      Some will say that insisting on valid code is over the top purism, but
                      when things don't act as you expect them to, finding some elusive hidden
                      syntax error can take ages. I have found that eliminating all those and
                      ensuring valid HTML and valid CSS first and then going after my own
                      faulty logic about how stuff should behave is the fastest and easiest
                      way to get results.

                      Your main problem is that the headers sent by the server take precedence
                      over your own META headers in the code, so you need to make the server
                      switch those off. The line I quoted is from one of my own .htaccess
                      files. It ought to work. With that off the META declaration will be used
                      and that should be what you actually use in your files. For NoteTab and
                      Windows this is NOT iso 8859-1 by the way, it nearly is but not quite.
                      You may never actually use the € sign, but why not declare the correct
                      set anyway?

                      Purism and fussiness is for lazy people like me, it makes life so much
                      easier.

                      Axel
                    • loro
                      ... In the only way that matters. ... The host shouldn t send that header at all, but we didn t know it was the host s doing. I still don t. ... That s another
                      Message 10 of 28 , Nov 9, 2008
                      • 0 Attachment
                        Axel Berger wrote:
                        >loro wrote:
                        > > There you see! He does declare a charset.
                        >
                        >In away yes,

                        In the only way that matters.

                        > but it is in fact the provider doing for him and doing it
                        >wrong for the actual content.

                        The host shouldn't send that header at all, but we didn't know it was
                        the host's doing. I still don't.

                        > > Easy maybe, but wrong. Superscript 1 and 2 are not in the illegal range.
                        >
                        >For UTF-8 the ones Bob uses are.

                        That's another matter than the illegal windows characters you were
                        referring to.

                        Lotta
                      • Axel Berger
                        ... Where in Bob s source did you find UTF-8? If not there it must be in the HTML headers. It has to be somewhere. ... No. The browser and the validator first
                        Message 11 of 28 , Nov 9, 2008
                        • 0 Attachment
                          loro wrote:
                          > but we didn't know it was the host's doing. I still don't.

                          Where in Bob's source did you find UTF-8? If not there it must be in the
                          HTML headers. It has to be somewhere.

                          > That's another matter than the illegal windows characters you were
                          > referring to.

                          No. The browser and the validator first look what encoding to expect and
                          then parse the code using that. So what is illegal and what is not
                          solely depends on that declaration. In another context those very same
                          characters may be perfectly legal, but that doesn't matter.

                          Axel
                        • loro
                          ... I think you mean HTTP headers. That doesn t mean the host made that happen. ... Superscript 1 and 2, encoded the way they are, are not exclusive to
                          Message 12 of 28 , Nov 9, 2008
                          • 0 Attachment
                            Axel Berger wrote:
                            >loro wrote:
                            > > but we didn't know it was the host's doing. I still don't.
                            >
                            >Where in Bob's source did you find UTF-8? If not there it must be in the
                            >HTML headers. It has to be somewhere.

                            I think you mean HTTP headers. That doesn't mean the host made that happen.

                            > > That's another matter than the illegal windows characters you were
                            > > referring to.
                            >
                            >No. The browser and the validator first look what encoding to expect and
                            >then parse the code using that. So what is illegal and what is not
                            >solely depends on that declaration. In another context those very same
                            >characters may be perfectly legal, but that doesn't matter.

                            Superscript 1 and 2, encoded the way they are, are not exclusive to
                            cp-1252, and that was what you were talking about. They aren't
                            "illegal", they just mean different things in ANSI and UTF-8. No
                            validator refuses to parse Bob's page because of those two
                            characters. They would have through an error had they been in the so
                            called illegal range though.

                            Lotta
                          • loro
                            ... This is so backwards. If Bob can use .htaccess, he should of course use it to make the server send the character encoding he prefers, which may very well
                            Message 13 of 28 , Nov 9, 2008
                            • 0 Attachment
                              Axel Berger wrote:
                              >Bob Gorman wrote:
                              > > Something in the Head part of my html files?
                              >
                              >Yes, that too. First you need to shut up that server. You might need
                              >help from your provider, but this line in .htaccess ought to be the
                              >first step:
                              >
                              > AddDefaultCharset Off

                              This is so backwards. If Bob can use .htaccess, he should of course
                              use it to make the server send the character encoding he prefers,
                              which may very well be UTF-8 for all we know. Why in the whole world
                              not use it as it is intended instead of relying solely on a fallback
                              mechanism like Meta?

                              ><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=US-ASCII">

                              Except ASCII doesn't cover superscript 1 and 2, so that's hardly an
                              improvement.

                              Bob, just use the entities (¹ and ²) for now and be done
                              with it. You can read up about character encoding when you feel up to
                              it. I fear this will just confuse you and I'm sorry for my part in that.

                              Lotta
                            • Bob Gorman
                              ... Yes, I did, & I m happy for now. ... I will. I obviously need to learn about character sets and this mysterious .htaccess, but it can wait till I get a
                              Message 14 of 28 , Nov 9, 2008
                              • 0 Attachment
                                loro wrote:

                                > Bob, just use the entities (¹ and ²) for now and be done
                                > with it.

                                Yes, I did, & I'm happy for now.

                                > You can read up about character encoding when you feel up to
                                > it.

                                I will.
                                I obviously need to learn about character sets and this mysterious
                                .htaccess, but it can wait till I get a good night's sleep.

                                I have 30+ web pages now and plan over time to double that, so I want to
                                learn good practices now, to avoid excessive fix-up later.

                                Thanks, and good night.

                                Bob
                              • Axel Berger
                                ... Absolutely. It was way past midnight here, when I wrote that. ... Well, someone who has control over the server. And that usually is not the customer or
                                Message 15 of 28 , Nov 9, 2008
                                • 0 Attachment
                                  loro wrote:
                                  > I think you mean HTTP headers.

                                  Absolutely. It was way past midnight here, when I wrote that.

                                  > That doesn't mean the host made that happen.

                                  Well, someone who has control over the server. And that usually is not
                                  the customer or only to the very limited degree .htaccess allows.

                                  > Superscript 1 and 2, encoded the way they are, are not exclusive to
                                  > cp-1252, and that was what you were talking about. They aren't
                                  > "illegal", they just mean different things in ANSI and UTF-8.

                                  I have to admit to not being firm in UTF-8. I do know that (nearly?)
                                  everything that's in the upper 128 for other encodings is a two
                                  character sequence in UTF-8. And I have tried this: The validator said
                                  "illegal, no UTF-8" first and was satisfied when I overrode that with
                                  telling it "use cp-1252". I have not checked which characters were the
                                  offenders, but the ones that showed up wrong in the browser is a good
                                  guess IMHO.

                                  Axel
                                • loro
                                  ... But you suggested Bob would use .htaccess. I d say declaring the charset is one of the most common uses people make of .htaccess. ... You are absolutely
                                  Message 16 of 28 , Nov 9, 2008
                                  • 0 Attachment
                                    Axel Berger wrote:
                                    > > That doesn't mean the host made that happen.
                                    >
                                    >Well, someone who has control over the server. And that usually is not
                                    >the customer or only to the very limited degree .htaccess allows.

                                    But you suggested Bob would use .htaccess. I'd say declaring the
                                    charset is one of the most common uses people make of .htaccess.

                                    >UTF-8. And I have tried this: The validator said
                                    >"illegal, no UTF-8" first and was satisfied when I overrode that with
                                    >telling it "use cp-1252".

                                    You are absolutely right. The W3C validator does do that now (while
                                    the WDG one does not). My bad.

                                    Lotta
                                  • Axel Berger
                                    ... We do know, Bob told us. He uses NoteTab and writes in his native Windows charset. Apart from that you re right of course, and I already said so. If Bob
                                    Message 17 of 28 , Nov 9, 2008
                                    • 0 Attachment
                                      loro wrote:
                                      > This is so backwards. If Bob can use .htaccess, he should of course
                                      > use it to make the server send the character encoding he prefers,
                                      > which may very well be UTF-8 for all we know.

                                      We do know, Bob told us. He uses NoteTab and writes in his native
                                      Windows charset.
                                      Apart from that you're right of course, and I already said so.
                                      If Bob can make the server send the correct HTTP headers, that's best.
                                      Only his provider can tell him that. The two providers I'm using (one is
                                      my own choice and with the other I'm webmaster for someone else) don't,
                                      but at least allow me to stop them sending the wrong ones.

                                      > Bob, just use the entities (¹ and ²) for now and be done
                                      > with it.

                                      That is a possibility. It is the one I use on my own site for maximum
                                      backwards compatibility and there I declare US-ASCII in step with what
                                      I'm actually doing.

                                      For the other site I made easy maintainabilty by others the priority and
                                      declare cp-1252, meaning that they can just type whatever their Windows
                                      computer allows them and need not bother about encoding.
                                      Unless you want to restrict yourself to the lowest common denominator on
                                      ideological grounds, like I do, that's the best choice. It means in
                                      essence "whatever you can type and display correctly in NoteTab, the
                                      server and browser will accept and display correctly too."

                                      > You can read up about character encoding when you feel up to it.

                                      Bob, there really is not much to it. Most computers use a 255 character
                                      alphabet - I'm ignoring extensions like UTF for the moment. In all these
                                      the first 127 characters are identical and standardized by ASCII. The
                                      top 128 ones, your ä ö ü é ê € µ and so on, can be all over the place.
                                      This used to be more of a problem when the Macs, Ataris, Amigas DOS with
                                      cp-437, DOS with cp-850 and so on all had sizeable market shares. As
                                      long as you are using Windows and don't switch to cyrillic, greek,
                                      hebrew or something like that, everything you type and display will be
                                      encoded as cp-1252 (of which terms like AnsiNew and others are synonyms,
                                      but not ANSI, Latin-1 or ISO 8859-1). So if you go and tell that to the
                                      browsers rendering your pages, you'll be fine. If you don't, they or the
                                      server have to guess and may guess wrong. That's all there is to it.

                                      Axel
                                    • loro
                                      ... Axel, it s only the so called illegal range that s unique to the windows codepage. The rest, as the superscript characters at hand, are not. ... Trial and
                                      Message 18 of 28 , Nov 10, 2008
                                      • 0 Attachment
                                        >We do know, Bob told us. He uses NoteTab and writes in his native
                                        >Windows charset.

                                        Axel, it's only the so called illegal range that's unique to the
                                        windows codepage. The rest, as the superscript characters at hand, are not.

                                        >If Bob can make the server send the correct HTTP headers, that's best.
                                        >Only his provider can tell him that.

                                        Trial and error works pretty well too. ;-)

                                        >The two providers I'm using (one is
                                        >my own choice and with the other I'm webmaster for someone else) don't,
                                        >but at least allow me to stop them sending the wrong ones.

                                        Do they let you use .htaccess but they don't let you use it to
                                        declare the character encoding? That sounds strange and unusual indeed.


                                        >encoded as cp-1252 (of which terms like AnsiNew and others are synonyms,
                                        >but not ANSI, Latin-1 or ISO 8859-1). So if you go and tell that to the
                                        >browsers rendering your pages, you'll be fine.

                                        So he will with an iso latin charset.

                                        I'll be quiet now. This doesn't lead anywhere and has very little to
                                        do with Bob's question. Again, I'm sorry for this bickering. It
                                        really wasn't my intention but that's how it turned out. I just
                                        wanted Bob to get an answer to his question.

                                        Lotta
                                      Your message has been successfully submitted and would be delivered to recipients shortly.