Loading ...
Sorry, an error occurred while loading the content.

10867RE: [GTh] Coptic Unicode Headaches

Expand Messages
  • Rick Hubbard
    Mar 25 2:55 AM
    • 0 Attachment
      Hi Mike

      I hadn’t recalled that I mentioned this to the list before and even at that
      I can only vaguely remember experimenting with the different version of
      Excel.

      In any case I think I may have identified the problem with the sorting
      issue, however at this point I have no idea how to resolve it.

      First a bit of background. Every Unicode character (glyph) is populated with
      a set of 14 properties assigned by the Unicode Technical Committee (UTC).
      Typical categories include things such as "letter", "combining mark",
      "symbol" and so forth, but the property relevant to the sorting issue seems
      to be what is called a “code point”. Every one of the many thousands of
      Unicode glyphs is assigned a unique Code point along with a Hexadecimal
      designation. These Code Points are grouped into ranges according to language
      script type or sometimes character type. These collections are called Code
      Tables. The Code Table for Basic Latin glyphs resides in a table with ranges
      from 0020 to 007F. There are two Code Tables for Greek and Coptic. One of
      these is called “Greek and Coptic” (code range 0370 to 03FF)and the other is
      called “Coptic” (code range 2C80 to 2CFF).

      The Demotic Coptic glyphs (Shei, Fei, Hori, Gangia, Shima and Dei) are
      located in the “Greek and Coptic” table and their Code Points are as
      follows.

      03E2 Shei
      03E5 Fei
      03E9 Hori
      03EB Gangia
      03ED Shima
      03EF Dei

      For reasons I don’t fully understand, the Coptic “Greek like” letters from
      Alfa through Ooo are located in the code page called “Coptic”. I’ll not list
      all of the code points but notice the code points for these letters:

      2C81 Alfa
      2C83 Vida
      2C86 Gamma
      2C88 Dalda etc.

      Now here is the result of sorting these letters by code point:

      03E2 Shei
      03E5 Fei
      03E9 Hori
      03EB Gangia
      03ED Shima
      03EF Dei
      2C81 Alfa
      2C83 Vida
      2C86 Gamma
      2C88 Dalda etc.

      As far as I can determine, Excel sorts Unicode characters by their code
      point values so if that is the case, then they are being sorted logically,
      however the end result is an incorrect alphabetic sequence.

      What to do?

      Rick


      From: gthomas@yahoogroups.com [mailto:gthomas@yahoogroups.com] On Behalf Of
      Mike Grondin
      Sent: Tuesday, March 25, 2014 2:13 AM
      To: gthomas@yahoogroups.com
      Subject: Re: [GTh] Coptic Unicode Headaches

       
      Hi Rick,
       
      The last time we talked about this problem (early Sept 2010), you
      determined that it had to with the version of MS Ofc being used.
      We never figured out why, but at last word, the sort worked OK
      in MS Ofc 2001 (which I still use) and MS Ofc 2003 (which you
      tested). You determined, however, that it didn't work right in MS
      Ofc 2010, and maybe in no version since 2003. Sure would like
      to know why, though - and how to fix it. Maybe Judy or somebody
      can confirm your results or tell us more about it. Or maybe you can
      contact Christian Askeland or someone else who has developed
      these fonts. I see also that it's time to upgrade my Coptic Unicode
      fonts page. I'll add a note about the sort problem if we can get a
      little better grip on it.
       
      Mike
    • Show all 6 messages in this topic