Loading ...
Sorry, an error occurred while loading the content.

One rather off-topic contribution - The Unicode Character Database

Expand Messages
  • Petr Pařízek
    Hi there. Some time ago, one of you mentioned the possibility of adding new symbols like Sagittal to the Unicode chart. A few months later, I was browsing the
    Message 1 of 6 , Apr 1, 2007
    • 0 Attachment
      Hi there.

      Some time ago, one of you mentioned the possibility of adding new symbols
      like Sagittal to the Unicode chart. A few months later, I was browsing the
      Unicode server and I was surprised how much the used characters were
      scattered around the space, which made me think this might be a task of
      touching more than one block (or group, if you wish) of characters. You can
      find more here: www.unicode.org/Public/zipped/5.0.0/UCD.zip

      Petr
    • Danny Wier
      ... From: Petr Paøízek To: Tuning List Sent: Sunday, April 01, 2007 3:42 AM Subject: [tuning] One rather
      Message 2 of 6 , Apr 1, 2007
      • 0 Attachment
        ----- Original Message -----
        From: "Petr Pařízek" <p.parizek@...>
        To: "Tuning List" <tuning@yahoogroups.com>
        Sent: Sunday, April 01, 2007 3:42 AM
        Subject: [tuning] One rather off-topic contribution - The Unicode Character
        Database


        > Hi there.
        >
        > Some time ago, one of you mentioned the possibility of adding new symbols
        > like Sagittal to the Unicode chart. A few months later, I was browsing the
        > Unicode server and I was surprised how much the used characters were
        > scattered around the space, which made me think this might be a task of
        > touching more than one block (or group, if you wish) of characters. You
        > can
        > find more here: www.unicode.org/Public/zipped/5.0.0/UCD.zip

        It would use a block of its own, I'd imagine. A large number of extensions
        added at once tends to get its own space somewhere in there, and it might
        not be continuous with the older block. If a few are added, they're added to
        the end, some some blocks end up being really piecemeal.

        But getting Sagittal encoded in Unicode would be a hard sell right now,
        wouldn't it? It's going to have to become a widely-used notation before
        they'll consider including it. They don't even have Tartini-Couper or
        Arel-Ezgi accidentals encoded. They do have sharps, flats and accidentals
        with arrows attached, and sharps and flats with a 4-like attachment--but
        does anyone use these?

        Unicode has ancient Greek musical notation, however. They're much more open
        to including ancient scripts.

        ~D.
      • Herman Miller
        ... It s strange that they picked such an obscure symbol for quarter tone sharp and quarter tone flat , but a particular font could use a more familiar
        Message 3 of 6 , Apr 1, 2007
        • 0 Attachment
          Danny Wier wrote:

          > It would use a block of its own, I'd imagine. A large number of extensions
          > added at once tends to get its own space somewhere in there, and it might
          > not be continuous with the older block. If a few are added, they're added to
          > the end, some some blocks end up being really piecemeal.
          >
          > But getting Sagittal encoded in Unicode would be a hard sell right now,
          > wouldn't it? It's going to have to become a widely-used notation before
          > they'll consider including it. They don't even have Tartini-Couper or
          > Arel-Ezgi accidentals encoded. They do have sharps, flats and accidentals
          > with arrows attached, and sharps and flats with a 4-like attachment--but
          > does anyone use these?

          It's strange that they picked such an obscure symbol for "quarter tone
          sharp" and "quarter tone flat", but a particular font could use a more
          familiar symbol for displaying those characters. I see they only have a
          single character for the quarter note rest, which is shown as the zigzag
          version in the chart, but there's another quarter note rest that's still
          occasionally seen (which looks like an eighth rest combined with a
          180-degree rotated eighth rest, or roughly the letter Z), but it doesn't
          have a Unicode character assigned to it. So they might be considering
          all the different quarter tone symbols as glyph variants representing
          the same abstract "character".

          Now, one interesting thing is that they have ornamentation characters
          that combine to form the different Baroque ornaments (1D19B-1D1A5). A
          similar thing could be done for Sagittal by having characters for the
          individual flags (up and down versions for each), which could be
          combined to form a single Sagittal symbol. New symbols from existing
          flags could be created without changing the Unicode definition. But a
          drawback of that is that the font rendering engine would have to support
          ligatures in the Musical Symbols block, which I guess probably isn't
          very high on the priority list at either Microsoft or Apple.

          It's odd how priorities are assigned. Recycling symbols for 7 different
          kinds of plastics have a nice spot in the Miscellaneous Symbols block,
          but when are those ever used in actual text? Yet the double sharp and
          double flat symbols are off in the Musical Symbols block in Plane 1,
          which is unusable in most older software, even though these would be
          very useful to have in text (so much so that "x" and "bb" are used as
          substitutes for these unavailable characters). If the plastics industry
          can have their technical symbols added to the character list, it seems
          that there ought to be a place for Sagittal and any other notations with
          an established history of usage.
        • Danny Wier
          From: Herman Miller To: Sent: Sunday, April 01, 2007 6:38 PM Subject: Re: [tuning] One rather off-topic
          Message 4 of 6 , Apr 1, 2007
          • 0 Attachment
            From: "Herman Miller" <hmiller@...>
            To: <tuning@yahoogroups.com>
            Sent: Sunday, April 01, 2007 6:38 PM
            Subject: Re: [tuning] One rather off-topic contribution - The Unicode
            Character Database


            > Danny Wier wrote:
            ...
            >> But getting Sagittal encoded in Unicode would be a hard sell right now,
            >> wouldn't it? It's going to have to become a widely-used notation before
            >> they'll consider including it. They don't even have Tartini-Couper or
            >> Arel-Ezgi accidentals encoded. They do have sharps, flats and accidentals
            >> with arrows attached, and sharps and flats with a 4-like attachment--but
            >> does anyone use these?
            >
            > It's strange that they picked such an obscure symbol for "quarter tone
            > sharp" and "quarter tone flat", but a particular font could use a more
            > familiar symbol for displaying those characters. I see they only have a
            > single character for the quarter note rest, which is shown as the zigzag
            > version in the chart, but there's another quarter note rest that's still
            > occasionally seen (which looks like an eighth rest combined with a
            > 180-degree rotated eighth rest, or roughly the letter Z), but it doesn't
            > have a Unicode character assigned to it. So they might be considering
            > all the different quarter tone symbols as glyph variants representing
            > the same abstract "character".

            True, they could do that, use the slots for quarter-tone sharp and flat
            generically, but it's not really in the spirit of Unicode. Case and point:
            Turkish and Romanian use very similar letters for the sound of "sh". The
            preferred form in Turkish is s-cedilla, but in Romanian it's s-comma-below.
            No language uses both letters, but they still are encoded in different
            locations, not as glyph variants.

            In fact, both the "Arabic" stroke-flat and Tartini-Couper reversed flat,
            both types of quarter-tone flat, are used in the Arel-Ezgi system for two
            different sizes of flat, so they can't be glyph variants anyway.

            I doubt I'd have the Z-type eighth rest encoded separately from the 7-type
            one, however. The printed Cyrillic lowercase T and the written one (which
            resembles a small Latin 'm') aren't, and shouldn't be.

            > It's odd how priorities are assigned. Recycling symbols for 7 different
            > kinds of plastics have a nice spot in the Miscellaneous Symbols block,
            > but when are those ever used in actual text? Yet the double sharp and
            > double flat symbols are off in the Musical Symbols block in Plane 1,
            > which is unusable in most older software, even though these would be
            > very useful to have in text (so much so that "x" and "bb" are used as
            > substitutes for these unavailable characters). If the plastics industry
            > can have their technical symbols added to the character list, it seems
            > that there ought to be a place for Sagittal and any other notations with
            > an established history of usage.

            For what it's worth, Plane 1 symbols can be viewed in Internet Explorer 7. I
            wish I knew if Firefox could.

            But I am thinking. If all else fails, what about encoding Sagittal in the
            ConScript Unicode Registry, or is it not for that purpose?

            ~D.
          • Graham Breed
            ... I don t think there is a consistent spirit of Unicode . Rather, different compromises are made in different situations. There s certainly a lot of
            Message 5 of 6 , Apr 1, 2007
            • 0 Attachment
              Danny Wier wrote:

              > True, they could do that, use the slots for quarter-tone sharp and flat
              > generically, but it's not really in the spirit of Unicode. Case and point:
              > Turkish and Romanian use very similar letters for the sound of "sh". The
              > preferred form in Turkish is s-cedilla, but in Romanian it's s-comma-below.
              > No language uses both letters, but they still are encoded in different
              > locations, not as glyph variants.

              I don't think there is a consistent "spirit of Unicode".
              Rather, different compromises are made in different
              situations. There's certainly a lot of griping about
              unified CJK characters. And it turns out that you can't
              design a single font to satisfy all readers of Arabic. The
              general spirit of character sets, though, is that you're
              supposed to leave the appearance to the fonts. I don't
              think we should take the example quartertone symbols at all
              seriously.

              > In fact, both the "Arabic" stroke-flat and Tartini-Couper reversed flat,
              > both types of quarter-tone flat, are used in the Arel-Ezgi system for two
              > different sizes of flat, so they can't be glyph variants anyway.

              There's "flat down" and "flat up" as well, whatever they're
              supposed to mean.

              > I doubt I'd have the Z-type eighth rest encoded separately from the 7-type
              > one, however. The printed Cyrillic lowercase T and the written one (which
              > resembles a small Latin 'm') aren't, and shouldn't be.

              I'm sure we wouldn't have done it this way.

              >>It's odd how priorities are assigned. Recycling symbols for 7 different
              >>kinds of plastics have a nice spot in the Miscellaneous Symbols block,
              >>but when are those ever used in actual text? Yet the double sharp and
              >>double flat symbols are off in the Musical Symbols block in Plane 1,
              >>which is unusable in most older software, even though these would be
              >>very useful to have in text (so much so that "x" and "bb" are used as
              >>substitutes for these unavailable characters). If the plastics industry
              >>can have their technical symbols added to the character list, it seems
              >>that there ought to be a place for Sagittal and any other notations with
              >>an established history of usage.
              >
              > For what it's worth, Plane 1 symbols can be viewed in Internet Explorer 7. I
              > wish I knew if Firefox could.

              It can.

              http://www.i18nguy.com/unicode/unicode-example-intro.html

              Internet Explorer 5 works if you encode them the right way.
              Of course, whatever the browser, you won't see anything if
              you don't have the right fonts.

              > But I am thinking. If all else fails, what about encoding Sagittal in the
              > ConScript Unicode Registry, or is it not for that purpose?

              http://www.evertype.com/standards/csur/

              Looks like a perfect fit for Sagittal! I see Herman
              Miller's got a load of them as well.


              Graham
            • Herman Miller
              ... The Gothic Wikipedia page (http://got.wikipedia.org/) is a good test for Plane 1 support (you ll need to install a Gothic font, but the page has links to
              Message 6 of 6 , Apr 1, 2007
              • 0 Attachment
                Danny Wier wrote:

                > For what it's worth, Plane 1 symbols can be viewed in Internet Explorer 7. I
                > wish I knew if Firefox could.
                >
                > But I am thinking. If all else fails, what about encoding Sagittal in the
                > ConScript Unicode Registry, or is it not for that purpose?

                The Gothic Wikipedia page (http://got.wikipedia.org/) is a good test for
                Plane 1 support (you'll need to install a Gothic font, but the page has
                links to fonts). It works in the latest version of Firefox under Windows
                XP (more or less).

                I don't know if the ConScript Unicode Registry is still being
                maintained, but the range of Unicode characters it uses is in the
                Private Use Area, and there's no reason we couldn't come up with our own
                standards for assigning Private Use characters for microtonal notation
                symbols (including historical ones).
              Your message has been successfully submitted and would be delivered to recipients shortly.