Loading ...
Sorry, an error occurred while loading the content.
 

Minor doc bug in digraph.txt

Expand Messages
  • Tony Mechelynck
    Under :help digraph (near the top of digraph.txt dated 2008 Jul 17), at the second sentence in the first paragraph there is: These are mostly accented
    Message 1 of 8 , Aug 6, 2008
      Under ":help digraph" (near the top of digraph.txt dated 2008 Jul 17),
      at the second sentence in the first paragraph

      there is:
      These are mostly accented characters which have the eighth bit set.

      there should be:
      In 8-bit encodings, these are mostly accented characters which have the
      eighth bit set; in 16-bit and multibyte encodings there can be a lot more.

      Alternative (maybe even better) wording:
      These are mostly accented and non-ASCII characters above 0x7F.

      Rationale: In 16-bit and multi-byte encodings, roughly half of the
      digraphs are for characters which actually have the 8th bit unset (but
      some higher bit[s] set); also, in these encodings there are often quite
      a number of non-Latin or even graphical characters which cannot be
      described as "accented".


      Best regards,
      Tony.
      --
      Two can Live as Cheaply as One for Half as Long.
      -- Howard Kandel

      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_dev" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Ben Schmidt
      ... I think it s referring to encoded byte values, not codepoints; even in 16-bit encodings the individual bytes tend to have the 8th bit set. I guess you
      Message 2 of 8 , Aug 6, 2008
        Tony Mechelynck wrote:
        > Under ":help digraph" (near the top of digraph.txt dated 2008 Jul 17),
        > at the second sentence in the first paragraph
        >
        > there is:
        > These are mostly accented characters which have the eighth bit set.

        I think it's referring to encoded byte values, not codepoints; even in
        16-bit encodings the individual bytes tend to have the 8th bit set. I
        guess you could have '...characters whose encoded bytes have...'.

        > there should be:
        > In 8-bit encodings, these are mostly accented characters which have the
        > eighth bit set; in 16-bit and multibyte encodings there can be a lot more.

        I don't like this version. A lot more what? It isn't clear.

        > Alternative (maybe even better) wording:
        > These are mostly accented and non-ASCII characters above 0x7F.

        That's OK. Not sure I like 'non-ASCII' as a description, as ASCII is
        getting rare language-wise these days and is not broadly understood like
        it used to be. But the point that 'accented' doesn't really cut it is a
        valid one. Perhaps 'accented characters and symbols' is better.

        Thought I'd just up the pedantry.

        Grins,

        Ben.



        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_dev" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      • Bram Moolenaar
        ... Yeah, the text is from before multi-byte support. I think this works better: Digraphs are used to enter characters that normally cannot be entered by an
        Message 3 of 8 , Aug 6, 2008
          Tony Mechelynck wrote:

          > Under ":help digraph" (near the top of digraph.txt dated 2008 Jul 17),
          > at the second sentence in the first paragraph
          >
          > there is:
          > These are mostly accented characters which have the eighth bit set.
          >
          > there should be:
          > In 8-bit encodings, these are mostly accented characters which have the
          > eighth bit set; in 16-bit and multibyte encodings there can be a lot more.
          >
          > Alternative (maybe even better) wording:
          > These are mostly accented and non-ASCII characters above 0x7F.
          >
          > Rationale: In 16-bit and multi-byte encodings, roughly half of the
          > digraphs are for characters which actually have the 8th bit unset (but
          > some higher bit[s] set); also, in these encodings there are often quite
          > a number of non-Latin or even graphical characters which cannot be
          > described as "accented".

          Yeah, the text is from before multi-byte support. I think this works
          better:

          Digraphs are used to enter characters that normally cannot be entered by
          an ordinary keyboard. These are mostly printable non-ASCII characters. The
          digraphs are easier to remember than the decimal number that can be entered
          with CTRL-V (see |i_CTRL-V|).

          --
          ARTHUR: What does it say?
          BROTHER MAYNARD: It reads ... "Here may be found the last words of Joseph of
          Aramathea." "He who is valorous and pure of heart may find
          the Holy Grail in the aaaaarrrrrrggghhh..."
          ARTHUR: What?
          BROTHER MAYNARD: "The Aaaaarrrrrrggghhh..."
          "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

          /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
          /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\ download, build and distribute -- http://www.A-A-P.org ///
          \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_dev" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        • Tony Mechelynck
          ... A lot more digraphs. ... accented characters and symbols does not apply to Greek, Cyrillic, Hebrew, Arabic, Hindi, Chinese, etc., which are letters, not
          Message 4 of 8 , Aug 6, 2008
            On 06/08/08 17:53, Ben Schmidt wrote:
            > Tony Mechelynck wrote:
            >> Under ":help digraph" (near the top of digraph.txt dated 2008 Jul 17),
            >> at the second sentence in the first paragraph
            >>
            >> there is:
            >> These are mostly accented characters which have the eighth bit set.
            >
            > I think it's referring to encoded byte values, not codepoints; even in
            > 16-bit encodings the individual bytes tend to have the 8th bit set. I
            > guess you could have '...characters whose encoded bytes have...'.
            >
            >> there should be:
            >> In 8-bit encodings, these are mostly accented characters which have the
            >> eighth bit set; in 16-bit and multibyte encodings there can be a lot more.
            >
            > I don't like this version. A lot more what? It isn't clear.

            A lot more digraphs.

            >
            >> Alternative (maybe even better) wording:
            >> These are mostly accented and non-ASCII characters above 0x7F.
            >
            > That's OK. Not sure I like 'non-ASCII' as a description, as ASCII is
            > getting rare language-wise these days and is not broadly understood like
            > it used to be. But the point that 'accented' doesn't really cut it is a
            > valid one. Perhaps 'accented characters and symbols' is better.

            "accented characters and symbols" does not apply to Greek, Cyrillic,
            Hebrew, Arabic, Hindi, Chinese, etc., which are letters, not symbols,
            but mostly non-accented (with a few exceptions). They aren't ASCII or
            even Latin, BTW.

            >
            > Thought I'd just up the pedantry.
            >
            > Grins,
            >
            > Ben.

            Sure, but if we're going to be pedant, I won't accept any
            "wooden-tongue" fake pedantry from anyone.


            Best regards,
            Tony.
            --
            A Christian is a man who feels repentance on Sunday for what he did on
            Saturday and is going to do on Monday.
            -- Thomas Ybarra

            --~--~---------~--~----~------------~-------~--~----~
            You received this message from the "vim_dev" maillist.
            For more information, visit http://www.vim.org/maillist.php
            -~----------~----~----~----~------~----~------~--~---
          • Tony Mechelynck
            ... Best regards, Tony. -- Zymurgy s Law of Volunteer Labor: People are always available for work in the past tense.
            Message 5 of 8 , Aug 6, 2008
              On 06/08/08 19:16, Bram Moolenaar wrote:
              >
              > Tony Mechelynck wrote:
              >
              >> Under ":help digraph" (near the top of digraph.txt dated 2008 Jul 17),
              >> at the second sentence in the first paragraph
              >>
              >> there is:
              >> These are mostly accented characters which have the eighth bit set.
              >>
              >> there should be:
              >> In 8-bit encodings, these are mostly accented characters which have the
              >> eighth bit set; in 16-bit and multibyte encodings there can be a lot more.
              >>
              >> Alternative (maybe even better) wording:
              >> These are mostly accented and non-ASCII characters above 0x7F.
              >>
              >> Rationale: In 16-bit and multi-byte encodings, roughly half of the
              >> digraphs are for characters which actually have the 8th bit unset (but
              >> some higher bit[s] set); also, in these encodings there are often quite
              >> a number of non-Latin or even graphical characters which cannot be
              >> described as "accented".
              >
              > Yeah, the text is from before multi-byte support. I think this works
              > better:
              >
              > Digraphs are used to enter characters that normally cannot be entered by
              > an ordinary keyboard. These are mostly printable non-ASCII characters. The
              > digraphs are easier to remember than the decimal number that can be entered
              > with CTRL-V (see |i_CTRL-V|).
              >
              -----------------^ |i_CTRL-V_digit|


              Best regards,
              Tony.
              --
              Zymurgy's Law of Volunteer Labor:
              People are always available for work in the past tense.

              --~--~---------~--~----~------------~-------~--~----~
              You received this message from the "vim_dev" maillist.
              For more information, visit http://www.vim.org/maillist.php
              -~----------~----~----~----~------~----~------~--~---
            • Ben Schmidt
              ... True. (Well, except Chinese etc. characters are not letters.) But do these really fall into the category of normally cannot be entered by an ordinary
              Message 6 of 8 , Aug 7, 2008
                >>> Alternative (maybe even better) wording:
                >>> These are mostly accented and non-ASCII characters above 0x7F.
                >> That's OK. Not sure I like 'non-ASCII' as a description, as ASCII is
                >> getting rare language-wise these days and is not broadly understood like
                >> it used to be. But the point that 'accented' doesn't really cut it is a
                >> valid one. Perhaps 'accented characters and symbols' is better.
                >
                > "accented characters and symbols" does not apply to Greek, Cyrillic,
                > Hebrew, Arabic, Hindi, Chinese, etc., which are letters, not symbols,
                > but mostly non-accented (with a few exceptions). They aren't ASCII or
                > even Latin, BTW.

                True. (Well, except Chinese etc. characters are not letters.) But do
                these really fall into the category of 'normally cannot be entered by an
                ordinary keyboard'? To call international keyboards un-ordinary is
                bordering on derogatory, and their input methods abnormal likely to be
                at least debatable, if not provably wrong, or soon to be so. I don't
                think the point of digraphs is to enter such characters. With the
                built-in ones in the case of Hanzi, you mostly can't, either. Were I
                typing Greek, it would be to get accented characters that I would use
                digraphs (and I'd have to define them, as they're not built in). As it
                is, I use a keymap, as I tend to do for foreign languages unless I just
                use my system's input methods. I tend to use digraphs to get either
                diacritics for letters I otherwise can type without them, or symbols,
                and I think that's really their use.

                >> Thought I'd just up the pedantry.
                >
                > Sure, but if we're going to be pedant, I won't accept any
                > "wooden-tongue" fake pedantry from anyone.

                Yeah, well, perhaps it was a pretty lame excuse for pedantry, but one
                step at a time! The adjective is 'pedantic,' by the way.

                Winks,

                Ben.



                --~--~---------~--~----~------------~-------~--~----~
                You received this message from the "vim_dev" maillist.
                For more information, visit http://www.vim.org/maillist.php
                -~----------~----~----~----~------~----~------~--~---
              • Tony Mechelynck
                ... --~--~---------~--~----~------------~-------~--~----~ You received this message from the vim_dev maillist. For more information, visit
                Message 7 of 8 , Aug 7, 2008
                  On 07/08/08 10:55, Ben Schmidt wrote:
                  >>>> Alternative (maybe even better) wording:
                  >>>> These are mostly accented and non-ASCII characters above 0x7F.
                  >>> That's OK. Not sure I like 'non-ASCII' as a description, as ASCII is
                  >>> getting rare language-wise these days and is not broadly understood like
                  >>> it used to be. But the point that 'accented' doesn't really cut it is a
                  >>> valid one. Perhaps 'accented characters and symbols' is better.
                  >> "accented characters and symbols" does not apply to Greek, Cyrillic,
                  >> Hebrew, Arabic, Hindi, Chinese, etc., which are letters, not symbols,
                  >> but mostly non-accented (with a few exceptions). They aren't ASCII or
                  >> even Latin, BTW.
                  >
                  > True. (Well, except Chinese etc. characters are not letters.) But do
                  > these really fall into the category of 'normally cannot be entered by an
                  > ordinary keyboard'? To call international keyboards un-ordinary is
                  > bordering on derogatory, and their input methods abnormal likely to be
                  > at least debatable, if not provably wrong, or soon to be so. I don't
                  > think the point of digraphs is to enter such characters. With the
                  > built-in ones in the case of Hanzi, you mostly can't, either. Were I
                  > typing Greek, it would be to get accented characters that I would use
                  > digraphs (and I'd have to define them, as they're not built in). As it
                  > is, I use a keymap, as I tend to do for foreign languages unless I just
                  > use my system's input methods. I tend to use digraphs to get either
                  > diacritics for letters I otherwise can type without them, or symbols,
                  > and I think that's really their use.
                  >
                  >>> Thought I'd just up the pedantry.
                  >> Sure, but if we're going to be pedant, I won't accept any
                  >> "wooden-tongue" fake pedantry from anyone.
                  >
                  > Yeah, well, perhaps it was a pretty lame excuse for pedantry, but one
                  > step at a time! The adjective is 'pedantic,' by the way.
                  >
                  > Winks,
                  >
                  > Ben.
                  >
                  >
                  >
                  > >
                  >


                  --~--~---------~--~----~------------~-------~--~----~
                  You received this message from the "vim_dev" maillist.
                  For more information, visit http://www.vim.org/maillist.php
                  -~----------~----~----~----~------~----~------~--~---
                • Tony Mechelynck
                  ... Hm. What is a letter? I think I ve seen treatises about Chinese writing where ideograms are called letters . ... You have a point there. Calling US-QWERTY
                  Message 8 of 8 , Aug 7, 2008
                    On 07/08/08 10:55, Ben Schmidt wrote:
                    >>>> Alternative (maybe even better) wording:
                    >>>> These are mostly accented and non-ASCII characters above 0x7F.
                    >>> That's OK. Not sure I like 'non-ASCII' as a description, as ASCII is
                    >>> getting rare language-wise these days and is not broadly understood like
                    >>> it used to be. But the point that 'accented' doesn't really cut it is a
                    >>> valid one. Perhaps 'accented characters and symbols' is better.
                    >> "accented characters and symbols" does not apply to Greek, Cyrillic,
                    >> Hebrew, Arabic, Hindi, Chinese, etc., which are letters, not symbols,
                    >> but mostly non-accented (with a few exceptions). They aren't ASCII or
                    >> even Latin, BTW.
                    >
                    > True. (Well, except Chinese etc. characters are not letters.)

                    Hm. What is a letter? I think I've seen treatises about Chinese writing
                    where ideograms are called "letters".

                    > But do
                    > these really fall into the category of 'normally cannot be entered by an
                    > ordinary keyboard'?

                    You have a point there. Calling US-QWERTY "ordinary" and everything else
                    "out of the ordinary" is the kind of parochialism which, when coming
                    from Usonians, often sets my nerves on edge. We just can't put that sort
                    of language into a help file which is aimed at people from all nations
                    and is supposed to have been written by a Dutchman currently living in
                    Switzerland.

                    > To call international keyboards un-ordinary is
                    > bordering on derogatory, and their input methods abnormal likely to be
                    > at least debatable, if not provably wrong, or soon to be so. I don't
                    > think the point of digraphs is to enter such characters. With the
                    > built-in ones in the case of Hanzi, you mostly can't, either. Were I
                    > typing Greek, it would be to get accented characters that I would use
                    > digraphs (and I'd have to define them, as they're not built in). As it
                    > is, I use a keymap, as I tend to do for foreign languages unless I just
                    > use my system's input methods. I tend to use digraphs to get either
                    > diacritics for letters I otherwise can type without them, or symbols,
                    > and I think that's really their use.

                    When I type Russian or Greek, I may use digraphs if what I need is a
                    letter or a word, because of the lesser overhead in not needing to
                    source a keymap; for a Russian sentence or more I will use my own-coded
                    russian-phonetic keymap (which I don't think is fit for public
                    consumption: in particular, for many people it has too many two-key
                    {lhs}es); for Greek i haven't yet tried the distributed Greek keymap but
                    by looking at it I think it might be usable, even for classical Greek.

                    >
                    >>> Thought I'd just up the pedantry.
                    >> Sure, but if we're going to be pedant, I won't accept any
                    >> "wooden-tongue" fake pedantry from anyone.
                    >
                    > Yeah, well, perhaps it was a pretty lame excuse for pedantry, but one
                    > step at a time! The adjective is 'pedantic,' by the way.
                    >
                    > Winks,
                    >
                    > Ben.

                    Oops! Hit "Send" too fast.

                    Strike one for my misguided sense of how English words are derived by
                    French. According to Oxford's, in English "pedant" is the noun,
                    "pedantic" is the adjective, even though both "pedant" and "-ic" are
                    derived from French and the French word is "pédant" for both noun and
                    adjective: "pédantique" doesn't exist.

                    I should have said: "If we're going to be pedants, ..."

                    Best regards,
                    Tony.
                    --
                    When a Banker jumps out of a window, jump after him -- that's where the
                    money is.
                    -- Robespierre

                    --~--~---------~--~----~------------~-------~--~----~
                    You received this message from the "vim_dev" maillist.
                    For more information, visit http://www.vim.org/maillist.php
                    -~----------~----~----~----~------~----~------~--~---
                  Your message has been successfully submitted and would be delivered to recipients shortly.