Loading ...
Sorry, an error occurred while loading the content.
 

[patch] matchaddpos(): fix for multibyte characters hl

Expand Messages
  • Alexey Radkov
    Now it accepts len in screen cells. Cheers, Alexey. -- -- You received this message from the vim_dev maillist. Do not top-post! Type your reply below the
    Message 1 of 9 , Jul 3, 2014
      Now it accepts len in screen cells.

      Cheers, Alexey.

      --
      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_dev" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
      For more options, visit https://groups.google.com/d/optout.
    • Bram Moolenaar
      ... Hmm, that s confusing. Suppose a script isolates a word that it wants to highlight. Then it s easy to locate the start of the word and the length with
      Message 2 of 9 , Jul 4, 2014
        Alexey Radkov wrote:

        > Now it accepts len in screen cells.

        Hmm, that's confusing. Suppose a script isolates a word that it wants
        to highlight. Then it's easy to locate the start of the word and the
        length with various methods, e.g. using getline(), match() and
        matchend(). Then you have the position and size in bytes, not
        characters or screen characters. So let's stick to that.

        In the implementation it should be easy to round up, so as to include
        the screen cell that contains a highlighted byte.


        --
        There are 10 kinds of people: Those who understand binary and those who don't.

        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
        /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
        \\\ an exciting new programming language -- http://www.Zimbu.org ///
        \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

        --
        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_dev" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
        For more options, visit https://groups.google.com/d/optout.
      • Alexey Radkov
        Ok, probably so. If script is able to calculate the end in bytes itself, then the current solution is good. But if it knows how many screen cells it must
        Message 3 of 9 , Jul 4, 2014
          Ok, probably so. If script is able to calculate the end in bytes itself, then the current solution is good. But if it knows how many screen cells it must highlight it is getting almost not feasible to calculate the end col. if case 1 is more common than 2 then current solution is better :)


          2014-07-04 14:05 GMT+04:00 Bram Moolenaar <Bram@...>:

          Alexey Radkov wrote:

          > Now it accepts len in screen cells.

          Hmm, that's confusing.  Suppose a script isolates a word that it wants
          to highlight.  Then it's easy to locate the start of the word and the
          length with various methods, e.g. using getline(), match() and
          matchend().  Then you have the position and size in bytes, not
          characters or screen characters.  So let's stick to that.

          In the implementation it should be easy to round up, so as to include
          the screen cell that contains a highlighted byte.


          --
          There are 10 kinds of people: Those who understand binary and those who don't.

           /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
          ///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\  an exciting new programming language -- http://www.Zimbu.org        ///
           \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

          --
          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_dev" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
          For more options, visit https://groups.google.com/d/optout.
        • Павлов Николай Алекса
          ... Hash: SHA512 Are you sure you need exactly screen cells? There are the following possible ways to identify position inside a string: 1. Byte offset. 2.
          Message 4 of 9 , Jul 4, 2014
            -----BEGIN PGP SIGNED MESSAGE-----
            Hash: SHA512

            Are you sure you need exactly screen cells? There are the following possible ways to identify position inside a string:

            1. Byte offset.
            2. Unicode codepoints offset.
            3. Composed characters offset (one "composed character" is "one Unicode codepoint with attached composing characters (if any)").
            4. Screen cells offset.

            I doubt anybody will use 4. outside of a editor because it is hard to calculate. There are two settings that affect 4. and are only defined in a editor: &tabstop and &ambiwidth. *You must not use screen cells offset with editor settings if you received it from another source.* It is incorrect: you need settings from another source, not from this instance of editor.

            And ***do never use screen cells to count characters***. Code that assumes any fixed amount of Unicode codepoints per one cell is brain-damaged, broken and wrong.


            For this patch I heard the following use cases:

            1. matchparen. Will happily live with byte offset.
            2. Highlighting of errors from some source. May not use screen cells under any circumstances for the reasons explained above.
            3. I think that things like Conque may also benefit from this, but they do not need screen cells as well.

            On July 4, 2014 6:00:06 PM GMT+03:00, Alexey Radkov <alexey.radkov@...> wrote:
            >Ok, probably so. If script is able to calculate the end in bytes
            >itself,
            >then the current solution is good. But if it knows how many screen
            >cells it
            >must highlight it is getting almost not feasible to calculate the end
            >col.
            >if case 1 is more common than 2 then current solution is better :)
            >
            >
            >2014-07-04 14:05 GMT+04:00 Bram Moolenaar <Bram@...>:
            >
            >>
            >> Alexey Radkov wrote:
            >>
            >> > Now it accepts len in screen cells.
            >>
            >> Hmm, that's confusing. Suppose a script isolates a word that it
            >wants
            >> to highlight. Then it's easy to locate the start of the word and the
            >> length with various methods, e.g. using getline(), match() and
            >> matchend(). Then you have the position and size in bytes, not
            >> characters or screen characters. So let's stick to that.
            >>
            >> In the implementation it should be easy to round up, so as to include
            >> the screen cell that contains a highlighted byte.
            >>
            >>
            >> --
            >> There are 10 kinds of people: Those who understand binary and those
            >who
            >> don't.
            >>
            >> /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
            >> \\\
            >> /// sponsor Vim, vote for features --
            >http://www.Vim.org/sponsor/
            >> \\\
            >> \\\ an exciting new programming language -- http://www.Zimbu.org
            >> ///
            >> \\\ help me help AIDS victims -- http://ICCF-Holland.org
            >> ///
            >>
            >
            >--
            >--
            >You received this message from the "vim_dev" maillist.
            >Do not top-post! Type your reply below the text you are replying to.
            >For more information, visit http://www.vim.org/maillist.php
            >
            >---
            >You received this message because you are subscribed to the Google
            >Groups "vim_dev" group.
            >To unsubscribe from this group and stop receiving emails from it, send
            >an email to vim_dev+unsubscribe@....
            >For more options, visit https://groups.google.com/d/optout.
            -----BEGIN PGP SIGNATURE-----
            Version: APG v1.1.1

            iQI1BAEBCgAfBQJTtriMGBxaeVggPHp5eC52aW1AZ21haWwuY29tPgAKCRCf3UKj
            HhHSvkSxD/93QBMAdcfdqm4vWl85dLkLYjFuQqN2yhTK/FEiboshXK7EbZiX8hyf
            HmmaCJL9XJbxjNn6Q9kqlwle+Bj7FI/7WAhqpbq6h6zIn4HT1We4WFTNv3AG7eFQ
            EDmFhsRD2UqPykJNaFO2aDP+vH4DUI0KhYmErnlUBjESbx+xedHcfc+oco3HDOrO
            dtDq1YlTrAyq3gns6AD81aYj2H8XjQ0hLHjwnLqrd9KokEiIRp0mlDUipyoFegI6
            iHpXvajaku838IBirzgYhk1sNQ1TtQ5vouJQ25bIhlAArivRr7RFZRHHWvVvDXqp
            xeREyYJxQKMsdzq4q9tAjYf9d+zgeLSk4gAstHItqn3h1Ggy1V5sv0A97Y2TKpVd
            ZPtHocV2LS4k9BIpQzvCGXEny6FPCvNtnJdQKWRUvegFTiUimL8+JkZPRIE8utn3
            SRmxGVRz1gBuw/46OQMsCg3FiDhqHaz+eqOVUs/KWAW2YOSyjM61343D98z+MOQ7
            y9Z7mLHzFgjiCiy418S5YSjuBDRTsWLf8fG69nd5WZevC9N6K22YmxxqUgYhMjOg
            vu+/kJxuIdUoDyIJo5RKZdTZN3nk8xyrdXODR7ENdv13xd36RDGOBNl5kI3ChJjl
            primxXL7cLGjnJ/6CQ8YZn6FkoJcaCh70Ao1Melr2kNAkqw2G5EhLw==
            =hnIH
            -----END PGP SIGNATURE-----

            --
            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php

            ---
            You received this message because you are subscribed to the Google Groups "vim_dev" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
            For more options, visit https://groups.google.com/d/optout.
          • Alexey Radkov
            ... Ok, i can imagine all the use cases but i cannot map all of them against their usage frequency in majority of the vim plugins. If I have known that 4 is
            Message 5 of 9 , Jul 4, 2014

              2014-07-04 18:22 GMT+04:00 Павлов Николай Александрович <zyx.vim@...>:
              -----BEGIN PGP SIGNED MESSAGE-----
              Hash: SHA512

              Are you sure you need exactly screen cells? There are the following possible ways to identify position inside a string:

              1. Byte offset.
              2. Unicode codepoints offset.
              3. Composed characters offset (one "composed character" is "one Unicode codepoint with attached composing characters (if any)").
              4. Screen cells offset.

               Ok, i can imagine all the use cases but i cannot map all of them against their usage frequency in majority of the vim plugins. If I have known that 4 is very rare i would not have suggested this patch :) ... I just remember that LCD asked for 4 in Syntastic plugin.


              I doubt anybody will use 4. outside of a editor because it is hard to calculate. There are two settings that affect 4. and are only defined in a editor: &tabstop and &ambiwidth. *You must not use screen cells offset with editor settings if you received it from another source.* It is incorrect: you need settings from another source, not from this instance of editor.

              And ***do never use screen cells to count characters***. Code that assumes any fixed amount of Unicode codepoints per one cell is brain-damaged, broken and wrong.


              For this patch I heard the following use cases:

              1. matchparen. Will happily live with byte offset.

              Not really. Apart of the fact that it gets matching parens in a wrong way like

              c = getline(c_lnum)[c_col - 1]

              which will always mean 1 byte symbol, it uses matchaddpos() (or 3match earlier) that rely on 1-byte symbol too.

              I made an experiment:

              :set matchpairs()+=в:д

              These symbols are not pairly highlighted with matchparen whereas '%' works just fine: the reason is simple: both result of getline() and matchaddpos()/3match must know that symbol under cursor is longer than 1 byte, it means that script must calculate it itself.
               
              2. Highlighting of errors from some source. May not use screen cells under any circumstances for the reasons explained above.
              3. I think that things like Conque may also benefit from this, but they do not need screen cells as well.

              On July 4, 2014 6:00:06 PM GMT+03:00, Alexey Radkov <alexey.radkov@...> wrote:
              >Ok, probably so. If script is able to calculate the end in bytes
              >itself,
              >then the current solution is good. But if it knows how many screen
              >cells it
              >must highlight it is getting almost not feasible to calculate the end
              >col.
              >if case 1 is more common than 2 then current solution is better :)
              >
              >
              >2014-07-04 14:05 GMT+04:00 Bram Moolenaar <Bram@...>:
              >
              >>
              >> Alexey Radkov wrote:
              >>
              >> > Now it accepts len in screen cells.
              >>
              >> Hmm, that's confusing.  Suppose a script isolates a word that it
              >wants
              >> to highlight.  Then it's easy to locate the start of the word and the
              >> length with various methods, e.g. using getline(), match() and
              >> matchend().  Then you have the position and size in bytes, not
              >> characters or screen characters.  So let's stick to that.
              >>
              >> In the implementation it should be easy to round up, so as to include
              >> the screen cell that contains a highlighted byte.
              >>
              >>
              >> --
              >> There are 10 kinds of people: Those who understand binary and those
              >who
              >> don't.
              >>
              >>  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
              >> \\\
              >> ///        sponsor Vim, vote for features --
              >http://www.Vim.org/sponsor/
              >> \\\
              >> \\\  an exciting new programming language -- http://www.Zimbu.org
              >>  ///
              >>  \\\            help me help AIDS victims -- http://ICCF-Holland.org
              >>  ///
              >>
              >
              >--
              >--
              >You received this message from the "vim_dev" maillist.
              >Do not top-post! Type your reply below the text you are replying to.
              >For more information, visit http://www.vim.org/maillist.php
              >
              >---
              >You received this message because you are subscribed to the Google
              >Groups "vim_dev" group.
              >To unsubscribe from this group and stop receiving emails from it, send
              >an email to vim_dev+unsubscribe@....
              >For more options, visit https://groups.google.com/d/optout.
              -----BEGIN PGP SIGNATURE-----
              Version: APG v1.1.1

              iQI1BAEBCgAfBQJTtriMGBxaeVggPHp5eC52aW1AZ21haWwuY29tPgAKCRCf3UKj
              HhHSvkSxD/93QBMAdcfdqm4vWl85dLkLYjFuQqN2yhTK/FEiboshXK7EbZiX8hyf
              HmmaCJL9XJbxjNn6Q9kqlwle+Bj7FI/7WAhqpbq6h6zIn4HT1We4WFTNv3AG7eFQ
              EDmFhsRD2UqPykJNaFO2aDP+vH4DUI0KhYmErnlUBjESbx+xedHcfc+oco3HDOrO
              dtDq1YlTrAyq3gns6AD81aYj2H8XjQ0hLHjwnLqrd9KokEiIRp0mlDUipyoFegI6
              iHpXvajaku838IBirzgYhk1sNQ1TtQ5vouJQ25bIhlAArivRr7RFZRHHWvVvDXqp
              xeREyYJxQKMsdzq4q9tAjYf9d+zgeLSk4gAstHItqn3h1Ggy1V5sv0A97Y2TKpVd
              ZPtHocV2LS4k9BIpQzvCGXEny6FPCvNtnJdQKWRUvegFTiUimL8+JkZPRIE8utn3
              SRmxGVRz1gBuw/46OQMsCg3FiDhqHaz+eqOVUs/KWAW2YOSyjM61343D98z+MOQ7
              y9Z7mLHzFgjiCiy418S5YSjuBDRTsWLf8fG69nd5WZevC9N6K22YmxxqUgYhMjOg
              vu+/kJxuIdUoDyIJo5RKZdTZN3nk8xyrdXODR7ENdv13xd36RDGOBNl5kI3ChJjl
              primxXL7cLGjnJ/6CQ8YZn6FkoJcaCh70Ao1Melr2kNAkqw2G5EhLw==
              =hnIH
              -----END PGP SIGNATURE-----


              --
              --
              You received this message from the "vim_dev" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php

              ---
              You received this message because you are subscribed to the Google Groups "vim_dev" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
              For more options, visit https://groups.google.com/d/optout.
            • Павлов Николай Алекса
              ... Hash: SHA512 ... Most of vim functions return byte count. No external tools may return correct 4., inside vim 4. is only returned by virtcol() and
              Message 6 of 9 , Jul 4, 2014
                -----BEGIN PGP SIGNED MESSAGE-----
                Hash: SHA512

                On July 4, 2014 6:55:28 PM GMT+03:00, Alexey Radkov <alexey.radkov@...> wrote:
                >2014-07-04 18:22 GMT+04:00 Павлов Николай Александрович
                ><zyx.vim@...>:
                >
                >> -----BEGIN PGP SIGNED MESSAGE-----
                >> Hash: SHA512
                >>
                >> Are you sure you need exactly screen cells? There are the following
                >> possible ways to identify position inside a string:
                >>
                >> 1. Byte offset.
                >> 2. Unicode codepoints offset.
                >> 3. Composed characters offset (one "composed character" is "one
                >Unicode
                >> codepoint with attached composing characters (if any)").
                >> 4. Screen cells offset.
                >>
                >
                >Ok, i can imagine all the use cases but i cannot map all of them
                >against
                >their usage frequency in majority of the vim plugins. If I have known
                >that
                >4 is very rare i would not have suggested this patch :) ... I just
                >remember
                >that LCD asked for 4 in Syntastic plugin.

                Most of vim functions return byte count. No external tools may return correct 4., inside vim 4. is only returned by virtcol() and strdisplaywidth() AFAIK.

                >
                >
                >> I doubt anybody will use 4. outside of a editor because it is hard to
                >> calculate. There are two settings that affect 4. and are only defined
                >in a
                >> editor: &tabstop and &ambiwidth. *You must not use screen cells
                >offset with
                >> editor settings if you received it from another source.* It is
                >incorrect:
                >> you need settings from another source, not from this instance of
                >editor.
                >>
                >> And ***do never use screen cells to count characters***. Code that
                >assumes
                >> any fixed amount of Unicode codepoints per one cell is brain-damaged,
                >> broken and wrong.
                >>
                >>
                >> For this patch I heard the following use cases:
                >>
                >> 1. matchparen. Will happily live with byte offset.
                >>
                >
                >Not really. Apart of the fact that it gets matching parens in a wrong
                >way
                >like
                >
                >c = getline(c_lnum)[c_col - 1]
                >
                >which will always mean 1 byte symbol, it uses matchaddpos() (or 3match
                >earlier) that rely on 1-byte symbol too.

                This code is not going to be fixed by matchaddpos() receiving screen cells. Regex engine does not rely on number of bytes in composed character. Cannot say the same for matchaddpos(), but I doubt it does.

                >
                >I made an experiment:
                >
                >:set matchpairs()+=в:д
                >
                >These symbols are not pairly highlighted with matchparen whereas '%'
                >works
                >just fine: the reason is simple: both result of getline() and
                >matchaddpos()/3match must know that symbol under cursor is longer than
                >1

                3match does not need to know this. Neither getline() does. Script author is who must know this, there are some more correct ways to get character at given byte offset (depending on what "character" means).

                And no, indexing must not know about length of characters. *Must not*. There are some valid use cases for byte indexing. Special functions for taking character (with different meanings) at given byte offset would be very handy, but so far nobody bothered to create them.

                >byte, it means that script must calculate it itself.
                >
                >
                >> 2. Highlighting of errors from some source. May not use screen cells
                >under
                >> any circumstances for the reasons explained above.
                >> 3. I think that things like Conque may also benefit from this, but
                >they do
                >> not need screen cells as well.
                >>
                >> On July 4, 2014 6:00:06 PM GMT+03:00, Alexey Radkov <
                >> alexey.radkov@...> wrote:
                >> >Ok, probably so. If script is able to calculate the end in bytes
                >> >itself,
                >> >then the current solution is good. But if it knows how many screen
                >> >cells it
                >> >must highlight it is getting almost not feasible to calculate the
                >end
                >> >col.
                >> >if case 1 is more common than 2 then current solution is better :)
                >> >
                >> >
                >> >2014-07-04 14:05 GMT+04:00 Bram Moolenaar <Bram@...>:
                >> >
                >> >>
                >> >> Alexey Radkov wrote:
                >> >>
                >> >> > Now it accepts len in screen cells.
                >> >>
                >> >> Hmm, that's confusing. Suppose a script isolates a word that it
                >> >wants
                >> >> to highlight. Then it's easy to locate the start of the word and
                >the
                >> >> length with various methods, e.g. using getline(), match() and
                >> >> matchend(). Then you have the position and size in bytes, not
                >> >> characters or screen characters. So let's stick to that.
                >> >>
                >> >> In the implementation it should be easy to round up, so as to
                >include
                >> >> the screen cell that contains a highlighted byte.
                >> >>
                >> >>
                >> >> --
                >> >> There are 10 kinds of people: Those who understand binary and
                >those
                >> >who
                >> >> don't.
                >> >>
                >> >> /// Bram Moolenaar -- Bram@... --
                >http://www.Moolenaar.net
                >> >> \\\
                >> >> /// sponsor Vim, vote for features --
                >> >http://www.Vim.org/sponsor/
                >> >> \\\
                >> >> \\\ an exciting new programming language -- http://www.Zimbu.org
                >> >> ///
                >> >> \\\ help me help AIDS victims --
                >http://ICCF-Holland.org
                >> >> ///
                >> >>
                >> >
                >> >--
                >> >--
                >> >You received this message from the "vim_dev" maillist.
                >> >Do not top-post! Type your reply below the text you are replying to.
                >> >For more information, visit http://www.vim.org/maillist.php
                >> >
                >> >---
                >> >You received this message because you are subscribed to the Google
                >> >Groups "vim_dev" group.
                >> >To unsubscribe from this group and stop receiving emails from it,
                >send
                >> >an email to vim_dev+unsubscribe@....
                >> >For more options, visit https://groups.google.com/d/optout.
                >> -----BEGIN PGP SIGNATURE-----
                >> Version: APG v1.1.1
                >>
                >> iQI1BAEBCgAfBQJTtriMGBxaeVggPHp5eC52aW1AZ21haWwuY29tPgAKCRCf3UKj
                >> HhHSvkSxD/93QBMAdcfdqm4vWl85dLkLYjFuQqN2yhTK/FEiboshXK7EbZiX8hyf
                >> HmmaCJL9XJbxjNn6Q9kqlwle+Bj7FI/7WAhqpbq6h6zIn4HT1We4WFTNv3AG7eFQ
                >> EDmFhsRD2UqPykJNaFO2aDP+vH4DUI0KhYmErnlUBjESbx+xedHcfc+oco3HDOrO
                >> dtDq1YlTrAyq3gns6AD81aYj2H8XjQ0hLHjwnLqrd9KokEiIRp0mlDUipyoFegI6
                >> iHpXvajaku838IBirzgYhk1sNQ1TtQ5vouJQ25bIhlAArivRr7RFZRHHWvVvDXqp
                >> xeREyYJxQKMsdzq4q9tAjYf9d+zgeLSk4gAstHItqn3h1Ggy1V5sv0A97Y2TKpVd
                >> ZPtHocV2LS4k9BIpQzvCGXEny6FPCvNtnJdQKWRUvegFTiUimL8+JkZPRIE8utn3
                >> SRmxGVRz1gBuw/46OQMsCg3FiDhqHaz+eqOVUs/KWAW2YOSyjM61343D98z+MOQ7
                >> y9Z7mLHzFgjiCiy418S5YSjuBDRTsWLf8fG69nd5WZevC9N6K22YmxxqUgYhMjOg
                >> vu+/kJxuIdUoDyIJo5RKZdTZN3nk8xyrdXODR7ENdv13xd36RDGOBNl5kI3ChJjl
                >> primxXL7cLGjnJ/6CQ8YZn6FkoJcaCh70Ao1Melr2kNAkqw2G5EhLw==
                >> =hnIH
                >> -----END PGP SIGNATURE-----
                >>
                >>

                -----BEGIN PGP SIGNATURE-----
                Version: APG v1.1.1

                iQI1BAEBCgAfBQJTtsUdGBxaeVggPHp5eC52aW1AZ21haWwuY29tPgAKCRCf3UKj
                HhHSvldgD/9QKeONNLk/hVUdyYNI2GOH+eG2evPf5FtJsfP6z+HBfYxRn9CZLIQB
                2cBhGUNlfzjLEK5ujXJVNnf6Dc2on0YeDnA8QLmIRoF3KKsdlGiITy/vIvn/Bzoo
                8uj1m7X/8nvQQQ54dc4YZuoKYHkdqT5fS8WofCvloCqUwU/o2wkUHtKQikMnhrpk
                6SJbQYaMqlIvc3k26PPo3gqRXOA8EV9P1AKWKR1ax3qUpmO/XXstEW/r0b5Jz7lh
                rYIXjGqTywmimuOo1kLRm+O7ERSyddlAWJSED3nxX/r5DVbO27d6j7iXVMZhLP+Y
                dYDqpuiYXwk0QoBwuBGZK2GXM1upQOf/hrNdHK5ga54VyonukNT/6JfhEdb66lSG
                Ozg3wa/GJxHlmkUQgA4K4PPigrkXwhz1IDtDjznu9Okas6mQ3Mxdxb+mut0YYuz5
                C0fbITSRjLzB31LfeDm6olZBbm0Vh9EozYWz8R8CwNoHvY6SMmkU1p+AqH1NKlcq
                35mCGNRx3AJIvKDGZMlU8/KaNBKDS4E1gtVGbSuTJDPugi9GD1qQDiVgFq0p3Tsu
                xxTvBuyn0eEuRnyWYEdKoWJaZ7Xb1ciLDLV0nBIB+eQO4uyqiWUiUpo9S5yYzkQ2
                I49tPTKmbdE7V7DH1rgsVlNJPMY03QkaUzlHldtj465tChoWtyCtQg==
                =z9B1
                -----END PGP SIGNATURE-----

                --
                --
                You received this message from the "vim_dev" maillist.
                Do not top-post! Type your reply below the text you are replying to.
                For more information, visit http://www.vim.org/maillist.php

                ---
                You received this message because you are subscribed to the Google Groups "vim_dev" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
                For more options, visit https://groups.google.com/d/optout.
              • LCD 47
                ... Sadly, the fact that you doubt it hasn t stopped anybody yet. :) Here s a short list of syntax checkers that do that, in no particular order: * Haskell
                Message 7 of 9 , Jul 4, 2014
                  On 4 July 2014, Павлов Николай Александрович <zyx.vim@...> wrote:
                  > Are you sure you need exactly screen cells? There are the following
                  > possible ways to identify position inside a string:
                  >
                  > 1. Byte offset.
                  > 2. Unicode codepoints offset.
                  > 3. Composed characters offset (one "composed character" is "one
                  > Unicode codepoint with attached composing characters (if any)").
                  > 4. Screen cells offset.
                  >
                  > I doubt anybody will use 4. outside of a editor because it is hard
                  > to calculate.

                  Sadly, the fact that you doubt it hasn't stopped anybody yet. :)
                  Here's a short list of syntax checkers that do that, in no particular
                  order:

                  * Haskell hdevtools
                  * Haskell HLint
                  * Haskell scan
                  * PHP_CodeSniffer
                  * HTML tidy
                  * GNU Bison
                  * sparse semantic parser for C
                  * Splint static checker for C
                  * msgfmt from GNU gettext
                  * Racket
                  * code-ayatollah linter for Racket
                  * R module svtools
                  * R module lint
                  * JSHint
                  * JSXHint

                  ... and probably others I don't remember right now.

                  > There are two settings that affect 4. and are only defined in a
                  > editor: &tabstop and &ambiwidth.

                  Yup. The checkers above either have their own config option for
                  &tabstop, or hardcode it (typically to 4 or 8). They also generally
                  don't care about &ambiwidth, and they can get away with that most of the
                  time, since source code is largely ASCII.

                  > *You must not use screen cells offset with editor settings if you
                  > received it from another source.* It is incorrect: you need settings
                  > from another source, not from this instance of editor.

                  True. Please suggest a better way to deal with the above.

                  > And ***do never use screen cells to count characters***. Code that
                  > assumes any fixed amount of Unicode codepoints per one cell is
                  > brain-damaged, broken and wrong.

                  True. Please suggest a better way to deal with the above.

                  > For this patch I heard the following use cases:
                  >
                  > 1. matchparen. Will happily live with byte offset.
                  > 2. Highlighting of errors from some source. May not use screen cells
                  > under any circumstances for the reasons explained above.

                  Yet Vim's errorformat already has %v. :)

                  > 3. I think that things like Conque may also benefit from this, but
                  > they do not need screen cells as well.

                  /lcd

                  --
                  --
                  You received this message from the "vim_dev" maillist.
                  Do not top-post! Type your reply below the text you are replying to.
                  For more information, visit http://www.vim.org/maillist.php

                  ---
                  You received this message because you are subscribed to the Google Groups "vim_dev" group.
                  To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
                  For more options, visit https://groups.google.com/d/optout.
                • LCD 47
                  ... [...] Oh, I wasn t *asking* for anything. You claimed in a previous message that matchaddpos() would be good for syntastic, and I mentioned some of the
                  Message 8 of 9 , Jul 4, 2014
                    On 4 July 2014, Alexey Radkov <alexey.radkov@...> wrote:
                    > 2014-07-04 18:22 GMT+04:00 Павлов Николай Александрович <zyx.vim@...>:
                    >
                    > > -----BEGIN PGP SIGNED MESSAGE-----
                    > > Hash: SHA512
                    > >
                    > > Are you sure you need exactly screen cells? There are the following
                    > > possible ways to identify position inside a string:
                    > >
                    > > 1. Byte offset.
                    > > 2. Unicode codepoints offset.
                    > > 3. Composed characters offset (one "composed character" is "one Unicode
                    > > codepoint with attached composing characters (if any)").
                    > > 4. Screen cells offset.
                    > >
                    >
                    > Ok, i can imagine all the use cases but i cannot map all of them against
                    > their usage frequency in majority of the vim plugins. If I have known that
                    > 4 is very rare i would not have suggested this patch :) ... I just remember
                    > that LCD asked for 4 in Syntastic plugin.
                    [...]

                    Oh, I wasn't *asking* for anything. You claimed in a previous
                    message that matchaddpos() would be good for syntastic, and I mentioned
                    some of the reasons why it can't be used. In particular, my point was
                    that *both* screen characters calculations and byte offset calculation
                    are useful in practice, not that the former should *replace* the latter.

                    /lcd

                    --
                    --
                    You received this message from the "vim_dev" maillist.
                    Do not top-post! Type your reply below the text you are replying to.
                    For more information, visit http://www.vim.org/maillist.php

                    ---
                    You received this message because you are subscribed to the Google Groups "vim_dev" group.
                    To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
                    For more options, visit https://groups.google.com/d/optout.
                  • Bram Moolenaar
                    ... We already concluded that the code should round the size up to a full character. You can t highlight one byte of a multi-byte character. ... There is no
                    Message 9 of 9 , Jul 6, 2014
                      Alexey Radkov wrote:

                      > 2014-07-04 18:22 GMT+04:00 Павлов Николай Александрович <zyx.vim@...>:
                      >
                      > > -----BEGIN PGP SIGNED MESSAGE-----
                      > > Hash: SHA512
                      > >
                      > > Are you sure you need exactly screen cells? There are the following
                      > > possible ways to identify position inside a string:
                      > >
                      > > 1. Byte offset.
                      > > 2. Unicode codepoints offset.
                      > > 3. Composed characters offset (one "composed character" is "one Unicode
                      > > codepoint with attached composing characters (if any)").
                      > > 4. Screen cells offset.
                      > >
                      >
                      > Ok, i can imagine all the use cases but i cannot map all of them against
                      > their usage frequency in majority of the vim plugins. If I have known that
                      > 4 is very rare i would not have suggested this patch :) ... I just remember
                      > that LCD asked for 4 in Syntastic plugin.
                      >
                      >
                      > > I doubt anybody will use 4. outside of a editor because it is hard to
                      > > calculate. There are two settings that affect 4. and are only defined in a
                      > > editor: &tabstop and &ambiwidth. *You must not use screen cells offset with
                      > > editor settings if you received it from another source.* It is incorrect:
                      > > you need settings from another source, not from this instance of editor.
                      > >
                      > > And ***do never use screen cells to count characters***. Code that assumes
                      > > any fixed amount of Unicode codepoints per one cell is brain-damaged,
                      > > broken and wrong.
                      > >
                      > >
                      > > For this patch I heard the following use cases:
                      > >
                      > > 1. matchparen. Will happily live with byte offset.
                      > >
                      >
                      > Not really. Apart of the fact that it gets matching parens in a wrong way
                      > like
                      >
                      > c = getline(c_lnum)[c_col - 1]
                      >
                      > which will always mean 1 byte symbol, it uses matchaddpos() (or 3match
                      > earlier) that rely on 1-byte symbol too.

                      We already concluded that the code should round the size up to a full
                      character. You can't highlight one byte of a multi-byte character.

                      > I made an experiment:
                      >
                      > :set matchpairs()+=в:д
                      >
                      > These symbols are not pairly highlighted with matchparen whereas '%' works
                      > just fine: the reason is simple: both result of getline() and
                      > matchaddpos()/3match must know that symbol under cursor is longer than 1
                      > byte, it means that script must calculate it itself.

                      There is no need to make calculations if we round up to a character.


                      --
                      When I look deep into your eyes, I see JPEG artifacts.
                      I can tell by the pixels that we're wrong for each other. (xkcd)

                      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                      /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                      \\\ an exciting new programming language -- http://www.Zimbu.org ///
                      \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

                      --
                      --
                      You received this message from the "vim_dev" maillist.
                      Do not top-post! Type your reply below the text you are replying to.
                      For more information, visit http://www.vim.org/maillist.php

                      ---
                      You received this message because you are subscribed to the Google Groups "vim_dev" group.
                      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
                      For more options, visit https://groups.google.com/d/optout.
                    Your message has been successfully submitted and would be delivered to recipients shortly.