Loading ...
Sorry, an error occurred while loading the content.

Re: Matching chars in a regex by number

Expand Messages
  • Eljay Love-Jensen
    Hi everyone, All languages have their pros and cons. Computer languages are like tools in a toolbox. Hammers can drive in screws, and screw drivers can pound
    Message 1 of 28 , Jul 7, 2004
    • 0 Attachment
      Hi everyone,

      All languages have their pros and cons.

      Computer languages are like tools in a toolbox. Hammers can drive in
      screws, and screw drivers can pound in nails... but they're not the most
      suited tool to the task.

      As my old boss used to say, "When C++ is your hammer, all your problems
      look like thumbs." (Keep in mind, he worked at Metrowerks at the time, on
      their C++ PowerPlant code base.)

      Looking at Perl through Lisp glasses, Perl looks atrocious.

      Looking at C or C++ through Java glasses, and C/C++ looks atrocious.

      Looking at Lisp through Perl glasses... well, that's not a good analogy
      since Perl gives you 5000 ways of doing something, and all of them are the
      right way.

      Bad programs can be written in any language.

      I admire the elegance and syntactic simplicity of Lisp. But I prefer the
      strong typing B&D of Ada -- although I dislike Ada's syntax.

      Sincerely,
      --Eljay
    • Ilya Sher
      ... Hash: SHA1 Ciaran McCreesh wrote: [snip] ... than ... If I understand your problem correctly you want to get something like: #### Index: test1
      Message 2 of 28 , Jul 7, 2004
      • 0 Attachment
        -----BEGIN PGP SIGNED MESSAGE-----
        Hash: SHA1

        Ciaran McCreesh wrote:
        [snip]
        | (If anyone knows how to make "cvs diff -N" handle new files rather
        than
        | giving a ?, please let me know...)

        If I understand your problem correctly you want
        to get something like:

        ####
        Index: test1
        ===================================================================
        RCS file: test1
        diff -N test1
        - --- /dev/null 1 Jan 1970 00:00:00 -0000
        +++ test1 7 Jul 2004 16:11:00 -0000
        @@ -0,0 +1 @@
        +skdjfhbasjkdfh
        ####

        I achieved that by
        cvs add test1
        cvs diff -N

        (Do not check in before diff)
        -----BEGIN PGP SIGNATURE-----
        Version: GnuPG v1.2.4 (GNU/Linux)
        Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

        iD8DBQFA7CGLcrmCGtFx9Y8RAqOwAKCbNN2qCjfbsRKlZaQZo1ceFdCTegCgnLp1
        C9P5WkHUF0CVWmXXDjKVOI0=
        =zF13
        -----END PGP SIGNATURE-----
      • Ilya Sher
        ... Hash: SHA1 ... Depends on how you look at it. If you do the switch that
        Message 3 of 28 , Jul 7, 2004
        • 0 Attachment
          -----BEGIN PGP SIGNED MESSAGE-----
          Hash: SHA1

          Bram Moolenaar wrote:
          | Ilya Sher wrote:
          |
          |
          |>To understand code easily requires that langauge will be simple
          |>and consistent. IMHO You can't get more simple/consistent than
          |>(op/func/special-form-name/whatever arg1 arg2 arg3 ... argN)
          |>for syntax.
          |>... which among other things puts Lisp as programming language
          |>#1 for me.
          |
          |
          | Thus:
          |
          | (smaller-than a b)
          |
          | Or even:
          |
          | (< a b)
          |
          | would be easier to read than:
          |
          | a < b
          |
          | Don't think so...
          Depends on how you look at it.
          If you do the switch that "<" is not an operator
          but a function (btw: with any number of arguments)
          and should not be called in a different way than
          others it starts to make sense. (*)

          One of the main concepts in Lisp is that most of
          the things are not _that_ different ;) That allows
          many wonderful things.

          For example code is a data that just happens to run.
          That allows macros generate code on the fly and in contrast
          to many other scripting languages, the generated code will
          _not_ need to be parsed on eval() just because (generated) code
          and data are similar in representation.


          (*) - Speaking here for myself only. Might happen that majority
          will not see it that way. Hmm... Lisp is _not_ considered
          popular ... so one of the reasons might be that the majority
          probably don't ;)

          |
          | Main problem with putting the operator first and then arguments is
          that
          | you don't know the order of the arguments without looking in the
          I don't get your point at all.
          How f(x,y,z) is better than (f x y z)
          or
          x binop z
          better than
          (binop x z)
          in regards to order of _arguments_ ?
          | documentation. C also has that problem when using functions. Python
          | allows keyword arguments, which is nice to read:
          |
          | getentry(table = x, index = y)
          | strstr(haystack = s, needle = p)
          | strcpy(from = n, to = m)
          |
          | But then you discover it's a lot of typing, and you may rename the
          | arguments...
          Lisp also have keyword arguments ...
          |
          | A good mix of words and symbols is essential for being able to quickly
          | reading back commands. Mostly words (e.g., Cobol) requires
          reading the
          | words. Mostly symbols (e.g., Perl) makes is cryptic.
          |
          Lisp can be balanced as it has dynamic syntax.
          (translated to eqv. "(" and ")" representation just after
          the program/expression is read)
          -----BEGIN PGP SIGNATURE-----
          Version: GnuPG v1.2.4 (GNU/Linux)
          Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

          iD8DBQFA7CdscrmCGtFx9Y8RApMtAKCvXhxfRSfCy3CunLeGqlDy8lu5YgCfdWLq
          7k3x1GcP1uhzy1Iy1Y9/riQ=
          =D14v
          -----END PGP SIGNATURE-----
        • Ciaran McCreesh
          On Wed, 07 Jul 2004 19:15:07 +0300 Ilya Sher ... ... Yeah, that s the effect I m after, except that a cvs add isn t an option.
          Message 4 of 28 , Jul 7, 2004
          • 0 Attachment
            On Wed, 07 Jul 2004 19:15:07 +0300 Ilya Sher <ilya-vim@...>
            wrote:
            | Ciaran McCreesh wrote:
            | [snip]
            | | (If anyone knows how to make "cvs diff -N" handle new files rather
            | | than giving a ?, please let me know...)
            |
            | If I understand your problem correctly you want
            | to get something like:
            <snip>
            | I achieved that by
            | cvs add test1
            | cvs diff -N

            Yeah, that's the effect I'm after, except that a cvs add isn't an
            option. I guess a local CVS server doing mirroring would solve that,
            I was just wondering whether anyone happened to know a cleaner way.

            --
            Ciaran McCreesh : Gentoo Developer (Sparc, MIPS, Vim, Fluxbox)
            Mail : ciaranm at gentoo.org
            Web : http://dev.gentoo.org/~ciaranm
          • Ilya Sher
            ... Hash: SHA1 ... 100% Agreed. We would have one super-language otherwise. ... the most ... problems ... time, ... analogy ... are ... Pick any code. Look
            Message 5 of 28 , Jul 7, 2004
            • 0 Attachment
              -----BEGIN PGP SIGNED MESSAGE-----
              Hash: SHA1

              Eljay Love-Jensen wrote:
              | Hi everyone,
              |
              | All languages have their pros and cons.
              100% Agreed.
              We would have one "super-language" otherwise.
              |
              | Computer languages are like tools in a toolbox. Hammers can drive in
              | screws, and screw drivers can pound in nails... but they're not
              the most
              | suited tool to the task.
              |
              | As my old boss used to say, "When C++ is your hammer, all your
              problems
              | look like thumbs." (Keep in mind, he worked at Metrowerks at the
              time,
              | on their C++ PowerPlant code base.)
              |
              | Looking at Perl through Lisp glasses, Perl looks atrocious.
              |
              | Looking at C or C++ through Java glasses, and C/C++ looks atrocious.
              |
              | Looking at Lisp through Perl glasses... well, that's not a good
              analogy
              | since Perl gives you 5000 ways of doing something, and all of them
              are
              | the right way.
              |
              | Bad programs can be written in any language.
              Pick any code. Look at it. 90% chances it's the case.
              |
              | I admire the elegance and syntactic simplicity of Lisp.
              Agreed.
              | But I prefer
              | the strong typing B&D of Ada -- although I dislike Ada's syntax.
              Strong typing - not agreed. Let's not argue about that - it will
              probably be very long discussion but I want to try to get back to
              my personal "urgent" things (though I can't resist to discuss on
              Lisp)
              |
              | Sincerely,
              | --Eljay
              |
              |

              -----BEGIN PGP SIGNATURE-----
              Version: GnuPG v1.2.4 (GNU/Linux)
              Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

              iD8DBQFA7CkDcrmCGtFx9Y8RAsguAKC6PYOcib+4hClOr2Uc+eb+RP6+zgCglZiq
              SOkXdI5pxnsDySt1lP81rzM=
              =9XTw
              -----END PGP SIGNATURE-----
            • Ilya Sher
              ... Hash: SHA1 ... Not 100% sure but I ve looked at my CVS repository. It doesn t contain the not-yet-actually-added test1 so maybe you don t have to have
              Message 6 of 28 , Jul 7, 2004
              • 0 Attachment
                -----BEGIN PGP SIGNED MESSAGE-----
                Hash: SHA1

                Ciaran McCreesh wrote:
                | On Wed, 07 Jul 2004 19:15:07 +0300 Ilya Sher <ilya-vim@...>
                | wrote:
                | | Ciaran McCreesh wrote:
                | | [snip]
                | | | (If anyone knows how to make "cvs diff -N" handle new files rather
                | | | than giving a ?, please let me know...)
                | |
                | | If I understand your problem correctly you want
                | | to get something like:
                | <snip>
                | | I achieved that by
                | | cvs add test1
                | | cvs diff -N
                |
                | Yeah, that's the effect I'm after, except that a cvs add isn't an
                | option.
                Not 100% sure but I've looked at my CVS repository.
                It doesn't contain the not-yet-actually-added test1
                so maybe you don't have to have write permissions for
                add (of course you can't check-in in that case).
                Worth trying I guess.
                | I guess a local CVS server doing mirroring would solve that,
                | I was just wondering whether anyone happened to know a cleaner way.
                |
                Not me :(
                -----BEGIN PGP SIGNATURE-----
                Version: GnuPG v1.2.4 (GNU/Linux)
                Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                iD8DBQFA7CpFcrmCGtFx9Y8RAneLAKDhI7MaMQs/r7tusR4KwOC4pvYfMACdEN2B
                bGMLXppnjmjlHCJjW0rtSxw=
                =rTcY
                -----END PGP SIGNATURE-----
              • Ilya Sher
                ... Hash: SHA1 ... Best suggestion. It makes it c is between or equals to one of the characters zero and seven You certainly can t say it about the C
                Message 7 of 28 , Jul 7, 2004
                • 0 Attachment
                  -----BEGIN PGP SIGNED MESSAGE-----
                  Hash: SHA1

                  David Brown wrote:
                  | On Wed, Jul 07, 2004 at 02:24:14PM +0200, Bram Moolenaar wrote:
                  |
                  |
                  |>>It would be (char<= #\0 c #\7) then, which is much
                  |
                  |
                  |>How do you do that for the Lisp code?
                  |
                  |
                  | Try pronouncing "char<=" as "between". Yes, the prefix ordering makes
                  | it weird.
                  Best suggestion.
                  It makes it
                  c is between or equals to one of
                  the characters zero and seven
                  You certainly can't say it about the C counterpart,
                  it's definetly not what is written there.
                  |
                  | Dave
                  |

                  -----BEGIN PGP SIGNATURE-----
                  Version: GnuPG v1.2.4 (GNU/Linux)
                  Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                  iD8DBQFA7CxYcrmCGtFx9Y8RAu3mAJ9DI29CLKaNTyzWb7G/s0q5hrJhMQCgum7s
                  3GRYVVvJ2NRi/Yyp77FZvo0=
                  =QuTS
                  -----END PGP SIGNATURE-----
                • Bram Moolenaar
                  [ Last post in this off-topic thread. Promise! :-) ] ... I m not sure who said it first, but I have this quote from Steve Hoflich. ... Larry Wall said: Lisp
                  Message 8 of 28 , Jul 7, 2004
                  • 0 Attachment
                    [ Last post in this off-topic thread. Promise! :-) ]

                    > As my old boss used to say, "When C++ is your hammer, all your problems
                    > look like thumbs." (Keep in mind, he worked at Metrowerks at the time, on
                    > their C++ PowerPlant code base.)

                    I'm not sure who said it first, but I have this quote from Steve Hoflich.

                    > Looking at Perl through Lisp glasses, Perl looks atrocious.

                    Larry Wall said:

                    "Lisp has all the visual appeal of oatmeal with nail clippings thrown in."


                    Of course, the best language will be Vim 7 script!

                    --
                    TERRY GILLIAM PLAYED: PATSY (ARTHUR'S TRUSTY STEED), THE GREEN KNIGHT
                    SOOTHSAYER, BRIDGEKEEPER, SIR GAWAIN (THE FIRST TO BE
                    KILLED BY THE RABBIT)
                    "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

                    /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                    /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                    \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                    \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
                  • Ilya Sher
                    ... Hash: SHA1 Bram Moolenaar wrote: [snip] ... thrown in. Larry Wall IMHO have very strange idea of what syntax of a language should be. ... Sure ! ;) I just
                    Message 9 of 28 , Jul 7, 2004
                    • 0 Attachment
                      -----BEGIN PGP SIGNED MESSAGE-----
                      Hash: SHA1

                      Bram Moolenaar wrote:
                      [snip]
                      | Larry Wall said:
                      |
                      | "Lisp has all the visual appeal of oatmeal with nail clippings
                      thrown in."
                      Larry Wall IMHO have very strange idea of what syntax of
                      a language should be.
                      |
                      |
                      | Of course, the best language will be Vim 7 script!
                      |
                      Sure ! ;)

                      I just have a list of features that
                      should go in ... oops that will be
                      Lisp copy-cat then ;)


                      P.S.
                      Sorry I did not change the subject earliear.
                      -----BEGIN PGP SIGNATURE-----
                      Version: GnuPG v1.2.4 (GNU/Linux)
                      Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                      iD8DBQFA7D2WcrmCGtFx9Y8RAooxAJ9h+MPaZdU1HyWT/6Px8etl9P8dAwCgt56c
                      piJhLFiIbhA5nOyXiRDlN2o=
                      =jpsd
                      -----END PGP SIGNATURE-----
                    • Ciaran McCreesh
                      On Wed, 07 Jul 2004 17:52:58 +0200 Bram Moolenaar ... Ok, one last tiny update: * Add in a test for [ u] (as well as %u) to test44 *
                      Message 10 of 28 , Jul 10, 2004
                      • 0 Attachment
                        On Wed, 07 Jul 2004 17:52:58 +0200 Bram Moolenaar <Bram@...>
                        wrote:
                        | Thanks. I intend to include this patch in a few days. Please keep
                        | improving it as you see need for it.

                        Ok, one last tiny update:

                        * Add in a test for [\u] (as well as \%u) to test44
                        * Remove the todo.txt items

                        In case the patch gets mangled:

                        http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.0aa-regexp-numbered-characters-r2.patch

                        Cheers,
                        --
                        Ciaran McCreesh : Gentoo Developer (Sparc, MIPS, Vim, Fluxbox)
                        Mail : ciaranm at gentoo.org
                        Web : http://dev.gentoo.org/~ciaranm
                      • Ilya Sher
                        ... Hash: SHA1 Ciaran McCreesh wrote: [snip] ... It would be nice something also U for multibyte: i = gethexchrs(8); for consistency with i_CTRL-V
                        Message 11 of 28 , Jul 10, 2004
                        • 0 Attachment
                          -----BEGIN PGP SIGNED MESSAGE-----
                          Hash: SHA1

                          Ciaran McCreesh wrote:
                          [snip]
                          | + #ifdef FEAT_MBYTE
                          | + case 'u':
                          | + {
                          | + int i;
                          | + i = gethexchrs(4);
                          It would be nice something also 'U' for multibyte:
                          i = gethexchrs(8);
                          for consistency with i_CTRL-V (i_CTRL-V_digit) and
                          for way to specify the chars
                          | + if (i < 0)
                          | + EMSG_M_RET_NULL(
                          | + _("E71: Invalid character after %s%%u"),
                          | + reg_magic == MAGIC_ALL);
                          | + ret = regnode(EXACTLY);
                          | + if (i == 0)
                          | + regc(0x0a);
                          | + else
                          | + regmbc(i);
                          | + regc(NUL);
                          | + *flagp |= HASWIDTH;
                          | + break;
                          | + }
                          | + #endif

                          Regards,
                          Ilya
                          -----BEGIN PGP SIGNATURE-----
                          Version: GnuPG v1.2.4 (GNU/Linux)
                          Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                          iD8DBQFA8NfwcrmCGtFx9Y8RAjNWAJ92TEA5L+9SDoQnvbm9WcOe8erL6gCgt/c2
                          Vk/CFNu86WR6Cd/UyqiqyAM=
                          =aU4H
                          -----END PGP SIGNATURE-----
                        • Ilya Sher
                          ... Hash: SHA1 ... http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.0aa-regexp-numbered-characters-r2.patch ... If I m not mistaken scripts made for +multibyte
                          Message 12 of 28 , Jul 10, 2004
                          • 0 Attachment
                            -----BEGIN PGP SIGNED MESSAGE-----
                            Hash: SHA1

                            Ciaran McCreesh wrote:
                            | On Wed, 07 Jul 2004 17:52:58 +0200 Bram Moolenaar <Bram@...>
                            | wrote:
                            | | Thanks. I intend to include this patch in a few days. Please keep
                            | | improving it as you see need for it.
                            |
                            | Ok, one last tiny update:
                            |
                            | * Add in a test for [\u] (as well as \%u) to test44
                            | * Remove the todo.txt items
                            |
                            | In case the patch gets mangled:
                            |
                            |
                            http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.0aa-regexp-numbered-characters-r2.patch
                            |
                            | Cheers,

                            If I'm not mistaken scripts made for +multibyte
                            (using the new \u patterns) will generate an
                            error on -multibyte compiled vim.

                            I'm not sure it's correct behaviour...
                            Anyway, we should all think about that.

                            Suggested possibilities:
                            1) will always fail the pattern (this part of it,
                            ~ matching "a" against "a\|\u1234" should succeed)
                            2) will match normal characters (unicode <= 255)
                            ~ in usual way and fail other matches
                            3) (very problematic imho) will convert unicode
                            ~ to sequence of chars to try (hard) to match the pattern
                            ~ anyway.

                            (I'm for #2)

                            [snip]
                            -----BEGIN PGP SIGNATURE-----
                            Version: GnuPG v1.2.4 (GNU/Linux)
                            Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                            iD8DBQFA8OA1crmCGtFx9Y8RAsCCAKDncTlSUL8OwG2FTuuE43ektlmG9wCdHwSi
                            VXOOFCurI88EIbn5NTfgJRU=
                            =CJji
                            -----END PGP SIGNATURE-----
                          • Ilya Sher
                            ... Hash: SHA1 ... ... keep ... http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.0aa-regexp-numbered-characters-r2.patch ... hmm... or
                            Message 13 of 28 , Jul 10, 2004
                            • 0 Attachment
                              -----BEGIN PGP SIGNED MESSAGE-----
                              Hash: SHA1

                              Ilya Sher wrote:
                              | Ciaran McCreesh wrote:
                              | | On Wed, 07 Jul 2004 17:52:58 +0200 Bram Moolenaar
                              <Bram@...>
                              | | wrote:
                              | | | Thanks. I intend to include this patch in a few days. Please
                              keep
                              | | | improving it as you see need for it.
                              | |
                              | | Ok, one last tiny update:
                              | |
                              | | * Add in a test for [\u] (as well as \%u) to test44
                              | | * Remove the todo.txt items
                              | |
                              | | In case the patch gets mangled:
                              | |
                              | |
                              |
                              http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.0aa-regexp-numbered-characters-r2.patch

                              |
                              | |
                              | | Cheers,
                              |
                              | If I'm not mistaken scripts made for +multibyte
                              | (using the new \u patterns) will generate an
                              | error on -multibyte compiled vim.
                              |
                              | I'm not sure it's correct behaviour...
                              | Anyway, we should all think about that.
                              |
                              | Suggested possibilities:
                              | 1) will always fail the pattern (this part of it,
                              | ~ matching "a" against "a\|\u1234" should succeed)
                              | 2) will match normal characters (unicode <= 255)
                              hmm... or unicode <= 127 ?
                              | ~ in usual way and fail other matches
                              | 3) (very problematic imho) will convert unicode
                              | ~ to sequence of chars to try (hard) to match the pattern
                              | ~ anyway.
                              |
                              | (I'm for #2)
                              |
                              | [snip]
                              -----BEGIN PGP SIGNATURE-----
                              Version: GnuPG v1.2.4 (GNU/Linux)
                              Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                              iD8DBQFA8OPUcrmCGtFx9Y8RAmGlAJ0XptmXaCvv9gKKB4MEKboMAuLT/gCgrvyY
                              h15U3FPtSh+IVQ9L/CeBBPA=
                              =KkP3
                              -----END PGP SIGNATURE-----
                            • Bram Moolenaar
                              ... Thanks for the update. -- The future s already arrived - it s just not evenly distributed yet. -- William Gibson /// Bram Moolenaar -- Bram@Moolenaar.net
                              Message 14 of 28 , Jul 11, 2004
                              • 0 Attachment
                                Ciaran McCreesh wrote:

                                > On Wed, 07 Jul 2004 17:52:58 +0200 Bram Moolenaar <Bram@...>
                                > wrote:
                                > | Thanks. I intend to include this patch in a few days. Please keep
                                > | improving it as you see need for it.
                                >
                                > Ok, one last tiny update:
                                >
                                > * Add in a test for [\u] (as well as \%u) to test44
                                > * Remove the todo.txt items

                                Thanks for the update.

                                --
                                "The future's already arrived - it's just not evenly distributed yet."
                                -- William Gibson

                                /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                                /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                                \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                                \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
                              • Ciaran McCreesh
                                On Sun, 11 Jul 2004 09:02:24 +0300 Ilya Sher ... Wow, I didn t realise vim could even do that :) ... Yeah. On the other hand, scripts
                                Message 15 of 28 , Jul 11, 2004
                                • 0 Attachment
                                  On Sun, 11 Jul 2004 09:02:24 +0300 Ilya Sher <ilya-vim@...>
                                  wrote:
                                  | It would be nice something also 'U' for multibyte:
                                  | i = gethexchrs(8);
                                  | for consistency with i_CTRL-V (i_CTRL-V_digit) and
                                  | for way to specify the chars

                                  Wow, I didn't realise vim could even do that :)

                                  | If I'm not mistaken scripts made for +multibyte
                                  | (using the new \u patterns) will generate an
                                  | error on -multibyte compiled vim.

                                  Yeah. On the other hand, scripts made for +(insert feature here) will
                                  generally fail on a -feature vim.

                                  | 1) will always fail the pattern (this part of it,
                                  | ~ matching "a" against "a\|\u1234" should succeed)

                                  Hm. This could be handled by only emitting the regmbc if FEAT_MBYTE is
                                  on. However, it'd be rather strange behaviour.

                                  | 2) will match normal characters (unicode <= 255)
                                  | ~ in usual way and fail other matches

                                  This would be easy enough to implement, just check for < 255 (or 127 as
                                  you suggest) and use regc instead. I could add this in if people think
                                  it's the 'right' option?

                                  | 3) (very problematic imho) will convert unicode
                                  | ~ to sequence of chars to try (hard) to match the pattern
                                  | ~ anyway.

                                  Hm. Tricky :)

                                  Here's another idea:

                                  4) If someone tries to use \%u or [\u], throw an E319.

                                  Personally, I kinda like the sound of that.

                                  Ok, updated patch (again...):
                                  * Add in [\U12345678] and \%U12345678
                                  * Update test44 and docs to include the above

                                  This thing's getting fairly long, so I'll stop attaching it. You can
                                  wget it from:

                                  http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.0aa-regexp-numbered-characters-r3.patch

                                  Looks like there'll be at least one more revision after this which
                                  includes a solution to the features problem. I'd appreciate views on
                                  which to go for. Hm, and I thought this item would be a nice easy one to
                                  implement :)

                                  --
                                  Ciaran McCreesh : Gentoo Developer (Sparc, MIPS, Vim, Fluxbox)
                                  Mail : ciaranm at gentoo.org
                                  Web : http://dev.gentoo.org/~ciaranm
                                • Ilya Sher
                                  ... Hash: SHA1 ... My general thought is to try to run the script despite the difficulties . I m not sure that is the best thing to do. What I m sure of is
                                  Message 16 of 28 , Jul 11, 2004
                                  • 0 Attachment
                                    -----BEGIN PGP SIGNED MESSAGE-----
                                    Hash: SHA1

                                    Ciaran McCreesh wrote:

                                    | Yeah. On the other hand, scripts made for +(insert feature here) will
                                    | generally fail on a -feature vim.

                                    My general thought is to try to run the script
                                    despite the "difficulties". I'm not sure that
                                    is the best thing to do. What I'm sure of is
                                    that trying to run script that tries do build
                                    gui menus on VIM without a gui is very different
                                    from trying to match a regex.

                                    Whatever I wrote in my previous email is
                                    a suggestion (and my opinion). If most people
                                    will find the suggested behaviour counter-intuitive/
                                    inappropriate/incorrect (whatever ;) ) that is fine
                                    with me.
                                    -----BEGIN PGP SIGNATURE-----
                                    Version: GnuPG v1.2.4 (GNU/Linux)
                                    Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

                                    iD8DBQFA8Zn+crmCGtFx9Y8RAp8QAKCeyw9CRY+kiNQcJRWR9Sr5B2shdQCeLZ5i
                                    Mc2pgnBqH3xEKkVkatuhgy4=
                                    =Fnov
                                    -----END PGP SIGNATURE-----
                                  • Alejandro Lopez-Valencia
                                    ... If multibyte is not compiled in, the regex should use 8-bit character ranges (that is 256 vector: x =
                                    Message 17 of 28 , Jul 11, 2004
                                    • 0 Attachment
                                      At 01:56 p.m. 11/07/2004, Ciaran McCreesh wrote in response to Ilya Sher:


                                      >| 2) will match normal characters (unicode <= 255)
                                      >| ~ in usual way and fail other matches
                                      >
                                      >This would be easy enough to implement, just check for < 255 (or 127 as
                                      >you suggest) and use regc instead. I could add this in if people think
                                      >it's the 'right' option?

                                      If multibyte is not compiled in, the regex should use 8-bit character
                                      ranges (that is 256 vector: x =< 255, not x<255). Vim is 8-bit clean, which
                                      allows writing any language with a character repertoire of less than 256
                                      characters (223 if you discard control characters) if defined by the system
                                      codepage; no matter if it is Cyrillic, Hebrew, German, Welsh, Spanish... No
                                      need for Unicode support in that case.

                                      Assuming x<127 (broken! 7-bit ASCII == x<127) is, well... A bad trip to
                                      mainframe land :-)
                                    • Ciaran McCreesh
                                      ... ... Since no-one has screamed particularly loudly about any of the approaches suggested, I ve gone for throwing an E319 for %u, and making [ u]
                                      Message 18 of 28 , Jul 21, 2004
                                      • 0 Attachment
                                        On Sun, 11 Jul 2004 19:56:56 +0100 I wrote:
                                        | | If I'm not mistaken scripts made for +multibyte
                                        | | (using the new \u patterns) will generate an
                                        | | error on -multibyte compiled vim.
                                        <snip>
                                        | 4) If someone tries to use \%u or [\u], throw an E319.

                                        Since no-one has screamed particularly loudly about any of the
                                        approaches suggested, I've gone for throwing an E319 for \%u, and making
                                        [\u] behave just like vi. This will avoid breaking anything which
                                        already uses [\u] expecting it to match a backslash and a u.

                                        http://dev.gentoo.org/~ciaranm/patches/vim/vim-7.00a-regexp-numbered-characters-r5.patch

                                        --
                                        Ciaran McCreesh : Gentoo Developer (Sparc, MIPS, Vim, Fluxbox)
                                        Mail : ciaranm at gentoo.org
                                        Web : http://dev.gentoo.org/~ciaranm
                                      Your message has been successfully submitted and would be delivered to recipients shortly.