Loading ...
Sorry, an error occurred while loading the content.

ANNC: v15 of engspchk

Expand Messages
  • Charles E. Campbell
    Hello! I ve uploaded version 15 of under http://www.erols.com/astronaut/vim/index.html#vimlinks_scripts The new version: * has some newly
    Message 1 of 20 , Aug 2, 2002
    • 0 Attachment
      Hello!

      I've uploaded version 15 of <engspchk.tar.gz> under

      http://www.erols.com/astronaut/vim/index.html#vimlinks_scripts

      The new version:


      * has some newly incorporated words

      * has split <engspchk.dict> off from <engspchk.vim> which
      should make generating new versions of engspchk
      supporting non-English languages easier to generate
      (so far: Dutch German Hungarian Polish Yiddish)

      * has improved help

      * accidentally omitted earlier: <engspchk.wb>
      which contains words not in <engspchk.dict> from
      Webster's 1913 dictionary is now included

      * bug fixes (agrep's path now included in ""s,
      has("menu") used to determine if menus are
      supported, etc)


      Please rename your personal dictionaries from

      usr_engspchk.vim -to-> engspchk.usr

      starting with this version.

      Regards,
      Chip Campbell

      --
      Charles E Campbell, Jr, PhD _ __ __
      Goddard Space Flight Center / /_/\_\_/ /
      cec@... /_/ \/_//_/
      PGP public key: http://www.erols.com/astronaut/pgp.html
    • Anthony Campbell
      ... I got this but there seems to be a bug. It identifies almost all the words in my files as spelled wrong. I therefore reverted to the previous version,
      Message 2 of 20 , Aug 2, 2002
      • 0 Attachment
        On 02 Aug 2002, Charles E. Campbell wrote:
        > Hello!
        >
        > I've uploaded version 15 of <engspchk.tar.gz> under
        >
        > http://www.erols.com/astronaut/vim/index.html#vimlinks_scripts
        >
        > The new version:
        >

        I got this but there seems to be a bug. It identifies almost all the
        words in my files as spelled wrong. I therefore reverted to the
        previous version, which works perfectly.

        AC


        --
        ac@... || http://www.acampbell.org.uk
        using Linux GNU/Debian || for book reviews, electronic
        Windows-free zone || books and skeptical articles
      • Kenneth Pronovici
        ... Doesn t seem to work quite right for me, either. If I set g:cvimsyn in my ~/.vimrc, I m fine, but it doesn t set g:cvimsyn to the value of $CVIMSYN
        Message 3 of 20 , Aug 2, 2002
        • 0 Attachment
          > > Hello!
          > >
          > > I've uploaded version 15 of <engspchk.tar.gz> under
          > >
          > > http://www.erols.com/astronaut/vim/index.html#vimlinks_scripts
          > >
          > > The new version:
          > >
          >
          > I got this but there seems to be a bug. It identifies almost all the
          > words in my files as spelled wrong. I therefore reverted to the
          > previous version, which works perfectly.

          Doesn't seem to work quite right for me, either. If I set g:cvimsyn
          in my ~/.vimrc, I'm fine, but it doesn't set g:cvimsyn to the value
          of $CVIMSYN properly. The code right around line 142:

          " Use g:cvimsyn (vim variable) if it exists, otherwise attempt to
          " use the environment variable $CVIMSYN.
          if !exists("g:cvimsyn")
          if version >= 600
          let g:cvimsyn= substitute(&rtp,',.*$','','')
          elseif expand("$CVIMSYN") == ""
          echo "You need to set the CVIMSYN environment variable first"
          echo "It points to where ".g:spchklang."spchk.usr, the user"
          echo "dictionary, is to go."
          finish
          else
          let g:cvimsyn= expand("$CVIMSYN")
          endif
          endif

          doesn't seem to quite do what the comments describe. Replacing the
          old block with this seems to work:

          " Use g:cvimsyn (vim variable) if it exists, otherwise attempt to
          " use the environment variable $CVIMSYN.
          if !exists("g:cvimsyn")
          if expand("$CVIMSYN") == ""
          if version >= 600
          let g:cvimsyn= substitute(&rtp,',.*$','','')
          else
          echo "You need to set the CVIMSYN environment variable first"
          echo "It points to where ".g:spchklang."spchk.usr, the user"
          echo "dictionary, is to go."
          finish
          endif
          else
          let g:cvimsyn= expand("$CVIMSYN")
          endif
          endif

          Other than that, it seems to work fine for me.

          KEN

          --
          Kenneth J. Pronovici <pronovic@...>
          Personal Homepage: http://www.skyjammer.com/~pronovic/
          "They that can give up essential liberty to obtain a little
          temporary safety deserve neither liberty nor safety."
          - Benjamin Franklin, Historical Review of Pennsylvania, 1759
        • Charles E. Campbell
          ... Sorry bout that! I ve uploaded v16 with that g:cvimsyn missing, environment variable CVIMSYN defined case fixed, more or less as you suggested.
          Message 4 of 20 , Aug 2, 2002
          • 0 Attachment
            On Fri, Aug 02, 2002 at 12:47:18PM -0500, Kenneth Pronovici wrote:
            > Doesn't seem to work quite right for me, either.
            ---------------------------------------------------------------------

            Sorry 'bout that! I've uploaded v16 with that g:cvimsyn missing,
            environment variable CVIMSYN defined case fixed, more or less
            as you suggested.

            http://www.erols.com/astronaut/vim/index.html#vimlinks_scripts
            (engspchk)

            Thanks!
            Chip Campbell

            --
            Charles E Campbell, Jr, PhD _ __ __
            Goddard Space Flight Center / /_/\_\_/ /
            cec@... /_/ \/_//_/
            PGP public key: http://www.erols.com/astronaut/pgp.html
          • Denis Perelyubskiy
            hello, * Charles E. Campbell [02-Aug-02 09:52 -0700]: [...] ... [...] ... it seems that docs still refer to usr_engspchk.vim and
            Message 5 of 20 , Aug 2, 2002
            • 0 Attachment
              hello,

              * Charles E. Campbell <cec@...> [02-Aug-02 09:52 -0700]:
              [...]
              > * accidentally omitted earlier: <engspchk.wb>
              > which contains words not in <engspchk.dict> from
              > Webster's 1913 dictionary is now included
              [...]
              >Please rename your personal dictionaries from
              >
              > usr_engspchk.vim -to-> engspchk.usr

              it seems that docs still refer to usr_engspchk.vim and
              wb1913_engspchk.vim.

              also, for " \es : save word under cursor into database
              (permanently) (requires $CVIMSYN)"... is $CVIMSYN required,
              or setting g:cvimsyn variable is sufficient?

              also, there does not seem to be a clean instructions on how
              to "just spellcheck the file" :)...

              finally, I dont have to reload with \ec if i have your
              script in plugins, do I?

              ... it'd be nice to have the above in the docs too, I
              think..

              denis

              --
              // mailto: Denis Perelyubskiy <denisp@...>
              // icq : 12359698
              // PGP : http://www.cs.ucla.edu/~denisp/files/pgp.asc
            • Charles E. Campbell
              ... Thank you -- version 17 now has the document fixed, which I ve again just uploaded! ... g:cvimsyn is quite sufficient! Again, doc has been fixed. ... I ve
              Message 6 of 20 , Aug 2, 2002
              • 0 Attachment
                On Fri, Aug 02, 2002 at 01:44:47PM -0700, Denis Perelyubskiy wrote:
                > it seems that docs still refer to usr_engspchk.vim and
                > wb1913_engspchk.vim.

                Thank you -- version 17 now has the document fixed, which I've again
                just uploaded!

                > also, for " \es : save word under cursor into database
                > (permanently) (requires $CVIMSYN)"... is $CVIMSYN required,
                > or setting g:cvimsyn variable is sufficient?

                g:cvimsyn is quite sufficient! Again, doc has been fixed.

                > also, there does not seem to be a clean instructions on how
                > to "just spellcheck the file" :)...

                I've included more chatter in the <engspchk.txt> file -- hopefully
                that'll help!

                > finally, I dont have to reload with \ec if i have your
                > script in plugins, do I?

                Actually, you do -- having <engspchk.vim> in plugins sets up a quickload
                -- ie. it *doesn't* load the <engspchk.dict>, etc. Doing \ec will
                actually load <engspchk.vim>, etc.

                Also, because engspchk uses syntax highlighting, anything that clears
                the syntax highlighting (as most syntax highlighting files do) will also
                eliminate engspchk highlighting. It will then need to be re-loaded
                (which \ec will do).

                Regards, and thanks for the suggestions,
                Chip Campbell

                --
                Charles E Campbell, Jr, PhD _ __ __
                Goddard Space Flight Center / /_/\_\_/ /
                cec@... /_/ \/_//_/
                PGP public key: http://www.erols.com/astronaut/pgp.html
              • Mikolaj Machowski
                Actually this is in polspchk (I will send it separately to author), but this seems as copy&paste case: look for /Chase Tingle patch And where is syn match
                Message 7 of 20 , Aug 2, 2002
                • 0 Attachment
                  Actually this is in polspchk (I will send it separately to author), but
                  this seems as copy&paste case:

                  look for
                  /Chase Tingle patch

                  And where is

                  syn match GoodWord "\<\a\+\%*\>"

                  should be

                  syn match GoodWord "\<\a\+\%#\>"

                  And for non-latin1 languages it should be:

                  syn match GoodWord "\<\k\+\%#\>"


                  Mikolaj
                • Steve Hall
                  Thought I d tag along on this thread because our Cream for Vim project has encountered some of the same engspchk issues being discussed here. With Dr. C s
                  Message 8 of 20 , Aug 2, 2002
                  • 0 Attachment
                    Thought I'd tag along on this thread because our Cream for Vim project
                    has encountered some of the same engspchk issues being discussed here.
                    With Dr. C's permission, we used it as a launch point for our own
                    spell check and have been debating some of the same things recently.
                    Ours is a little more overhead because we try to handle multiple words
                    and case sensitivity (proper nouns, German nouns) but the principles
                    are all the same.

                    It's almost inevitable that spell check is buffer-specific since the
                    word lists are merged with the syntax ones to form good words and bad
                    words. WAY down our todo list is some sort of sophisticated cross
                    referencing system that eliminates redundancies, but this couldn't be
                    very quick. Each spell check load within a buffer basically increases
                    memory usage by about twice the dictionary size we found, too.

                    We've also added two things others might find useful. (The third is
                    still broken for us, any suggestions?):

                    * Stand alone numbers omitted as errors. (Be curious to see the
                    pattern for mixed alpha/numbers if someone cares to suggest.)

                    " numbers aren't errors
                    syntax match GoodWord "\<[0-9]\+\>"


                    * Words not capitalized at the beginning of sentences. (I'm proud of
                    this one.) Not sure if this applies to all languages, but has so far
                    in the ones we represent. Notice that non-cap words after ellipses
                    are cleared and listed first, and that ellipses plus a period (four
                    total) *may* require a capitalized case following and is considered
                    an error just so you see it.

                    " non-capitalized word after ellipses
                    syntax match GoodWord "\.\.\. \{0,2}\l\@="
                    " but NOT non-capitalized word after ellipses + period
                    syntax match BadWord "\.\.\.\. \{0,2}\l"

                    " non-lowercased end of words (in order, as required)
                    " * required: period/question mark/exclamation mark
                    " * optional: double/single quote (German, too)
                    " * required: return/return-linefeed/space/2 spaces
                    " * required: lowercase letter
                    syntax match BadWord "[\.?!][\"'«»]\=[\r\n\t ]\+\l"

                    " TEST:
                    " cow. bells
                    " cow. Bells
                    " cow. bells
                    " cow. Bells
                    " cow. bells
                    " cow. Bells

                    " cow...bells
                    " cow... bells
                    " cow... bells

                    " cow....bells
                    " cow....Bells
                    " cow.... bells
                    " cow.... Bells
                    " cow.... bells
                    " cow.... Bells


                    * Repeated word checking, which I never got to work despite many
                    alternatives suggested. Be glad to solve this one.

                    " double word checking (TEST: the the coww )
                    " Bram Moolenaar:
                    "syntax match BadWord "\(\<\k\+\>\)\s\+\1"
                    " Preben Peppe Guldberg:
                    "syntax match BadWord "\(\<\k\+\>\)\s\+\1\>"
                    " Gary Holloway:
                    "syntax match BadWord "\(\<\k\+\>\)\(\_s\+\<\1\>\)\+"
                    " Rober Montante:
                    "syntax match BadWord "\(\<\w\+\>\)\_s*\1\>"


                    Hope somebody can use these.

                    Steve Hall [ digitect@... ]
                  • Antoine J. Mechelynck
                    ... From: Steve Hall To: Vim Sent: Saturday, August 03, 2002 4:38 AM Subject: Re: [vim] ANNC: v15 of engspchk [...]
                    Message 9 of 20 , Aug 2, 2002
                    • 0 Attachment
                      ----- Original Message -----
                      From: "Steve Hall" <digitect@...>
                      To: "Vim" <vim@...>
                      Sent: Saturday, August 03, 2002 4:38 AM
                      Subject: Re: [vim] ANNC: v15 of engspchk

                      [...]
                      >
                      > * Words not capitalized at the beginning of sentences. (I'm proud of
                      > this one.) Not sure if this applies to all languages, but has so far
                      > in the ones we represent.
                      [...]
                      Some styles of writing (Hebrew, Arabic, ...) simply dont't make the
                      distinction betwen upper- and lower-case. And in Dutch, if the first word of
                      a sentence begins with an apostrophe, then the *second* word takes a
                      capital, be it a common word: 's Werelds... The world's...; 't Is...
                      It's... -- or a proper noun: 's-Gravenhage, 's-Hertogenbosch. In Dutch also,
                      the letter-pair ij is treated as one letter when it comes to capitalization:
                      IJsland, Iceland. IJs en zout... Ice and salt...

                      Regards,
                      Tony.
                    • Steve Hall
                      From: Antoine J. Mechelynck ... This is great information, thanks for the explanation! One thing we ve done is to split the
                      Message 10 of 20 , Aug 3, 2002
                      • 0 Attachment
                        From: "Antoine J. Mechelynck" <antoine.mechelynck@...>
                        >
                        > From: "Steve Hall" <digitect@...>
                        >
                        > > * Words not capitalized at the beginning of sentences. (I'm proud
                        > > of this one.) Not sure if this applies to all languages, but has
                        > > so far in the ones we represent.
                        >
                        > Some styles of writing (Hebrew, Arabic, ...) simply dont't make the
                        > distinction betwen upper- and lower-case. And in Dutch, if the first
                        > word of a sentence begins with an apostrophe, then the *second* word
                        > takes a capital, be it a common word: 's Werelds... The world's...;
                        > 't Is... It's... -- or a proper noun: 's-Gravenhage,
                        > 's-Hertogenbosch. In Dutch also, the letter-pair ij is treated as
                        > one letter when it comes to capitalization: IJsland, Iceland. IJs en
                        > zout... Ice and salt...

                        This is great information, thanks for the explanation!

                        One thing we've done is to split the dictionary into four parts of
                        both case in/sensitivity and single/multi-word sections. So it looks
                        like for Dutch, we'd just need to invent a highlight match similar to

                        syntax match BadWord "[\.?!][\"'«»]\=[\r\n\t ]\+'s\>[- ]\l"

                        BTW, for the record, does Dutch use "«»" for quote symbols?


                        Steve Hall [ digitect@... ]
                      • Antoine J. Mechelynck
                        ... From: Steve Hall To: Antoine J. Mechelynck Cc: Sent: Saturday, August 03,
                        Message 11 of 20 , Aug 3, 2002
                        • 0 Attachment
                          ----- Original Message -----
                          From: "Steve Hall" <digitect@...>
                          To: "Antoine J. Mechelynck" <antoine.mechelynck@...>
                          Cc: <vim@...>
                          Sent: Saturday, August 03, 2002 2:12 PM
                          Subject: Re: [vim] ANNC: v15 of engspchk


                          [...]
                          >
                          > BTW, for the record, does Dutch use "«»" for quote symbols?
                          >

                          I'm not sure; it may depend on the area (Belgium vs. The Netherlands vs
                          ex-colonies of the latter vs...). I think `` ,, (^K 6" and ^K 9:, IIRC,
                          in utf-8 gvim) are preferred. « » are certainly "possible". Maybe
                          Niederdeutsch (Low German, very similar to Dutch and inter-comprehensible
                          with it) would use German-style quotation marks, opening a quote with what
                          would elsewhere be a "closing" q.m. and vice-versa. (French is my mother
                          language, I live in a bilingual country where the "other" language is Dutch,
                          I learnt it all the way through school and even did some professional
                          translation work, but out of Dutch, not into it.)

                          Regards,
                          Tony.

                          >
                          > Steve Hall [ digitect@... ]
                          >
                          >
                          >
                          >
                        • Benji Fisher
                          ... [snip] ... How do these fail? I prefer yet another variant: / although perhaps w or a would be preferred over k. This, like some of
                          Message 12 of 20 , Aug 31, 2002
                          • 0 Attachment
                            Steve Hall wrote:
                            >
                            [snip]
                            > * Repeated word checking, which I never got to work despite many
                            > alternatives suggested. Be glad to solve this one.
                            >
                            > " double word checking (TEST: the the coww )
                            > " Bram Moolenaar:
                            > "syntax match BadWord "\(\<\k\+\>\)\s\+\1"
                            > " Preben Peppe Guldberg:
                            > "syntax match BadWord "\(\<\k\+\>\)\s\+\1\>"
                            > " Gary Holloway:
                            > "syntax match BadWord "\(\<\k\+\>\)\(\_s\+\<\1\>\)\+"
                            > " Rober Montante:
                            > "syntax match BadWord "\(\<\w\+\>\)\_s*\1\>"

                            How do these fail? I prefer yet another variant:

                            /\<\(\k\+\)\_s\+\1\>

                            although perhaps \w or \a would be preferred over \k. This, like some of the
                            alternatives you listed, works on the test you gave.

                            --Benji Fisher
                          • Mikolaj Machowski
                            ... k is for not us-ascii users. The best solution would be using here variable with somewhere else defined letters ( k is sometimes too generous). Mikolaj
                            Message 13 of 20 , Sep 1, 2002
                            • 0 Attachment
                              On Sat, 31 Aug 2002, Benji Fisher wrote:

                              > Steve Hall wrote:
                              > >
                              > [snip]
                              > > * Repeated word checking, which I never got to work despite many
                              > > alternatives suggested. Be glad to solve this one.
                              >
                              > How do these fail? I prefer yet another variant:
                              >
                              > /\<\(\k\+\)\_s\+\1\>
                              >
                              > although perhaps \w or \a would be preferred over \k. This, like some of the
                              > alternatives you listed, works on the test you gave.

                              \k is for not us-ascii users. The best solution would be using here
                              variable with somewhere else defined letters (\k is sometimes too
                              generous).

                              Mikolaj
                            • Steve Hall
                              From: Benji Fisher ... Do they work for you? That s encouraging. I am actually using these in an engspchk variant, one I ve cooked up
                              Message 14 of 20 , Sep 1, 2002
                              • 0 Attachment
                                From: "Benji Fisher" <benji@...>
                                >
                                > Steve Hall wrote:
                                > > * Repeated word checking....
                                >
                                > How do these fail? I prefer yet another variant:
                                >
                                > /\<\(\k\+\)\_s\+\1\>
                                >
                                > although perhaps \w or \a would be preferred over \k. This, like
                                > some of the alternatives you listed, works on the test you gave.

                                Do they work for you? That's encouraging.

                                I am actually using these in an engspchk variant, one I've cooked up
                                myself and they all fail. The other custom statements work as expected
                                but I've never been able to get these correct.

                                At least now I know these statements aren't the problem. Thanks!


                                Steve Hall [ digitect@... ]
                              • Charles E. Campbell
                                ... Hello! The problem with these matches isn t that they don t work in isolation, its that matches have a lower syntax highlighting priority than do keywords.
                                Message 15 of 20 , Sep 2, 2002
                                • 0 Attachment
                                  On Sun, Sep 01, 2002 at 09:18:04PM -0400, Steve Hall wrote:
                                  > From: "Benji Fisher" <benji@...>
                                  > >
                                  > > Steve Hall wrote:
                                  > > > * Repeated word checking....
                                  > > How do these fail? I prefer yet another variant:
                                  > >
                                  > > /\<\(\k\+\)\_s\+\1\>
                                  > >
                                  >
                                  > Do they work for you? That's encouraging.
                                  >
                                  > I am actually using these in an engspchk variant...
                                  ---------------------------------------------------------------------

                                  Hello!

                                  The problem with these matches isn't that they don't work in
                                  isolation, its that matches have a lower syntax highlighting
                                  priority than do keywords. A correctly spelled word is
                                  recognized as a GoodWord (keyword) before the match looking
                                  for repeated words gets a chance.

                                  Regards,
                                  Chip Campbell

                                  --
                                  Charles E Campbell, Jr, PhD _ __ __
                                  Goddard Space Flight Center / /_/\_\_/ /
                                  cec@... /_/ \/_//_/
                                  PGP public key: http://www.erols.com/astronaut/pgp.html
                                • Benji Fisher
                                  ... How can we get around this? The obvious solution is not to use :syn keyword at all, just use :syn match . This would probably lead to a big
                                  Message 16 of 20 , Sep 2, 2002
                                  • 0 Attachment
                                    "Charles E. Campbell" wrote:
                                    >
                                    > On Sun, Sep 01, 2002 at 09:18:04PM -0400, Steve Hall wrote:
                                    > > From: "Benji Fisher" <benji@...>
                                    > > >
                                    > > > Steve Hall wrote:
                                    > > > > * Repeated word checking....
                                    > > > How do these fail? I prefer yet another variant:
                                    > > >
                                    > > > /\<\(\k\+\)\_s\+\1\>
                                    > > >
                                    > >
                                    > > Do they work for you? That's encouraging.
                                    > >
                                    > > I am actually using these in an engspchk variant...
                                    > ---------------------------------------------------------------------
                                    >
                                    > Hello!
                                    >
                                    > The problem with these matches isn't that they don't work in
                                    > isolation, its that matches have a lower syntax highlighting
                                    > priority than do keywords. A correctly spelled word is
                                    > recognized as a GoodWord (keyword) before the match looking
                                    > for repeated words gets a chance.

                                    How can we get around this? The obvious solution is not to use ":syn
                                    keyword" at all, just use ":syn match". This would probably lead to a big
                                    performance hit. Is there a more clever situation? Maybe use the above pattern
                                    to define a syntax region, so that keywords are not recognized within the
                                    region? (I have never used ":syn region" myself, so I hope Chip or some other
                                    syntax guru will either flesh this idea out or put it out of its misery.)

                                    HTH --Benji Fisher
                                  • Charles E. Campbell
                                    ... That is something we definitely don t want to do. ... One thing that I do for is use a match to recognizes all potential words; the match
                                    Message 17 of 20 , Sep 3, 2002
                                    • 0 Attachment
                                      On Mon, Sep 02, 2002 at 09:15:49PM -0400, Benji Fisher wrote:
                                      > How can we get around this? The obvious solution is not to use ":syn
                                      > keyword" at all, just use ":syn match". This would probably lead to a big
                                      > performance hit.

                                      That is something we definitely don't want to do.

                                      > Is there a more clever situation?

                                      One thing that I do for <vim.vim> is use a match to recognizes all
                                      potential words; the match contains the keyword list.

                                      syn match MatchMaker "\k\+" contains=GoodWord,BadWord

                                      Then the double-word match syntax would be able to use
                                      priority and thus take effect. This approach would also result
                                      in a performance hit, although not as bad as making all the
                                      GoodWords matches.

                                      Alternatively one could recognize the double-word ahead of the
                                      keyword recognition:

                                      syn match DoubleWord "\s\(\k\+\)\_s\+\1\>"ms=s+1

                                      Thus the DoubleWord starts its recognition pass on a leading
                                      whitespace. Of course lines which begin with double-words
                                      like like this one would not be found.

                                      > Maybe use the above pattern
                                      > to define a syntax region, so that keywords are not recognized within the
                                      > region?

                                      Keywords have top priority over both matches *and* regions.

                                      Regards,
                                      Chip Campbell

                                      --
                                      Charles E Campbell, Jr, PhD _ __ __
                                      Goddard Space Flight Center / /_/\_\_/ /
                                      cec@... /_/ \/_//_/
                                      PGP public key: http://www.erols.com/astronaut/pgp.html
                                    • digitect@mindspring.com
                                      From: Charles E. Campbell ... Not sure I understand why it needs to begin with whitespace. All of the possibilities so far use
                                      Message 18 of 20 , Sep 3, 2002
                                      • 0 Attachment
                                        From: "Charles E. Campbell" <cec@...>
                                        >
                                        > On Mon, Sep 02, 2002 at 09:15:49PM -0400, Benji Fisher wrote:
                                        > >
                                        > > How can we get around this?....Is there a more clever situation?
                                        >
                                        > Alternatively one could recognize the double-word ahead of the
                                        > keyword recognition:
                                        >
                                        > syn match DoubleWord "\s\(\k\+\)\_s\+\1\>"ms=s+1
                                        >
                                        > Thus the DoubleWord starts its recognition pass on a leading
                                        > whitespace. Of course lines which begin with double-words like like
                                        > this one would not be found.

                                        Not sure I understand why it needs to begin with whitespace. All of
                                        the possibilities so far use the "\<" word beginning to define the
                                        start of the pattern:

                                        " Bram Moolenaar:
                                        "syntax match BadWord "\(\<\k\+\>\)\s\+\1"
                                        " Preben Peppe Guldberg:
                                        "syntax match BadWord "\(\<\k\+\>\)\s\+\1\>"
                                        " Gary Holloway:
                                        "syntax match BadWord "\(\<\k\+\>\)\(\_s\+\<\1\>\)\+"
                                        " Rober Montante:
                                        "syntax match BadWord "\(\<\w\+\>\)\_s*\1\>"
                                        " Benji Fisher:
                                        "syntax match BadWord "\<\(\k\+\)\_s\+\1\>"
                                        " Steve Hall:
                                        "syntax match BadWord "\(\<\k\+\>\)\s\+\1"

                                        Wouldn't that suffice?


                                        Steve Hall [ digitect@... ]
                                      • Preben 'Peppe' Guldberg
                                        ... I think that should read ... hs=s+1 ... Hi :-) ... Try the following commands in turn and watch the result Maps to execute some vim ex commands. With
                                        Message 19 of 20 , Sep 3, 2002
                                        • 0 Attachment
                                          Thus wrote digitect@... (digitect@...) on [020903]:
                                          > From: "Charles E. Campbell" <cec@...>

                                          > > syn match DoubleWord "\s\(\k\+\)\_s\+\1\>"ms=s+1

                                          I think that should read "..."hs=s+1

                                          > > Thus the DoubleWord starts its recognition pass on a leading
                                          > > whitespace. Of course lines which begin with double-words like like
                                          > > this one would not be found.

                                          > Not sure I understand why it needs to begin with whitespace. All of
                                          > the possibilities so far use the "\<" word beginning to define the
                                          > start of the pattern:

                                          > " Preben Peppe Guldberg:
                                          > "syntax match BadWord "\(\<\k\+\>\)\s\+\1\>"

                                          Hi :-)

                                          > Wouldn't that suffice?

                                          Try the following commands in turn and watch the result

                                          " Maps to execute some vim ex commands.
                                          " With "mapleader=','", use ",@@" to execute.
                                          nmap <Leader>@@ yy:@"<CR>
                                          vmap <Leader>@@ y:@"<CR>
                                          "
                                          " Syntax items
                                          syn clear
                                          syn match Todo "\<\(\k\+\)\s\+\1\>"
                                          syn keyword Special good bad ugly
                                          syn match Error "\s\(\k\+\)\s\+\1\>"hs=s+1
                                          "
                                          " Example text:
                                          "
                                          " good good bad ugly ugly

                                          Note that the order of the Todo and Special lines don't matter. Keywords
                                          win over matches and regions at any time.

                                          Peppe
                                          --
                                          Preben 'Peppe' Guldberg __/-\__ "Before you criticize someone, walk
                                          peppe@... (o o) a mile in his shoes. That way, if
                                          -----------------------oOOo (_) oOOo-- he gets angry, he'll be a mile away
                                          http://www.xs4all.nl/~peppe/ - and barefoot." --Sarah Jackson
                                        • Charles E. Campbell
                                          ... In a word, no. It would suffice to pick up two identically misspelled words in a row, but here we re considering correctly spelled words that are doubled
                                          Message 20 of 20 , Sep 3, 2002
                                          • 0 Attachment
                                            On Tue, Sep 03, 2002 at 11:57:15AM -0400, digitect@... wrote:
                                            > Not sure I understand why it needs to begin with whitespace. All of
                                            > the possibilities so far use the "\<" word beginning to define the
                                            > start of the pattern:
                                            >
                                            > " Bram Moolenaar:
                                            > "syntax match BadWord "\(\<\k\+\>\)\s\+\1"
                                            > " Preben Peppe Guldberg:
                                            > "syntax match BadWord "\(\<\k\+\>\)\s\+\1\>"
                                            > " Gary Holloway:
                                            > "syntax match BadWord "\(\<\k\+\>\)\(\_s\+\<\1\>\)\+"
                                            > " Rober Montante:
                                            > "syntax match BadWord "\(\<\w\+\>\)\_s*\1\>"
                                            > " Benji Fisher:
                                            > "syntax match BadWord "\<\(\k\+\)\_s\+\1\>"
                                            > " Steve Hall:
                                            > "syntax match BadWord "\(\<\k\+\>\)\s\+\1"
                                            >
                                            > Wouldn't that suffice?
                                            ---------------------------------------------------------------------

                                            In a word, no.

                                            It would suffice to pick up two identically misspelled words in a row,
                                            but here we're considering correctly spelled words that are doubled
                                            up. The misspelled words get caught as misspellings anyway.

                                            Consider the next line:

                                            one one is not two.

                                            At the point of the first "o", the word "one" is immediately
                                            identified as a GoodWord by the list of keywords. None of the matches
                                            given above then get a chance to identify "one one".

                                            The leading whitespace in the pattern allows the syntax highlighting
                                            to try out the match on a blank, which none of the keywords will do,
                                            and so the match gets a sporting chance to do its thing. Obviously
                                            this won't catch the "one one" in my example because there's no
                                            leading blank.

                                            The alternative method would work better; ie., have all words recognized
                                            as part of some match ("\<\w\+\>") which contains GoodWord's and
                                            BadWords. The doubled-word syntax matches then have a shot at
                                            identifying doubled words. There'll be a speed penalty because the
                                            word-recognizer pattern will be used almost everywhere.

                                            Regards,
                                            Chip Campbell

                                            --
                                            Charles E Campbell, Jr, PhD _ __ __
                                            Goddard Space Flight Center / /_/\_\_/ /
                                            cec@... /_/ \/_//_/
                                            PGP public key: http://www.erols.com/astronaut/pgp.html
                                          Your message has been successfully submitted and would be delivered to recipients shortly.