Loading ...
Sorry, an error occurred while loading the content.
 

Re: More spell files available

Expand Messages
  • Bram Moolenaar
    ... I notice that when adding ,en the scoring changes. The sound-a-like mechanism for English is also used for Polish. Perhaps we should not do ... Then
    Message 1 of 8 , Sep 3, 2005
      Mikolaj Machowski wrote:

      > Dnia czwartek, 1 wrze¶nia 2005 18:09, Bram Moolenaar napisa³:
      > > - Try if suggestions make sense. You may set 'verbose' to see the
      > [cut]
      > > As always, suggestions are welcome.
      >
      > One thing with suggestions.
      >
      > Word: rzubr(badly spelled {z with dot above}ubr - bison-like animal
      > from Central Europe)
      >
      > set spelllang=pl
      > set spelllang=pl,en
      >
      > Correct spelling comes at the top.

      I notice that when adding ",en" the scoring changes. The sound-a-like
      mechanism for English is also used for Polish. Perhaps we should not do
      that? However, if you would have used:
      :set spellang=en,en-math
      Then you do want to use the English sound folding for en-math too.

      Perhaps you can add SOFO items to the Polish spell file? That would
      give better sound folding and suggestions. And we can avoid using the
      English sound folding for Polish.

      > set spelllang=en,pl
      >
      > Strange things happen. I understand English words have preference but
      > there are also other Polish words before {z.}ubr:

      The sound folding appears to change the scoring. It's strange though
      that "en,pl" differs so much from "pl,en".

      > Ideal would be to split suggestions in two lists, one for each language.
      > Unfortunately if I remember correctly this is not possible because Vim
      > creates in memory one, big word list with preferences (in this case) for
      > suggestions taken from "en".

      Making two lists should not be necessary, since the scoring mechanism
      should find the best matching words. Thus it should recognize the
      language implicitly. Perhaps it would be useful to indicate what word
      list the suggestion came from.

      --
      hundred-and-one symptoms of being an internet addict:
      168. You have your own domain name.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
    • Mikolaj Machowski
      ... It would be the best. ... Can Vim recognise difference between en,pl (or any other lang code?) If could make difference for en,pl but use the same technic
      Message 2 of 8 , Sep 3, 2005
        Dnia sobota, 3 września 2005 12:12, Bram Moolenaar napisał:
        > > > - Try if suggestions make sense. You may set 'verbose' to see the
        > >
        > > [cut]
        > >
        > > > As always, suggestions are welcome.
        > >
        > > One thing with suggestions.
        > >
        > > Word: rzubr(badly spelled {z with dot above}ubr - bison-like animal
        > > from Central Europe)
        > >
        > > set spelllang=pl
        > > set spelllang=pl,en
        > >
        > > Correct spelling comes at the top.
        >
        > I notice that when adding ",en" the scoring changes. The sound-a-like
        > mechanism for English is also used for Polish. Perhaps we should not do
        >
        > that?

        It would be the best.

        > However, if you would have used:
        > :set spellang=en,en-math

        Can Vim recognise difference between en,pl (or any other lang code?)

        If could make difference for en,pl but use the same technic for
        en,en-math...

        > Then you do want to use the English sound folding for en-math too.
        >
        > Perhaps you can add SOFO items to the Polish spell file? That would
        > give better sound folding and suggestions. And we can avoid using the
        > English sound folding for Polish.

        Don't think so. As i understand from ":help SOFO" this is
        letter-for-letter mechanism while in Polish there is many
        letter-for-2letters exchanges.

        Also made some tests and only use of REP was making significant
        improvement in suggestions.

        > > set spelllang=en,pl
        > >
        > > Strange things happen. I understand English words have preference but
        > > there are also other Polish words before {z.}ubr:
        >
        > The sound folding appears to change the scoring. It's strange though
        > that "en,pl" differs so much from "pl,en".

        I understood first language is enforcing its rules on second (and
        all next) language. Which is quite logical but as I posted example it
        makes some strange effect.

        > > Ideal would be to split suggestions in two lists, one for each
        > > language. Unfortunately if I remember correctly this is not possible
        > > because Vim creates in memory one, big word list with preferences (in
        > > this case) for suggestions taken from "en".
        >
        > Making two lists should not be necessary, since the scoring mechanism
        > should find the best matching words. Thus it should recognize the
        > language implicitly. Perhaps it would be useful to indicate what word
        > list the suggestion came from.

        Yes. And list could be sorted by this indication (to group them).

        Maybe also Vim could guess which language is currently used.
        I proposed it previously: Get current line with 1 or 2 lines of context
        (3-5 lines total), pass it to spell checking probing each language from
        spelllang separately. Give priority to settings of language with lower
        number of errors.

        Pseudo-code:

        let spelllang_set = &spelllang
        let langlist = split(&spelllang, ',')
        let langbads = {}
        for i in langlist
        let &spelllang = i
        let text = getline(line(".")-2, line(".")+2)
        let wordlist = <- get rid of punctuation and split text by whitespace
        ->
        let counter = 0
        for k in wordlist
        if tolower(k) != tolower(spellsuggest(k,1)[0])
        let counter += 1
        endif
        endfor
        let langbads[counter] = i
        endor

        " Now we have dictionary {"20":"en", "3":"pl"} . This is quite safe to
        " assume is this situation we want to write in pl, so
        let &spelllang = langbads[min(keys(langbads))]
        " Hmm. I remember some problems with remapping of z?
        normal! z?
        let &spelllang = spelllang_set

        It would be faster if made binary. Maybe option for 'spellsuggest':
        "lang:2". number would be number of context lines.

        Remains one problem: special dictionaries. There would be hardly any
        text written entirely in en-math.

        m.
      • Bram Moolenaar
        ... OK, I ll look into using the sound folding only for the language it is specified for. ... The main issue would actually be the additions. This is what
        Message 3 of 8 , Sep 4, 2005
          Mikolaj Machowski wrote:

          > > I notice that when adding ",en" the scoring changes. The
          > > sound-a-like mechanism for English is also used for Polish. Perhaps
          > > we should not do that?
          >
          > It would be the best.

          OK, I'll look into using the sound folding only for the language it is
          specified for.

          > > However, if you would have used:
          > > :set spellang=en,en-math
          >
          > Can Vim recognise difference between en,pl (or any other lang code?)
          >
          > If could make difference for en,pl but use the same technic for
          > en,en-math...

          The main issue would actually be the additions. This is what someone
          adds to his personal dictionary with "zg". You do want sound folding
          for that.

          Otherwise, if there is a language specified with two letters, it would
          be possible to use the same sound folding for other languages with these
          letters that don't specify sound folding itself. That would work for
          "en", "en-math", "en-whatever". Hopefully this isn't too tricky.

          > > Perhaps you can add SOFO items to the Polish spell file? That would
          > > give better sound folding and suggestions. And we can avoid using the
          > > English sound folding for Polish.
          >
          > Don't think so. As i understand from ":help SOFO" this is
          > letter-for-letter mechanism while in Polish there is many
          > letter-for-2letters exchanges.

          You would need to use SAL items them. That's a lot more complicated,
          but also provides the possibility for more accurate sounds-a-like
          matching.

          > Also made some tests and only use of REP was making significant
          > improvement in suggestions.

          OK. You could suggest this to the maintainers of the Polish word list.

          > Maybe also Vim could guess which language is currently used.
          > I proposed it previously: Get current line with 1 or 2 lines of context
          > (3-5 lines total), pass it to spell checking probing each language from
          > spelllang separately. Give priority to settings of language with lower
          > number of errors.

          It's possible, but in border cases this will go wrong. Especially when
          mixing short lines of Polish and English. I also think there is not
          much use for it, since Vim already supports mixing languages.

          --
          "Hit any key to continue" it said, but nothing happened after F sharp.

          /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
          /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
          \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
        • Mikolaj Machowski
          ... see below ... I tested it 2 month ago and only REP has significant improvement in suggestions (SAL only slight). My REP lines are from the beginning of
          Message 4 of 8 , Sep 4, 2005
            Dnia niedziela, 4 września 2005 17:52, Bram Moolenaar napisał:
            > > > Perhaps you can add SOFO items to the Polish spell file? That would
            > > > give better sound folding and suggestions. And we can avoid using
            > > > the English sound folding for Polish.
            > >
            > > Don't think so. As i understand from ":help SOFO" this is
            > > letter-for-letter mechanism while in Polish there is many
            > > letter-for-2letters exchanges.
            >
            > You would need to use SAL items them. That's a lot more complicated,
            > but also provides the possibility for more accurate sounds-a-like
            > matching.
            >
            see below
            > > Also made some tests and only use of REP was making significant
            > > improvement in suggestions.
            >
            > OK. You could suggest this to the maintainers of the Polish word list.

            I tested it 2 month ago and only REP has significant improvement in
            suggestions (SAL only slight). My REP lines are from the beginning of
            July in kurnik files.

            m.
          Your message has been successfully submitted and would be delivered to recipients shortly.