Loading ...
Sorry, an error occurred while loading the content.

More spell files available

Expand Messages
  • Bram Moolenaar
    I have made spell files for all the languages that Myspell supports. You can download them directly from: ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/
    Message 1 of 8 , Sep 1, 2005
    • 0 Attachment
      I have made spell files for all the languages that Myspell supports.
      You can download them directly from:
      ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/

      This requires a recent snapshot, since the spell file format has been
      changed about a week ago. I don't expect the format to change again,
      since I now use sections for different parts of information. Thus when
      adding something for compound words, spell files that don't use compound
      words will remain valid.

      The Finnish one is missing, the zip file that Myspell refers to is for
      hyphenation, not spelling. I don't know where to find the actual
      Finnish .aff and .dic files.

      I'm still working on compound words. This applies to languages such as
      Hungarian and German. Unfortunately, there is no good definition for
      this yet. The Myspell and Aspell support is very limited. I'm working
      together with hunspell to improve this
      (http://hunspell.sourceforge.net/).

      Please try the spell files and check for problems:

      - If good words are flagged as wrong, or bad words are not flagged,
      first check if the word is in the Myspell word list. If the list
      appears to be OK it would be a bug in Vim, report it to me, with an
      example. If the word list is wrong try contacting the maintainer
      (he should be mentioned in the README file). If you fail to reach the
      maintainer we could fix it for Vim only, making a diff to the Myspell
      word list.

      - Try if suggestions make sense. You may set 'verbose' to see the
      score. If the right suggestion isn't given or there are suggestions
      for bad words, report it to me. If the scoring isn't very good, check
      the items in the .aff file, such as MAP (for accented letters) and REP
      (replacing letter groups).

      If you want to try out building the spell files yourself and perhaps
      make a few changes, get the latest snapshot
      (http://www.vim.org/develop.php) and install Aap
      (http://www.a-a-p.org/download.html). After building Vim run "aap" in
      the spell directory of your language and Aap will fetch the files, apply
      patches and build the spell file.

      As always, suggestions are welcome.

      --
      hundred-and-one symptoms of being an internet addict:
      154. You fondle your mouse.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
    • Tony Mechelynck
      ... From: Bram Moolenaar To: Sent: Thursday, September 01, 2005 6:09 PM Subject: More spell files available ... [...]
      Message 2 of 8 , Sep 1, 2005
      • 0 Attachment
        ----- Original Message -----
        From: "Bram Moolenaar" <Bram@...>
        To: <vim-dev@...>
        Sent: Thursday, September 01, 2005 6:09 PM
        Subject: More spell files available


        >
        > I have made spell files for all the languages that Myspell supports.
        > You can download them directly from:
        > ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/
        >
        > This requires a recent snapshot, since the spell file format has been
        > changed about a week ago. I don't expect the format to change again,
        > since I now use sections for different parts of information. Thus when
        > adding something for compound words, spell files that don't use compound
        > words will remain valid.
        [...]

        OK, if the next snapshot includes them they will find themselves in my next
        W32 distribution, see http://users.skynet.be/antoine.mechelynck/vim/ --
        check the "last change" at the top of the page, then click "The experimental
        Vim 7" to get to the relevant paragraph. (Read before downloading! No
        warranty, no reimbursements).

        However, since I go to town this evening, I don't know exactly _when_ I will
        compile the next snapshot. That's why it's important to check the change
        date (at top of page) and/or the snapshot number (part of the .zip filename;
        yesterday's was 0139). The "last change" timestamp is in UTC, which is 2
        hours earlier than the current "official time" where I live (Central
        European summer time, zone +0200). OTOH the "compile date/time" in the
        ":version" listing of my builds is my "time zone time" which explains why
        the "compile date" in the ":version" listing can be later (by up to approx.
        2 hours) to the "change date" on the HTML page. If you know your own time
        zone (compared to UTC) you can determine when my latest build was produced.
        <OFFTOPIC>
        Don't confuse adding and subtracting: the sun rises in the East and sets in
        the West, therefore "midday" happens progressively later, the more you go to
        the West (on a world map whose left and right edges coincide with the
        International Date Line, which runs approximately but not exactly
        North-South through the Pacific and makes a zigzag between Kamchatka and
        Alaska). IOW when it is 12:00 UTC (in 24-hour notation, i.e., midday) it is
        "morning" (on the same day) in the Americas and "afternoon" (also on the
        same day) in most of the other continents.
        </OFFTOPIC>


        Best regards,
        Tony.
      • Tony Mechelynck
        ... From: Tony Mechelynck To: ; Bram Moolenaar Sent: Thursday, September 01, 2005 6:55
        Message 3 of 8 , Sep 1, 2005
        • 0 Attachment
          ----- Original Message -----
          From: "Tony Mechelynck" <antoine.mechelynck@...>
          To: <vim-dev@...>; "Bram Moolenaar" <Bram@...>
          Sent: Thursday, September 01, 2005 6:55 PM
          Subject: Re: More spell files available


          > ----- Original Message -----
          > From: "Bram Moolenaar" <Bram@...>
          > To: <vim-dev@...>
          > Sent: Thursday, September 01, 2005 6:09 PM
          > Subject: More spell files available
          >
          >
          >>
          >> I have made spell files for all the languages that Myspell supports.
          >> You can download them directly from:
          >> ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/
          >>
          >> This requires a recent snapshot, since the spell file format has been
          >> changed about a week ago. I don't expect the format to change again,
          >> since I now use sections for different parts of information. Thus when
          >> adding something for compound words, spell files that don't use compound
          >> words will remain valid.
          > [...]
          >
          > OK, if the next snapshot includes them they will find themselves in my
          > next W32 distribution, see
          > http://users.skynet.be/antoine.mechelynck/vim/ -- check the "last change"
          > at the top of the page, then click "The experimental Vim 7" to get to the
          > relevant paragraph. (Read before downloading! No warranty, no
          > reimbursements).
          >
          > However, since I go to town this evening, I don't know exactly _when_ I
          > will compile the next snapshot. That's why it's important to check the
          > change date (at top of page) and/or the snapshot number (part of the .zip
          > filename; yesterday's was 0139). [...]

          OK, it's done. Snapshot 140 distribution for W32 has just been uploaded and
          there are indeed a lot of new spell-related files which didn't exist
          yesterday.

          Happy Vimming!
          Tony.
        • Mikolaj Machowski
          ... [cut] ... One thing with suggestions. Word: rzubr(badly spelled {z with dot above}ubr - bison-like animal from Central Europe) set spelllang=pl set
          Message 4 of 8 , Sep 2, 2005
          • 0 Attachment
            Dnia czwartek, 1 września 2005 18:09, Bram Moolenaar napisał:
            > - Try if suggestions make sense. You may set 'verbose' to see the
            [cut]
            > As always, suggestions are welcome.

            One thing with suggestions.

            Word: rzubr(badly spelled {z with dot above}ubr - bison-like animal
            from Central Europe)

            set spelllang=pl
            set spelllang=pl,en

            Correct spelling comes at the top.

            set spelllang=en,pl

            Strange things happen. I understand English words have preference but
            there are also other Polish words before {z.}ubr:

            1 "Rysu br"
            2 "Sabr"
            3 "Issuer"
            4 "Rubra"
            5 "Rs suer"
            6 "Rósłby" <- R{oacute}słby
            7 "Tsuby"
            8 "Tsuba"
            9 "Tsubo"
            10 "Tsubą" <- Tsub{aogonek}
            11 "Tsubę" <- Tsub{eogonek}
            12 "Reube"
            13 "Rubs"
            14 "Ruby"
            15 "Subj"
            16 "Subs"
            17 "Tsub"
            18 "Sutr"
            19 "Reub"
            20 "Rube"
            21 "Rubi"
            22 "Ruhr"
            23 "Żubr" <- correct word {Z.}ubr

            Ideal would be to split suggestions in two lists, one for each language.
            Unfortunately if I remember correctly this is not possible because Vim
            creates in memory one, big word list with preferences (in this case) for
            suggestions taken from "en".

            m.
          • Bram Moolenaar
            ... I notice that when adding ,en the scoring changes. The sound-a-like mechanism for English is also used for Polish. Perhaps we should not do ... Then
            Message 5 of 8 , Sep 3, 2005
            • 0 Attachment
              Mikolaj Machowski wrote:

              > Dnia czwartek, 1 wrze¶nia 2005 18:09, Bram Moolenaar napisa³:
              > > - Try if suggestions make sense. You may set 'verbose' to see the
              > [cut]
              > > As always, suggestions are welcome.
              >
              > One thing with suggestions.
              >
              > Word: rzubr(badly spelled {z with dot above}ubr - bison-like animal
              > from Central Europe)
              >
              > set spelllang=pl
              > set spelllang=pl,en
              >
              > Correct spelling comes at the top.

              I notice that when adding ",en" the scoring changes. The sound-a-like
              mechanism for English is also used for Polish. Perhaps we should not do
              that? However, if you would have used:
              :set spellang=en,en-math
              Then you do want to use the English sound folding for en-math too.

              Perhaps you can add SOFO items to the Polish spell file? That would
              give better sound folding and suggestions. And we can avoid using the
              English sound folding for Polish.

              > set spelllang=en,pl
              >
              > Strange things happen. I understand English words have preference but
              > there are also other Polish words before {z.}ubr:

              The sound folding appears to change the scoring. It's strange though
              that "en,pl" differs so much from "pl,en".

              > Ideal would be to split suggestions in two lists, one for each language.
              > Unfortunately if I remember correctly this is not possible because Vim
              > creates in memory one, big word list with preferences (in this case) for
              > suggestions taken from "en".

              Making two lists should not be necessary, since the scoring mechanism
              should find the best matching words. Thus it should recognize the
              language implicitly. Perhaps it would be useful to indicate what word
              list the suggestion came from.

              --
              hundred-and-one symptoms of being an internet addict:
              168. You have your own domain name.

              /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
              /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
              \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
              \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
            • Mikolaj Machowski
              ... It would be the best. ... Can Vim recognise difference between en,pl (or any other lang code?) If could make difference for en,pl but use the same technic
              Message 6 of 8 , Sep 3, 2005
              • 0 Attachment
                Dnia sobota, 3 września 2005 12:12, Bram Moolenaar napisał:
                > > > - Try if suggestions make sense. You may set 'verbose' to see the
                > >
                > > [cut]
                > >
                > > > As always, suggestions are welcome.
                > >
                > > One thing with suggestions.
                > >
                > > Word: rzubr(badly spelled {z with dot above}ubr - bison-like animal
                > > from Central Europe)
                > >
                > > set spelllang=pl
                > > set spelllang=pl,en
                > >
                > > Correct spelling comes at the top.
                >
                > I notice that when adding ",en" the scoring changes. The sound-a-like
                > mechanism for English is also used for Polish. Perhaps we should not do
                >
                > that?

                It would be the best.

                > However, if you would have used:
                > :set spellang=en,en-math

                Can Vim recognise difference between en,pl (or any other lang code?)

                If could make difference for en,pl but use the same technic for
                en,en-math...

                > Then you do want to use the English sound folding for en-math too.
                >
                > Perhaps you can add SOFO items to the Polish spell file? That would
                > give better sound folding and suggestions. And we can avoid using the
                > English sound folding for Polish.

                Don't think so. As i understand from ":help SOFO" this is
                letter-for-letter mechanism while in Polish there is many
                letter-for-2letters exchanges.

                Also made some tests and only use of REP was making significant
                improvement in suggestions.

                > > set spelllang=en,pl
                > >
                > > Strange things happen. I understand English words have preference but
                > > there are also other Polish words before {z.}ubr:
                >
                > The sound folding appears to change the scoring. It's strange though
                > that "en,pl" differs so much from "pl,en".

                I understood first language is enforcing its rules on second (and
                all next) language. Which is quite logical but as I posted example it
                makes some strange effect.

                > > Ideal would be to split suggestions in two lists, one for each
                > > language. Unfortunately if I remember correctly this is not possible
                > > because Vim creates in memory one, big word list with preferences (in
                > > this case) for suggestions taken from "en".
                >
                > Making two lists should not be necessary, since the scoring mechanism
                > should find the best matching words. Thus it should recognize the
                > language implicitly. Perhaps it would be useful to indicate what word
                > list the suggestion came from.

                Yes. And list could be sorted by this indication (to group them).

                Maybe also Vim could guess which language is currently used.
                I proposed it previously: Get current line with 1 or 2 lines of context
                (3-5 lines total), pass it to spell checking probing each language from
                spelllang separately. Give priority to settings of language with lower
                number of errors.

                Pseudo-code:

                let spelllang_set = &spelllang
                let langlist = split(&spelllang, ',')
                let langbads = {}
                for i in langlist
                let &spelllang = i
                let text = getline(line(".")-2, line(".")+2)
                let wordlist = <- get rid of punctuation and split text by whitespace
                ->
                let counter = 0
                for k in wordlist
                if tolower(k) != tolower(spellsuggest(k,1)[0])
                let counter += 1
                endif
                endfor
                let langbads[counter] = i
                endor

                " Now we have dictionary {"20":"en", "3":"pl"} . This is quite safe to
                " assume is this situation we want to write in pl, so
                let &spelllang = langbads[min(keys(langbads))]
                " Hmm. I remember some problems with remapping of z?
                normal! z?
                let &spelllang = spelllang_set

                It would be faster if made binary. Maybe option for 'spellsuggest':
                "lang:2". number would be number of context lines.

                Remains one problem: special dictionaries. There would be hardly any
                text written entirely in en-math.

                m.
              • Bram Moolenaar
                ... OK, I ll look into using the sound folding only for the language it is specified for. ... The main issue would actually be the additions. This is what
                Message 7 of 8 , Sep 4, 2005
                • 0 Attachment
                  Mikolaj Machowski wrote:

                  > > I notice that when adding ",en" the scoring changes. The
                  > > sound-a-like mechanism for English is also used for Polish. Perhaps
                  > > we should not do that?
                  >
                  > It would be the best.

                  OK, I'll look into using the sound folding only for the language it is
                  specified for.

                  > > However, if you would have used:
                  > > :set spellang=en,en-math
                  >
                  > Can Vim recognise difference between en,pl (or any other lang code?)
                  >
                  > If could make difference for en,pl but use the same technic for
                  > en,en-math...

                  The main issue would actually be the additions. This is what someone
                  adds to his personal dictionary with "zg". You do want sound folding
                  for that.

                  Otherwise, if there is a language specified with two letters, it would
                  be possible to use the same sound folding for other languages with these
                  letters that don't specify sound folding itself. That would work for
                  "en", "en-math", "en-whatever". Hopefully this isn't too tricky.

                  > > Perhaps you can add SOFO items to the Polish spell file? That would
                  > > give better sound folding and suggestions. And we can avoid using the
                  > > English sound folding for Polish.
                  >
                  > Don't think so. As i understand from ":help SOFO" this is
                  > letter-for-letter mechanism while in Polish there is many
                  > letter-for-2letters exchanges.

                  You would need to use SAL items them. That's a lot more complicated,
                  but also provides the possibility for more accurate sounds-a-like
                  matching.

                  > Also made some tests and only use of REP was making significant
                  > improvement in suggestions.

                  OK. You could suggest this to the maintainers of the Polish word list.

                  > Maybe also Vim could guess which language is currently used.
                  > I proposed it previously: Get current line with 1 or 2 lines of context
                  > (3-5 lines total), pass it to spell checking probing each language from
                  > spelllang separately. Give priority to settings of language with lower
                  > number of errors.

                  It's possible, but in border cases this will go wrong. Especially when
                  mixing short lines of Polish and English. I also think there is not
                  much use for it, since Vim already supports mixing languages.

                  --
                  "Hit any key to continue" it said, but nothing happened after F sharp.

                  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                  /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                  \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                  \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
                • Mikolaj Machowski
                  ... see below ... I tested it 2 month ago and only REP has significant improvement in suggestions (SAL only slight). My REP lines are from the beginning of
                  Message 8 of 8 , Sep 4, 2005
                  • 0 Attachment
                    Dnia niedziela, 4 września 2005 17:52, Bram Moolenaar napisał:
                    > > > Perhaps you can add SOFO items to the Polish spell file? That would
                    > > > give better sound folding and suggestions. And we can avoid using
                    > > > the English sound folding for Polish.
                    > >
                    > > Don't think so. As i understand from ":help SOFO" this is
                    > > letter-for-letter mechanism while in Polish there is many
                    > > letter-for-2letters exchanges.
                    >
                    > You would need to use SAL items them. That's a lot more complicated,
                    > but also provides the possibility for more accurate sounds-a-like
                    > matching.
                    >
                    see below
                    > > Also made some tests and only use of REP was making significant
                    > > improvement in suggestions.
                    >
                    > OK. You could suggest this to the maintainers of the Polish word list.

                    I tested it 2 month ago and only REP has significant improvement in
                    suggestions (SAL only slight). My REP lines are from the beginning of
                    July in kurnik files.

                    m.
                  Your message has been successfully submitted and would be delivered to recipients shortly.