Loading ...
Sorry, an error occurred while loading the content.

Re: "flexwiki" ftplugin causing problems ('bomb')

Expand Messages
  • Ron Aaron
    ... It s common to use the .wiki for any wiki text file; so making both it and .mw load MediaWiki syntax makes sense. -- For privacy, my GPG key signature
    Message 1 of 17 , May 3, 2010
    • 0 Attachment
      On Monday 03 May 2010 23:12:42 Bram Moolenaar wrote:

      > What I did now is to disable recognizing .wiki files as flexwiki.
      > Someone still using these files can re-enable it when needed.
      >
      > I can't find another file format that uses the .wiki extension.
      > Mediawiki uses .mw.

      It's common to use the '.wiki' for any wiki text file; so making both it and
      '.mw' load MediaWiki syntax makes sense.

      --
      For privacy, my GPG key signature is: AD29415D
    • Bram Moolenaar
      ... There is no MediaWiki syntax file. -- hundred-and-one symptoms of being an internet addict: 9. All your daydreaming is preoccupied with getting a faster
      Message 2 of 17 , May 4, 2010
      • 0 Attachment
        Ron Aaron wrote:

        > On Monday 03 May 2010 23:12:42 Bram Moolenaar wrote:
        >
        > > What I did now is to disable recognizing .wiki files as flexwiki.
        > > Someone still using these files can re-enable it when needed.
        > >
        > > I can't find another file format that uses the .wiki extension.
        > > Mediawiki uses .mw.
        >
        > It's common to use the '.wiki' for any wiki text file; so making both
        > it and '.mw' load MediaWiki syntax makes sense.

        There is no MediaWiki syntax file.

        --
        hundred-and-one symptoms of being an internet addict:
        9. All your daydreaming is preoccupied with getting a faster connection to the
        net: 28.8...ISDN...cable modem...T1...T3.

        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
        /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
        \\\ download, build and distribute -- http://www.A-A-P.org ///
        \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php
      • Ron Aaron
        ... Sorry, it s called Wikipedia. -- Sending me something private? Use my GPG public key: AD29415D
        Message 3 of 17 , May 4, 2010
        • 0 Attachment
          On Tuesday 04 May 2010 21:52:54 Bram Moolenaar wrote:
          >

          > There is no MediaWiki syntax file.

          Sorry, it's called Wikipedia.

          --
          Sending me something private?
          Use my GPG public key: AD29415D
        • Charles Campbell
          ... Hello! Ron, Bram was wanting a Wikipedia syntax file. I can t vouch for it, but perhaps you mean the one in:
          Message 4 of 17 , May 4, 2010
          • 0 Attachment
            Ron Aaron wrote:
            > On Tuesday 04 May 2010 21:52:54 Bram Moolenaar wrote:
            >
            >
            >
            >> There is no MediaWiki syntax file.
            >>
            >
            > Sorry, it's called Wikipedia.
            >
            >
            Hello!

            Ron, Bram was wanting a Wikipedia syntax file. I can't vouch for it,
            but perhaps you mean the one in:

            http://www.vim.org/scripts/script.php?script_id=1787

            Regards,
            Chip Campbell

            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php
          • ron
            On May 4, 11:57 pm, Charles Campbell ... I think that s the one I use, yes. -- You received this message from the vim_dev
            Message 5 of 17 , May 4, 2010
            • 0 Attachment
              On May 4, 11:57 pm, Charles Campbell <Charles.E.Campb...@...>
              wrote:

              > Ron, Bram was wanting a Wikipedia syntax file.  I can't vouch for it,
              > but perhaps you mean the one in:
              >
              > http://www.vim.org/scripts/script.php?script_id=1787

              I think that's the one I use, yes.

              --
              You received this message from the "vim_dev" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php
            • Tony Mechelynck
              On 03/05/10 23:45, Lech Lorens wrote: [...] ... Notwithstanding its name, the BOM provides more than just endianness detection. Actually, it is an encoding
              Message 6 of 17 , Jun 27, 2010
              • 0 Attachment
                On 03/05/10 23:45, Lech Lorens wrote:
                [...]
                > I might be totally wrong basing my understanding of BOM and character
                > sets mainly on Wikipedia, but I thought that setting 'bomb' for utf-8
                > encoded files (which does not pose a risk of misinterpreting the
                > contents due to endianness difference) didn't make much sense. For
                > utf-16 that would be another thing.
                >
                > http://en.wikipedia.org/wiki/Byte-order_mark
                >

                Notwithstanding its name, the BOM provides more than just endianness
                detection. Actually, it is an "encoding signal" which allows detecting
                all five of the following encodings, assuming a UTF-16le file won't
                start with a NULL:

                utf-16be FE FF
                utf-16le FF FE
                utf-8 EF BB BF
                utf-32be 00 00 FE FF
                utf-32le FF FE 00 00

                For instance, when I was still on XP, I noticed that WordPad could read
                UTF-8 files but only if they started with a BOM. When writing what it
                called "Unicode", what it produced was UTF-16le with BOM.

                Any file starting 0xEF 0xBB 0xBF can be assumed to be in UTF-8.
                Distinguishing UTF-8 from Latin1 or Windows-1252 would otherwise require
                scanning the whole file, checking for invalid UTF-8 byte sequences.


                Best regards,
                Tony.
                --
                Life is a gift, living is an art. (Bram Moolenaar)

                --
                You received this message from the "vim_dev" maillist.
                Do not top-post! Type your reply below the text you are replying to.
                For more information, visit http://www.vim.org/maillist.php
              • Benjamin R. Haskell
                ... Quoting the same Wikipedia article Lech mentioned: While [the] Unicode standard allows BOM in UTF-8, it does not require or recommend it. and
                Message 7 of 17 , Jun 27, 2010
                • 0 Attachment
                  On Sun, 27 Jun 2010, Tony Mechelynck wrote:

                  > On 03/05/10 23:45, Lech Lorens wrote:
                  > [...]
                  > > I might be totally wrong basing my understanding of BOM and
                  > > character sets mainly on Wikipedia, but I thought that setting
                  > > 'bomb' for utf-8 encoded files (which does not pose a risk of
                  > > misinterpreting the contents due to endianness difference) didn't
                  > > make much sense. For utf-16 that would be another thing.
                  > >
                  > > http://en.wikipedia.org/wiki/Byte-order_mark
                  > >
                  >
                  > Notwithstanding its name, the BOM provides more than just endianness
                  > detection. Actually, it is an "encoding signal" which allows detecting
                  > all five of the following encodings, assuming a UTF-16le file won't
                  > start with a NULL:
                  >
                  > utf-16be FE FF
                  > utf-16le FF FE
                  > utf-8 EF BB BF
                  > utf-32be 00 00 FE FF
                  > utf-32le FF FE 00 00
                  >
                  > For instance, when I was still on XP, I noticed that WordPad could
                  > read UTF-8 files but only if they started with a BOM. When writing
                  > what it called "Unicode", what it produced was UTF-16le with BOM.
                  >
                  > Any file starting 0xEF 0xBB 0xBF can be assumed to be in UTF-8.
                  > Distinguishing UTF-8 from Latin1 or Windows-1252 would otherwise
                  > require scanning the whole file, checking for invalid UTF-8 byte
                  > sequences.

                  Quoting the same Wikipedia article Lech mentioned:

                  "While [the] Unicode standard allows BOM in UTF-8, it does not require
                  or recommend it."

                  and paraphrasing the rest of that paragraph:

                  Using a BOM as the first character of a UTF-8-encoded file can cause
                  problems with the shebang line[1] in Unix-like systems. And
                  UTF-8-capable software is often written to assume UTF-8 unless otherwise
                  directed, so the U+FEFF character at the start of the stream is often
                  interpreted incorrectly.

                  The Unicode UTF-{8,16,32} & BOM FAQ probably worded it better than
                  Wikipedia or I[2].

                  --
                  Best,
                  Ben

                  [1] http://en.wikipedia.org/wiki/Shebang_(Unix)
                  [2] http://unicode.org/faq/utf_bom.html#bom5

                  --
                  You received this message from the "vim_dev" maillist.
                  Do not top-post! Type your reply below the text you are replying to.
                  For more information, visit http://www.vim.org/maillist.php
                • Tony Mechelynck
                  ... Yes, a UTF-8 BOM will interfere with any software that has no knowledge of Unicode and expects some particular magic bytes at the start, or simply won t
                  Message 8 of 17 , Jun 27, 2010
                  • 0 Attachment
                    On 27/06/10 21:21, Benjamin R. Haskell wrote:
                    > On Sun, 27 Jun 2010, Tony Mechelynck wrote:
                    >
                    >> On 03/05/10 23:45, Lech Lorens wrote:
                    >> [...]
                    >>> I might be totally wrong basing my understanding of BOM and
                    >>> character sets mainly on Wikipedia, but I thought that setting
                    >>> 'bomb' for utf-8 encoded files (which does not pose a risk of
                    >>> misinterpreting the contents due to endianness difference) didn't
                    >>> make much sense. For utf-16 that would be another thing.
                    >>>
                    >>> http://en.wikipedia.org/wiki/Byte-order_mark
                    >>>
                    >>
                    >> Notwithstanding its name, the BOM provides more than just endianness
                    >> detection. Actually, it is an "encoding signal" which allows detecting
                    >> all five of the following encodings, assuming a UTF-16le file won't
                    >> start with a NULL:
                    >>
                    >> utf-16be FE FF
                    >> utf-16le FF FE
                    >> utf-8 EF BB BF
                    >> utf-32be 00 00 FE FF
                    >> utf-32le FF FE 00 00
                    >>
                    >> For instance, when I was still on XP, I noticed that WordPad could
                    >> read UTF-8 files but only if they started with a BOM. When writing
                    >> what it called "Unicode", what it produced was UTF-16le with BOM.
                    >>
                    >> Any file starting 0xEF 0xBB 0xBF can be assumed to be in UTF-8.
                    >> Distinguishing UTF-8 from Latin1 or Windows-1252 would otherwise
                    >> require scanning the whole file, checking for invalid UTF-8 byte
                    >> sequences.
                    >
                    > Quoting the same Wikipedia article Lech mentioned:
                    >
                    > "While [the] Unicode standard allows BOM in UTF-8, it does not require
                    > or recommend it."
                    >
                    > and paraphrasing the rest of that paragraph:
                    >
                    > Using a BOM as the first character of a UTF-8-encoded file can cause
                    > problems with the shebang line[1] in Unix-like systems. And
                    > UTF-8-capable software is often written to assume UTF-8 unless otherwise
                    > directed, so the U+FEFF character at the start of the stream is often
                    > interpreted incorrectly.
                    >
                    > The Unicode UTF-{8,16,32}& BOM FAQ probably worded it better than
                    > Wikipedia or I[2].
                    >

                    Yes, a UTF-8 BOM will interfere with any software that has no knowledge
                    of Unicode and expects some particular "magic bytes" at the start, or
                    simply won't accept 0xEF 0xBB 0xBF at the start of a document. The #!
                    shebang is just one example.

                    OTOH, in filetypes where UTF-8 is but one possibility among many, the
                    BOM is useful to specify the encoding or to confirm what was set
                    otherwise. Examples:

                    - HTML charset can be set by the HTTP "Content-Type" header (in an HTTP
                    or HTTPS transaction extrernal to the file), in a <meta
                    http-equiv="Content-Type" content="text/html; charset=something"> tag
                    (replacing "something" by the charset) within the <head> section, or by
                    a BOM. There are even official priority rules that tell browsers what to
                    do when two or three of the above are present (and they are necessary,
                    because -I'm told- some braindead hosts will send "Content-Type:
                    text/html; charset=iso-8859-1" for any *.htm or *.html file regardless
                    of BOM or <meta> tags).

                    - CSS charset can be set by a BOM.

                    - XML charset can be set (IIRC) by a <? header line or by a BOM

                    - XHTML is both HTML and XML so the methods of both apply to it.

                    Personally I use the following rules of thumb:

                    - Add a BOM to Unicode files meant for use by a browser.
                    - Don't add it to UTF-8 files mostly in US-ASCII (possibly with
                    codepoints above 0x7F in literals and comments) if they're meant for use
                    by a shell, the 'make' utility, or a compiler.
                    - Some Windows programs won't read UTF-8 correctly unless a BOM is present.
                    - On Windows, when a system file is said to be in 'Unicode' that usually
                    means UTF-16le with BOM.
                    - Vim helpfiles in a single directory must either all have a BOM, or
                    (recommended) all lack a BOM. If some have one and others not, the
                    ":helptags" command will abort with an error.

                    This does not explicitly cover all cases; when it doesn't (or in the
                    cases where some of the above rules conflict), I proceed by analogy and
                    by trial and error.


                    Best regards,
                    Tony.
                    --
                    One man's brain plus one other will produce one half as many ideas as
                    one man would have produced alone. These two plus two more will
                    produce half again as many ideas. These four plus four more begin to
                    represent a creative meeting, and the ratio changes to one quarter as
                    many ...
                    -- Anthony Chevins

                    --
                    You received this message from the "vim_dev" maillist.
                    Do not top-post! Type your reply below the text you are replying to.
                    For more information, visit http://www.vim.org/maillist.php
                  Your message has been successfully submitted and would be delivered to recipients shortly.