Loading ...
Sorry, an error occurred while loading the content.

Re: "flexwiki" ftplugin causing problems ('bomb')

Expand Messages
  • Bram Moolenaar
    ... There is no MediaWiki syntax file. -- hundred-and-one symptoms of being an internet addict: 9. All your daydreaming is preoccupied with getting a faster
    Message 1 of 17 , May 4, 2010
    • 0 Attachment
      Ron Aaron wrote:

      > On Monday 03 May 2010 23:12:42 Bram Moolenaar wrote:
      >
      > > What I did now is to disable recognizing .wiki files as flexwiki.
      > > Someone still using these files can re-enable it when needed.
      > >
      > > I can't find another file format that uses the .wiki extension.
      > > Mediawiki uses .mw.
      >
      > It's common to use the '.wiki' for any wiki text file; so making both
      > it and '.mw' load MediaWiki syntax makes sense.

      There is no MediaWiki syntax file.

      --
      hundred-and-one symptoms of being an internet addict:
      9. All your daydreaming is preoccupied with getting a faster connection to the
      net: 28.8...ISDN...cable modem...T1...T3.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ download, build and distribute -- http://www.A-A-P.org ///
      \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php
    • Ron Aaron
      ... Sorry, it s called Wikipedia. -- Sending me something private? Use my GPG public key: AD29415D
      Message 2 of 17 , May 4, 2010
      • 0 Attachment
        On Tuesday 04 May 2010 21:52:54 Bram Moolenaar wrote:
        >

        > There is no MediaWiki syntax file.

        Sorry, it's called Wikipedia.

        --
        Sending me something private?
        Use my GPG public key: AD29415D
      • Charles Campbell
        ... Hello! Ron, Bram was wanting a Wikipedia syntax file. I can t vouch for it, but perhaps you mean the one in:
        Message 3 of 17 , May 4, 2010
        • 0 Attachment
          Ron Aaron wrote:
          > On Tuesday 04 May 2010 21:52:54 Bram Moolenaar wrote:
          >
          >
          >
          >> There is no MediaWiki syntax file.
          >>
          >
          > Sorry, it's called Wikipedia.
          >
          >
          Hello!

          Ron, Bram was wanting a Wikipedia syntax file. I can't vouch for it,
          but perhaps you mean the one in:

          http://www.vim.org/scripts/script.php?script_id=1787

          Regards,
          Chip Campbell

          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php
        • ron
          On May 4, 11:57 pm, Charles Campbell ... I think that s the one I use, yes. -- You received this message from the vim_dev
          Message 4 of 17 , May 4, 2010
          • 0 Attachment
            On May 4, 11:57 pm, Charles Campbell <Charles.E.Campb...@...>
            wrote:

            > Ron, Bram was wanting a Wikipedia syntax file.  I can't vouch for it,
            > but perhaps you mean the one in:
            >
            > http://www.vim.org/scripts/script.php?script_id=1787

            I think that's the one I use, yes.

            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php
          • Tony Mechelynck
            On 03/05/10 23:45, Lech Lorens wrote: [...] ... Notwithstanding its name, the BOM provides more than just endianness detection. Actually, it is an encoding
            Message 5 of 17 , Jun 27, 2010
            • 0 Attachment
              On 03/05/10 23:45, Lech Lorens wrote:
              [...]
              > I might be totally wrong basing my understanding of BOM and character
              > sets mainly on Wikipedia, but I thought that setting 'bomb' for utf-8
              > encoded files (which does not pose a risk of misinterpreting the
              > contents due to endianness difference) didn't make much sense. For
              > utf-16 that would be another thing.
              >
              > http://en.wikipedia.org/wiki/Byte-order_mark
              >

              Notwithstanding its name, the BOM provides more than just endianness
              detection. Actually, it is an "encoding signal" which allows detecting
              all five of the following encodings, assuming a UTF-16le file won't
              start with a NULL:

              utf-16be FE FF
              utf-16le FF FE
              utf-8 EF BB BF
              utf-32be 00 00 FE FF
              utf-32le FF FE 00 00

              For instance, when I was still on XP, I noticed that WordPad could read
              UTF-8 files but only if they started with a BOM. When writing what it
              called "Unicode", what it produced was UTF-16le with BOM.

              Any file starting 0xEF 0xBB 0xBF can be assumed to be in UTF-8.
              Distinguishing UTF-8 from Latin1 or Windows-1252 would otherwise require
              scanning the whole file, checking for invalid UTF-8 byte sequences.


              Best regards,
              Tony.
              --
              Life is a gift, living is an art. (Bram Moolenaar)

              --
              You received this message from the "vim_dev" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php
            • Benjamin R. Haskell
              ... Quoting the same Wikipedia article Lech mentioned: While [the] Unicode standard allows BOM in UTF-8, it does not require or recommend it. and
              Message 6 of 17 , Jun 27, 2010
              • 0 Attachment
                On Sun, 27 Jun 2010, Tony Mechelynck wrote:

                > On 03/05/10 23:45, Lech Lorens wrote:
                > [...]
                > > I might be totally wrong basing my understanding of BOM and
                > > character sets mainly on Wikipedia, but I thought that setting
                > > 'bomb' for utf-8 encoded files (which does not pose a risk of
                > > misinterpreting the contents due to endianness difference) didn't
                > > make much sense. For utf-16 that would be another thing.
                > >
                > > http://en.wikipedia.org/wiki/Byte-order_mark
                > >
                >
                > Notwithstanding its name, the BOM provides more than just endianness
                > detection. Actually, it is an "encoding signal" which allows detecting
                > all five of the following encodings, assuming a UTF-16le file won't
                > start with a NULL:
                >
                > utf-16be FE FF
                > utf-16le FF FE
                > utf-8 EF BB BF
                > utf-32be 00 00 FE FF
                > utf-32le FF FE 00 00
                >
                > For instance, when I was still on XP, I noticed that WordPad could
                > read UTF-8 files but only if they started with a BOM. When writing
                > what it called "Unicode", what it produced was UTF-16le with BOM.
                >
                > Any file starting 0xEF 0xBB 0xBF can be assumed to be in UTF-8.
                > Distinguishing UTF-8 from Latin1 or Windows-1252 would otherwise
                > require scanning the whole file, checking for invalid UTF-8 byte
                > sequences.

                Quoting the same Wikipedia article Lech mentioned:

                "While [the] Unicode standard allows BOM in UTF-8, it does not require
                or recommend it."

                and paraphrasing the rest of that paragraph:

                Using a BOM as the first character of a UTF-8-encoded file can cause
                problems with the shebang line[1] in Unix-like systems. And
                UTF-8-capable software is often written to assume UTF-8 unless otherwise
                directed, so the U+FEFF character at the start of the stream is often
                interpreted incorrectly.

                The Unicode UTF-{8,16,32} & BOM FAQ probably worded it better than
                Wikipedia or I[2].

                --
                Best,
                Ben

                [1] http://en.wikipedia.org/wiki/Shebang_(Unix)
                [2] http://unicode.org/faq/utf_bom.html#bom5

                --
                You received this message from the "vim_dev" maillist.
                Do not top-post! Type your reply below the text you are replying to.
                For more information, visit http://www.vim.org/maillist.php
              • Tony Mechelynck
                ... Yes, a UTF-8 BOM will interfere with any software that has no knowledge of Unicode and expects some particular magic bytes at the start, or simply won t
                Message 7 of 17 , Jun 27, 2010
                • 0 Attachment
                  On 27/06/10 21:21, Benjamin R. Haskell wrote:
                  > On Sun, 27 Jun 2010, Tony Mechelynck wrote:
                  >
                  >> On 03/05/10 23:45, Lech Lorens wrote:
                  >> [...]
                  >>> I might be totally wrong basing my understanding of BOM and
                  >>> character sets mainly on Wikipedia, but I thought that setting
                  >>> 'bomb' for utf-8 encoded files (which does not pose a risk of
                  >>> misinterpreting the contents due to endianness difference) didn't
                  >>> make much sense. For utf-16 that would be another thing.
                  >>>
                  >>> http://en.wikipedia.org/wiki/Byte-order_mark
                  >>>
                  >>
                  >> Notwithstanding its name, the BOM provides more than just endianness
                  >> detection. Actually, it is an "encoding signal" which allows detecting
                  >> all five of the following encodings, assuming a UTF-16le file won't
                  >> start with a NULL:
                  >>
                  >> utf-16be FE FF
                  >> utf-16le FF FE
                  >> utf-8 EF BB BF
                  >> utf-32be 00 00 FE FF
                  >> utf-32le FF FE 00 00
                  >>
                  >> For instance, when I was still on XP, I noticed that WordPad could
                  >> read UTF-8 files but only if they started with a BOM. When writing
                  >> what it called "Unicode", what it produced was UTF-16le with BOM.
                  >>
                  >> Any file starting 0xEF 0xBB 0xBF can be assumed to be in UTF-8.
                  >> Distinguishing UTF-8 from Latin1 or Windows-1252 would otherwise
                  >> require scanning the whole file, checking for invalid UTF-8 byte
                  >> sequences.
                  >
                  > Quoting the same Wikipedia article Lech mentioned:
                  >
                  > "While [the] Unicode standard allows BOM in UTF-8, it does not require
                  > or recommend it."
                  >
                  > and paraphrasing the rest of that paragraph:
                  >
                  > Using a BOM as the first character of a UTF-8-encoded file can cause
                  > problems with the shebang line[1] in Unix-like systems. And
                  > UTF-8-capable software is often written to assume UTF-8 unless otherwise
                  > directed, so the U+FEFF character at the start of the stream is often
                  > interpreted incorrectly.
                  >
                  > The Unicode UTF-{8,16,32}& BOM FAQ probably worded it better than
                  > Wikipedia or I[2].
                  >

                  Yes, a UTF-8 BOM will interfere with any software that has no knowledge
                  of Unicode and expects some particular "magic bytes" at the start, or
                  simply won't accept 0xEF 0xBB 0xBF at the start of a document. The #!
                  shebang is just one example.

                  OTOH, in filetypes where UTF-8 is but one possibility among many, the
                  BOM is useful to specify the encoding or to confirm what was set
                  otherwise. Examples:

                  - HTML charset can be set by the HTTP "Content-Type" header (in an HTTP
                  or HTTPS transaction extrernal to the file), in a <meta
                  http-equiv="Content-Type" content="text/html; charset=something"> tag
                  (replacing "something" by the charset) within the <head> section, or by
                  a BOM. There are even official priority rules that tell browsers what to
                  do when two or three of the above are present (and they are necessary,
                  because -I'm told- some braindead hosts will send "Content-Type:
                  text/html; charset=iso-8859-1" for any *.htm or *.html file regardless
                  of BOM or <meta> tags).

                  - CSS charset can be set by a BOM.

                  - XML charset can be set (IIRC) by a <? header line or by a BOM

                  - XHTML is both HTML and XML so the methods of both apply to it.

                  Personally I use the following rules of thumb:

                  - Add a BOM to Unicode files meant for use by a browser.
                  - Don't add it to UTF-8 files mostly in US-ASCII (possibly with
                  codepoints above 0x7F in literals and comments) if they're meant for use
                  by a shell, the 'make' utility, or a compiler.
                  - Some Windows programs won't read UTF-8 correctly unless a BOM is present.
                  - On Windows, when a system file is said to be in 'Unicode' that usually
                  means UTF-16le with BOM.
                  - Vim helpfiles in a single directory must either all have a BOM, or
                  (recommended) all lack a BOM. If some have one and others not, the
                  ":helptags" command will abort with an error.

                  This does not explicitly cover all cases; when it doesn't (or in the
                  cases where some of the above rules conflict), I proceed by analogy and
                  by trial and error.


                  Best regards,
                  Tony.
                  --
                  One man's brain plus one other will produce one half as many ideas as
                  one man would have produced alone. These two plus two more will
                  produce half again as many ideas. These four plus four more begin to
                  represent a creative meeting, and the ratio changes to one quarter as
                  many ...
                  -- Anthony Chevins

                  --
                  You received this message from the "vim_dev" maillist.
                  Do not top-post! Type your reply below the text you are replying to.
                  For more information, visit http://www.vim.org/maillist.php
                Your message has been successfully submitted and would be delivered to recipients shortly.