Loading ...
Sorry, an error occurred while loading the content.

Re: Match the BOM

Expand Messages
  • zod
    Thanks, Tony. That was driving me crazy. --~--~---------~--~----~------------~-------~--~----~ You received this message from the vim_multibyte maillist. For
    Message 1 of 5 , Nov 19, 2008
    • 0 Attachment
      Thanks, Tony. That was driving me crazy.
      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • François Pinard
      [Tony Mechelynck] ... Just for the record (and to be pedantic), while the BOM is a Unicode character, the reversed BOM is not part of Unicode. This is more a
      Message 2 of 5 , Nov 20, 2008
      • 0 Attachment
        [Tony Mechelynck]

        >On 19/11/08 18:34, zod wrote:

        >> Does the BOM fall outside of the unicode range that vim's regex
        >> engine uses or do I just have the syntax wrong?

        >The BOM doesn't fall outside the Unicode range, it is not considered
        >part of the text [...]

        Just for the record (and to be pedantic), while the BOM is a Unicode
        character, the reversed BOM is not part of Unicode. This is more
        a philosophical detail than a technical issue. :-)

        --
        François Pinard http://pinard.progiciels-bpi.ca

        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      • Tony Mechelynck
        ... To be still more pedantic, U+FFFE is part of the Unicode range, where it is listed as Not a character , i.e., it is one of the forbidden codepoints
        Message 3 of 5 , Nov 20, 2008
        • 0 Attachment
          On 20/11/08 13:04, François Pinard wrote:
          > [Tony Mechelynck]
          >
          >> On 19/11/08 18:34, zod wrote:
          >
          >>> Does the BOM fall outside of the unicode range that vim's regex
          >>> engine uses or do I just have the syntax wrong?
          >
          >> The BOM doesn't fall outside the Unicode range, it is not considered
          >> part of the text [...]
          >
          > Just for the record (and to be pedantic), while the BOM is a Unicode
          > character, the reversed BOM is not part of Unicode. This is more
          > a philosophical detail than a technical issue. :-)
          >

          To be still more pedantic, U+FFFE is part of the Unicode range, where it
          is listed as "Not a character", i.e., it is one of the "forbidden"
          codepoints which are "in range". (The "original" Unicode range as still
          supported by Vim for UTF-8, UTF-32be and UTF-32le, used to be from
          U+0000 to U+7FFFFFFF. The Unicode Consortium later invalidated, among
          others, (a) all planes above plane 0x10, and (b) the last two codepoints
          U+xxFFFE and U+xxFFFF in every plane, which brings the highest "valid"
          codepoint down to U+10FFFD at most.)


          Best regards,
          Tony.
          --
          His head smashed in, and his heart cut out,
          And his liver removed, and his bowels unplugged,
          And his nostrils raped, and his bottom burned off,
          And his penis split ... and his ...
          "Monty Python and the Holy Grail" PYTHON (MONTY)
          PICTURES LTD

          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_multibyte" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        Your message has been successfully submitted and would be delivered to recipients shortly.