Loading ...
Sorry, an error occurred while loading the content.
 

Re: Failed to drag&drop-open a file with wide-chars in its filename

Expand Messages
  • björn
    Hi Eljay, ... Thanks for clarifying that (and for the links!). ... I read that Windows uses NFKC. Have you got a reference for the claim that NFC is used? ...
    Message 1 of 9 , Jun 24, 2009
      Hi Eljay,

      2009/6/23 John (Eljay) Love-Jensen:
      >
      >> As far as I can tell (from searching around) HFS+ always uses
      >> normalization form D (NFD) for filenames.
      >
      > HFS+ uses a variant of NFD for filenames.  (The HFS+ variant predates
      > standardizatoin of NFD.)  This requirement is enforced by the OS.
      >
      > http://developer.apple.com/technotes/tn/tn1150.html
      > http://developer.apple.com/technotes/tn/tn1150table.html
      > http://developer.apple.com/qa/qa2001/qa1235.html
      > http://www.unicode.org/reports/tr15/

      Thanks for clarifying that (and for the links!).

      > Windows uses NFC for filenames.  I'm not sure if the Linux world settled on
      > NFC or NFK.

      I read that Windows uses NFKC. Have you got a reference for the claim
      that NFC is used?

      >> So as a workaround for the issue the OP had I now normalize filenames
      >> to compatibility form C (NFKC) before passing the filename on to Vim
      >> and this takes care of the OP's problem.
      >
      > NFC or NFKC?  Those are different normalizations.
      >
      > Windows NTFS file system uses NFC.  But it isn't enforced by the OS, yet.

      I did mean the compatibility form NFKC since I read somewhere that
      NTFS uses NFKC, but I did not research that very carefully.


      >> However, as I see it this really is a legitimate issue in Vim itself
      >> in that it does not handle NFD properly (the example above should
      >> always render as one glyph, not three as it does now if NFD is used).
      >> Either Vim should ensure that all buffers are normalized to composed
      >> form NFC/NFKC or it needs to be made "NFD aware".
      >
      > I agree with your assessment.
      >
      >> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
      >> well) have any comments on this?
      >
      > The relevant Mac OS X routine APIs are:
      >
      > CFURLRef url =
      > CFURLCreateWithFileSystemPath(
      >  kCFAllocatorDefault,
      >  cfstringFullPath,
      >  kCFURLPOSIXPathStyle,
      >  false));
      >
      > char bufferUTF8[32768*4]; // Worst case scenario.
      > // As per Apple documentation, paths can be "up to 30,000 UTF-16
      > // encoding units long", with each component being up to 255 UTF-16
      > // encoding units long.  Too bad there isn't an API to specify the
      > // exact buffer size /a priori/.
      >
      > Boolean success =
      > CFURLGetFileSystemRepresentation(
      >  url,
      >  true,
      >  &bufferUTF8[0],
      >  sizeof bufferUTF8);

      Thanks. NSString has a method called fileSystemRepresentation which
      I'm guessing does the same thing(?). I used the NSString method
      precomposedStringWithCompatibilityMapping to convert to NFKC.

      Björn

      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_mac" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Tony Mechelynck
      ... Hm, NFKC and NFKD sometimes fuse slightly different glyphs into a single normalized form. For instance, NFKC(²) = 2, though both are (different) Latin1
      Message 2 of 9 , Jun 24, 2009
        On 24/06/09 14:00, björn wrote:
        >
        > Hi Eljay,
        >
        > 2009/6/23 John (Eljay) Love-Jensen:
        >>
        >>> As far as I can tell (from searching around) HFS+ always uses
        >>> normalization form D (NFD) for filenames.
        >>
        >> HFS+ uses a variant of NFD for filenames. (The HFS+ variant predates
        >> standardizatoin of NFD.) This requirement is enforced by the OS.
        >>
        >> http://developer.apple.com/technotes/tn/tn1150.html
        >> http://developer.apple.com/technotes/tn/tn1150table.html
        >> http://developer.apple.com/qa/qa2001/qa1235.html
        >> http://www.unicode.org/reports/tr15/
        >
        > Thanks for clarifying that (and for the links!).
        >
        >> Windows uses NFC for filenames. I'm not sure if the Linux world settled on
        >> NFC or NFK.
        >
        > I read that Windows uses NFKC. Have you got a reference for the claim
        > that NFC is used?
        >
        >>> So as a workaround for the issue the OP had I now normalize filenames
        >>> to compatibility form C (NFKC) before passing the filename on to Vim
        >>> and this takes care of the OP's problem.
        >>
        >> NFC or NFKC? Those are different normalizations.
        >>
        >> Windows NTFS file system uses NFC. But it isn't enforced by the OS, yet.
        >
        > I did mean the compatibility form NFKC since I read somewhere that
        > NTFS uses NFKC, but I did not research that very carefully.
        >
        >
        >>> However, as I see it this really is a legitimate issue in Vim itself
        >>> in that it does not handle NFD properly (the example above should
        >>> always render as one glyph, not three as it does now if NFD is used).
        >>> Either Vim should ensure that all buffers are normalized to composed
        >>> form NFC/NFKC or it needs to be made "NFD aware".
        >>
        >> I agree with your assessment.
        >>
        >>> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
        >>> well) have any comments on this?
        >>
        >> The relevant Mac OS X routine APIs are:
        >>
        >> CFURLRef url =
        >> CFURLCreateWithFileSystemPath(
        >> kCFAllocatorDefault,
        >> cfstringFullPath,
        >> kCFURLPOSIXPathStyle,
        >> false));
        >>
        >> char bufferUTF8[32768*4]; // Worst case scenario.
        >> // As per Apple documentation, paths can be "up to 30,000 UTF-16
        >> // encoding units long", with each component being up to 255 UTF-16
        >> // encoding units long. Too bad there isn't an API to specify the
        >> // exact buffer size /a priori/.
        >>
        >> Boolean success =
        >> CFURLGetFileSystemRepresentation(
        >> url,
        >> true,
        >> &bufferUTF8[0],
        >> sizeof bufferUTF8);
        >
        > Thanks. NSString has a method called fileSystemRepresentation which
        > I'm guessing does the same thing(?). I used the NSString method
        > precomposedStringWithCompatibilityMapping to convert to NFKC.
        >
        > Björn

        Hm, NFKC and NFKD sometimes fuse slightly different glyphs into a single
        "normalized" form. For instance, NFKC(²) = 2, though both are
        (different) Latin1 characters (0xB2 and 0x32). IIRC, DOS would have kept
        them distinct.

        Best regards,
        Tony.
        --
        hundred-and-one symptoms of being an internet addict:
        56. You leave the modem speaker on after connecting because you think it
        sounds like the ocean wind...the perfect soundtrack for "surfing
        the net".

        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_mac" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      Your message has been successfully submitted and would be delivered to recipients shortly.