Loading ...
Sorry, an error occurred while loading the content.

Re: Failed to drag&drop-open a file with wide-chars in its filename

Expand Messages
  • Tony Mechelynck
    ... Hm, NFKC and NFKD sometimes fuse slightly different glyphs into a single normalized form. For instance, NFKC(²) = 2, though both are (different) Latin1
    Message 1 of 12 , Jun 24, 2009
    • 0 Attachment
      On 24/06/09 14:00, björn wrote:
      > Hi Eljay,
      > 2009/6/23 John (Eljay) Love-Jensen:
      >>> As far as I can tell (from searching around) HFS+ always uses
      >>> normalization form D (NFD) for filenames.
      >> HFS+ uses a variant of NFD for filenames. (The HFS+ variant predates
      >> standardizatoin of NFD.) This requirement is enforced by the OS.
      >> http://developer.apple.com/technotes/tn/tn1150.html
      >> http://developer.apple.com/technotes/tn/tn1150table.html
      >> http://developer.apple.com/qa/qa2001/qa1235.html
      >> http://www.unicode.org/reports/tr15/
      > Thanks for clarifying that (and for the links!).
      >> Windows uses NFC for filenames. I'm not sure if the Linux world settled on
      >> NFC or NFK.
      > I read that Windows uses NFKC. Have you got a reference for the claim
      > that NFC is used?
      >>> So as a workaround for the issue the OP had I now normalize filenames
      >>> to compatibility form C (NFKC) before passing the filename on to Vim
      >>> and this takes care of the OP's problem.
      >> NFC or NFKC? Those are different normalizations.
      >> Windows NTFS file system uses NFC. But it isn't enforced by the OS, yet.
      > I did mean the compatibility form NFKC since I read somewhere that
      > NTFS uses NFKC, but I did not research that very carefully.
      >>> However, as I see it this really is a legitimate issue in Vim itself
      >>> in that it does not handle NFD properly (the example above should
      >>> always render as one glyph, not three as it does now if NFD is used).
      >>> Either Vim should ensure that all buffers are normalized to composed
      >>> form NFC/NFKC or it needs to be made "NFD aware".
      >> I agree with your assessment.
      >>> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
      >>> well) have any comments on this?
      >> The relevant Mac OS X routine APIs are:
      >> CFURLRef url =
      >> CFURLCreateWithFileSystemPath(
      >> kCFAllocatorDefault,
      >> cfstringFullPath,
      >> kCFURLPOSIXPathStyle,
      >> false));
      >> char bufferUTF8[32768*4]; // Worst case scenario.
      >> // As per Apple documentation, paths can be "up to 30,000 UTF-16
      >> // encoding units long", with each component being up to 255 UTF-16
      >> // encoding units long. Too bad there isn't an API to specify the
      >> // exact buffer size /a priori/.
      >> Boolean success =
      >> CFURLGetFileSystemRepresentation(
      >> url,
      >> true,
      >> &bufferUTF8[0],
      >> sizeof bufferUTF8);
      > Thanks. NSString has a method called fileSystemRepresentation which
      > I'm guessing does the same thing(?). I used the NSString method
      > precomposedStringWithCompatibilityMapping to convert to NFKC.
      > Björn

      Hm, NFKC and NFKD sometimes fuse slightly different glyphs into a single
      "normalized" form. For instance, NFKC(²) = 2, though both are
      (different) Latin1 characters (0xB2 and 0x32). IIRC, DOS would have kept
      them distinct.

      Best regards,
      hundred-and-one symptoms of being an internet addict:
      56. You leave the modem speaker on after connecting because you think it
      sounds like the ocean wind...the perfect soundtrack for "surfing
      the net".

      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
    Your message has been successfully submitted and would be delivered to recipients shortly.