Loading ...
Sorry, an error occurred while loading the content.

2728Re: [jslint] member names outside ASCII but still in the Unicode Basic Multilingual Plane

Expand Messages
  • Joshua Bell
    Jan 6, 2012
    • 0 Attachment
      On Fri, Jan 6, 2012 at 8:32 AM, Joshua Bell <inexorabletash@...>wrote:

      > On Thu, Jan 5, 2012 at 6:54 AM, Tom Worster <fsb@...> wrote:
      >
      >> **
      >>
      >>
      >> I like to program in Unicode (☭ = ☃ + π;) but I accept that there can be
      >> difficulties. One question is what collation JS should use to decide
      >> equivalence (according to Unicode, whether é is different from e depends
      >> on locale). Another is that Unicode offers different character sequences,
      >> and thus different byte strings, to represent the exact same thing (ö and
      >> ö look the same to me but the first is U+006F U+0308 the second is
      >> U+00F6).
      >>
      >
      > A Globalization API for JavaScript is under consideration on es-discuss,
      > for implementation by browser vendors as host objects and/or inclusion in
      > the next version of the ECMAScript standard as a module. I believe the
      > latest version of the proposal can be found at:
      >
      >
      > http://norbertlindenberg.com/2011/11/ecmascript-globalization-api/index.html
      >
      > The current proposal includes support for locale-specific collation and
      > all the Unicode-goodness you'd expect. This is done with new
      > objects/functions - existing JavaScript string comparison operations remain
      > unchanged (i.e. continue to operate by ordinal comparison of the 16-bit
      > elements of JS strings)
      >

      ... and to expound on Crockford's point on the other fork of this thread
      (mea culpa!), the above proposal assumes no changes to the ECMAScript
      language itself. Different JS strings (i.e. different sequences of 16-bit
      code points) would remain different identifiers, both in the source and,
      perhaps more importantly, in basic ECMAScript operations like keys for
      objects. e.g. o["ö"] and o["ö"] refer to different properties (assuming my
      clipboard didn't normalize), although other proposed changes in ECMAScript
      may enable collation-aware string maps with that convenient syntax.

      Encoding is still a very real issue on the Web, and you don't want to find
      out that your server thought your script file was UTF-8 while some browsers
      thought your script file was Windows-1252 only after your code is in
      production, so keeping your source code ASCII is still the best practice.

      Are the well known minification tools able to cope with non-ASCII input?


      [Non-text portions of this message have been removed]
    • Show all 14 messages in this topic