Loading ...
Sorry, an error occurred while loading the content.

Re: [caplet] Code that lexes differently in ES3 vs ES3.1

Expand Messages
  • Marcel Laverdet
    From what I remember this started out as a bug in IE and then Firefox followed suit for compatibility which left the other browsers with no choice. I can t
    Message 1 of 13 , Feb 9, 2009
    • 0 Attachment

      From what I remember this started out as a bug in IE and then Firefox followed suit for compatibility which left the other browsers with no choice. I can't find the original bug but `/[/]/` only started parsing in FF1.5, in FF1.0 it would throw a syntax error.

      You could throw out any malformed regexp literals (any that differ between ES3 \ ES3.1) which is a fairly small subset and you would be ok.


      On Feb 9, 2009, at 9:16 AM, David-Sarah Hopwood wrote:

      Consider the following JavaScript source:

      [ /[/]/ /foo]/ + bar

      According to the ES3 spec, this is interpreted as:

      [ new RegExp("[") ] / new RegExp("foo] ") + bar

      According to the ES3.1 draft spec, it is interpreted as:

      [ new RegExp("[\/] ") / foo ] / +bar

      Apparently, Firefox and IE7 were lexing regexp literals in the way
      ES3.1 specifies. I had considered re-allowing regexp literals in
      Jacaranda 0.4, but this has scared me off doing so -- the potential
      for lexical confusion attacks is just too great.

      -- 
      David-Sarah Hopwood ⚥


    • Brendan Eich
      ... No, other browsers followed suit first. ... https://bugzilla.mozilla.org/show_bug.cgi?id=309840 Quoting from comment 0: Description From Jesse Ruderman
      Message 2 of 13 , Feb 9, 2009
      • 0 Attachment
        On Feb 9, 2009, at 9:42 AM, Marcel Laverdet wrote:

        > From what I remember this started out as a bug in IE and then
        > Firefox followed suit for compatibility which left the other
        > browsers with no choice.

        No, other browsers followed suit first.


        > I can't find the original bug but `/[/]/` only started parsing in
        > FF1.5, in FF1.0 it would throw a syntax error.

        https://bugzilla.mozilla.org/show_bug.cgi?id=309840

        Quoting from comment 0:

        Description From Jesse Ruderman 2005-09-23 19:33:25 PST (-)
        [reply] Private

        Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b5) Gecko/
        20050923
        Firefox/1.4

        Firefox gives an "unterminated character class" error on the regular
        expression
        /[/]/. Safari and Opera 8.5 accept it.

        I noticed this because a regular expression containing [/] appears in
        http://msdn.microsoft.com/workshop/code/browdata.js, which was
        mentioned in
        bug 309695 comment 9.

        /be
      • Douglas Crockford
        ... ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn t mean that ADsafe must. ADsafe insists that all internal / must have .
        Message 3 of 13 , Feb 9, 2009
        • 0 Attachment
          David-Sarah Hopwood wrote:
          > Consider the following JavaScript source:
          >
          > [ /[/]/ /foo]/ + bar
          >
          > According to the ES3 spec, this is interpreted as:
          >
          > [ new RegExp("[") ] / new RegExp("foo]") + bar
          >
          > According to the ES3.1 draft spec, it is interpreted as:
          >
          > [ new RegExp("[\/]") / foo ] / +bar
          >
          > Apparently, Firefox and IE7 were lexing regexp literals in the way
          > ES3.1 specifies. I had considered re-allowing regexp literals in
          > Jacaranda 0.4, but this has scared me off doing so -- the potential
          > for lexical confusion attacks is just too great.

          ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn't
          mean that ADsafe must. ADsafe insists that all internal / must have \.
        • Mike Samuel
          2009/2/9 Douglas Crockford ... Cajita disallows regex literals, but Valija uses the ES3.1 rule for lexing regexs and rewrites [ /[/]/
          Message 4 of 13 , Feb 9, 2009
          • 0 Attachment
            2009/2/9 Douglas Crockford <douglas@...>
            >
            > David-Sarah Hopwood wrote:
            > > Consider the following JavaScript source:
            > >
            > > [ /[/]/ /foo]/ + bar
            > >
            > > According to the ES3 spec, this is interpreted as:
            > >
            > > [ new RegExp("[") ] / new RegExp("foo]") + bar
            > >
            > > According to the ES3.1 draft spec, it is interpreted as:
            > >
            > > [ new RegExp("[\/]") / foo ] / +bar
            > >
            > > Apparently, Firefox and IE7 were lexing regexp literals in the way
            > > ES3.1 specifies. I had considered re-allowing regexp literals in
            > > Jacaranda 0.4, but this has scared me off doing so -- the potential
            > > for lexical confusion attacks is just too great.
            >
            > ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn't
            > mean that ADsafe must. ADsafe insists that all internal / must have \.

            Cajita disallows regex literals, but Valija uses the ES3.1 rule for
            lexing regexs and rewrites
            [ /[/]/ /foo]/ + bar
            to
            [ (new RegExp('[\\/]', '')) / foo ] / +bar
            though I think we do something to make sure it will always use the
            builtin RegExp ctor, again using the ES3.1 semantics.

            We rely on the builtin RegExp ctor to be callable with string
            arguments without introducing a breach, / after a close bracket token
            to always be interpreted as a div operator, and our normalization for
            string escapes to produce a string literal where the browser agrees
            with us where the string ends.
          • Marcel Laverdet
            My apologies.
            Message 5 of 13 , Feb 9, 2009
            • 0 Attachment

              My apologies.

              On Feb 9, 2009, at 10:54 AM, Brendan Eich wrote:
              No, other browsers followed suit first.

            • Brendan Eich
              No need to apologize, and I did not aim to blame Opera or Safari in citing the record. This was not a situation where anyone fielding a browser compatible
              Message 6 of 13 , Feb 10, 2009
              • 0 Attachment
                No need to apologize, and I did not aim to blame Opera or Safari in citing the record. This was not a situation where anyone fielding a browser compatible enough to get market share could break the de-facto standard that IE set in the wake of Netscape going down (Netscape 4 was based on early SpiderMonkey code, which did not allow / unescaped in a character class).

                /be

                On Feb 9, 2009, at 11:31 PM, Marcel Laverdet wrote:



                My apologies.

                On Feb 9, 2009, at 10:54 AM, Brendan Eich wrote:
                No, other browsers followed suit first.



              • David-Sarah Hopwood
                ... I could, if I knew that there were no more bugs like this. Note that lexical confusion attacks of this kind can easily be turned into complete breaks of a
                Message 7 of 13 , Feb 10, 2009
                • 0 Attachment
                  Marcel Laverdet wrote:
                  >
                  > From what I remember this started out as a bug in IE and then Firefox
                  > followed suit for compatibility which left the other browsers with no
                  > choice. I can't find the original bug but `/[/]/` only started parsing
                  > in FF1.5, in FF1.0 it would throw a syntax error.
                  >
                  > You could throw out any malformed regexp literals (any that differ
                  > between ES3 \ ES3.1) which is a fairly small subset and you would be ok.

                  I could, if I knew that there were no more bugs like this. Note that
                  lexical confusion attacks of this kind can easily be turned into complete
                  breaks of a subset implementation:

                  [ /[/]/ /alert('toast')]/ + 1

                  Verifier sees valid, harmless code:
                  [ new RegExp("[") ] / new RegExp("alert('toast')]") + 1

                  Browser runs exploit code:
                  [ new RegExp("[\/]") / alert('toast') ] / +1

                  Since there's no way that I could reliably have known about the IE lexer
                  bug, it's just too risky.

                  Anyone know of other bugs where common JS implementations lex or parse
                  valid ES3 code with a different meaning than specified? (The only one
                  I can think of right now is \v in IE, but at least that doesn't result
                  in a parse with a different structure.)

                  --
                  David-Sarah Hopwood ⚥
                • David-Sarah Hopwood
                  ... # This fixes a highly dup ed IE compatibility bug. It s an extension # to ECMA syntax that s
                  Message 8 of 13 , Feb 10, 2009
                  • 0 Attachment
                    Brendan Eich wrote:
                    > On Feb 9, 2009, at 9:42 AM, Marcel Laverdet wrote:
                    >
                    >> From what I remember this started out as a bug in IE and then
                    >> Firefox followed suit for compatibility which left the other
                    >> browsers with no choice.
                    >
                    > No, other browsers followed suit first.
                    >
                    >> I can't find the original bug but `/[/]/` only started parsing in
                    >> FF1.5, in FF1.0 it would throw a syntax error.
                    >
                    > https://bugzilla.mozilla.org/show_bug.cgi?id=309840

                    <https://bugzilla.mozilla.org/show_bug.cgi?id=309840#c12>

                    # This fixes a highly dup'ed IE compatibility bug. It's an extension
                    # to ECMA syntax that's allowed by Section 16. I'm approving it so
                    # that we can get it into 1.8b5 / Firefox 1.5b2.

                    As the example in my first post demonstrated, it is absolutely not
                    correct that this was an allowed Section 16 extension. That section
                    allows lexical extensions only if a program does not match the lexical
                    grammar in the spec (in this case using the ES3 definition of
                    RegularExpressionLiteral), and allows regexp syntax extensions only
                    if the resulting RegularExpressionBody does not match Pattern.

                    In fact this just makes me even more worried: it seems that Section 16
                    is being misinterpreted in a way that prevents independently developed
                    parsers, implemented strictly from the spec, from being able to match the
                    parsing behaviour of browsers on syntactically valid ES3 code. Is this
                    just a one-off mistake, or is Section 16 consistently being interpreted
                    too loosely?

                    --
                    David-Sarah Hopwood ⚥
                  • David-Sarah Hopwood
                    ... I m confused -- how does it know that the middle / in /[/]/ is internal ? Is it lexing according to the intersection of Pattern from section 15.10.1,
                    Message 9 of 13 , Feb 10, 2009
                    • 0 Attachment
                      Douglas Crockford wrote:
                      > David-Sarah Hopwood wrote:
                      >> Consider the following JavaScript source:
                      >>
                      >> [ /[/]/ /foo]/ + bar
                      >>
                      >> According to the ES3 spec, this is interpreted as:
                      >>
                      >> [ new RegExp("[") ] / new RegExp("foo]") + bar
                      >>
                      >> According to the ES3.1 draft spec, it is interpreted as:
                      >>
                      >> [ new RegExp("[\/]") / foo ] / +bar
                      >>
                      >> Apparently, Firefox and IE7 were lexing regexp literals in the way
                      >> ES3.1 specifies. I had considered re-allowing regexp literals in
                      >> Jacaranda 0.4, but this has scared me off doing so -- the potential
                      >> for lexical confusion attacks is just too great.
                      >
                      > ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn't
                      > mean that ADsafe must. ADsafe insists that all internal / must have \.

                      I'm confused -- how does it know that the middle '/' in "/[/]/" is
                      "internal"? Is it lexing according to the intersection of Pattern
                      from section 15.10.1, and RegularExpressionBody?

                      --
                      David-Sarah Hopwood ⚥
                    • Brendan Eich
                      ... You re right, but so what? The IE bug and monopoly combined to create a de-facto standard. Appealing to the de-jure standard does you no good, and
                      Message 10 of 13 , Feb 10, 2009
                      • 0 Attachment
                        On Feb 10, 2009, at 6:34 AM, David-Sarah Hopwood wrote:
                        > Brendan Eich wrote:
                        > > On Feb 9, 2009, at 9:42 AM, Marcel Laverdet wrote:
                        > >
                        > >> From what I remember this started out as a bug in IE and then
                        > >> Firefox followed suit for compatibility which left the other
                        > >> browsers with no choice.
                        > >
                        > > No, other browsers followed suit first.
                        > >
                        > >> I can't find the original bug but `/[/]/` only started parsing in
                        > >> FF1.5, in FF1.0 it would throw a syntax error.
                        > >
                        > > https://bugzilla.mozilla.org/show_bug.cgi?id=309840
                        >
                        > <https://bugzilla.mozilla.org/show_bug.cgi?id=309840#c12>
                        >
                        > # This fixes a highly dup'ed IE compatibility bug. It's an extension
                        > # to ECMA syntax that's allowed by Section 16. I'm approving it so
                        > # that we can get it into 1.8b5 / Firefox 1.5b2.
                        >
                        > As the example in my first post demonstrated, it is absolutely not
                        > correct that this was an allowed Section 16 extension.
                        >
                        You're right, but so what? The IE bug and monopoly combined to create
                        a de-facto standard. Appealing to the de-jure standard does you no
                        good, and correcting my 2005-ear misunderstanding (you've corrected it
                        more recently in es-discuss) does not change the de-facto standard
                        trumping the de-jure one.

                        > In fact this just makes me even more worried: it seems that Section 16
                        > is being misinterpreted in a way that prevents independently developed
                        > parsers, implemented strictly from the spec, from being able to
                        > match the
                        > parsing behaviour of browsers on syntactically valid ES3 code. Is this
                        > just a one-off mistake, or is Section 16 consistently being
                        > interpreted
                        > too loosely?
                        >

                        This has nothing to do with Section 16 or my former misunderstanding
                        of it, and everything to do with IE forcing a de-facto standard. As
                        far as I know, no one at Microsoft added the bug allowing unescaped /
                        in a character class by arguing based on a misinterpretation of
                        Section 16. I think you are barking up the wrong tree.

                        /be
                        >
                        >
                        > --
                        > David-Sarah Hopwood ⚥
                        >
                        >
                        >
                      • Mike Samuel
                        ... Plenty. But I suspect you know of them. There s conditional compilation comments /* @cc_on */, and there s the newlines in block comments thing return /*
                        Message 11 of 13 , Feb 10, 2009
                        • 0 Attachment
                          2009/2/10 David-Sarah Hopwood <david.hopwood@...>:
                          > Marcel Laverdet wrote:
                          >>
                          >> From what I remember this started out as a bug in IE and then Firefox
                          >> followed suit for compatibility which left the other browsers with no
                          >> choice. I can't find the original bug but `/[/]/` only started parsing
                          >> in FF1.5, in FF1.0 it would throw a syntax error.
                          >>
                          >> You could throw out any malformed regexp literals (any that differ
                          >> between ES3 \ ES3.1) which is a fairly small subset and you would be ok.
                          >
                          > I could, if I knew that there were no more bugs like this. Note that
                          > lexical confusion attacks of this kind can easily be turned into complete
                          > breaks of a subset implementation:
                          >
                          > [ /[/]/ /alert('toast')]/ + 1
                          >
                          > Verifier sees valid, harmless code:
                          > [ new RegExp("[") ] / new RegExp("alert('toast')]") + 1
                          >
                          > Browser runs exploit code:
                          > [ new RegExp("[\/]") / alert('toast') ] / +1
                          >
                          > Since there's no way that I could reliably have known about the IE lexer
                          > bug, it's just too risky.
                          >
                          > Anyone know of other bugs where common JS implementations lex or parse
                          > valid ES3 code with a different meaning than specified? (The only one
                          > I can think of right now is \v in IE, but at least that doesn't result
                          > in a parse with a different structure.)

                          Plenty. But I suspect you know of them. There's conditional
                          compilation comments /* @cc_on */,
                          and there's the newlines in block comments thing return /*
                          */ foo();
                          and there's format control characters between pairs like */ and \".
                          There's other tricks you can do with \u escapes in identifiers and NUL
                          and BOM characters in source.



                          > --
                          > David-Sarah Hopwood ⚥
                        • Brendan Eich
                          ... Fixed in Firefox 3.1 beta nightlies: https://bugzilla.mozilla.org/show_bug.cgi?id=475834 We could push the fix back into a 3.0.x maintenance release if it
                          Message 12 of 13 , Feb 10, 2009
                          • 0 Attachment
                            On Feb 10, 2009, at 6:36 PM, Mike Samuel wrote:
                            > and there's the newlines in block comments thing return /*
                            > */ foo();
                            >

                            Fixed in Firefox 3.1 beta nightlies:

                            https://bugzilla.mozilla.org/show_bug.cgi?id=475834

                            We could push the fix back into a 3.0.x maintenance release if it
                            would help. Anyone with https://bugzilla.mozilla.org editbugs
                            permission who wants this, feel free to nominate the patch for approval.

                            /be
                          Your message has been successfully submitted and would be delivered to recipients shortly.