Loading ...
Sorry, an error occurred while loading the content.

Re: [caplet] Code that lexes differently in ES3 vs ES3.1

Expand Messages
  • Douglas Crockford
    ... ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn t mean that ADsafe must. ADsafe insists that all internal / must have .
    Message 1 of 13 , Feb 9, 2009
    • 0 Attachment
      David-Sarah Hopwood wrote:
      > Consider the following JavaScript source:
      >
      > [ /[/]/ /foo]/ + bar
      >
      > According to the ES3 spec, this is interpreted as:
      >
      > [ new RegExp("[") ] / new RegExp("foo]") + bar
      >
      > According to the ES3.1 draft spec, it is interpreted as:
      >
      > [ new RegExp("[\/]") / foo ] / +bar
      >
      > Apparently, Firefox and IE7 were lexing regexp literals in the way
      > ES3.1 specifies. I had considered re-allowing regexp literals in
      > Jacaranda 0.4, but this has scared me off doing so -- the potential
      > for lexical confusion attacks is just too great.

      ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn't
      mean that ADsafe must. ADsafe insists that all internal / must have \.
    • Mike Samuel
      2009/2/9 Douglas Crockford ... Cajita disallows regex literals, but Valija uses the ES3.1 rule for lexing regexs and rewrites [ /[/]/
      Message 2 of 13 , Feb 9, 2009
      • 0 Attachment
        2009/2/9 Douglas Crockford <douglas@...>
        >
        > David-Sarah Hopwood wrote:
        > > Consider the following JavaScript source:
        > >
        > > [ /[/]/ /foo]/ + bar
        > >
        > > According to the ES3 spec, this is interpreted as:
        > >
        > > [ new RegExp("[") ] / new RegExp("foo]") + bar
        > >
        > > According to the ES3.1 draft spec, it is interpreted as:
        > >
        > > [ new RegExp("[\/]") / foo ] / +bar
        > >
        > > Apparently, Firefox and IE7 were lexing regexp literals in the way
        > > ES3.1 specifies. I had considered re-allowing regexp literals in
        > > Jacaranda 0.4, but this has scared me off doing so -- the potential
        > > for lexical confusion attacks is just too great.
        >
        > ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn't
        > mean that ADsafe must. ADsafe insists that all internal / must have \.

        Cajita disallows regex literals, but Valija uses the ES3.1 rule for
        lexing regexs and rewrites
        [ /[/]/ /foo]/ + bar
        to
        [ (new RegExp('[\\/]', '')) / foo ] / +bar
        though I think we do something to make sure it will always use the
        builtin RegExp ctor, again using the ES3.1 semantics.

        We rely on the builtin RegExp ctor to be callable with string
        arguments without introducing a breach, / after a close bracket token
        to always be interpreted as a div operator, and our normalization for
        string escapes to produce a string literal where the browser agrees
        with us where the string ends.
      • Marcel Laverdet
        My apologies.
        Message 3 of 13 , Feb 9, 2009
        • 0 Attachment

          My apologies.

          On Feb 9, 2009, at 10:54 AM, Brendan Eich wrote:
          No, other browsers followed suit first.

        • Brendan Eich
          No need to apologize, and I did not aim to blame Opera or Safari in citing the record. This was not a situation where anyone fielding a browser compatible
          Message 4 of 13 , Feb 10, 2009
          • 0 Attachment
            No need to apologize, and I did not aim to blame Opera or Safari in citing the record. This was not a situation where anyone fielding a browser compatible enough to get market share could break the de-facto standard that IE set in the wake of Netscape going down (Netscape 4 was based on early SpiderMonkey code, which did not allow / unescaped in a character class).

            /be

            On Feb 9, 2009, at 11:31 PM, Marcel Laverdet wrote:



            My apologies.

            On Feb 9, 2009, at 10:54 AM, Brendan Eich wrote:
            No, other browsers followed suit first.



          • David-Sarah Hopwood
            ... I could, if I knew that there were no more bugs like this. Note that lexical confusion attacks of this kind can easily be turned into complete breaks of a
            Message 5 of 13 , Feb 10, 2009
            • 0 Attachment
              Marcel Laverdet wrote:
              >
              > From what I remember this started out as a bug in IE and then Firefox
              > followed suit for compatibility which left the other browsers with no
              > choice. I can't find the original bug but `/[/]/` only started parsing
              > in FF1.5, in FF1.0 it would throw a syntax error.
              >
              > You could throw out any malformed regexp literals (any that differ
              > between ES3 \ ES3.1) which is a fairly small subset and you would be ok.

              I could, if I knew that there were no more bugs like this. Note that
              lexical confusion attacks of this kind can easily be turned into complete
              breaks of a subset implementation:

              [ /[/]/ /alert('toast')]/ + 1

              Verifier sees valid, harmless code:
              [ new RegExp("[") ] / new RegExp("alert('toast')]") + 1

              Browser runs exploit code:
              [ new RegExp("[\/]") / alert('toast') ] / +1

              Since there's no way that I could reliably have known about the IE lexer
              bug, it's just too risky.

              Anyone know of other bugs where common JS implementations lex or parse
              valid ES3 code with a different meaning than specified? (The only one
              I can think of right now is \v in IE, but at least that doesn't result
              in a parse with a different structure.)

              --
              David-Sarah Hopwood ⚥
            • David-Sarah Hopwood
              ... # This fixes a highly dup ed IE compatibility bug. It s an extension # to ECMA syntax that s
              Message 6 of 13 , Feb 10, 2009
              • 0 Attachment
                Brendan Eich wrote:
                > On Feb 9, 2009, at 9:42 AM, Marcel Laverdet wrote:
                >
                >> From what I remember this started out as a bug in IE and then
                >> Firefox followed suit for compatibility which left the other
                >> browsers with no choice.
                >
                > No, other browsers followed suit first.
                >
                >> I can't find the original bug but `/[/]/` only started parsing in
                >> FF1.5, in FF1.0 it would throw a syntax error.
                >
                > https://bugzilla.mozilla.org/show_bug.cgi?id=309840

                <https://bugzilla.mozilla.org/show_bug.cgi?id=309840#c12>

                # This fixes a highly dup'ed IE compatibility bug. It's an extension
                # to ECMA syntax that's allowed by Section 16. I'm approving it so
                # that we can get it into 1.8b5 / Firefox 1.5b2.

                As the example in my first post demonstrated, it is absolutely not
                correct that this was an allowed Section 16 extension. That section
                allows lexical extensions only if a program does not match the lexical
                grammar in the spec (in this case using the ES3 definition of
                RegularExpressionLiteral), and allows regexp syntax extensions only
                if the resulting RegularExpressionBody does not match Pattern.

                In fact this just makes me even more worried: it seems that Section 16
                is being misinterpreted in a way that prevents independently developed
                parsers, implemented strictly from the spec, from being able to match the
                parsing behaviour of browsers on syntactically valid ES3 code. Is this
                just a one-off mistake, or is Section 16 consistently being interpreted
                too loosely?

                --
                David-Sarah Hopwood ⚥
              • David-Sarah Hopwood
                ... I m confused -- how does it know that the middle / in /[/]/ is internal ? Is it lexing according to the intersection of Pattern from section 15.10.1,
                Message 7 of 13 , Feb 10, 2009
                • 0 Attachment
                  Douglas Crockford wrote:
                  > David-Sarah Hopwood wrote:
                  >> Consider the following JavaScript source:
                  >>
                  >> [ /[/]/ /foo]/ + bar
                  >>
                  >> According to the ES3 spec, this is interpreted as:
                  >>
                  >> [ new RegExp("[") ] / new RegExp("foo]") + bar
                  >>
                  >> According to the ES3.1 draft spec, it is interpreted as:
                  >>
                  >> [ new RegExp("[\/]") / foo ] / +bar
                  >>
                  >> Apparently, Firefox and IE7 were lexing regexp literals in the way
                  >> ES3.1 specifies. I had considered re-allowing regexp literals in
                  >> Jacaranda 0.4, but this has scared me off doing so -- the potential
                  >> for lexical confusion attacks is just too great.
                  >
                  > ADsafe rejects [ /[/]/ /foo]/ + bar. Just because ECMAScript says its ok doesn't
                  > mean that ADsafe must. ADsafe insists that all internal / must have \.

                  I'm confused -- how does it know that the middle '/' in "/[/]/" is
                  "internal"? Is it lexing according to the intersection of Pattern
                  from section 15.10.1, and RegularExpressionBody?

                  --
                  David-Sarah Hopwood ⚥
                • Brendan Eich
                  ... You re right, but so what? The IE bug and monopoly combined to create a de-facto standard. Appealing to the de-jure standard does you no good, and
                  Message 8 of 13 , Feb 10, 2009
                  • 0 Attachment
                    On Feb 10, 2009, at 6:34 AM, David-Sarah Hopwood wrote:
                    > Brendan Eich wrote:
                    > > On Feb 9, 2009, at 9:42 AM, Marcel Laverdet wrote:
                    > >
                    > >> From what I remember this started out as a bug in IE and then
                    > >> Firefox followed suit for compatibility which left the other
                    > >> browsers with no choice.
                    > >
                    > > No, other browsers followed suit first.
                    > >
                    > >> I can't find the original bug but `/[/]/` only started parsing in
                    > >> FF1.5, in FF1.0 it would throw a syntax error.
                    > >
                    > > https://bugzilla.mozilla.org/show_bug.cgi?id=309840
                    >
                    > <https://bugzilla.mozilla.org/show_bug.cgi?id=309840#c12>
                    >
                    > # This fixes a highly dup'ed IE compatibility bug. It's an extension
                    > # to ECMA syntax that's allowed by Section 16. I'm approving it so
                    > # that we can get it into 1.8b5 / Firefox 1.5b2.
                    >
                    > As the example in my first post demonstrated, it is absolutely not
                    > correct that this was an allowed Section 16 extension.
                    >
                    You're right, but so what? The IE bug and monopoly combined to create
                    a de-facto standard. Appealing to the de-jure standard does you no
                    good, and correcting my 2005-ear misunderstanding (you've corrected it
                    more recently in es-discuss) does not change the de-facto standard
                    trumping the de-jure one.

                    > In fact this just makes me even more worried: it seems that Section 16
                    > is being misinterpreted in a way that prevents independently developed
                    > parsers, implemented strictly from the spec, from being able to
                    > match the
                    > parsing behaviour of browsers on syntactically valid ES3 code. Is this
                    > just a one-off mistake, or is Section 16 consistently being
                    > interpreted
                    > too loosely?
                    >

                    This has nothing to do with Section 16 or my former misunderstanding
                    of it, and everything to do with IE forcing a de-facto standard. As
                    far as I know, no one at Microsoft added the bug allowing unescaped /
                    in a character class by arguing based on a misinterpretation of
                    Section 16. I think you are barking up the wrong tree.

                    /be
                    >
                    >
                    > --
                    > David-Sarah Hopwood ⚥
                    >
                    >
                    >
                  • Mike Samuel
                    ... Plenty. But I suspect you know of them. There s conditional compilation comments /* @cc_on */, and there s the newlines in block comments thing return /*
                    Message 9 of 13 , Feb 10, 2009
                    • 0 Attachment
                      2009/2/10 David-Sarah Hopwood <david.hopwood@...>:
                      > Marcel Laverdet wrote:
                      >>
                      >> From what I remember this started out as a bug in IE and then Firefox
                      >> followed suit for compatibility which left the other browsers with no
                      >> choice. I can't find the original bug but `/[/]/` only started parsing
                      >> in FF1.5, in FF1.0 it would throw a syntax error.
                      >>
                      >> You could throw out any malformed regexp literals (any that differ
                      >> between ES3 \ ES3.1) which is a fairly small subset and you would be ok.
                      >
                      > I could, if I knew that there were no more bugs like this. Note that
                      > lexical confusion attacks of this kind can easily be turned into complete
                      > breaks of a subset implementation:
                      >
                      > [ /[/]/ /alert('toast')]/ + 1
                      >
                      > Verifier sees valid, harmless code:
                      > [ new RegExp("[") ] / new RegExp("alert('toast')]") + 1
                      >
                      > Browser runs exploit code:
                      > [ new RegExp("[\/]") / alert('toast') ] / +1
                      >
                      > Since there's no way that I could reliably have known about the IE lexer
                      > bug, it's just too risky.
                      >
                      > Anyone know of other bugs where common JS implementations lex or parse
                      > valid ES3 code with a different meaning than specified? (The only one
                      > I can think of right now is \v in IE, but at least that doesn't result
                      > in a parse with a different structure.)

                      Plenty. But I suspect you know of them. There's conditional
                      compilation comments /* @cc_on */,
                      and there's the newlines in block comments thing return /*
                      */ foo();
                      and there's format control characters between pairs like */ and \".
                      There's other tricks you can do with \u escapes in identifiers and NUL
                      and BOM characters in source.



                      > --
                      > David-Sarah Hopwood ⚥
                    • Brendan Eich
                      ... Fixed in Firefox 3.1 beta nightlies: https://bugzilla.mozilla.org/show_bug.cgi?id=475834 We could push the fix back into a 3.0.x maintenance release if it
                      Message 10 of 13 , Feb 10, 2009
                      • 0 Attachment
                        On Feb 10, 2009, at 6:36 PM, Mike Samuel wrote:
                        > and there's the newlines in block comments thing return /*
                        > */ foo();
                        >

                        Fixed in Firefox 3.1 beta nightlies:

                        https://bugzilla.mozilla.org/show_bug.cgi?id=475834

                        We could push the fix back into a 3.0.x maintenance release if it
                        would help. Anyone with https://bugzilla.mozilla.org editbugs
                        permission who wants this, feel free to nominate the patch for approval.

                        /be
                      Your message has been successfully submitted and would be delivered to recipients shortly.