Loading ...
Sorry, an error occurred while loading the content.

Re: [caplet] Re: ADsafe validation

Expand Messages
  • Kris Zyp
    Here is my attempt at an ADsafe validator: http://www.persvr.org/test/capability-validate.html Let me know if anyone can find any false acceptances (scripts
    Message 1 of 16 , Mar 18, 2008
    • 0 Attachment
      Here is my attempt at an ADsafe validator:
      Let me know if anyone can find any false acceptances (scripts that get successfully eval'ed that are unsafe).
      You can download the validator at:
      It is about 5K uncompressed, probably about 2-3K compressed. I would presume that it is also a lot faster since it is using simpler regex-based checking rather than full AST parsing. This small size and probable improved speed seems like it would improve the odds of adoption of ADsafe. IMHO, 34K is a bit heavy, but a quick 3K validator that can validate ads, would lower the barriers to people using this stuff.
      There are some known limitations that cause false rejections (which do not represent a security concern):
      • Nesting vars inside statement blocks will not declare the variable for the outer block
      • Named functions are not treated as declaration so they are generally not allowed unless the name is declared with a var. 
      • Var declaration that involve multiple comma delimited variable assignments are not accepted
      So I suppose it validates a slight subset of ADsafe, but these limitations are minor, I think.
      If this proves to be a workable validation technique, I will probably try to get this in Dojo. Let me know what you think.
      Thanks,
      Kris
       
      ----- Original Message -----
      Sent: Monday, March 17, 2008 11:05 AM
      Subject: [caplet] Re: ADsafe validation

      Results of a quick experiment: Pulling stuff out of JSLint.js that is
      not needed for ADsafe validation of JavaScript produced an adsafe.js
      file that is 34K. I expect that an heroic rewrite could do better.

    • Douglas Crockford
      ... get successfully eval ed that are unsafe). ... presume that it is also a lot faster since it is using simpler regex-based checking rather than full AST
      Message 2 of 16 , Mar 20, 2008
      • 0 Attachment
        --- In caplet@yahoogroups.com, "Kris Zyp" <kris@...> wrote:
        >
        > Here is my attempt at an ADsafe validator:
        > http://www.persvr.org/test/capability-validate.html
        > Let me know if anyone can find any false acceptances (scripts that
        get successfully eval'ed that are unsafe).
        > You can download the validator at:
        > http://www.persvr.org/jsclient/capability-validate.js
        > It is about 5K uncompressed, probably about 2-3K compressed. I would
        presume that it is also a lot faster since it is using simpler
        regex-based checking rather than full AST parsing. This small size and
        probable improved speed seems like it would improve the odds of
        adoption of ADsafe. IMHO, 34K is a bit heavy, but a quick 3K validator
        that can validate ads, would lower the barriers to people using this
        stuff.


        I have reservations about extensive use of regular expressions for
        validation. In the json.js case, I started thinking that a single
        regexp should do the job. It has since grown to four, and was still
        vulnerable to a screw-up in Firefox. RexExp doesn't have enough
        context to make me confident.

        In your case, I think you might have a problem with comment deletion.
        Lacking context, the regexps can be confused.

        /\/*\//.test("*/");
        /* // */
      • Kris Zyp
        ... Yes, regular expression based validation does seem impropable. However, it seems like you could also make an argument that it easier to reason about and
        Message 3 of 16 , Mar 20, 2008
        • 0 Attachment
          > I have reservations about extensive use of regular expressions for
          > validation.
          In the json.js case, I started thinking that a single
          > regexp should do
          the job. It has since grown to four, and was still
          > vulnerable to a
          screw-up in Firefox. RexExp doesn't have enough
          > context to make me
          confident.
           
          Yes, regular expression based validation does seem impropable. However, it seems like you could also make an argument that it easier to reason about and have confidence in a simple 5K chunk of code, than a 34K module. A large module has more room for human errors. Anyway, I understand your skepticism, but I don't want to dismiss this approach yet, based solely on feelings on uncertainty. So far the problems have been fixable.
           
          > In your case, I think you might have a problem with comment
          deletion.
          > Lacking context, the regexps can be confused.

          >
          /\/*\//.test( "*/");
          > /* // */
           
          Thanks, yes I did have a problem. Those should be fixed now.
           
          Kris
        • Adam Barth
          ... Do we have a regression test suite of tricky examples? For instance, I don t see the string cc_on in Kris validator, but that feature tripped up ADsafe
          Message 4 of 16 , Mar 20, 2008
          • 0 Attachment
            On Thu, Mar 20, 2008 at 7:48 AM, Kris Zyp <kris@...> wrote:
            > > /\/*\//.test("*/");
            > > /* // */
            >
            > Thanks, yes I did have a problem. Those should be fixed now.

            Do we have a regression test suite of tricky examples? For instance,
            I don't see the string "cc_on" in Kris' validator, but that feature
            tripped up ADsafe a few months ago. I could rewrite a test case for
            that (or dig through the list archives), but it's probably a better
            approach to have a test suite that we can run against ADsafe
            validators, both to catch regressions as they are modified and to
            build confidence in new implementations.

            Adam
          • Kris Zyp
            ... That would be awesome. ... Thanks for the heads, fixed it. Thanks, Kris
            Message 5 of 16 , Mar 21, 2008
            • 0 Attachment
              > Do we have a regression test suite of tricky examples?
               
              That would be awesome.

              > For instance, I don't see the string "cc_on" in Kris' validator, but

              that feature
              > tripped up ADsafe a few months ago.

              Thanks for the heads, fixed it.

              Thanks,

              Kris

            • Mike Samuel
              ... Can you disallow @ outside of string literals entirely? What if ADSafe code is included in a container that has @cc_on, and does an @set that overrides a
              Message 6 of 16 , Mar 21, 2008
              • 0 Attachment
                On 21/03/2008, Kris Zyp <kris@...> wrote:
                >
                >
                >
                >
                >
                >
                >
                > > Do we have a regression test suite of tricky examples?
                >
                > That would be awesome.
                >
                > > For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                > > tripped up ADsafe a few months ago.
                >
                > Thanks for the heads, fixed it.

                Can you disallow @ outside of string literals entirely?

                What if ADSafe code is included in a container that has @cc_on, and
                does an @set that overrides a variable defined in the container?
              • David-Sarah Hopwood
                ... @ does not appear anywhere in the ES3 grammar outside string literals, regexp literals, and comments, right? Isn t ADsafe defined to be a subset of ES3?
                Message 7 of 16 , Mar 21, 2008
                • 0 Attachment
                  Mike Samuel wrote:
                  > On 21/03/2008, Kris Zyp <kris@...> wrote:
                  >>
                  >>> Do we have a regression test suite of tricky examples?
                  >> That would be awesome.
                  >>
                  >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                  >>> tripped up ADsafe a few months ago.
                  >> Thanks for the heads, fixed it.
                  >
                  > Can you disallow @ outside of string literals entirely?
                  >
                  > What if ADSafe code is included in a container that has @cc_on, and
                  > does an @set that overrides a variable defined in the container?

                  '@' does not appear anywhere in the ES3 grammar outside string literals,
                  regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                  of ES3?

                  (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                  'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                  an old Unicode version], so it is not valid in an identifier.)

                  --
                  David-Sarah Hopwood
                • Mike Samuel
                  On 21/03/2008, David-Sarah Hopwood ... Yep. @ often appears in JSDoc style comments: http://jsdoc.sourceforge.net/#tagref so banning @ in comments might make
                  Message 8 of 16 , Mar 21, 2008
                  • 0 Attachment
                    On 21/03/2008, David-Sarah Hopwood
                    <david.hopwood@...> wrote:
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    > Mike Samuel wrote:
                    > > On 21/03/2008, Kris Zyp <kris@...> wrote:
                    > >>
                    > >>> Do we have a regression test suite of tricky examples?
                    > >> That would be awesome.
                    > >>
                    > >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                    > >>> tripped up ADsafe a few months ago.
                    > >> Thanks for the heads, fixed it.
                    > >
                    > > Can you disallow @ outside of string literals entirely?
                    > >
                    > > What if ADSafe code is included in a container that has @cc_on, and
                    > > does an @set that overrides a variable defined in the container?
                    >
                    > '@' does not appear anywhere in the ES3 grammar outside string literals,
                    > regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                    > of ES3?
                    >
                    > (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                    > 'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                    > an old Unicode version], so it is not valid in an identifier.)

                    Yep. @ often appears in JSDoc style comments:
                    http://jsdoc.sourceforge.net/#tagref so banning @ in comments might
                    make some programmers grumble.

                    It also indicates a conditional compilation directive in IE and I
                    don't have a specific exploit in mind but I don't know whether
                    blacklisting @cc_on is sufficient to avoid using conditional
                    compilation to split or join tokens that otherwise appear separate.

                    Perhaps either ban @ outside string/regexp contexts, or recommend that
                    containers not allow @cc_on.




                    >
                    >
                    >
                    >
                    > --
                    > David-Sarah Hopwood
                    >
                    >
                  • Kris Zyp
                    ... Certainly seems reasonable to insist that containers don t do the eval inside a @cc_on. Kris
                    Message 9 of 16 , Mar 21, 2008
                    • 0 Attachment
                      
                      > Perhaps either ban @ outside string/regexp contexts, or recommend
                      that
                      > containers not allow @cc_on.
                      Certainly seems reasonable to insist that containers don't do the eval inside a @cc_on.
                      Kris
                    • David-Sarah Hopwood
                      ... I meant my point a bit more generally: Assume that any extension to strict ES3 is designed by an evil genius trying to break ADsafe (or Caja, or whatever),
                      Message 10 of 16 , Mar 21, 2008
                      • 0 Attachment
                        Mike Samuel wrote:
                        > On 21/03/2008, David-Sarah Hopwood
                        > <david.hopwood@...> wrote:
                        >> Mike Samuel wrote:
                        >> > On 21/03/2008, Kris Zyp <kris@...> wrote:
                        >> >>
                        >> >>> Do we have a regression test suite of tricky examples?
                        >> >> That would be awesome.
                        >> >>
                        >> >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                        >> >>> tripped up ADsafe a few months ago.
                        >> >> Thanks for the heads, fixed it.
                        >> >
                        >> > Can you disallow @ outside of string literals entirely?
                        >> >
                        >> > What if ADSafe code is included in a container that has @cc_on, and
                        >> > does an @set that overrides a variable defined in the container?
                        >>
                        >> '@' does not appear anywhere in the ES3 grammar outside string literals,
                        >> regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                        >> of ES3?
                        >>
                        >> (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                        >> 'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                        >> an old Unicode version], so it is not valid in an identifier.)
                        >
                        > Yep. @ often appears in JSDoc style comments:
                        > http://jsdoc.sourceforge.net/#tagref so banning @ in comments might
                        > make some programmers grumble.
                        >
                        > It also indicates a conditional compilation directive in IE and I
                        > don't have a specific exploit in mind but I don't know whether
                        > blacklisting @cc_on is sufficient to avoid using conditional
                        > compilation to split or join tokens that otherwise appear separate.
                        >
                        > Perhaps either ban @ outside string/regexp contexts, or recommend that
                        > containers not allow @cc_on.

                        I meant my point a bit more generally:

                        Assume that any extension to strict ES3 is designed by an evil genius trying
                        to break ADsafe (or Caja, or whatever), and you won't go far wrong. It's
                        impossible to review all of the browser extensions: they aren't adequately
                        documented, even if they were documented it would be too much work, and
                        many of them display total cluelessness about programming language design.

                        (Preprocessing features in client-side Javascript? What's the point of that?
                        Just preprocess it on the server.)

                        --
                        David-Sarah Hopwood
                      • Mike Samuel
                        On 21/03/2008, David-Sarah Hopwood ... Or a committee of evil geniuses. ... Caja deals with many of these problems by rewriting. We can deal perfectly well
                        Message 11 of 16 , Mar 21, 2008
                        • 0 Attachment
                          On 21/03/2008, David-Sarah Hopwood
                          <david.hopwood@...> wrote:
                          >
                          >
                          >
                          >
                          >
                          >
                          >
                          > Mike Samuel wrote:
                          > > On 21/03/2008, David-Sarah Hopwood
                          > > <david.hopwood@...> wrote:
                          > >> Mike Samuel wrote:
                          > >> > On 21/03/2008, Kris Zyp <kris@...> wrote:
                          > >> >>
                          > >> >>> Do we have a regression test suite of tricky examples?
                          > >> >> That would be awesome.
                          > >> >>
                          > >> >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                          > >> >>> tripped up ADsafe a few months ago.
                          > >> >> Thanks for the heads, fixed it.
                          > >> >
                          > >> > Can you disallow @ outside of string literals entirely?
                          > >> >
                          > >> > What if ADSafe code is included in a container that has @cc_on, and
                          > >> > does an @set that overrides a variable defined in the container?
                          > >>
                          > >> '@' does not appear anywhere in the ES3 grammar outside string literals,
                          > >> regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                          > >> of ES3?
                          > >>
                          > >> (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                          > >> 'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                          > >> an old Unicode version], so it is not valid in an identifier.)
                          > >
                          > > Yep. @ often appears in JSDoc style comments:
                          > > http://jsdoc.sourceforge.net/#tagref so banning @ in comments might
                          > > make some programmers grumble.
                          > >
                          > > It also indicates a conditional compilation directive in IE and I
                          > > don't have a specific exploit in mind but I don't know whether
                          > > blacklisting @cc_on is sufficient to avoid using conditional
                          > > compilation to split or join tokens that otherwise appear separate.
                          > >
                          > > Perhaps either ban @ outside string/regexp contexts, or recommend that
                          > > containers not allow @cc_on.
                          >
                          > I meant my point a bit more generally:
                          >
                          > Assume that any extension to strict ES3 is designed by an evil genius trying

                          Or a committee of evil geniuses.


                          > to break ADsafe (or Caja, or whatever), and you won't go far wrong. It's
                          > impossible to review all of the browser extensions: they aren't adequately
                          > documented, even if they were documented it would be too much work, and
                          > many of them display total cluelessness about programming language design.

                          Caja deals with many of these problems by rewriting. We can deal
                          perfectly well with @ by stripping comments, and rewriting '@' to \x40
                          in strings and regexps.

                          ADSafe is a really elegant design, but my concern with the validation
                          approach, besides blacklisting in general, is that it seems to accrete
                          these rules which must seem arbitrary and hard to remember to coders.

                          And if a vulnerability is discovered, a rewriter can add a new rewrite
                          rule, and apply that to existing programs, whereas a validator might
                          have to reject previously valid programs, which won't work until their
                          creator actually looks at the program, learns the new rule, and
                          applies it.


                          > (Preprocessing features in client-side Javascript? What's the point of that?
                          > Just preprocess it on the server.)

                          This presupposes the existence of a server. If you're using a
                          cut-rate hosting service that only serves of static files, then you
                          need to do everything on the client.

                          I agree though that preprocessing on the client is a little weird,
                          since one of the typical goals of preprocessing is to avoid wasting
                          bandwidth on code the client doesn't need -- the logic could otherwise
                          just use the language's conditional constructs.


                          > --
                          > David-Sarah Hopwood
                        • Kris Zyp
                          ... Also, because with the new cross-site XHR and XDR capabilities, web sites can directly request the scripts from other sites, which can potentially be
                          Message 12 of 16 , Mar 21, 2008
                          • 0 Attachment
                            
                            > (Preprocessing features in client-side Javascript? What's the point of that?
                            > Just preprocess it on the server.)
                            Also, because with the new cross-site XHR and XDR capabilities, web sites can directly request the scripts from other sites, which can potentially be significantly faster than sending them through your proxy (and incurring queuing against your own connection limit on the client). This could also reduces the burden on the server, always nice to offload to the infinitely scalable client (a new cpu for each user).
                            IMO, performance is going to be a critical part of the acceptance of secured JavaScript. I would bet that a large percentage of potential users won't find the benefit of secure JavaScript compelling enough if there site takes twice as long to load. Of course this belief is one of the reasons for wanting to create a small fast 5K-ish library.
                            Even without the new cross-site XHR and XDR capabilities, I think ADsafe client side validation could have a use, since we do have cross-site requesting mechanisms that are safe (like CrossSafe/Subspace). However, CrossSafe can't safely bring scripts into the container, only data. With ADsafe, CrossSafe could bring those scripts across using existing widespread browser technology.
                            However, I do certainly agree that many users will prefer the server-side validation. Has anyone created a server-side ADsafe validator yet, or is that another project waiting to be undertaken?
                            Kris
                          Your message has been successfully submitted and would be delivered to recipients shortly.