Loading ...
Sorry, an error occurred while loading the content.

ADsafe validation

Expand Messages
  • Kris Zyp
    Doug/ADsafe people, Has there been any efforts to produce a lightweight minimal-sized ADsafe validator? With the coming browser capabilities in Cross-site XHR
    Message 1 of 16 , Mar 16 6:40 PM
    • 0 Attachment
      Doug/ADsafe people,
      Has there been any efforts to produce a lightweight minimal-sized ADsafe validator? With the coming browser capabilities in Cross-site XHR (MS's XDR and W3C/AC proposal) and the new postMessage API, it seems there will be significant potential for loading scripts by first retrieving the script in text form and then validating their safety before eval'ing them. AFAIK, jslint is the only validator, but it does a lot more than just ADsafe validation (and consequently is fairly big, I assume). I wonder how small a validator could be that only did ADsafe validation (and it would not even need to check for valid JavaScript since eval does that). Have their been efforts towards such a minimalistic validator?
      It seems like the other useful place for ADsafe validator would be in a proxy server that could request a script, validate it, and deliver it to the browser. Would such efforts be appreciated? (not committing, but I am interested.)
      Thanks,
      Kris
    • Douglas Crockford
      ... ADsafe validator? With the coming browser capabilities in Cross-site XHR (MS s XDR and W3C/AC proposal) and the new postMessage API, it seems there will be
      Message 2 of 16 , Mar 17 6:10 AM
      • 0 Attachment
        --- In caplet@yahoogroups.com, "Kris Zyp" <kris@...> wrote:
        >
        > Doug/ADsafe people,
        > Has there been any efforts to produce a lightweight minimal-sized
        ADsafe validator? With the coming browser capabilities in Cross-site
        XHR (MS's XDR and W3C/AC proposal) and the new postMessage API, it
        seems there will be significant potential for loading scripts by first
        retrieving the script in text form and then validating their safety
        before eval'ing them. AFAIK, jslint is the only validator, but it does
        a lot more than just ADsafe validation (and consequently is fairly
        big, I assume).

        Currently JSLint is the only ADsafe validator. It is 60K. It is
        certainly feasible to reduce it in size by removing the specialized
        error messages and quality checks that are not required by ADsafe, and
        by removing the HTML and XML and JSON processing.
      • Douglas Crockford
        Results of a quick experiment: Pulling stuff out of JSLint.js that is not needed for ADsafe validation of JavaScript produced an adsafe.js file that is 34K. I
        Message 3 of 16 , Mar 17 10:05 AM
        • 0 Attachment
          Results of a quick experiment: Pulling stuff out of JSLint.js that is
          not needed for ADsafe validation of JavaScript produced an adsafe.js
          file that is 34K. I expect that an heroic rewrite could do better.
        • David-Sarah Hopwood
          ... A validator for a Javascript subset like ADsafe does have to check for syntactic validity, because: - it cannot trust the browser s eval to accept only
          Message 4 of 16 , Mar 17 10:05 AM
          • 0 Attachment
            Kris Zyp wrote:
            > [...] I wonder how small a validator could be that only did ADsafe validation
            > (and it would not even need to check for valid JavaScript since eval does that).

            A validator for a Javascript subset like ADsafe does have to check for
            syntactic validity, because:

            - it cannot trust the browser's eval to accept only Javascript from a
            known dialect of the language; browser extensions might be insecure

            - it must parse the Javascript anyway, which implicitly checks that
            it is syntactically valid.

            --
            David-Sarah Hopwood
          • Kris Zyp
            Here is my attempt at an ADsafe validator: http://www.persvr.org/test/capability-validate.html Let me know if anyone can find any false acceptances (scripts
            Message 5 of 16 , Mar 18 1:18 PM
            • 0 Attachment
              Here is my attempt at an ADsafe validator:
              Let me know if anyone can find any false acceptances (scripts that get successfully eval'ed that are unsafe).
              You can download the validator at:
              It is about 5K uncompressed, probably about 2-3K compressed. I would presume that it is also a lot faster since it is using simpler regex-based checking rather than full AST parsing. This small size and probable improved speed seems like it would improve the odds of adoption of ADsafe. IMHO, 34K is a bit heavy, but a quick 3K validator that can validate ads, would lower the barriers to people using this stuff.
              There are some known limitations that cause false rejections (which do not represent a security concern):
              • Nesting vars inside statement blocks will not declare the variable for the outer block
              • Named functions are not treated as declaration so they are generally not allowed unless the name is declared with a var. 
              • Var declaration that involve multiple comma delimited variable assignments are not accepted
              So I suppose it validates a slight subset of ADsafe, but these limitations are minor, I think.
              If this proves to be a workable validation technique, I will probably try to get this in Dojo. Let me know what you think.
              Thanks,
              Kris
               
              ----- Original Message -----
              Sent: Monday, March 17, 2008 11:05 AM
              Subject: [caplet] Re: ADsafe validation

              Results of a quick experiment: Pulling stuff out of JSLint.js that is
              not needed for ADsafe validation of JavaScript produced an adsafe.js
              file that is 34K. I expect that an heroic rewrite could do better.

            • Douglas Crockford
              ... get successfully eval ed that are unsafe). ... presume that it is also a lot faster since it is using simpler regex-based checking rather than full AST
              Message 6 of 16 , Mar 20 6:54 AM
              • 0 Attachment
                --- In caplet@yahoogroups.com, "Kris Zyp" <kris@...> wrote:
                >
                > Here is my attempt at an ADsafe validator:
                > http://www.persvr.org/test/capability-validate.html
                > Let me know if anyone can find any false acceptances (scripts that
                get successfully eval'ed that are unsafe).
                > You can download the validator at:
                > http://www.persvr.org/jsclient/capability-validate.js
                > It is about 5K uncompressed, probably about 2-3K compressed. I would
                presume that it is also a lot faster since it is using simpler
                regex-based checking rather than full AST parsing. This small size and
                probable improved speed seems like it would improve the odds of
                adoption of ADsafe. IMHO, 34K is a bit heavy, but a quick 3K validator
                that can validate ads, would lower the barriers to people using this
                stuff.


                I have reservations about extensive use of regular expressions for
                validation. In the json.js case, I started thinking that a single
                regexp should do the job. It has since grown to four, and was still
                vulnerable to a screw-up in Firefox. RexExp doesn't have enough
                context to make me confident.

                In your case, I think you might have a problem with comment deletion.
                Lacking context, the regexps can be confused.

                /\/*\//.test("*/");
                /* // */
              • Kris Zyp
                ... Yes, regular expression based validation does seem impropable. However, it seems like you could also make an argument that it easier to reason about and
                Message 7 of 16 , Mar 20 7:48 AM
                • 0 Attachment
                  > I have reservations about extensive use of regular expressions for
                  > validation.
                  In the json.js case, I started thinking that a single
                  > regexp should do
                  the job. It has since grown to four, and was still
                  > vulnerable to a
                  screw-up in Firefox. RexExp doesn't have enough
                  > context to make me
                  confident.
                   
                  Yes, regular expression based validation does seem impropable. However, it seems like you could also make an argument that it easier to reason about and have confidence in a simple 5K chunk of code, than a 34K module. A large module has more room for human errors. Anyway, I understand your skepticism, but I don't want to dismiss this approach yet, based solely on feelings on uncertainty. So far the problems have been fixable.
                   
                  > In your case, I think you might have a problem with comment
                  deletion.
                  > Lacking context, the regexps can be confused.

                  >
                  /\/*\//.test( "*/");
                  > /* // */
                   
                  Thanks, yes I did have a problem. Those should be fixed now.
                   
                  Kris
                • Adam Barth
                  ... Do we have a regression test suite of tricky examples? For instance, I don t see the string cc_on in Kris validator, but that feature tripped up ADsafe
                  Message 8 of 16 , Mar 20 11:34 AM
                  • 0 Attachment
                    On Thu, Mar 20, 2008 at 7:48 AM, Kris Zyp <kris@...> wrote:
                    > > /\/*\//.test("*/");
                    > > /* // */
                    >
                    > Thanks, yes I did have a problem. Those should be fixed now.

                    Do we have a regression test suite of tricky examples? For instance,
                    I don't see the string "cc_on" in Kris' validator, but that feature
                    tripped up ADsafe a few months ago. I could rewrite a test case for
                    that (or dig through the list archives), but it's probably a better
                    approach to have a test suite that we can run against ADsafe
                    validators, both to catch regressions as they are modified and to
                    build confidence in new implementations.

                    Adam
                  • Kris Zyp
                    ... That would be awesome. ... Thanks for the heads, fixed it. Thanks, Kris
                    Message 9 of 16 , Mar 21 12:20 PM
                    • 0 Attachment
                      > Do we have a regression test suite of tricky examples?
                       
                      That would be awesome.

                      > For instance, I don't see the string "cc_on" in Kris' validator, but

                      that feature
                      > tripped up ADsafe a few months ago.

                      Thanks for the heads, fixed it.

                      Thanks,

                      Kris

                    • Mike Samuel
                      ... Can you disallow @ outside of string literals entirely? What if ADSafe code is included in a container that has @cc_on, and does an @set that overrides a
                      Message 10 of 16 , Mar 21 12:40 PM
                      • 0 Attachment
                        On 21/03/2008, Kris Zyp <kris@...> wrote:
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        > > Do we have a regression test suite of tricky examples?
                        >
                        > That would be awesome.
                        >
                        > > For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                        > > tripped up ADsafe a few months ago.
                        >
                        > Thanks for the heads, fixed it.

                        Can you disallow @ outside of string literals entirely?

                        What if ADSafe code is included in a container that has @cc_on, and
                        does an @set that overrides a variable defined in the container?
                      • David-Sarah Hopwood
                        ... @ does not appear anywhere in the ES3 grammar outside string literals, regexp literals, and comments, right? Isn t ADsafe defined to be a subset of ES3?
                        Message 11 of 16 , Mar 21 4:09 PM
                        • 0 Attachment
                          Mike Samuel wrote:
                          > On 21/03/2008, Kris Zyp <kris@...> wrote:
                          >>
                          >>> Do we have a regression test suite of tricky examples?
                          >> That would be awesome.
                          >>
                          >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                          >>> tripped up ADsafe a few months ago.
                          >> Thanks for the heads, fixed it.
                          >
                          > Can you disallow @ outside of string literals entirely?
                          >
                          > What if ADSafe code is included in a container that has @cc_on, and
                          > does an @set that overrides a variable defined in the container?

                          '@' does not appear anywhere in the ES3 grammar outside string literals,
                          regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                          of ES3?

                          (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                          'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                          an old Unicode version], so it is not valid in an identifier.)

                          --
                          David-Sarah Hopwood
                        • Mike Samuel
                          On 21/03/2008, David-Sarah Hopwood ... Yep. @ often appears in JSDoc style comments: http://jsdoc.sourceforge.net/#tagref so banning @ in comments might make
                          Message 12 of 16 , Mar 21 5:05 PM
                          • 0 Attachment
                            On 21/03/2008, David-Sarah Hopwood
                            <david.hopwood@...> wrote:
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            > Mike Samuel wrote:
                            > > On 21/03/2008, Kris Zyp <kris@...> wrote:
                            > >>
                            > >>> Do we have a regression test suite of tricky examples?
                            > >> That would be awesome.
                            > >>
                            > >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                            > >>> tripped up ADsafe a few months ago.
                            > >> Thanks for the heads, fixed it.
                            > >
                            > > Can you disallow @ outside of string literals entirely?
                            > >
                            > > What if ADSafe code is included in a container that has @cc_on, and
                            > > does an @set that overrides a variable defined in the container?
                            >
                            > '@' does not appear anywhere in the ES3 grammar outside string literals,
                            > regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                            > of ES3?
                            >
                            > (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                            > 'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                            > an old Unicode version], so it is not valid in an identifier.)

                            Yep. @ often appears in JSDoc style comments:
                            http://jsdoc.sourceforge.net/#tagref so banning @ in comments might
                            make some programmers grumble.

                            It also indicates a conditional compilation directive in IE and I
                            don't have a specific exploit in mind but I don't know whether
                            blacklisting @cc_on is sufficient to avoid using conditional
                            compilation to split or join tokens that otherwise appear separate.

                            Perhaps either ban @ outside string/regexp contexts, or recommend that
                            containers not allow @cc_on.




                            >
                            >
                            >
                            >
                            > --
                            > David-Sarah Hopwood
                            >
                            >
                          • Kris Zyp
                            ... Certainly seems reasonable to insist that containers don t do the eval inside a @cc_on. Kris
                            Message 13 of 16 , Mar 21 8:01 PM
                            • 0 Attachment
                              
                              > Perhaps either ban @ outside string/regexp contexts, or recommend
                              that
                              > containers not allow @cc_on.
                              Certainly seems reasonable to insist that containers don't do the eval inside a @cc_on.
                              Kris
                            • David-Sarah Hopwood
                              ... I meant my point a bit more generally: Assume that any extension to strict ES3 is designed by an evil genius trying to break ADsafe (or Caja, or whatever),
                              Message 14 of 16 , Mar 21 8:04 PM
                              • 0 Attachment
                                Mike Samuel wrote:
                                > On 21/03/2008, David-Sarah Hopwood
                                > <david.hopwood@...> wrote:
                                >> Mike Samuel wrote:
                                >> > On 21/03/2008, Kris Zyp <kris@...> wrote:
                                >> >>
                                >> >>> Do we have a regression test suite of tricky examples?
                                >> >> That would be awesome.
                                >> >>
                                >> >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                                >> >>> tripped up ADsafe a few months ago.
                                >> >> Thanks for the heads, fixed it.
                                >> >
                                >> > Can you disallow @ outside of string literals entirely?
                                >> >
                                >> > What if ADSafe code is included in a container that has @cc_on, and
                                >> > does an @set that overrides a variable defined in the container?
                                >>
                                >> '@' does not appear anywhere in the ES3 grammar outside string literals,
                                >> regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                                >> of ES3?
                                >>
                                >> (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                                >> 'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                                >> an old Unicode version], so it is not valid in an identifier.)
                                >
                                > Yep. @ often appears in JSDoc style comments:
                                > http://jsdoc.sourceforge.net/#tagref so banning @ in comments might
                                > make some programmers grumble.
                                >
                                > It also indicates a conditional compilation directive in IE and I
                                > don't have a specific exploit in mind but I don't know whether
                                > blacklisting @cc_on is sufficient to avoid using conditional
                                > compilation to split or join tokens that otherwise appear separate.
                                >
                                > Perhaps either ban @ outside string/regexp contexts, or recommend that
                                > containers not allow @cc_on.

                                I meant my point a bit more generally:

                                Assume that any extension to strict ES3 is designed by an evil genius trying
                                to break ADsafe (or Caja, or whatever), and you won't go far wrong. It's
                                impossible to review all of the browser extensions: they aren't adequately
                                documented, even if they were documented it would be too much work, and
                                many of them display total cluelessness about programming language design.

                                (Preprocessing features in client-side Javascript? What's the point of that?
                                Just preprocess it on the server.)

                                --
                                David-Sarah Hopwood
                              • Mike Samuel
                                On 21/03/2008, David-Sarah Hopwood ... Or a committee of evil geniuses. ... Caja deals with many of these problems by rewriting. We can deal perfectly well
                                Message 15 of 16 , Mar 21 8:20 PM
                                • 0 Attachment
                                  On 21/03/2008, David-Sarah Hopwood
                                  <david.hopwood@...> wrote:
                                  >
                                  >
                                  >
                                  >
                                  >
                                  >
                                  >
                                  > Mike Samuel wrote:
                                  > > On 21/03/2008, David-Sarah Hopwood
                                  > > <david.hopwood@...> wrote:
                                  > >> Mike Samuel wrote:
                                  > >> > On 21/03/2008, Kris Zyp <kris@...> wrote:
                                  > >> >>
                                  > >> >>> Do we have a regression test suite of tricky examples?
                                  > >> >> That would be awesome.
                                  > >> >>
                                  > >> >>> For instance, I don't see the string "cc_on" in Kris' validator, but that feature
                                  > >> >>> tripped up ADsafe a few months ago.
                                  > >> >> Thanks for the heads, fixed it.
                                  > >> >
                                  > >> > Can you disallow @ outside of string literals entirely?
                                  > >> >
                                  > >> > What if ADSafe code is included in a container that has @cc_on, and
                                  > >> > does an @set that overrides a variable defined in the container?
                                  > >>
                                  > >> '@' does not appear anywhere in the ES3 grammar outside string literals,
                                  > >> regexp literals, and comments, right? Isn't ADsafe defined to be a subset
                                  > >> of ES3?
                                  > >>
                                  > >> (See ECMA-262 section 7.6; note that '@' is 'Punctuation' but not
                                  > >> 'Connector punctuation' in Unicode 2.1 [insert grumble about using such
                                  > >> an old Unicode version], so it is not valid in an identifier.)
                                  > >
                                  > > Yep. @ often appears in JSDoc style comments:
                                  > > http://jsdoc.sourceforge.net/#tagref so banning @ in comments might
                                  > > make some programmers grumble.
                                  > >
                                  > > It also indicates a conditional compilation directive in IE and I
                                  > > don't have a specific exploit in mind but I don't know whether
                                  > > blacklisting @cc_on is sufficient to avoid using conditional
                                  > > compilation to split or join tokens that otherwise appear separate.
                                  > >
                                  > > Perhaps either ban @ outside string/regexp contexts, or recommend that
                                  > > containers not allow @cc_on.
                                  >
                                  > I meant my point a bit more generally:
                                  >
                                  > Assume that any extension to strict ES3 is designed by an evil genius trying

                                  Or a committee of evil geniuses.


                                  > to break ADsafe (or Caja, or whatever), and you won't go far wrong. It's
                                  > impossible to review all of the browser extensions: they aren't adequately
                                  > documented, even if they were documented it would be too much work, and
                                  > many of them display total cluelessness about programming language design.

                                  Caja deals with many of these problems by rewriting. We can deal
                                  perfectly well with @ by stripping comments, and rewriting '@' to \x40
                                  in strings and regexps.

                                  ADSafe is a really elegant design, but my concern with the validation
                                  approach, besides blacklisting in general, is that it seems to accrete
                                  these rules which must seem arbitrary and hard to remember to coders.

                                  And if a vulnerability is discovered, a rewriter can add a new rewrite
                                  rule, and apply that to existing programs, whereas a validator might
                                  have to reject previously valid programs, which won't work until their
                                  creator actually looks at the program, learns the new rule, and
                                  applies it.


                                  > (Preprocessing features in client-side Javascript? What's the point of that?
                                  > Just preprocess it on the server.)

                                  This presupposes the existence of a server. If you're using a
                                  cut-rate hosting service that only serves of static files, then you
                                  need to do everything on the client.

                                  I agree though that preprocessing on the client is a little weird,
                                  since one of the typical goals of preprocessing is to avoid wasting
                                  bandwidth on code the client doesn't need -- the logic could otherwise
                                  just use the language's conditional constructs.


                                  > --
                                  > David-Sarah Hopwood
                                • Kris Zyp
                                  ... Also, because with the new cross-site XHR and XDR capabilities, web sites can directly request the scripts from other sites, which can potentially be
                                  Message 16 of 16 , Mar 21 8:34 PM
                                  • 0 Attachment
                                    
                                    > (Preprocessing features in client-side Javascript? What's the point of that?
                                    > Just preprocess it on the server.)
                                    Also, because with the new cross-site XHR and XDR capabilities, web sites can directly request the scripts from other sites, which can potentially be significantly faster than sending them through your proxy (and incurring queuing against your own connection limit on the client). This could also reduces the burden on the server, always nice to offload to the infinitely scalable client (a new cpu for each user).
                                    IMO, performance is going to be a critical part of the acceptance of secured JavaScript. I would bet that a large percentage of potential users won't find the benefit of secure JavaScript compelling enough if there site takes twice as long to load. Of course this belief is one of the reasons for wanting to create a small fast 5K-ish library.
                                    Even without the new cross-site XHR and XDR capabilities, I think ADsafe client side validation could have a use, since we do have cross-site requesting mechanisms that are safe (like CrossSafe/Subspace). However, CrossSafe can't safely bring scripts into the container, only data. With ADsafe, CrossSafe could bring those scripts across using existing widespread browser technology.
                                    However, I do certainly agree that many users will prefer the server-side validation. Has anyone created a server-side ADsafe validator yet, or is that another project waiting to be undertaken?
                                    Kris
                                  Your message has been successfully submitted and would be delivered to recipients shortly.