Loading ...
Sorry, an error occurred while loading the content.

Re: How to handle big soap attachments ?

Expand Messages
  • allierogers@yahoo.com
    Paul, I have seen a similar problem with ALL SOAP (and XML-RPC) implementations. That is that they do not handle streams at all. Every implementation (that I
    Message 1 of 13 , Mar 21, 2001
    • 0 Attachment
      Paul,

      I have seen a similar problem with ALL SOAP (and XML-RPC)
      implementations. That is that they do not handle streams at all.
      Every implementation (that I know of) loads the entire HTTP
      request/response into memory (bad idea, what if it's a MPEG movie or
      something), parses it into an in-memory tree, and then hands it off
      for proper dispatch.

      Why do implementations assume that all methods accept 2 simple
      arguments and return 1 simple argument like all of the examples (for
      instance, "getQuote")? The real world is not like that.

      In the real world, there may be many complicated arguments (like a
      file upload) and the response may also be arbitrarily large (e.g., a
      SQL result set of 100,000 rows).

      HTTP, SOAP, XML-RPC and XML can handle this fine, if the
      implementations were a bit smarter.

      - Parse using SAX or stream parsers, only (never DOM), and always
      parse "on the fly" without loading entire stream into RAM.

      - Always stream to/from HTTP, rather than loading entire
      request/response into RAM.

      Since HTTP is the most transport, do what good HTTP servers do. When
      you browse to a site with streaming video, do you think the HTTP
      server loads the entire video file into RAM for each request? Of
      course not. It streams it off disk (or cache) directly back to the
      client, chunk by chunk, so that only a small amount of a large file
      is ever in memory at once. In that way, the HTTP server can handle
      100,000 simulatenous hits to that same streaming video.

      Am I wrong about this?

      Regards,

      Allie Rogers

      --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
      > Hi, Sebastian!
      >
      > It might, thanks for the tip. Won't make any promisses, but
      > definitely will take a look. Next version is about to be released,
      > and I don't think will include any changes in this aspect (unless
      > they are minimal, that is possible also), but I'll try to do it
      ASAP.
      > Thank you.
      >
      > Best wishes, Paul.
      >
      > --- sebaklu@y... wrote:
      > > Hi, Paul!
      > >
      > > Thanx for the answer.
      > >
      > > In fact the memory usage in the CGI package
      (SOAP::Transport::HTTP)
      > >
      > > is a problem for handling big SOAP requests because the complete
      > > content is kept more than one times in the memory.
      > >
      > > extract from package SOAP::Transport::HTTP::CGI
      > > located in SOAP/Transport/HTTP.pm:
      > >
      > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
      > > # ^^^^^^^^--- first time
      > >
      > > $self->request(HTTP::Request->new(
      > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
      > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
      {$_}}
      > > keys %ENV),
      > > $content,
      > > # ^^^^^^^^- second time
      > > ));
      > >
      > > Will future version support stream-based handling directly from
      > > STDIN?
      > >
      > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
      > > > Hi, Sebastian!
      > > >
      > > > Yes, you may specify option for MIME::Parser to keep temporary
      > > file
      > > > on disk during parsing, but presence it in memory will be
      > > required
      > > > later to parse the message, though it shouldn't take 200MB to
      > > parse
      > > > it. As soon as it's one part you should be fine. Anyway, there
      is
      > >
      > > not
      > > > much you can change (except options for MIME::Parser) and I'll
      do
      > > my
      > > > tests to check it.
      > > >
      > > > Best wishes, Paul.
      > > >
      > > >
      > > >
      > > > __________________________________________________
      > > > Do You Yahoo!?
      > > > Get email at your own domain with Yahoo! Mail.
      > > > http://personal.mail.yahoo.com/
      > >
      > >
      > > ------------------------ Yahoo! Groups Sponsor
      > >
      > > To unsubscribe from this group, send an email to:
      > > soaplite-unsubscribe@y...
      > >
      > >
      > >
      > > Your use of Yahoo! Groups is subject to
      > > http://docs.yahoo.com/info/terms/
      > >
      > >
      >
      >
      > __________________________________________________
      > Do You Yahoo!?
      > Yahoo! Auctions - Buy the things you want at great prices.
      > http://auctions.yahoo.com/
    • Paul Kulchenko
      Hi, Allie! Definitely you re right and that s exactly what I want to do. Some but s: first, there could be MIME encoded message that is parsed differently, and
      Message 2 of 13 , Mar 21, 2001
      • 0 Attachment
        Hi, Allie!

        Definitely you're right and that's exactly what I want to do. Some
        but's: first, there could be MIME encoded message that is parsed
        differently, and so relationships between
        Transport-Parser-SOAP-You_module becomes little bit more complicated.
        Second, anyway your module should get parameters somehow and there is
        no way to give it to you without finishing parsing first. Third, I
        cannot dispatch message to you wihout finishing parsing, because XML
        could be wrong, unfinished, etc., so again, I need to finish parsing.
        I soon as I finished parsing I need to keep data somewhere. I'm using
        SAX parser, but RESULT of parsing is stored in memory similar to DOM
        structure ("similar" regarding to consumed memory), and I don't know
        HOW I can avoid it. All that you can save is the memory currently
        used for storing message BEFORE parsing, and it'll work only for
        streaming transports (CGI/STDIO/TCP?) and doesn't work for others
        anyway.

        At the same time it worth doing it, but I was thinking about
        different approach. Keep in memory ONLY XML message, navigate through
        it with quick XPATH component, and provide parameters as tied
        variables, so as soon as you want to get it it'll be founded in
        message and returned to you. memory vs. speed. It'll also make
        implementation significantly more complicated.

        I did some benchmarking recently and if take XML::Parser as 1, then
        XML::Parser with tree style will parse the same message 5 times
        slower and my implementation works about 7-8 time slower, just
        because of memory manipulations. I don't think that it could be
        significantly improved.

        Any other ideas?

        Best wishes, Paul.

        --- allierogers@... wrote:
        > Paul,
        >
        > I have seen a similar problem with ALL SOAP (and XML-RPC)
        > implementations. That is that they do not handle streams at all.
        > Every implementation (that I know of) loads the entire HTTP
        > request/response into memory (bad idea, what if it's a MPEG movie
        > or
        > something), parses it into an in-memory tree, and then hands it off
        >
        > for proper dispatch.
        >
        > Why do implementations assume that all methods accept 2 simple
        > arguments and return 1 simple argument like all of the examples
        > (for
        > instance, "getQuote")? The real world is not like that.
        >
        > In the real world, there may be many complicated arguments (like a
        > file upload) and the response may also be arbitrarily large (e.g.,
        > a
        > SQL result set of 100,000 rows).
        >
        > HTTP, SOAP, XML-RPC and XML can handle this fine, if the
        > implementations were a bit smarter.
        >
        > - Parse using SAX or stream parsers, only (never DOM), and always
        > parse "on the fly" without loading entire stream into RAM.
        >
        > - Always stream to/from HTTP, rather than loading entire
        > request/response into RAM.
        >
        > Since HTTP is the most transport, do what good HTTP servers do.
        > When
        > you browse to a site with streaming video, do you think the HTTP
        > server loads the entire video file into RAM for each request? Of
        > course not. It streams it off disk (or cache) directly back to the
        >
        > client, chunk by chunk, so that only a small amount of a large file
        >
        > is ever in memory at once. In that way, the HTTP server can handle
        >
        > 100,000 simulatenous hits to that same streaming video.
        >
        > Am I wrong about this?
        >
        > Regards,
        >
        > Allie Rogers
        >
        > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
        > > Hi, Sebastian!
        > >
        > > It might, thanks for the tip. Won't make any promisses, but
        > > definitely will take a look. Next version is about to be
        > released,
        > > and I don't think will include any changes in this aspect (unless
        > > they are minimal, that is possible also), but I'll try to do it
        > ASAP.
        > > Thank you.
        > >
        > > Best wishes, Paul.
        > >
        > > --- sebaklu@y... wrote:
        > > > Hi, Paul!
        > > >
        > > > Thanx for the answer.
        > > >
        > > > In fact the memory usage in the CGI package
        > (SOAP::Transport::HTTP)
        > > >
        > > > is a problem for handling big SOAP requests because the
        > complete
        > > > content is kept more than one times in the memory.
        > > >
        > > > extract from package SOAP::Transport::HTTP::CGI
        > > > located in SOAP/Transport/HTTP.pm:
        > > >
        > > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} ||
        > 0);
        > > > # ^^^^^^^^--- first time
        > > >
        > > > $self->request(HTTP::Request->new(
        > > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
        > > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
        > {$_}}
        > > > keys %ENV),
        > > > $content,
        > > > # ^^^^^^^^- second time
        > > > ));
        > > >
        > > > Will future version support stream-based handling directly from
        > > > STDIN?
        > > >
        > > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
        > > > > Hi, Sebastian!
        > > > >
        > > > > Yes, you may specify option for MIME::Parser to keep
        > temporary
        > > > file
        > > > > on disk during parsing, but presence it in memory will be
        > > > required
        > > > > later to parse the message, though it shouldn't take 200MB to
        > > > parse
        > > > > it. As soon as it's one part you should be fine. Anyway,
        > there
        > is
        > > >
        > > > not
        > > > > much you can change (except options for MIME::Parser) and
        > I'll
        > do
        > > > my
        > > > > tests to check it.
        > > > >
        > > > > Best wishes, Paul.
        > > > >
        > > > >
        > > > >
        > > > > __________________________________________________
        > > > > Do You Yahoo!?
        > > > > Get email at your own domain with Yahoo! Mail.
        > > > > http://personal.mail.yahoo.com/
        > > >
        > > >
        > > > ------------------------ Yahoo! Groups Sponsor
        > > >
        > > > To unsubscribe from this group, send an email to:
        > > > soaplite-unsubscribe@y...
        > > >
        > > >
        > > >
        > > > Your use of Yahoo! Groups is subject to
        > > > http://docs.yahoo.com/info/terms/
        > > >
        > > >
        > >
        > >
        > > __________________________________________________
        > > Do You Yahoo!?
        > > Yahoo! Auctions - Buy the things you want at great prices.
        > > http://auctions.yahoo.com/
        >
        >
        > ------------------------ Yahoo! Groups Sponsor
        >
        > To unsubscribe from this group, send an email to:
        > soaplite-unsubscribe@yahoogroups.com
        >
        >
        >
        > Your use of Yahoo! Groups is subject to
        > http://docs.yahoo.com/info/terms/
        >
        >


        __________________________________________________
        Do You Yahoo!?
        Yahoo! Auctions - Buy the things you want at great prices.
        http://auctions.yahoo.com/
      • Petr Janata
        Hello, I would just like to remark that we had Don Box last week in Prague giving a SOAP talk and he said that SOAP is meant to work exactly like this, i.e.
        Message 3 of 13 , Mar 21, 2001
        • 0 Attachment
          Hello,

          I would just like to remark that we had Don Box last week in Prague giving a
          SOAP talk and he said that SOAP is meant to work exactly like this, i.e.
          not to handle large amounts of data. He also said that if you need to do
          that you can e.g. pass just the URL in a SOAP message and use simple
          download (or SAX parser ) to handle the transfer.

          Petr Janata

          -----Original Message-----
          From: sentto-2738395-120-985195300-petr.janata=i.cz@...
          [mailto:sentto-2738395-120-985195300-petr.janata=i.cz@...]On
          Behalf Of allierogers@...
          Sent: Wednesday, March 21, 2001 6:22 PM
          To: soaplite@yahoogroups.com
          Subject: [soaplite] Re: How to handle big soap attachments ?


          Paul,

          I have seen a similar problem with ALL SOAP (and XML-RPC)
          implementations. That is that they do not handle streams at all.
          Every implementation (that I know of) loads the entire HTTP
          request/response into memory (bad idea, what if it's a MPEG movie or
          something), parses it into an in-memory tree, and then hands it off
          for proper dispatch.

          Why do implementations assume that all methods accept 2 simple
          arguments and return 1 simple argument like all of the examples (for
          instance, "getQuote")? The real world is not like that.

          In the real world, there may be many complicated arguments (like a
          file upload) and the response may also be arbitrarily large (e.g., a
          SQL result set of 100,000 rows).

          HTTP, SOAP, XML-RPC and XML can handle this fine, if the
          implementations were a bit smarter.

          - Parse using SAX or stream parsers, only (never DOM), and always
          parse "on the fly" without loading entire stream into RAM.

          - Always stream to/from HTTP, rather than loading entire
          request/response into RAM.

          Since HTTP is the most transport, do what good HTTP servers do. When
          you browse to a site with streaming video, do you think the HTTP
          server loads the entire video file into RAM for each request? Of
          course not. It streams it off disk (or cache) directly back to the
          client, chunk by chunk, so that only a small amount of a large file
          is ever in memory at once. In that way, the HTTP server can handle
          100,000 simulatenous hits to that same streaming video.

          Am I wrong about this?

          Regards,

          Allie Rogers

          --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
          > Hi, Sebastian!
          >
          > It might, thanks for the tip. Won't make any promisses, but
          > definitely will take a look. Next version is about to be released,
          > and I don't think will include any changes in this aspect (unless
          > they are minimal, that is possible also), but I'll try to do it
          ASAP.
          > Thank you.
          >
          > Best wishes, Paul.
          >
          > --- sebaklu@y... wrote:
          > > Hi, Paul!
          > >
          > > Thanx for the answer.
          > >
          > > In fact the memory usage in the CGI package
          (SOAP::Transport::HTTP)
          > >
          > > is a problem for handling big SOAP requests because the complete
          > > content is kept more than one times in the memory.
          > >
          > > extract from package SOAP::Transport::HTTP::CGI
          > > located in SOAP/Transport/HTTP.pm:
          > >
          > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
          > > # ^^^^^^^^--- first time
          > >
          > > $self->request(HTTP::Request->new(
          > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
          > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
          {$_}}
          > > keys %ENV),
          > > $content,
          > > # ^^^^^^^^- second time
          > > ));
          > >
          > > Will future version support stream-based handling directly from
          > > STDIN?
          > >
          > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
          > > > Hi, Sebastian!
          > > >
          > > > Yes, you may specify option for MIME::Parser to keep temporary
          > > file
          > > > on disk during parsing, but presence it in memory will be
          > > required
          > > > later to parse the message, though it shouldn't take 200MB to
          > > parse
          > > > it. As soon as it's one part you should be fine. Anyway, there
          is
          > >
          > > not
          > > > much you can change (except options for MIME::Parser) and I'll
          do
          > > my
          > > > tests to check it.
          > > >
          > > > Best wishes, Paul.
          > > >
          > > >
          > > >
          > > > __________________________________________________
          > > > Do You Yahoo!?
          > > > Get email at your own domain with Yahoo! Mail.
          > > > http://personal.mail.yahoo.com/
          > >
          > >
          > > ------------------------ Yahoo! Groups Sponsor
          > >
          > > To unsubscribe from this group, send an email to:
          > > soaplite-unsubscribe@y...
          > >
          > >
          > >
          > > Your use of Yahoo! Groups is subject to
          > > http://docs.yahoo.com/info/terms/
          > >
          > >
          >
          >
          > __________________________________________________
          > Do You Yahoo!?
          > Yahoo! Auctions - Buy the things you want at great prices.
          > http://auctions.yahoo.com/



          To unsubscribe from this group, send an email to:
          soaplite-unsubscribe@yahoogroups.com



          Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
        • sebaklu@yahoo.com
          Hi Petr. Since there exists a specification for soap messages with attachements it should be possible to send the entire data according to the request at once.
          Message 4 of 13 , Mar 22, 2001
          • 0 Attachment
            Hi Petr.

            Since there exists a specification for soap messages with
            attachements it should be possible to send the entire data according
            to the request at once. Why make it complicated with more requests ?
            My problem is that the client could send only data via HTTP and the
            server can receive it only via HTTP. The server themself can't send
            any request for requiered data to the client. Firewalls and security
            rules on both sides make other requests eg. via URLs or FTP
            impossible .
            I think SOAP is a good solution for this problem.


            Sebastian

            --- In soaplite@y..., "Petr Janata" <petr.janata@i...> wrote:
            > Hello,
            >
            > I would just like to remark that we had Don Box last week in Prague
            giving a
            > SOAP talk and he said that SOAP is meant to work exactly like
            this, i.e.
            > not to handle large amounts of data. He also said that if you need
            to do
            > that you can e.g. pass just the URL in a SOAP message and use simple
            > download (or SAX parser ) to handle the transfer.
            >
            > Petr Janata
            >
            > -----Original Message-----
            > From: sentto-2738395-120-985195300-petr.janata=i.cz@r...
            > [mailto:sentto-2738395-120-985195300-petr.janata=i.cz@r...]On
            > Behalf Of allierogers@y...
            > Sent: Wednesday, March 21, 2001 6:22 PM
            > To: soaplite@y...
            > Subject: [soaplite] Re: How to handle big soap attachments ?
            >
            >
            > Paul,
            >
            > I have seen a similar problem with ALL SOAP (and XML-RPC)
            > implementations. That is that they do not handle streams at all.
            > Every implementation (that I know of) loads the entire HTTP
            > request/response into memory (bad idea, what if it's a MPEG movie or
            > something), parses it into an in-memory tree, and then hands it off
            > for proper dispatch.
            >
            > Why do implementations assume that all methods accept 2 simple
            > arguments and return 1 simple argument like all of the examples (for
            > instance, "getQuote")? The real world is not like that.
            >
            > In the real world, there may be many complicated arguments (like a
            > file upload) and the response may also be arbitrarily large (e.g., a
            > SQL result set of 100,000 rows).
            >
            > HTTP, SOAP, XML-RPC and XML can handle this fine, if the
            > implementations were a bit smarter.
            >
            > - Parse using SAX or stream parsers, only (never DOM), and always
            > parse "on the fly" without loading entire stream into RAM.
            >
            > - Always stream to/from HTTP, rather than loading entire
            > request/response into RAM.
            >
            > Since HTTP is the most transport, do what good HTTP servers do.
            When
            > you browse to a site with streaming video, do you think the HTTP
            > server loads the entire video file into RAM for each request? Of
            > course not. It streams it off disk (or cache) directly back to the
            > client, chunk by chunk, so that only a small amount of a large file
            > is ever in memory at once. In that way, the HTTP server can handle
            > 100,000 simulatenous hits to that same streaming video.
            >
            > Am I wrong about this?
            >
            > Regards,
            >
            > Allie Rogers
            >
            > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
            > > Hi, Sebastian!
            > >
            > > It might, thanks for the tip. Won't make any promisses, but
            > > definitely will take a look. Next version is about to be released,
            > > and I don't think will include any changes in this aspect (unless
            > > they are minimal, that is possible also), but I'll try to do it
            > ASAP.
            > > Thank you.
            > >
            > > Best wishes, Paul.
            > >
            > > --- sebaklu@y... wrote:
            > > > Hi, Paul!
            > > >
            > > > Thanx for the answer.
            > > >
            > > > In fact the memory usage in the CGI package
            > (SOAP::Transport::HTTP)
            > > >
            > > > is a problem for handling big SOAP requests because the complete
            > > > content is kept more than one times in the memory.
            > > >
            > > > extract from package SOAP::Transport::HTTP::CGI
            > > > located in SOAP/Transport/HTTP.pm:
            > > >
            > > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
            > > > # ^^^^^^^^--- first time
            > > >
            > > > $self->request(HTTP::Request->new(
            > > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
            > > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
            > {$_}}
            > > > keys %ENV),
            > > > $content,
            > > > # ^^^^^^^^- second time
            > > > ));
            > > >
            > > > Will future version support stream-based handling directly from
            > > > STDIN?
            > > >
            > > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
            > > > > Hi, Sebastian!
            > > > >
            > > > > Yes, you may specify option for MIME::Parser to keep temporary
            > > > file
            > > > > on disk during parsing, but presence it in memory will be
            > > > required
            > > > > later to parse the message, though it shouldn't take 200MB to
            > > > parse
            > > > > it. As soon as it's one part you should be fine. Anyway, there
            > is
            > > >
            > > > not
            > > > > much you can change (except options for MIME::Parser) and I'll
            > do
            > > > my
            > > > > tests to check it.
            > > > >
            > > > > Best wishes, Paul.
            > > > >
            > > > >
            > > > >
            > > > > __________________________________________________
            > > > > Do You Yahoo!?
            > > > > Get email at your own domain with Yahoo! Mail.
            > > > > http://personal.mail.yahoo.com/
            > > >
            > > >
            > > > ------------------------ Yahoo! Groups Sponsor
            > > >
            > > > To unsubscribe from this group, send an email to:
            > > > soaplite-unsubscribe@y...
            > > >
            > > >
            > > >
            > > > Your use of Yahoo! Groups is subject to
            > > > http://docs.yahoo.com/info/terms/
            > > >
            > > >
            > >
            > >
            > > __________________________________________________
            > > Do You Yahoo!?
            > > Yahoo! Auctions - Buy the things you want at great prices.
            > > http://auctions.yahoo.com/
            >
            >
            >
            > To unsubscribe from this group, send an email to:
            > soaplite-unsubscribe@y...
            >
            >
            >
            > Your use of Yahoo! Groups is subject to
            http://docs.yahoo.com/info/terms/
          • Paul Kulchenko
            Hi, Sebastian! That s true, but at the same time it s easy to imagine situation when you send something not directly, but thru the several different
            Message 5 of 13 , Mar 22, 2001
            • 0 Attachment
              Hi, Sebastian!

              That's true, but at the same time it's easy to imagine situation when
              you send something not directly, but thru the several different
              intermediaries and each of them will need to handle this huge
              request. If this piece is encoded as external reference then handler
              could be smart enough to get it only if it's required (yet I don't
              know about such smart handlers :)). Ideas, ideas...

              Ideally implementation should be flexible enough to handle both (and
              maybe man others) approaches, maybe with manual hints.

              Best wishes, Paul.

              --- sebaklu@... wrote:
              > Hi Petr.
              >
              > Since there exists a specification for soap messages with
              > attachements it should be possible to send the entire data
              > according
              > to the request at once. Why make it complicated with more requests
              > ?
              > My problem is that the client could send only data via HTTP and the
              >
              > server can receive it only via HTTP. The server themself can't send
              >
              > any request for requiered data to the client. Firewalls and
              > security
              > rules on both sides make other requests eg. via URLs or FTP
              > impossible .
              > I think SOAP is a good solution for this problem.
              >
              >
              > Sebastian
              >
              > --- In soaplite@y..., "Petr Janata" <petr.janata@i...> wrote:
              > > Hello,
              > >
              > > I would just like to remark that we had Don Box last week in
              > Prague
              > giving a
              > > SOAP talk and he said that SOAP is meant to work exactly like
              > this, i.e.
              > > not to handle large amounts of data. He also said that if you
              > need
              > to do
              > > that you can e.g. pass just the URL in a SOAP message and use
              > simple
              > > download (or SAX parser ) to handle the transfer.
              > >
              > > Petr Janata
              > >
              > > -----Original Message-----
              > > From: sentto-2738395-120-985195300-petr.janata=i.cz@r...
              > > [mailto:sentto-2738395-120-985195300-petr.janata=i.cz@r...]On
              > > Behalf Of allierogers@y...
              > > Sent: Wednesday, March 21, 2001 6:22 PM
              > > To: soaplite@y...
              > > Subject: [soaplite] Re: How to handle big soap attachments ?
              > >
              > >
              > > Paul,
              > >
              > > I have seen a similar problem with ALL SOAP (and XML-RPC)
              > > implementations. That is that they do not handle streams at all.
              > > Every implementation (that I know of) loads the entire HTTP
              > > request/response into memory (bad idea, what if it's a MPEG movie
              > or
              > > something), parses it into an in-memory tree, and then hands it
              > off
              > > for proper dispatch.
              > >
              > > Why do implementations assume that all methods accept 2 simple
              > > arguments and return 1 simple argument like all of the examples
              > (for
              > > instance, "getQuote")? The real world is not like that.
              > >
              > > In the real world, there may be many complicated arguments (like
              > a
              > > file upload) and the response may also be arbitrarily large
              > (e.g., a
              > > SQL result set of 100,000 rows).
              > >
              > > HTTP, SOAP, XML-RPC and XML can handle this fine, if the
              > > implementations were a bit smarter.
              > >
              > > - Parse using SAX or stream parsers, only (never DOM), and always
              > > parse "on the fly" without loading entire stream into RAM.
              > >
              > > - Always stream to/from HTTP, rather than loading entire
              > > request/response into RAM.
              > >
              > > Since HTTP is the most transport, do what good HTTP servers do.
              > When
              > > you browse to a site with streaming video, do you think the HTTP
              > > server loads the entire video file into RAM for each request? Of
              > > course not. It streams it off disk (or cache) directly back to
              > the
              > > client, chunk by chunk, so that only a small amount of a large
              > file
              > > is ever in memory at once. In that way, the HTTP server can
              > handle
              > > 100,000 simulatenous hits to that same streaming video.
              > >
              > > Am I wrong about this?
              > >
              > > Regards,
              > >
              > > Allie Rogers
              > >
              > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
              > > > Hi, Sebastian!
              > > >
              > > > It might, thanks for the tip. Won't make any promisses, but
              > > > definitely will take a look. Next version is about to be
              > released,
              > > > and I don't think will include any changes in this aspect
              > (unless
              > > > they are minimal, that is possible also), but I'll try to do it
              > > ASAP.
              > > > Thank you.
              > > >
              > > > Best wishes, Paul.
              > > >
              > > > --- sebaklu@y... wrote:
              > > > > Hi, Paul!
              > > > >
              > > > > Thanx for the answer.
              > > > >
              > > > > In fact the memory usage in the CGI package
              > > (SOAP::Transport::HTTP)
              > > > >
              > > > > is a problem for handling big SOAP requests because the
              > complete
              > > > > content is kept more than one times in the memory.
              > > > >
              > > > > extract from package SOAP::Transport::HTTP::CGI
              > > > > located in SOAP/Transport/HTTP.pm:
              > > > >
              > > > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} ||
              > 0);
              > > > > # ^^^^^^^^--- first time
              > > > >
              > > > > $self->request(HTTP::Request->new(
              > > > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
              > > > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
              > > {$_}}
              > > > > keys %ENV),
              > > > > $content,
              > > > > # ^^^^^^^^- second time
              > > > > ));
              > > > >
              > > > > Will future version support stream-based handling directly
              > from
              > > > > STDIN?
              > > > >
              > > > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...>
              > wrote:
              > > > > > Hi, Sebastian!
              > > > > >
              > > > > > Yes, you may specify option for MIME::Parser to keep
              > temporary
              > > > > file
              > > > > > on disk during parsing, but presence it in memory will be
              > > > > required
              > > > > > later to parse the message, though it shouldn't take 200MB
              > to
              > > > > parse
              > > > > > it. As soon as it's one part you should be fine. Anyway,
              > there
              > > is
              > > > >
              > > > > not
              > > > > > much you can change (except options for MIME::Parser) and
              > I'll
              > > do
              > > > > my
              > > > > > tests to check it.
              > > > > >
              > > > > > Best wishes, Paul.
              > > > > >
              > > > > >
              > > > > >
              > > > > > __________________________________________________
              > > > > > Do You Yahoo!?
              > > > > > Get email at your own domain with Yahoo! Mail.
              > > > > > http://personal.mail.yahoo.com/
              > > > >
              > > > >
              > > > > ------------------------ Yahoo! Groups Sponsor
              > > > >
              > > > > To unsubscribe from this group, send an email to:
              > > > > soaplite-unsubscribe@y...
              > > > >
              > > > >
              > > > >
              > > > > Your use of Yahoo! Groups is subject to
              > > > > http://docs.yahoo.com/info/terms/
              > > > >
              > > > >
              > > >
              > > >
              > > > __________________________________________________
              > > > Do You Yahoo!?
              > > > Yahoo! Auctions - Buy the things you want at great prices.
              > > > http://auctions.yahoo.com/
              > >
              > >
              > >
              > > To unsubscribe from this group, send an email to:
              > > soaplite-unsubscribe@y...
              > >
              > >
              > >
              > > Your use of Yahoo! Groups is subject to
              > http://docs.yahoo.com/info/terms/
              >
              === message truncated ===


              __________________________________________________
              Do You Yahoo!?
              Get email at your own domain with Yahoo! Mail.
              http://personal.mail.yahoo.com/
            • allierogers@yahoo.com
              Paul, You make many good points, and I see the issues. Maybe we need to decompose the problem into smaller spaces. As I see it, there are 2 basic issues
              Message 6 of 13 , Mar 22, 2001
              • 0 Attachment
                Paul,

                You make many good points, and I see the issues. Maybe we need to
                decompose the problem into smaller spaces. As I see it, there are 2
                basic issues related to large SOAP RPC calls:

                1. SOAP request where arguments are large

                2. SOAP response where method result is large

                I thought the proper way to handle number 1 was to use the HTTP file
                upload capability where the SOAP argument references the data in the
                HTTP upload or via URL somehow. Maybe I am wrong about this because
                my particular implementations never make use of this feature. I
                realize this is an HTTP-centric approach, but the scalability issues
                really revolve around HTTP implementations and not so much in SMTP
                and FTP where disk rather than RAM storage is the default and
                dispatch and execution threads are not as memory sensitive as HTTP
                (for instance, an SMTP server is only handling one message at a time
                while an HTTP server may be handling 100,000 requests at once).

                For me, number 2 is more important. In this case, the method call is
                simple and its data is relatively small, so how you parse and handle
                the request on the server side is fine as it is. However, the
                response should be streamed in all cases. There is no issue of XML
                parsing on the server side in this case. But, the server should
                assume that the method return could be arbitrarily large, so it
                should not attempt to receive the entire method return before it
                starts passing it back to the client. The server should start, right
                away, streaming back the SOAP envelope, unbuffered, or at least allow
                this as a setting. For SOAP::Lite, you have the problem that you are
                dynamically trying to figure out what types are in the method
                response, rather than through static definition (e.g., a
                configuration file on the server to map method name, namespace, uri,
                arugments, return types, etc.). Without a static configuration,
                there is no way to stream as I would like. Maybe it could be a
                performance option?

                This is how we have solved the problem, here, both for SOAP and XML-
                RPC. However, we use SOAP::Lite servers only for some prototyping.
                In other cases, we have COM-based SOAP/XML-RPC servers (e.g., 4s4c
                from pocketsoap.com) and IIS configured for unbuffered response.

                Regards,

                Allie
              • allierogers@yahoo.com
                Petr, I know Don is an important figure, but I disagree with him. SOAP can be used for large data, just as all HTTP servers today handle large data. We use
                Message 7 of 13 , Mar 22, 2001
                • 0 Attachment
                  Petr,

                  I know Don is an important figure, but I disagree with him. SOAP can
                  be used for large data, just as all HTTP servers today handle large
                  data. We use it for this purpose here in all of our products and it
                  works well. Maybe he did not intend this use of SOAP, but it does
                  work. However, SOAP::Lite, currently, can not be used in this way.
                  Perhaps that may change.

                  Allie

                  --- In soaplite@y..., "Petr Janata" <petr.janata@i...> wrote:
                  > Hello,
                  >
                  > I would just like to remark that we had Don Box last week in Prague
                  giving a
                  > SOAP talk and he said that SOAP is meant to work exactly like
                  this, i.e.
                  > not to handle large amounts of data. He also said that if you need
                  to do
                  > that you can e.g. pass just the URL in a SOAP message and use simple
                  > download (or SAX parser ) to handle the transfer.
                  >
                  > Petr Janata
                • sebaklu@yahoo.com
                  Hi Paul, That s right, but the current version of SOAP::Lite should never expect large requests. The server will give no response and fill out the complete
                  Message 8 of 13 , Mar 22, 2001
                  • 0 Attachment
                    Hi Paul,

                    That's right, but the current version of SOAP::Lite should never
                    expect large requests. The server will give no response and fill out
                    the complete memory on the machine. Since it expect SOAP messages
                    with attachements it should be able to handle large amount of data.

                    However, it works fine with simple requests. But general for
                    handling SOAP messages with attachments (the 7 MB attachment was an
                    example, i had also problems to handle smaller attachments) should
                    use stream mechanism. If not you should not read it into memory but
                    reject the request. Maybe i'm wrong but it is a weak point and DOS
                    attacks may use it.


                    Sebastian

                    --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                    > Hi, Sebastian!
                    >
                    > That's true, but at the same time it's easy to imagine situation
                    when
                    > you send something not directly, but thru the several different
                    > intermediaries and each of them will need to handle this huge
                    > request. If this piece is encoded as external reference then handler
                    > could be smart enough to get it only if it's required (yet I don't
                    > know about such smart handlers :)). Ideas, ideas...
                    >
                    > Ideally implementation should be flexible enough to handle both (and
                    > maybe man others) approaches, maybe with manual hints.
                    >
                    > Best wishes, Paul.
                  • Paul Kulchenko
                    Hi, Sebastian! Absolutely agree. That s the reason why I want to introduce some additional transport options, like ACCEPTABLE_CONTENT_TYPE (if you want to
                    Message 9 of 13 , Mar 22, 2001
                    • 0 Attachment
                      Hi, Sebastian!

                      Absolutely agree. That's the reason why I want to introduce some
                      additional transport options, like ACCEPTABLE_CONTENT_TYPE (if you
                      want to accept ONLY text/xml or multipart/related) and
                      MAX_CONTENT_SIZE that should take care about it and request will be
                      rejected. As for DOS attack it could be introduced even with small
                      request which has complex XML structure. Anyway, these options should
                      make server side more robust.

                      Best wishes, Paul.

                      --- sebaklu@... wrote:
                      > Hi Paul,
                      >
                      > That's right, but the current version of SOAP::Lite should never
                      > expect large requests. The server will give no response and fill
                      > out
                      > the complete memory on the machine. Since it expect SOAP messages
                      > with attachements it should be able to handle large amount of data.
                      >
                      > However, it works fine with simple requests. But general for
                      > handling SOAP messages with attachments (the 7 MB attachment was
                      > an
                      > example, i had also problems to handle smaller attachments) should
                      > use stream mechanism. If not you should not read it into memory but
                      >
                      > reject the request. Maybe i'm wrong but it is a weak point and DOS
                      > attacks may use it.
                      >
                      >
                      > Sebastian
                      >
                      > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                      > > Hi, Sebastian!
                      > >
                      > > That's true, but at the same time it's easy to imagine situation
                      > when
                      > > you send something not directly, but thru the several different
                      > > intermediaries and each of them will need to handle this huge
                      > > request. If this piece is encoded as external reference then
                      > handler
                      > > could be smart enough to get it only if it's required (yet I
                      > don't
                      > > know about such smart handlers :)). Ideas, ideas...
                      > >
                      > > Ideally implementation should be flexible enough to handle both
                      > (and
                      > > maybe man others) approaches, maybe with manual hints.
                      > >
                      > > Best wishes, Paul.
                      >
                      >
                      >
                      > ------------------------ Yahoo! Groups Sponsor
                      >
                      > To unsubscribe from this group, send an email to:
                      > soaplite-unsubscribe@yahoogroups.com
                      >
                      >
                      >
                      > Your use of Yahoo! Groups is subject to
                      > http://docs.yahoo.com/info/terms/
                      >
                      >


                      __________________________________________________
                      Do You Yahoo!?
                      Get email at your own domain with Yahoo! Mail.
                      http://personal.mail.yahoo.com/
                    Your message has been successfully submitted and would be delivered to recipients shortly.