Loading ...
Sorry, an error occurred while loading the content.

Re: [soaplite] How to handle big soap attachments ?

Expand Messages
  • Paul Kulchenko
    Hi, Sebastian! Yes, you may specify option for MIME::Parser to keep temporary file on disk during parsing, but presence it in memory will be required later to
    Message 1 of 13 , Mar 21, 2001
    • 0 Attachment
      Hi, Sebastian!

      Yes, you may specify option for MIME::Parser to keep temporary file
      on disk during parsing, but presence it in memory will be required
      later to parse the message, though it shouldn't take 200MB to parse
      it. As soon as it's one part you should be fine. Anyway, there is not
      much you can change (except options for MIME::Parser) and I'll do my
      tests to check it.

      Best wishes, Paul.

      --- sebaklu@... wrote:
      > Hi.
      >
      > I'm developing a SOAP::Lite based web service that should be able
      > to
      > handle attached file content.
      >
      > While posting big file content the server runs out of memory.
      > I use the CGI package located in the SOAP::Transport::HTTP module
      > to
      > process the request.
      >
      > ------------- -------
      >
      >
      > # The posted request
      >
      > POST http://localhost/ws/HandleFilePost.cgi
      > Content-Length: 6906872
      > Content-Type: Multipart/Related; boundary=Mime_Boundary;
      > type="text/xml"; start="<foo.xml@...>"
      > MIME-Version: 1.0
      >
      > --Mime_Boundary
      > Content-Type: text/xml; charset=UTF-8
      > Content-Transfer-Encoding: 8bit
      > Content-ID: <foo.xml@...>
      >
      > <?xml version="1.0" encoding="UTF-8"?>
      > <SOAP-ENV:Envelope
      > xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
      >
      > SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
      > xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
      > xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
      > xmlns:xsd="http://www.w3.org/1999/XMLSchema">
      > <SOAP-ENV:Body>
      > <namesp1:processfile
      > xmlns:namesp1="http://localhost/HandleFilePost">
      > <filecontent href="cid:foo.jpg@..."/>
      > </namesp1:processfile>
      > </SOAP-ENV:Body>
      > </SOAP-ENV:Envelope>
      >
      > --Mime_Boundary
      > Content-Type: image/jpeg
      > Content-Transfer-Encoding: binary
      > Content-ID: <foo.jpg@...>
      >
      > ... the binary image content (7 MB) ...
      > --Mime_Boundary--
      >
      >
      > # The service:
      >
      > use SOAP::Transport::HTTP;
      > SOAP::Transport::HTTP::CGI
      > -> dispatch_to('HandleFilePost')
      > -> handle;
      >
      > package HandleFilePost;
      > sub processfile {
      > # do nothing here ...
      > }
      >
      > 1;
      > __END__
      >
      > ------------- -------
      >
      > This request allocates 64 MB of real memory and 130 MB of virtual
      > memory on the server and runs finally out of memory.
      >
      > Is there an other way to handle the request that will not load the
      > complete request in memory ?
      >
      >
      > Sebastian
      >
      >
      > ------------------------ Yahoo! Groups Sponsor
      >
      > To unsubscribe from this group, send an email to:
      > soaplite-unsubscribe@yahoogroups.com
      >
      >
      >
      > Your use of Yahoo! Groups is subject to
      > http://docs.yahoo.com/info/terms/
      >
      >


      __________________________________________________
      Do You Yahoo!?
      Get email at your own domain with Yahoo! Mail.
      http://personal.mail.yahoo.com/
    • sebaklu@yahoo.com
      Hi, Paul! Thanx for the answer. In fact the memory usage in the CGI package (SOAP::Transport::HTTP) is a problem for handling big SOAP requests because the
      Message 2 of 13 , Mar 21, 2001
      • 0 Attachment
        Hi, Paul!

        Thanx for the answer.

        In fact the memory usage in the CGI package (SOAP::Transport::HTTP)
        is a problem for handling big SOAP requests because the complete
        content is kept more than one times in the memory.

        extract from package SOAP::Transport::HTTP::CGI
        located in SOAP/Transport/HTTP.pm:

        my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
        # ^^^^^^^^--- first time

        $self->request(HTTP::Request->new(
        $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
        HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV{$_}}
        keys %ENV),
        $content,
        # ^^^^^^^^- second time
        ));

        Will future version support stream-based handling directly from STDIN?

        --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
        > Hi, Sebastian!
        >
        > Yes, you may specify option for MIME::Parser to keep temporary file
        > on disk during parsing, but presence it in memory will be required
        > later to parse the message, though it shouldn't take 200MB to parse
        > it. As soon as it's one part you should be fine. Anyway, there is
        not
        > much you can change (except options for MIME::Parser) and I'll do my
        > tests to check it.
        >
        > Best wishes, Paul.
        >
        >
        >
        > __________________________________________________
        > Do You Yahoo!?
        > Get email at your own domain with Yahoo! Mail.
        > http://personal.mail.yahoo.com/
      • Paul Kulchenko
        Hi, Sebastian! It might, thanks for the tip. Won t make any promisses, but definitely will take a look. Next version is about to be released, and I don t think
        Message 3 of 13 , Mar 21, 2001
        • 0 Attachment
          Hi, Sebastian!

          It might, thanks for the tip. Won't make any promisses, but
          definitely will take a look. Next version is about to be released,
          and I don't think will include any changes in this aspect (unless
          they are minimal, that is possible also), but I'll try to do it ASAP.
          Thank you.

          Best wishes, Paul.

          --- sebaklu@... wrote:
          > Hi, Paul!
          >
          > Thanx for the answer.
          >
          > In fact the memory usage in the CGI package (SOAP::Transport::HTTP)
          >
          > is a problem for handling big SOAP requests because the complete
          > content is kept more than one times in the memory.
          >
          > extract from package SOAP::Transport::HTTP::CGI
          > located in SOAP/Transport/HTTP.pm:
          >
          > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
          > # ^^^^^^^^--- first time
          >
          > $self->request(HTTP::Request->new(
          > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
          > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV{$_}}
          > keys %ENV),
          > $content,
          > # ^^^^^^^^- second time
          > ));
          >
          > Will future version support stream-based handling directly from
          > STDIN?
          >
          > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
          > > Hi, Sebastian!
          > >
          > > Yes, you may specify option for MIME::Parser to keep temporary
          > file
          > > on disk during parsing, but presence it in memory will be
          > required
          > > later to parse the message, though it shouldn't take 200MB to
          > parse
          > > it. As soon as it's one part you should be fine. Anyway, there is
          >
          > not
          > > much you can change (except options for MIME::Parser) and I'll do
          > my
          > > tests to check it.
          > >
          > > Best wishes, Paul.
          > >
          > >
          > >
          > > __________________________________________________
          > > Do You Yahoo!?
          > > Get email at your own domain with Yahoo! Mail.
          > > http://personal.mail.yahoo.com/
          >
          >
          > ------------------------ Yahoo! Groups Sponsor
          >
          > To unsubscribe from this group, send an email to:
          > soaplite-unsubscribe@yahoogroups.com
          >
          >
          >
          > Your use of Yahoo! Groups is subject to
          > http://docs.yahoo.com/info/terms/
          >
          >


          __________________________________________________
          Do You Yahoo!?
          Yahoo! Auctions - Buy the things you want at great prices.
          http://auctions.yahoo.com/
        • allierogers@yahoo.com
          Paul, I have seen a similar problem with ALL SOAP (and XML-RPC) implementations. That is that they do not handle streams at all. Every implementation (that I
          Message 4 of 13 , Mar 21, 2001
          • 0 Attachment
            Paul,

            I have seen a similar problem with ALL SOAP (and XML-RPC)
            implementations. That is that they do not handle streams at all.
            Every implementation (that I know of) loads the entire HTTP
            request/response into memory (bad idea, what if it's a MPEG movie or
            something), parses it into an in-memory tree, and then hands it off
            for proper dispatch.

            Why do implementations assume that all methods accept 2 simple
            arguments and return 1 simple argument like all of the examples (for
            instance, "getQuote")? The real world is not like that.

            In the real world, there may be many complicated arguments (like a
            file upload) and the response may also be arbitrarily large (e.g., a
            SQL result set of 100,000 rows).

            HTTP, SOAP, XML-RPC and XML can handle this fine, if the
            implementations were a bit smarter.

            - Parse using SAX or stream parsers, only (never DOM), and always
            parse "on the fly" without loading entire stream into RAM.

            - Always stream to/from HTTP, rather than loading entire
            request/response into RAM.

            Since HTTP is the most transport, do what good HTTP servers do. When
            you browse to a site with streaming video, do you think the HTTP
            server loads the entire video file into RAM for each request? Of
            course not. It streams it off disk (or cache) directly back to the
            client, chunk by chunk, so that only a small amount of a large file
            is ever in memory at once. In that way, the HTTP server can handle
            100,000 simulatenous hits to that same streaming video.

            Am I wrong about this?

            Regards,

            Allie Rogers

            --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
            > Hi, Sebastian!
            >
            > It might, thanks for the tip. Won't make any promisses, but
            > definitely will take a look. Next version is about to be released,
            > and I don't think will include any changes in this aspect (unless
            > they are minimal, that is possible also), but I'll try to do it
            ASAP.
            > Thank you.
            >
            > Best wishes, Paul.
            >
            > --- sebaklu@y... wrote:
            > > Hi, Paul!
            > >
            > > Thanx for the answer.
            > >
            > > In fact the memory usage in the CGI package
            (SOAP::Transport::HTTP)
            > >
            > > is a problem for handling big SOAP requests because the complete
            > > content is kept more than one times in the memory.
            > >
            > > extract from package SOAP::Transport::HTTP::CGI
            > > located in SOAP/Transport/HTTP.pm:
            > >
            > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
            > > # ^^^^^^^^--- first time
            > >
            > > $self->request(HTTP::Request->new(
            > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
            > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
            {$_}}
            > > keys %ENV),
            > > $content,
            > > # ^^^^^^^^- second time
            > > ));
            > >
            > > Will future version support stream-based handling directly from
            > > STDIN?
            > >
            > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
            > > > Hi, Sebastian!
            > > >
            > > > Yes, you may specify option for MIME::Parser to keep temporary
            > > file
            > > > on disk during parsing, but presence it in memory will be
            > > required
            > > > later to parse the message, though it shouldn't take 200MB to
            > > parse
            > > > it. As soon as it's one part you should be fine. Anyway, there
            is
            > >
            > > not
            > > > much you can change (except options for MIME::Parser) and I'll
            do
            > > my
            > > > tests to check it.
            > > >
            > > > Best wishes, Paul.
            > > >
            > > >
            > > >
            > > > __________________________________________________
            > > > Do You Yahoo!?
            > > > Get email at your own domain with Yahoo! Mail.
            > > > http://personal.mail.yahoo.com/
            > >
            > >
            > > ------------------------ Yahoo! Groups Sponsor
            > >
            > > To unsubscribe from this group, send an email to:
            > > soaplite-unsubscribe@y...
            > >
            > >
            > >
            > > Your use of Yahoo! Groups is subject to
            > > http://docs.yahoo.com/info/terms/
            > >
            > >
            >
            >
            > __________________________________________________
            > Do You Yahoo!?
            > Yahoo! Auctions - Buy the things you want at great prices.
            > http://auctions.yahoo.com/
          • Paul Kulchenko
            Hi, Allie! Definitely you re right and that s exactly what I want to do. Some but s: first, there could be MIME encoded message that is parsed differently, and
            Message 5 of 13 , Mar 21, 2001
            • 0 Attachment
              Hi, Allie!

              Definitely you're right and that's exactly what I want to do. Some
              but's: first, there could be MIME encoded message that is parsed
              differently, and so relationships between
              Transport-Parser-SOAP-You_module becomes little bit more complicated.
              Second, anyway your module should get parameters somehow and there is
              no way to give it to you without finishing parsing first. Third, I
              cannot dispatch message to you wihout finishing parsing, because XML
              could be wrong, unfinished, etc., so again, I need to finish parsing.
              I soon as I finished parsing I need to keep data somewhere. I'm using
              SAX parser, but RESULT of parsing is stored in memory similar to DOM
              structure ("similar" regarding to consumed memory), and I don't know
              HOW I can avoid it. All that you can save is the memory currently
              used for storing message BEFORE parsing, and it'll work only for
              streaming transports (CGI/STDIO/TCP?) and doesn't work for others
              anyway.

              At the same time it worth doing it, but I was thinking about
              different approach. Keep in memory ONLY XML message, navigate through
              it with quick XPATH component, and provide parameters as tied
              variables, so as soon as you want to get it it'll be founded in
              message and returned to you. memory vs. speed. It'll also make
              implementation significantly more complicated.

              I did some benchmarking recently and if take XML::Parser as 1, then
              XML::Parser with tree style will parse the same message 5 times
              slower and my implementation works about 7-8 time slower, just
              because of memory manipulations. I don't think that it could be
              significantly improved.

              Any other ideas?

              Best wishes, Paul.

              --- allierogers@... wrote:
              > Paul,
              >
              > I have seen a similar problem with ALL SOAP (and XML-RPC)
              > implementations. That is that they do not handle streams at all.
              > Every implementation (that I know of) loads the entire HTTP
              > request/response into memory (bad idea, what if it's a MPEG movie
              > or
              > something), parses it into an in-memory tree, and then hands it off
              >
              > for proper dispatch.
              >
              > Why do implementations assume that all methods accept 2 simple
              > arguments and return 1 simple argument like all of the examples
              > (for
              > instance, "getQuote")? The real world is not like that.
              >
              > In the real world, there may be many complicated arguments (like a
              > file upload) and the response may also be arbitrarily large (e.g.,
              > a
              > SQL result set of 100,000 rows).
              >
              > HTTP, SOAP, XML-RPC and XML can handle this fine, if the
              > implementations were a bit smarter.
              >
              > - Parse using SAX or stream parsers, only (never DOM), and always
              > parse "on the fly" without loading entire stream into RAM.
              >
              > - Always stream to/from HTTP, rather than loading entire
              > request/response into RAM.
              >
              > Since HTTP is the most transport, do what good HTTP servers do.
              > When
              > you browse to a site with streaming video, do you think the HTTP
              > server loads the entire video file into RAM for each request? Of
              > course not. It streams it off disk (or cache) directly back to the
              >
              > client, chunk by chunk, so that only a small amount of a large file
              >
              > is ever in memory at once. In that way, the HTTP server can handle
              >
              > 100,000 simulatenous hits to that same streaming video.
              >
              > Am I wrong about this?
              >
              > Regards,
              >
              > Allie Rogers
              >
              > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
              > > Hi, Sebastian!
              > >
              > > It might, thanks for the tip. Won't make any promisses, but
              > > definitely will take a look. Next version is about to be
              > released,
              > > and I don't think will include any changes in this aspect (unless
              > > they are minimal, that is possible also), but I'll try to do it
              > ASAP.
              > > Thank you.
              > >
              > > Best wishes, Paul.
              > >
              > > --- sebaklu@y... wrote:
              > > > Hi, Paul!
              > > >
              > > > Thanx for the answer.
              > > >
              > > > In fact the memory usage in the CGI package
              > (SOAP::Transport::HTTP)
              > > >
              > > > is a problem for handling big SOAP requests because the
              > complete
              > > > content is kept more than one times in the memory.
              > > >
              > > > extract from package SOAP::Transport::HTTP::CGI
              > > > located in SOAP/Transport/HTTP.pm:
              > > >
              > > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} ||
              > 0);
              > > > # ^^^^^^^^--- first time
              > > >
              > > > $self->request(HTTP::Request->new(
              > > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
              > > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
              > {$_}}
              > > > keys %ENV),
              > > > $content,
              > > > # ^^^^^^^^- second time
              > > > ));
              > > >
              > > > Will future version support stream-based handling directly from
              > > > STDIN?
              > > >
              > > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
              > > > > Hi, Sebastian!
              > > > >
              > > > > Yes, you may specify option for MIME::Parser to keep
              > temporary
              > > > file
              > > > > on disk during parsing, but presence it in memory will be
              > > > required
              > > > > later to parse the message, though it shouldn't take 200MB to
              > > > parse
              > > > > it. As soon as it's one part you should be fine. Anyway,
              > there
              > is
              > > >
              > > > not
              > > > > much you can change (except options for MIME::Parser) and
              > I'll
              > do
              > > > my
              > > > > tests to check it.
              > > > >
              > > > > Best wishes, Paul.
              > > > >
              > > > >
              > > > >
              > > > > __________________________________________________
              > > > > Do You Yahoo!?
              > > > > Get email at your own domain with Yahoo! Mail.
              > > > > http://personal.mail.yahoo.com/
              > > >
              > > >
              > > > ------------------------ Yahoo! Groups Sponsor
              > > >
              > > > To unsubscribe from this group, send an email to:
              > > > soaplite-unsubscribe@y...
              > > >
              > > >
              > > >
              > > > Your use of Yahoo! Groups is subject to
              > > > http://docs.yahoo.com/info/terms/
              > > >
              > > >
              > >
              > >
              > > __________________________________________________
              > > Do You Yahoo!?
              > > Yahoo! Auctions - Buy the things you want at great prices.
              > > http://auctions.yahoo.com/
              >
              >
              > ------------------------ Yahoo! Groups Sponsor
              >
              > To unsubscribe from this group, send an email to:
              > soaplite-unsubscribe@yahoogroups.com
              >
              >
              >
              > Your use of Yahoo! Groups is subject to
              > http://docs.yahoo.com/info/terms/
              >
              >


              __________________________________________________
              Do You Yahoo!?
              Yahoo! Auctions - Buy the things you want at great prices.
              http://auctions.yahoo.com/
            • Petr Janata
              Hello, I would just like to remark that we had Don Box last week in Prague giving a SOAP talk and he said that SOAP is meant to work exactly like this, i.e.
              Message 6 of 13 , Mar 21, 2001
              • 0 Attachment
                Hello,

                I would just like to remark that we had Don Box last week in Prague giving a
                SOAP talk and he said that SOAP is meant to work exactly like this, i.e.
                not to handle large amounts of data. He also said that if you need to do
                that you can e.g. pass just the URL in a SOAP message and use simple
                download (or SAX parser ) to handle the transfer.

                Petr Janata

                -----Original Message-----
                From: sentto-2738395-120-985195300-petr.janata=i.cz@...
                [mailto:sentto-2738395-120-985195300-petr.janata=i.cz@...]On
                Behalf Of allierogers@...
                Sent: Wednesday, March 21, 2001 6:22 PM
                To: soaplite@yahoogroups.com
                Subject: [soaplite] Re: How to handle big soap attachments ?


                Paul,

                I have seen a similar problem with ALL SOAP (and XML-RPC)
                implementations. That is that they do not handle streams at all.
                Every implementation (that I know of) loads the entire HTTP
                request/response into memory (bad idea, what if it's a MPEG movie or
                something), parses it into an in-memory tree, and then hands it off
                for proper dispatch.

                Why do implementations assume that all methods accept 2 simple
                arguments and return 1 simple argument like all of the examples (for
                instance, "getQuote")? The real world is not like that.

                In the real world, there may be many complicated arguments (like a
                file upload) and the response may also be arbitrarily large (e.g., a
                SQL result set of 100,000 rows).

                HTTP, SOAP, XML-RPC and XML can handle this fine, if the
                implementations were a bit smarter.

                - Parse using SAX or stream parsers, only (never DOM), and always
                parse "on the fly" without loading entire stream into RAM.

                - Always stream to/from HTTP, rather than loading entire
                request/response into RAM.

                Since HTTP is the most transport, do what good HTTP servers do. When
                you browse to a site with streaming video, do you think the HTTP
                server loads the entire video file into RAM for each request? Of
                course not. It streams it off disk (or cache) directly back to the
                client, chunk by chunk, so that only a small amount of a large file
                is ever in memory at once. In that way, the HTTP server can handle
                100,000 simulatenous hits to that same streaming video.

                Am I wrong about this?

                Regards,

                Allie Rogers

                --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                > Hi, Sebastian!
                >
                > It might, thanks for the tip. Won't make any promisses, but
                > definitely will take a look. Next version is about to be released,
                > and I don't think will include any changes in this aspect (unless
                > they are minimal, that is possible also), but I'll try to do it
                ASAP.
                > Thank you.
                >
                > Best wishes, Paul.
                >
                > --- sebaklu@y... wrote:
                > > Hi, Paul!
                > >
                > > Thanx for the answer.
                > >
                > > In fact the memory usage in the CGI package
                (SOAP::Transport::HTTP)
                > >
                > > is a problem for handling big SOAP requests because the complete
                > > content is kept more than one times in the memory.
                > >
                > > extract from package SOAP::Transport::HTTP::CGI
                > > located in SOAP/Transport/HTTP.pm:
                > >
                > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
                > > # ^^^^^^^^--- first time
                > >
                > > $self->request(HTTP::Request->new(
                > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
                > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
                {$_}}
                > > keys %ENV),
                > > $content,
                > > # ^^^^^^^^- second time
                > > ));
                > >
                > > Will future version support stream-based handling directly from
                > > STDIN?
                > >
                > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                > > > Hi, Sebastian!
                > > >
                > > > Yes, you may specify option for MIME::Parser to keep temporary
                > > file
                > > > on disk during parsing, but presence it in memory will be
                > > required
                > > > later to parse the message, though it shouldn't take 200MB to
                > > parse
                > > > it. As soon as it's one part you should be fine. Anyway, there
                is
                > >
                > > not
                > > > much you can change (except options for MIME::Parser) and I'll
                do
                > > my
                > > > tests to check it.
                > > >
                > > > Best wishes, Paul.
                > > >
                > > >
                > > >
                > > > __________________________________________________
                > > > Do You Yahoo!?
                > > > Get email at your own domain with Yahoo! Mail.
                > > > http://personal.mail.yahoo.com/
                > >
                > >
                > > ------------------------ Yahoo! Groups Sponsor
                > >
                > > To unsubscribe from this group, send an email to:
                > > soaplite-unsubscribe@y...
                > >
                > >
                > >
                > > Your use of Yahoo! Groups is subject to
                > > http://docs.yahoo.com/info/terms/
                > >
                > >
                >
                >
                > __________________________________________________
                > Do You Yahoo!?
                > Yahoo! Auctions - Buy the things you want at great prices.
                > http://auctions.yahoo.com/



                To unsubscribe from this group, send an email to:
                soaplite-unsubscribe@yahoogroups.com



                Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
              • sebaklu@yahoo.com
                Hi Petr. Since there exists a specification for soap messages with attachements it should be possible to send the entire data according to the request at once.
                Message 7 of 13 , Mar 22, 2001
                • 0 Attachment
                  Hi Petr.

                  Since there exists a specification for soap messages with
                  attachements it should be possible to send the entire data according
                  to the request at once. Why make it complicated with more requests ?
                  My problem is that the client could send only data via HTTP and the
                  server can receive it only via HTTP. The server themself can't send
                  any request for requiered data to the client. Firewalls and security
                  rules on both sides make other requests eg. via URLs or FTP
                  impossible .
                  I think SOAP is a good solution for this problem.


                  Sebastian

                  --- In soaplite@y..., "Petr Janata" <petr.janata@i...> wrote:
                  > Hello,
                  >
                  > I would just like to remark that we had Don Box last week in Prague
                  giving a
                  > SOAP talk and he said that SOAP is meant to work exactly like
                  this, i.e.
                  > not to handle large amounts of data. He also said that if you need
                  to do
                  > that you can e.g. pass just the URL in a SOAP message and use simple
                  > download (or SAX parser ) to handle the transfer.
                  >
                  > Petr Janata
                  >
                  > -----Original Message-----
                  > From: sentto-2738395-120-985195300-petr.janata=i.cz@r...
                  > [mailto:sentto-2738395-120-985195300-petr.janata=i.cz@r...]On
                  > Behalf Of allierogers@y...
                  > Sent: Wednesday, March 21, 2001 6:22 PM
                  > To: soaplite@y...
                  > Subject: [soaplite] Re: How to handle big soap attachments ?
                  >
                  >
                  > Paul,
                  >
                  > I have seen a similar problem with ALL SOAP (and XML-RPC)
                  > implementations. That is that they do not handle streams at all.
                  > Every implementation (that I know of) loads the entire HTTP
                  > request/response into memory (bad idea, what if it's a MPEG movie or
                  > something), parses it into an in-memory tree, and then hands it off
                  > for proper dispatch.
                  >
                  > Why do implementations assume that all methods accept 2 simple
                  > arguments and return 1 simple argument like all of the examples (for
                  > instance, "getQuote")? The real world is not like that.
                  >
                  > In the real world, there may be many complicated arguments (like a
                  > file upload) and the response may also be arbitrarily large (e.g., a
                  > SQL result set of 100,000 rows).
                  >
                  > HTTP, SOAP, XML-RPC and XML can handle this fine, if the
                  > implementations were a bit smarter.
                  >
                  > - Parse using SAX or stream parsers, only (never DOM), and always
                  > parse "on the fly" without loading entire stream into RAM.
                  >
                  > - Always stream to/from HTTP, rather than loading entire
                  > request/response into RAM.
                  >
                  > Since HTTP is the most transport, do what good HTTP servers do.
                  When
                  > you browse to a site with streaming video, do you think the HTTP
                  > server loads the entire video file into RAM for each request? Of
                  > course not. It streams it off disk (or cache) directly back to the
                  > client, chunk by chunk, so that only a small amount of a large file
                  > is ever in memory at once. In that way, the HTTP server can handle
                  > 100,000 simulatenous hits to that same streaming video.
                  >
                  > Am I wrong about this?
                  >
                  > Regards,
                  >
                  > Allie Rogers
                  >
                  > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                  > > Hi, Sebastian!
                  > >
                  > > It might, thanks for the tip. Won't make any promisses, but
                  > > definitely will take a look. Next version is about to be released,
                  > > and I don't think will include any changes in this aspect (unless
                  > > they are minimal, that is possible also), but I'll try to do it
                  > ASAP.
                  > > Thank you.
                  > >
                  > > Best wishes, Paul.
                  > >
                  > > --- sebaklu@y... wrote:
                  > > > Hi, Paul!
                  > > >
                  > > > Thanx for the answer.
                  > > >
                  > > > In fact the memory usage in the CGI package
                  > (SOAP::Transport::HTTP)
                  > > >
                  > > > is a problem for handling big SOAP requests because the complete
                  > > > content is kept more than one times in the memory.
                  > > >
                  > > > extract from package SOAP::Transport::HTTP::CGI
                  > > > located in SOAP/Transport/HTTP.pm:
                  > > >
                  > > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} || 0);
                  > > > # ^^^^^^^^--- first time
                  > > >
                  > > > $self->request(HTTP::Request->new(
                  > > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
                  > > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
                  > {$_}}
                  > > > keys %ENV),
                  > > > $content,
                  > > > # ^^^^^^^^- second time
                  > > > ));
                  > > >
                  > > > Will future version support stream-based handling directly from
                  > > > STDIN?
                  > > >
                  > > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                  > > > > Hi, Sebastian!
                  > > > >
                  > > > > Yes, you may specify option for MIME::Parser to keep temporary
                  > > > file
                  > > > > on disk during parsing, but presence it in memory will be
                  > > > required
                  > > > > later to parse the message, though it shouldn't take 200MB to
                  > > > parse
                  > > > > it. As soon as it's one part you should be fine. Anyway, there
                  > is
                  > > >
                  > > > not
                  > > > > much you can change (except options for MIME::Parser) and I'll
                  > do
                  > > > my
                  > > > > tests to check it.
                  > > > >
                  > > > > Best wishes, Paul.
                  > > > >
                  > > > >
                  > > > >
                  > > > > __________________________________________________
                  > > > > Do You Yahoo!?
                  > > > > Get email at your own domain with Yahoo! Mail.
                  > > > > http://personal.mail.yahoo.com/
                  > > >
                  > > >
                  > > > ------------------------ Yahoo! Groups Sponsor
                  > > >
                  > > > To unsubscribe from this group, send an email to:
                  > > > soaplite-unsubscribe@y...
                  > > >
                  > > >
                  > > >
                  > > > Your use of Yahoo! Groups is subject to
                  > > > http://docs.yahoo.com/info/terms/
                  > > >
                  > > >
                  > >
                  > >
                  > > __________________________________________________
                  > > Do You Yahoo!?
                  > > Yahoo! Auctions - Buy the things you want at great prices.
                  > > http://auctions.yahoo.com/
                  >
                  >
                  >
                  > To unsubscribe from this group, send an email to:
                  > soaplite-unsubscribe@y...
                  >
                  >
                  >
                  > Your use of Yahoo! Groups is subject to
                  http://docs.yahoo.com/info/terms/
                • Paul Kulchenko
                  Hi, Sebastian! That s true, but at the same time it s easy to imagine situation when you send something not directly, but thru the several different
                  Message 8 of 13 , Mar 22, 2001
                  • 0 Attachment
                    Hi, Sebastian!

                    That's true, but at the same time it's easy to imagine situation when
                    you send something not directly, but thru the several different
                    intermediaries and each of them will need to handle this huge
                    request. If this piece is encoded as external reference then handler
                    could be smart enough to get it only if it's required (yet I don't
                    know about such smart handlers :)). Ideas, ideas...

                    Ideally implementation should be flexible enough to handle both (and
                    maybe man others) approaches, maybe with manual hints.

                    Best wishes, Paul.

                    --- sebaklu@... wrote:
                    > Hi Petr.
                    >
                    > Since there exists a specification for soap messages with
                    > attachements it should be possible to send the entire data
                    > according
                    > to the request at once. Why make it complicated with more requests
                    > ?
                    > My problem is that the client could send only data via HTTP and the
                    >
                    > server can receive it only via HTTP. The server themself can't send
                    >
                    > any request for requiered data to the client. Firewalls and
                    > security
                    > rules on both sides make other requests eg. via URLs or FTP
                    > impossible .
                    > I think SOAP is a good solution for this problem.
                    >
                    >
                    > Sebastian
                    >
                    > --- In soaplite@y..., "Petr Janata" <petr.janata@i...> wrote:
                    > > Hello,
                    > >
                    > > I would just like to remark that we had Don Box last week in
                    > Prague
                    > giving a
                    > > SOAP talk and he said that SOAP is meant to work exactly like
                    > this, i.e.
                    > > not to handle large amounts of data. He also said that if you
                    > need
                    > to do
                    > > that you can e.g. pass just the URL in a SOAP message and use
                    > simple
                    > > download (or SAX parser ) to handle the transfer.
                    > >
                    > > Petr Janata
                    > >
                    > > -----Original Message-----
                    > > From: sentto-2738395-120-985195300-petr.janata=i.cz@r...
                    > > [mailto:sentto-2738395-120-985195300-petr.janata=i.cz@r...]On
                    > > Behalf Of allierogers@y...
                    > > Sent: Wednesday, March 21, 2001 6:22 PM
                    > > To: soaplite@y...
                    > > Subject: [soaplite] Re: How to handle big soap attachments ?
                    > >
                    > >
                    > > Paul,
                    > >
                    > > I have seen a similar problem with ALL SOAP (and XML-RPC)
                    > > implementations. That is that they do not handle streams at all.
                    > > Every implementation (that I know of) loads the entire HTTP
                    > > request/response into memory (bad idea, what if it's a MPEG movie
                    > or
                    > > something), parses it into an in-memory tree, and then hands it
                    > off
                    > > for proper dispatch.
                    > >
                    > > Why do implementations assume that all methods accept 2 simple
                    > > arguments and return 1 simple argument like all of the examples
                    > (for
                    > > instance, "getQuote")? The real world is not like that.
                    > >
                    > > In the real world, there may be many complicated arguments (like
                    > a
                    > > file upload) and the response may also be arbitrarily large
                    > (e.g., a
                    > > SQL result set of 100,000 rows).
                    > >
                    > > HTTP, SOAP, XML-RPC and XML can handle this fine, if the
                    > > implementations were a bit smarter.
                    > >
                    > > - Parse using SAX or stream parsers, only (never DOM), and always
                    > > parse "on the fly" without loading entire stream into RAM.
                    > >
                    > > - Always stream to/from HTTP, rather than loading entire
                    > > request/response into RAM.
                    > >
                    > > Since HTTP is the most transport, do what good HTTP servers do.
                    > When
                    > > you browse to a site with streaming video, do you think the HTTP
                    > > server loads the entire video file into RAM for each request? Of
                    > > course not. It streams it off disk (or cache) directly back to
                    > the
                    > > client, chunk by chunk, so that only a small amount of a large
                    > file
                    > > is ever in memory at once. In that way, the HTTP server can
                    > handle
                    > > 100,000 simulatenous hits to that same streaming video.
                    > >
                    > > Am I wrong about this?
                    > >
                    > > Regards,
                    > >
                    > > Allie Rogers
                    > >
                    > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                    > > > Hi, Sebastian!
                    > > >
                    > > > It might, thanks for the tip. Won't make any promisses, but
                    > > > definitely will take a look. Next version is about to be
                    > released,
                    > > > and I don't think will include any changes in this aspect
                    > (unless
                    > > > they are minimal, that is possible also), but I'll try to do it
                    > > ASAP.
                    > > > Thank you.
                    > > >
                    > > > Best wishes, Paul.
                    > > >
                    > > > --- sebaklu@y... wrote:
                    > > > > Hi, Paul!
                    > > > >
                    > > > > Thanx for the answer.
                    > > > >
                    > > > > In fact the memory usage in the CGI package
                    > > (SOAP::Transport::HTTP)
                    > > > >
                    > > > > is a problem for handling big SOAP requests because the
                    > complete
                    > > > > content is kept more than one times in the memory.
                    > > > >
                    > > > > extract from package SOAP::Transport::HTTP::CGI
                    > > > > located in SOAP/Transport/HTTP.pm:
                    > > > >
                    > > > > my $content; read(STDIN,$content,$ENV{'CONTENT_LENGTH'} ||
                    > 0);
                    > > > > # ^^^^^^^^--- first time
                    > > > >
                    > > > > $self->request(HTTP::Request->new(
                    > > > > $ENV{'REQUEST_METHOD'} || '' => $ENV{'SCRIPT_NAME'},
                    > > > > HTTP::Headers->new(map {(/^HTTP_(.+)/i ? $1 : $_) => $ENV
                    > > {$_}}
                    > > > > keys %ENV),
                    > > > > $content,
                    > > > > # ^^^^^^^^- second time
                    > > > > ));
                    > > > >
                    > > > > Will future version support stream-based handling directly
                    > from
                    > > > > STDIN?
                    > > > >
                    > > > > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...>
                    > wrote:
                    > > > > > Hi, Sebastian!
                    > > > > >
                    > > > > > Yes, you may specify option for MIME::Parser to keep
                    > temporary
                    > > > > file
                    > > > > > on disk during parsing, but presence it in memory will be
                    > > > > required
                    > > > > > later to parse the message, though it shouldn't take 200MB
                    > to
                    > > > > parse
                    > > > > > it. As soon as it's one part you should be fine. Anyway,
                    > there
                    > > is
                    > > > >
                    > > > > not
                    > > > > > much you can change (except options for MIME::Parser) and
                    > I'll
                    > > do
                    > > > > my
                    > > > > > tests to check it.
                    > > > > >
                    > > > > > Best wishes, Paul.
                    > > > > >
                    > > > > >
                    > > > > >
                    > > > > > __________________________________________________
                    > > > > > Do You Yahoo!?
                    > > > > > Get email at your own domain with Yahoo! Mail.
                    > > > > > http://personal.mail.yahoo.com/
                    > > > >
                    > > > >
                    > > > > ------------------------ Yahoo! Groups Sponsor
                    > > > >
                    > > > > To unsubscribe from this group, send an email to:
                    > > > > soaplite-unsubscribe@y...
                    > > > >
                    > > > >
                    > > > >
                    > > > > Your use of Yahoo! Groups is subject to
                    > > > > http://docs.yahoo.com/info/terms/
                    > > > >
                    > > > >
                    > > >
                    > > >
                    > > > __________________________________________________
                    > > > Do You Yahoo!?
                    > > > Yahoo! Auctions - Buy the things you want at great prices.
                    > > > http://auctions.yahoo.com/
                    > >
                    > >
                    > >
                    > > To unsubscribe from this group, send an email to:
                    > > soaplite-unsubscribe@y...
                    > >
                    > >
                    > >
                    > > Your use of Yahoo! Groups is subject to
                    > http://docs.yahoo.com/info/terms/
                    >
                    === message truncated ===


                    __________________________________________________
                    Do You Yahoo!?
                    Get email at your own domain with Yahoo! Mail.
                    http://personal.mail.yahoo.com/
                  • allierogers@yahoo.com
                    Paul, You make many good points, and I see the issues. Maybe we need to decompose the problem into smaller spaces. As I see it, there are 2 basic issues
                    Message 9 of 13 , Mar 22, 2001
                    • 0 Attachment
                      Paul,

                      You make many good points, and I see the issues. Maybe we need to
                      decompose the problem into smaller spaces. As I see it, there are 2
                      basic issues related to large SOAP RPC calls:

                      1. SOAP request where arguments are large

                      2. SOAP response where method result is large

                      I thought the proper way to handle number 1 was to use the HTTP file
                      upload capability where the SOAP argument references the data in the
                      HTTP upload or via URL somehow. Maybe I am wrong about this because
                      my particular implementations never make use of this feature. I
                      realize this is an HTTP-centric approach, but the scalability issues
                      really revolve around HTTP implementations and not so much in SMTP
                      and FTP where disk rather than RAM storage is the default and
                      dispatch and execution threads are not as memory sensitive as HTTP
                      (for instance, an SMTP server is only handling one message at a time
                      while an HTTP server may be handling 100,000 requests at once).

                      For me, number 2 is more important. In this case, the method call is
                      simple and its data is relatively small, so how you parse and handle
                      the request on the server side is fine as it is. However, the
                      response should be streamed in all cases. There is no issue of XML
                      parsing on the server side in this case. But, the server should
                      assume that the method return could be arbitrarily large, so it
                      should not attempt to receive the entire method return before it
                      starts passing it back to the client. The server should start, right
                      away, streaming back the SOAP envelope, unbuffered, or at least allow
                      this as a setting. For SOAP::Lite, you have the problem that you are
                      dynamically trying to figure out what types are in the method
                      response, rather than through static definition (e.g., a
                      configuration file on the server to map method name, namespace, uri,
                      arugments, return types, etc.). Without a static configuration,
                      there is no way to stream as I would like. Maybe it could be a
                      performance option?

                      This is how we have solved the problem, here, both for SOAP and XML-
                      RPC. However, we use SOAP::Lite servers only for some prototyping.
                      In other cases, we have COM-based SOAP/XML-RPC servers (e.g., 4s4c
                      from pocketsoap.com) and IIS configured for unbuffered response.

                      Regards,

                      Allie
                    • allierogers@yahoo.com
                      Petr, I know Don is an important figure, but I disagree with him. SOAP can be used for large data, just as all HTTP servers today handle large data. We use
                      Message 10 of 13 , Mar 22, 2001
                      • 0 Attachment
                        Petr,

                        I know Don is an important figure, but I disagree with him. SOAP can
                        be used for large data, just as all HTTP servers today handle large
                        data. We use it for this purpose here in all of our products and it
                        works well. Maybe he did not intend this use of SOAP, but it does
                        work. However, SOAP::Lite, currently, can not be used in this way.
                        Perhaps that may change.

                        Allie

                        --- In soaplite@y..., "Petr Janata" <petr.janata@i...> wrote:
                        > Hello,
                        >
                        > I would just like to remark that we had Don Box last week in Prague
                        giving a
                        > SOAP talk and he said that SOAP is meant to work exactly like
                        this, i.e.
                        > not to handle large amounts of data. He also said that if you need
                        to do
                        > that you can e.g. pass just the URL in a SOAP message and use simple
                        > download (or SAX parser ) to handle the transfer.
                        >
                        > Petr Janata
                      • sebaklu@yahoo.com
                        Hi Paul, That s right, but the current version of SOAP::Lite should never expect large requests. The server will give no response and fill out the complete
                        Message 11 of 13 , Mar 22, 2001
                        • 0 Attachment
                          Hi Paul,

                          That's right, but the current version of SOAP::Lite should never
                          expect large requests. The server will give no response and fill out
                          the complete memory on the machine. Since it expect SOAP messages
                          with attachements it should be able to handle large amount of data.

                          However, it works fine with simple requests. But general for
                          handling SOAP messages with attachments (the 7 MB attachment was an
                          example, i had also problems to handle smaller attachments) should
                          use stream mechanism. If not you should not read it into memory but
                          reject the request. Maybe i'm wrong but it is a weak point and DOS
                          attacks may use it.


                          Sebastian

                          --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                          > Hi, Sebastian!
                          >
                          > That's true, but at the same time it's easy to imagine situation
                          when
                          > you send something not directly, but thru the several different
                          > intermediaries and each of them will need to handle this huge
                          > request. If this piece is encoded as external reference then handler
                          > could be smart enough to get it only if it's required (yet I don't
                          > know about such smart handlers :)). Ideas, ideas...
                          >
                          > Ideally implementation should be flexible enough to handle both (and
                          > maybe man others) approaches, maybe with manual hints.
                          >
                          > Best wishes, Paul.
                        • Paul Kulchenko
                          Hi, Sebastian! Absolutely agree. That s the reason why I want to introduce some additional transport options, like ACCEPTABLE_CONTENT_TYPE (if you want to
                          Message 12 of 13 , Mar 22, 2001
                          • 0 Attachment
                            Hi, Sebastian!

                            Absolutely agree. That's the reason why I want to introduce some
                            additional transport options, like ACCEPTABLE_CONTENT_TYPE (if you
                            want to accept ONLY text/xml or multipart/related) and
                            MAX_CONTENT_SIZE that should take care about it and request will be
                            rejected. As for DOS attack it could be introduced even with small
                            request which has complex XML structure. Anyway, these options should
                            make server side more robust.

                            Best wishes, Paul.

                            --- sebaklu@... wrote:
                            > Hi Paul,
                            >
                            > That's right, but the current version of SOAP::Lite should never
                            > expect large requests. The server will give no response and fill
                            > out
                            > the complete memory on the machine. Since it expect SOAP messages
                            > with attachements it should be able to handle large amount of data.
                            >
                            > However, it works fine with simple requests. But general for
                            > handling SOAP messages with attachments (the 7 MB attachment was
                            > an
                            > example, i had also problems to handle smaller attachments) should
                            > use stream mechanism. If not you should not read it into memory but
                            >
                            > reject the request. Maybe i'm wrong but it is a weak point and DOS
                            > attacks may use it.
                            >
                            >
                            > Sebastian
                            >
                            > --- In soaplite@y..., Paul Kulchenko <paulclinger@y...> wrote:
                            > > Hi, Sebastian!
                            > >
                            > > That's true, but at the same time it's easy to imagine situation
                            > when
                            > > you send something not directly, but thru the several different
                            > > intermediaries and each of them will need to handle this huge
                            > > request. If this piece is encoded as external reference then
                            > handler
                            > > could be smart enough to get it only if it's required (yet I
                            > don't
                            > > know about such smart handlers :)). Ideas, ideas...
                            > >
                            > > Ideally implementation should be flexible enough to handle both
                            > (and
                            > > maybe man others) approaches, maybe with manual hints.
                            > >
                            > > Best wishes, Paul.
                            >
                            >
                            >
                            > ------------------------ Yahoo! Groups Sponsor
                            >
                            > To unsubscribe from this group, send an email to:
                            > soaplite-unsubscribe@yahoogroups.com
                            >
                            >
                            >
                            > Your use of Yahoo! Groups is subject to
                            > http://docs.yahoo.com/info/terms/
                            >
                            >


                            __________________________________________________
                            Do You Yahoo!?
                            Get email at your own domain with Yahoo! Mail.
                            http://personal.mail.yahoo.com/
                          Your message has been successfully submitted and would be delivered to recipients shortly.