Loading ...
Sorry, an error occurred while loading the content.

Scalability issues in building a third-party front-end MTA service?

Expand Messages
  • Darren Pilgrim
    I m setting up a front-end MTA as service for a number of clients. My server gets configured as the MX for the client s domain and I also provide an
    Message 1 of 4 , Aug 31, 2007
    View Source
    • 0 Attachment
      I'm setting up a front-end MTA as service for a number of clients. My
      server gets configured as the MX for the client's domain and I also
      provide an authenticated smarthost relay. The client stores their own
      mailboxes (i.e., on their Exchange server) and provides a list against
      which I can do recipient validation. My prototype setup is a relay
      domains setup with relay_transport lookup returning per-domain
      relay:[server.example.com] entries.

      As a proof of concept, this has worked well, but I have some concurrency
      questions I'd like answered:

      Let's say I know a client's mailbox server or internet access is down.
      I'd want to immediately hold all mail destined for their server--I don't
      want to even try delivery once. When they come back up, resume delivery
      and flush all held mail for that destination.

      Let's say I have one client with some very large emails and a slow
      connection, but I still have lots of bandwidth left over to relay to
      other clients' servers. I'd want postfix to deliver to separate clients
      in parallel, up to a maximum of N simultaneous connections per
      destination. From what I've read, this is the intended behavior with
      relay_destination_concurrency_limit=N. However, I'd want to set the
      value of N on a per-destination basis.

      I believe I can do this by creating a separate transport for each
      mailbox server. That would give me separate
      destination_concurrency_limit parameters for each mailbox server as well
      as allow me to "pause" delivery to individual mailbox servers by use of
      the defer_transports parameter. Is this a sound approach? Is there a
      better approach? Is there an upper limit on the number of transports I
      can create?

      --
      Darren Pilgrim
    • Victor Duchovni
      ... This is sensible if you expect to queue substantially more than 20,000 messages during the outage, once the deferred queue grows into the hundreds of
      Message 2 of 4 , Aug 31, 2007
      View Source
      • 0 Attachment
        On Fri, Aug 31, 2007 at 02:03:16PM -0700, Darren Pilgrim wrote:

        > My prototype setup is a relay
        > domains setup with relay_transport lookup returning per-domain
        > relay:[server.example.com] entries.
        >
        > Let's say I know a client's mailbox server or internet access is down.
        > I'd want to immediately hold all mail destined for their server--I don't
        > want to even try delivery once. When they come back up, resume delivery
        > and flush all held mail for that destination.

        This is sensible if you expect to queue substantially more than 20,000
        messages during the outage, once the deferred queue grows into the
        hundreds of thousands or more of queued messages, the Postfix deferred
        queue becocmes too large for periodic retries of every message, ...

        A fallback relay instance is a good way to siphon such mail out of the
        queue, into an instance where the mail is placed on hold, or otherwise
        separated fromt the primary queue.

        > Let's say I have one client with some very large emails and a slow
        > connection, but I still have lots of bandwidth left over to relay to
        > other clients' servers. I'd want postfix to deliver to separate clients
        > in parallel, up to a maximum of N simultaneous connections per
        > destination. From what I've read, this is the intended behavior with
        > relay_destination_concurrency_limit=N. However, I'd want to set the
        > value of N on a per-destination basis.

        You need a separate transport for each large client.

        > I believe I can do this by creating a separate transport for each
        > mailbox server. That would give me separate
        > destination_concurrency_limit parameters for each mailbox server as well
        > as allow me to "pause" delivery to individual mailbox servers by use of
        > the defer_transports parameter. Is this a sound approach? Is there a
        > better approach? Is there an upper limit on the number of transports I
        > can create?

        Each transport costs a few file descriptors in the master process, if you
        have hundreds of transports, you could run out of file descriptor slots.

        Also make sure that the queue manager file descriptor limit is not
        exhausted by the sum of the concurrencies of all the transports.

        Using a BSD system with working kqueue, Linux 2.6+ with epoll() or SunOS
        5.8+ with /dev/poll is a good idea, as you will be able to use more than
        1024 descriptors per-process, but don't go crazy, adding more hardware
        is often a better approach.

        --
        Viktor.

        Disclaimer: off-list followups get on-list replies or get ignored.
        Please do not ignore the "Reply-To" header.

        To unsubscribe from the postfix-users list, visit
        http://www.postfix.org/lists.html or click the link below:
        <mailto:majordomo@...?body=unsubscribe%20postfix-users>

        If my response solves your problem, the best way to thank me is to not
        send an "it worked, thanks" follow-up. If you must respond, please put
        "It worked, thanks" in the "Subject" so I can delete these quickly.
      • Darren Pilgrim
        ... I d anticipate 50-75k messages during a 2-day outage, so let s say 150k-200k messages worst case, so yeah I ll run parallel instances on the box. ... Would
        Message 3 of 4 , Aug 31, 2007
        View Source
        • 0 Attachment
          Victor Duchovni wrote:
          > On Fri, Aug 31, 2007 at 02:03:16PM -0700, Darren Pilgrim wrote:
          >
          >> My prototype setup is a relay domains setup with relay_transport
          >> lookup returning per-domain relay:[server.example.com] entries.
          >>
          >> Let's say I know a client's mailbox server or internet access is
          >> down. I'd want to immediately hold all mail destined for their
          >> server--I don't want to even try delivery once. When they come
          >> back up, resume delivery and flush all held mail for that
          >> destination.
          >
          > This is sensible if you expect to queue substantially more than
          > 20,000 messages during the outage, once the deferred queue grows into
          > the hundreds of thousands or more of queued messages, the Postfix
          > deferred queue becocmes too large for periodic retries of every
          > message, ...
          >
          > A fallback relay instance is a good way to siphon such mail out of
          > the queue, into an instance where the mail is placed on hold, or
          > otherwise separated fromt the primary queue.

          I'd anticipate 50-75k messages during a 2-day outage, so let's say
          150k-200k messages worst case, so yeah I'll run parallel instances on
          the box.

          >> Is there an upper limit on the number of transports I can create?
          >
          > Each transport costs a few file descriptors in the master process, if
          > you have hundreds of transports, you could run out of file
          > descriptor slots.
          >
          > Also make sure that the queue manager file descriptor limit is not
          > exhausted by the sum of the concurrencies of all the transports.
          >
          > [Use] a BSD system with working kqueue [to permit more than 1024 file
          > descriptors per process]

          Would you consider the facility in FreeBSD 6.x a working kqueue?

          > but don't go crazy, adding more hardware is often a better approach.

          At this point I'm looking at only 22 separate mailbox servers so I've
          got a bit of growing room.

          Thanks for the input.

          --
          Darren Pilgrim
        • Victor Duchovni
          ... With a decent memory, disk subsystem, and larger (50k-100k) active queue size limits and larger maximal_backoff_time/(minimal_backoff_time +
          Message 4 of 4 , Aug 31, 2007
          View Source
          • 0 Attachment
            On Fri, Aug 31, 2007 at 03:25:16PM -0700, Darren Pilgrim wrote:

            > Victor Duchovni wrote:
            > >On Fri, Aug 31, 2007 at 02:03:16PM -0700, Darren Pilgrim wrote:
            > >
            > >>My prototype setup is a relay domains setup with relay_transport
            > >>lookup returning per-domain relay:[server.example.com] entries.
            > >>
            > >>Let's say I know a client's mailbox server or internet access is
            > >>down. I'd want to immediately hold all mail destined for their
            > >>server--I don't want to even try delivery once. When they come
            > >>back up, resume delivery and flush all held mail for that
            > >>destination.
            > >
            > >This is sensible if you expect to queue substantially more than
            > >20,000 messages during the outage, once the deferred queue grows into
            > >the hundreds of thousands or more of queued messages, the Postfix
            > >deferred queue becocmes too large for periodic retries of every
            > >message, ...
            > >
            > >A fallback relay instance is a good way to siphon such mail out of
            > >the queue, into an instance where the mail is placed on hold, or
            > >otherwise separated fromt the primary queue.
            >
            > I'd anticipate 50-75k messages during a 2-day outage, so let's say
            > 150k-200k messages worst case, so yeah I'll run parallel instances on
            > the box.

            With a decent memory, disk subsystem, and larger (50k-100k) active queue
            size limits and larger maximal_backoff_time/(minimal_backoff_time +
            queue_run_delay) ration, 200,000 deferred messages is still manageable.

            Postfix will not scale much beyond 1,000,000 deferred messages...

            > Would you consider the facility in FreeBSD 6.x a working kqueue?

            I think so, you need to build a Postfix that has kqueue support, 2.4.5
            in this case.

            > >but don't go crazy, adding more hardware is often a better approach.
            >
            > At this point I'm looking at only 22 separate mailbox servers so I've
            > got a bit of growing room.
            >

            If the 150k messages split over 22 machines, there is nothing to worry
            about at all. If this 150k messages each, you can ride that out with a
            bit of tuning... 1,000,000 per box requires design changes...

            --
            Viktor.

            Disclaimer: off-list followups get on-list replies or get ignored.
            Please do not ignore the "Reply-To" header.

            To unsubscribe from the postfix-users list, visit
            http://www.postfix.org/lists.html or click the link below:
            <mailto:majordomo@...?body=unsubscribe%20postfix-users>

            If my response solves your problem, the best way to thank me is to not
            send an "it worked, thanks" follow-up. If you must respond, please put
            "It worked, thanks" in the "Subject" so I can delete these quickly.
          Your message has been successfully submitted and would be delivered to recipients shortly.