Loading ...
Sorry, an error occurred while loading the content.

What is the right way to configure timeout managers when using the distributor?

Expand Messages
  • janovesk
    Hi! Could someone give me a clear answer on this? We have a service A. It uses sagas and timeouts. It runs on two workers, W1 and W2. A Master Node for A is
    Message 1 of 9 , Feb 6, 2013
    View Source
    • 0 Attachment
      Hi!

      Could someone give me a clear answer on this?

      We have a service A. It uses sagas and timeouts. It runs on two workers, W1 and W2. A Master Node for A is running on MN, working as a distributor.(MN does NOT take part as a worker itself)

      Which machines should be running a timeout manager in this scenario?

      a) ONLY the MN? W1 and W2 will send their timeout related messages to the MN .timeouts/.timeoutsdispatcher queues.

      or
      b) W1 and W2 should be running their own timeout managers. MN does not need one.

      or
      c) W1, W2 and MN all need to run the timeout manager.

      Regards,
      Jan Ove Skogheim Olsen
    • Andreas Öhlund
      The correct answer is a . This question actually uncovered a few issues with the way we handle timeouts when scaled out. I ll get back to you soon with more
      Message 2 of 9 , Feb 7, 2013
      View Source
      • 0 Attachment
        The correct answer is "a". This question actually uncovered a few issues with the way we handle timeouts when scaled out. I'll get back to you soon with more info.

        Cheers,

        Andreas

        On Wed, Feb 6, 2013 at 3:51 PM, janovesk <janovesk@...> wrote:
         

        Hi!

        Could someone give me a clear answer on this?

        We have a service A. It uses sagas and timeouts. It runs on two workers, W1 and W2. A Master Node for A is running on MN, working as a distributor.(MN does NOT take part as a worker itself)

        Which machines should be running a timeout manager in this scenario?

        a) ONLY the MN? W1 and W2 will send their timeout related messages to the MN .timeouts/.timeoutsdispatcher queues.

        or
        b) W1 and W2 should be running their own timeout managers. MN does not need one.

        or
        c) W1, W2 and MN all need to run the timeout manager.

        Regards,
        Jan Ove Skogheim Olsen




        --
        http://andreasohlund.net
        http://twitter.com/andreasohlund
      • David Boike
        Related Stack Overflow questions: Udi has already commented here...
        Message 3 of 9 , Feb 7, 2013
        View Source
        • 0 Attachment


          On Thu, Feb 7, 2013 at 7:55 AM, Andreas Öhlund <andreasohlund2@...> wrote:
           

          The correct answer is "a". This question actually uncovered a few issues with the way we handle timeouts when scaled out. I'll get back to you soon with more info.


          Cheers,

          Andreas


          On Wed, Feb 6, 2013 at 3:51 PM, janovesk <janovesk@...> wrote:
           

          Hi!

          Could someone give me a clear answer on this?

          We have a service A. It uses sagas and timeouts. It runs on two workers, W1 and W2. A Master Node for A is running on MN, working as a distributor.(MN does NOT take part as a worker itself)

          Which machines should be running a timeout manager in this scenario?

          a) ONLY the MN? W1 and W2 will send their timeout related messages to the MN .timeouts/.timeoutsdispatcher queues.

          or
          b) W1 and W2 should be running their own timeout managers. MN does not need one.

          or
          c) W1, W2 and MN all need to run the timeout manager.

          Regards,
          Jan Ove Skogheim Olsen




          --
          http://andreasohlund.net
          http://twitter.com/andreasohlund


        • Andreas Öhlund
          Hi here is the issue we found: Calls to bus.Defer and Saga.RequestUtcTimeouts causes ready messages to build up in the distributor
          Message 4 of 9 , Feb 8, 2013
          View Source
          • 0 Attachment
            Hi here is the issue we found:

            "Calls to bus.Defer and Saga.RequestUtcTimeouts causes ready messages to build up in the distributor"


            Does this apply to your situation Jan Ove?

            On Thu, Feb 7, 2013 at 10:20 PM, David Boike <david.boike@...> wrote:
             



            On Thu, Feb 7, 2013 at 7:55 AM, Andreas Öhlund <andreasohlund2@...> wrote:
             

            The correct answer is "a". This question actually uncovered a few issues with the way we handle timeouts when scaled out. I'll get back to you soon with more info.


            Cheers,

            Andreas


            On Wed, Feb 6, 2013 at 3:51 PM, janovesk <janovesk@...> wrote:
             

            Hi!

            Could someone give me a clear answer on this?

            We have a service A. It uses sagas and timeouts. It runs on two workers, W1 and W2. A Master Node for A is running on MN, working as a distributor.(MN does NOT take part as a worker itself)

            Which machines should be running a timeout manager in this scenario?

            a) ONLY the MN? W1 and W2 will send their timeout related messages to the MN .timeouts/.timeoutsdispatcher queues.

            or
            b) W1 and W2 should be running their own timeout managers. MN does not need one.

            or
            c) W1, W2 and MN all need to run the timeout manager.

            Regards,
            Jan Ove Skogheim Olsen








            --
            http://andreasohlund.net
            http://twitter.com/andreasohlund
          • janovesk
            Yes, that matches the symptoms we re seeing. Extra storage entries on the distributor when handling saga timeouts on the workers. Jan Ove
            Message 5 of 9 , Feb 8, 2013
            View Source
            • 0 Attachment
              Yes, that matches the symptoms we're seeing. Extra storage entries on the distributor when handling saga timeouts on the workers.

              Jan Ove

              --- In nservicebus@yahoogroups.com, Andreas Öhlund wrote:
              >
              > Hi here is the issue we found:
              >
              > "Calls to bus.Defer and Saga.RequestUtcTimeouts causes ready messages to
              > build up in the distributor"
              >
              > https://github.com/NServiceBus/NServiceBus/issues/954
              >
              > Does this apply to your situation Jan Ove?
              >
              > On Thu, Feb 7, 2013 at 10:20 PM, David Boike wrote:
              >
              > > **
              > >
              > >
              > > Related Stack Overflow questions:
              > >
              > > Udi has already commented here...
              > >
              > > http://stackoverflow.com/questions/14718083/what-is-the-correct-way-to-use-the-timeout-manager-with-the-distributor-in-nserv
              > >
              > > Check the comments here...
              > >
              > > http://stackoverflow.com/questions/11017211/what-are-the-differences-in-nservicebus-distributor-and-master-node/11023900
              > >
              > >
              > > On Thu, Feb 7, 2013 at 7:55 AM, Andreas Öhlund wrote:
              > >
              > >> **
              > >>
              > >>
              > >> The correct answer is "a". This question actually uncovered a few issues
              > >> with the way we handle timeouts when scaled out. I'll get back to you soon
              > >> with more info.
              > >>
              > >> Cheers,
              > >>
              > >> Andreas
              > >>
              > >>
              > >> On Wed, Feb 6, 2013 at 3:51 PM, janovesk wrote:
              > >>
              > >>> **
              > >>>
              > >>>
              > >>> Hi!
              > >>>
              > >>> Could someone give me a clear answer on this?
              > >>>
              > >>> We have a service A. It uses sagas and timeouts. It runs on two workers,
              > >>> W1 and W2. A Master Node for A is running on MN, working as a
              > >>> distributor.(MN does NOT take part as a worker itself)
              > >>>
              > >>> Which machines should be running a timeout manager in this scenario?
              > >>>
              > >>> a) ONLY the MN? W1 and W2 will send their timeout related messages to
              > >>> the MN .timeouts/.timeoutsdispatcher queues.
              > >>>
              > >>> or
              > >>> b) W1 and W2 should be running their own timeout managers. MN does not
              > >>> need one.
              > >>>
              > >>> or
              > >>> c) W1, W2 and MN all need to run the timeout manager.
              > >>>
              > >>> Regards,
              > >>> Jan Ove Skogheim Olsen
              > >>>
              > >>>
              > >>
              > >>
              > >> --
              > >> http://andreasohlund.net
              > >> http://twitter.com/andreasohlund
              > >>
              > >>
              > >
              > >
              >
              >
              >
              > --
              > http://andreasohlund.net
              > http://twitter.com/andreasohlund
              >
            • janovesk
              Well then I think there is a bug somewhere. :-) If you use the default Master and Worker profiles you will get TMs in W1, W2 and the MN. If you disable TM on
              Message 6 of 9 , Feb 8, 2013
              View Source
              • 0 Attachment
                Well then I think there is a bug somewhere. :-) If you use the default Master and Worker profiles you will get TMs in W1, W2 and the MN. If you disable TM on the workers, timeouts no longer works because the worker send timeout messages to its local .timeouts queue, not the one the MN.

                If you keep the TM on the workers running timeouts will "work", but the TMs on the workers race against each other against the backing store and can pick up the same timeouts concurrently. Leads to deadlocks against raven, but seems to work. Unfortunately is also leads to increasing number of entries in the storage queue on the MN.

                Jan Ove

                --- In nservicebus@yahoogroups.com, Andreas Öhlund wrote:
                >
                > The correct answer is "a". This question actually uncovered a few issues
                > with the way we handle timeouts when scaled out. I'll get back to you soon
                > with more info.
                >
                > Cheers,
                >
                > Andreas
                >
                > On Wed, Feb 6, 2013 at 3:51 PM, janovesk wrote:
                >
                > > **
                > >
                > >
                > > Hi!
                > >
                > > Could someone give me a clear answer on this?
                > >
                > > We have a service A. It uses sagas and timeouts. It runs on two workers,
                > > W1 and W2. A Master Node for A is running on MN, working as a
                > > distributor.(MN does NOT take part as a worker itself)
                > >
                > > Which machines should be running a timeout manager in this scenario?
                > >
                > > a) ONLY the MN? W1 and W2 will send their timeout related messages to the
                > > MN .timeouts/.timeoutsdispatcher queues.
                > >
                > > or
                > > b) W1 and W2 should be running their own timeout managers. MN does not
                > > need one.
                > >
                > > or
                > > c) W1, W2 and MN all need to run the timeout manager.
                > >
                > > Regards,
                > > Jan Ove Skogheim Olsen
                > >
                > >
                > >
                >
                >
                >
                > --
                > http://andreasohlund.net
                > http://twitter.com/andreasohlund
                >
              • Cris
                We are in a scenario with 2 workers and 1 Distributor all running on separate machines. We have implemented the work arounds so nicely documented and
                Message 7 of 9 , Apr 1 3:19 PM
                View Source
                • 0 Attachment
                  We are in a scenario with 2 workers and 1 Distributor all running on separate machines. We have implemented the work arounds so nicely documented and explained here (http://stackoverflow.com/questions/14718083/what-is-the-correct-way-to-use-the-timeout-manager-with-the-distributor-in-nserv), but we are still getting extra 'ready' messages building up in the storage queue on the Distributor. As described in that post, it seems like NSB is sending extra 'ready' messages when processing 'timeouts'. The issue on Github (https://github.com/NServiceBus/NServiceBus/issues/954) was recently closed, but doesn't appear to have been fixed.

                  Is there another workaround for this?

                  --- In nservicebus@yahoogroups.com, "janovesk" <janovesk@...> wrote:
                  >
                  > Well then I think there is a bug somewhere. :-) If you use the default Master and Worker profiles you will get TMs in W1, W2 and the MN. If you disable TM on the workers, timeouts no longer works because the worker send timeout messages to its local .timeouts queue, not the one the MN.
                  >
                  > If you keep the TM on the workers running timeouts will "work", but the TMs on the workers race against each other against the backing store and can pick up the same timeouts concurrently. Leads to deadlocks against raven, but seems to work. Unfortunately is also leads to increasing number of entries in the storage queue on the MN.
                  >
                  > Jan Ove
                  >
                  > --- In nservicebus@yahoogroups.com, Andreas Öhlund wrote:
                  > >
                  > > The correct answer is "a". This question actually uncovered a few issues
                  > > with the way we handle timeouts when scaled out. I'll get back to you soon
                  > > with more info.
                  > >
                  > > Cheers,
                  > >
                  > > Andreas
                  > >
                  > > On Wed, Feb 6, 2013 at 3:51 PM, janovesk wrote:
                  > >
                  > > > **
                  > > >
                  > > >
                  > > > Hi!
                  > > >
                  > > > Could someone give me a clear answer on this?
                  > > >
                  > > > We have a service A. It uses sagas and timeouts. It runs on two workers,
                  > > > W1 and W2. A Master Node for A is running on MN, working as a
                  > > > distributor.(MN does NOT take part as a worker itself)
                  > > >
                  > > > Which machines should be running a timeout manager in this scenario?
                  > > >
                  > > > a) ONLY the MN? W1 and W2 will send their timeout related messages to the
                  > > > MN .timeouts/.timeoutsdispatcher queues.
                  > > >
                  > > > or
                  > > > b) W1 and W2 should be running their own timeout managers. MN does not
                  > > > need one.
                  > > >
                  > > > or
                  > > > c) W1, W2 and MN all need to run the timeout manager.
                  > > >
                  > > > Regards,
                  > > > Jan Ove Skogheim Olsen
                  > > >
                  > > >
                  > > >
                  > >
                  > >
                  > >
                  > > --
                  > > http://andreasohlund.net
                  > > http://twitter.com/andreasohlund
                  > >
                  >
                • Andreas Öhlund
                  We couldn t reproduce it in other setups than running locally in demo mode . Can you reopen the GH issue and add info on how to reproduce it with workers +
                  Message 8 of 9 , Apr 1 11:22 PM
                  View Source
                  • 0 Attachment
                    We couldn't reproduce it in other setups than running locally in "demo mode". Can you reopen the GH issue and add info on how to reproduce it with workers + distributor on separate machines?


                    On Tue, Apr 2, 2013 at 12:19 AM, Cris <cris.barbero@...> wrote:
                     

                    We are in a scenario with 2 workers and 1 Distributor all running on separate machines. We have implemented the work arounds so nicely documented and explained here (http://stackoverflow.com/questions/14718083/what-is-the-correct-way-to-use-the-timeout-manager-with-the-distributor-in-nserv), but we are still getting extra 'ready' messages building up in the storage queue on the Distributor. As described in that post, it seems like NSB is sending extra 'ready' messages when processing 'timeouts'. The issue on Github (https://github.com/NServiceBus/NServiceBus/issues/954) was recently closed, but doesn't appear to have been fixed.

                    Is there another workaround for this?



                    --- In nservicebus@yahoogroups.com, "janovesk" <janovesk@...> wrote:
                    >
                    > Well then I think there is a bug somewhere. :-) If you use the default Master and Worker profiles you will get TMs in W1, W2 and the MN. If you disable TM on the workers, timeouts no longer works because the worker send timeout messages to its local .timeouts queue, not the one the MN.
                    >
                    > If you keep the TM on the workers running timeouts will "work", but the TMs on the workers race against each other against the backing store and can pick up the same timeouts concurrently. Leads to deadlocks against raven, but seems to work. Unfortunately is also leads to increasing number of entries in the storage queue on the MN.
                    >
                    > Jan Ove
                    >
                    > --- In nservicebus@yahoogroups.com, Andreas Öhlund wrote:
                    > >
                    > > The correct answer is "a". This question actually uncovered a few issues
                    > > with the way we handle timeouts when scaled out. I'll get back to you soon
                    > > with more info.
                    > >
                    > > Cheers,
                    > >
                    > > Andreas
                    > >
                    > > On Wed, Feb 6, 2013 at 3:51 PM, janovesk wrote:
                    > >
                    > > > **
                    > > >
                    > > >
                    > > > Hi!
                    > > >
                    > > > Could someone give me a clear answer on this?
                    > > >
                    > > > We have a service A. It uses sagas and timeouts. It runs on two workers,
                    > > > W1 and W2. A Master Node for A is running on MN, working as a
                    > > > distributor.(MN does NOT take part as a worker itself)
                    > > >
                    > > > Which machines should be running a timeout manager in this scenario?
                    > > >
                    > > > a) ONLY the MN? W1 and W2 will send their timeout related messages to the
                    > > > MN .timeouts/.timeoutsdispatcher queues.
                    > > >
                    > > > or
                    > > > b) W1 and W2 should be running their own timeout managers. MN does not
                    > > > need one.
                    > > >
                    > > > or
                    > > > c) W1, W2 and MN all need to run the timeout manager.
                    > > >
                    > > > Regards,
                    > > > Jan Ove Skogheim Olsen
                    > > >
                    > > >
                    > > >
                    > >
                    > >
                    > >
                    > > --
                    > > http://andreasohlund.net
                    > > http://twitter.com/andreasohlund
                    > >
                    >




                    --
                    http://andreasohlund.net
                    http://twitter.com/andreasohlund
                  • Cris
                    After extensive testing, I have isolated the issue. It doesn t appear to be an issue with the timeout manager, but instead with sending a command from an
                    Message 9 of 9 , Apr 15 9:51 AM
                    View Source
                    • 0 Attachment
                      After extensive testing, I have isolated the issue. It doesn't appear to be an issue with the timeout manager, but instead with 'sending' a command from an message handler. I have posted this issue in a separate thread for discussion (http://tech.groups.yahoo.com/group/nservicebus/message/18623).

                      Thanks

                      --- In nservicebus@yahoogroups.com, Andreas Öhlund <andreasohlund2@...> wrote:
                      >
                      > We couldn't reproduce it in other setups than running locally in "demo
                      > mode". Can you reopen the GH issue and add info on how to reproduce it with
                      > workers + distributor on separate machines?
                      >
                      >
                      > On Tue, Apr 2, 2013 at 12:19 AM, Cris <cris.barbero@...> wrote:
                      >
                      > > **
                      > >
                      > >
                      > > We are in a scenario with 2 workers and 1 Distributor all running on
                      > > separate machines. We have implemented the work arounds so nicely
                      > > documented and explained here (
                      > > http://stackoverflow.com/questions/14718083/what-is-the-correct-way-to-use-the-timeout-manager-with-the-distributor-in-nserv),
                      > > but we are still getting extra 'ready' messages building up in the storage
                      > > queue on the Distributor. As described in that post, it seems like NSB is
                      > > sending extra 'ready' messages when processing 'timeouts'. The issue on
                      > > Github (https://github.com/NServiceBus/NServiceBus/issues/954) was
                      > > recently closed, but doesn't appear to have been fixed.
                      > >
                      > > Is there another workaround for this?
                      > >
                      > >
                      > > --- In nservicebus@yahoogroups.com, "janovesk" <janovesk@> wrote:
                      > > >
                      > > > Well then I think there is a bug somewhere. :-) If you use the default
                      > > Master and Worker profiles you will get TMs in W1, W2 and the MN. If you
                      > > disable TM on the workers, timeouts no longer works because the worker send
                      > > timeout messages to its local .timeouts queue, not the one the MN.
                      > > >
                      > > > If you keep the TM on the workers running timeouts will "work", but the
                      > > TMs on the workers race against each other against the backing store and
                      > > can pick up the same timeouts concurrently. Leads to deadlocks against
                      > > raven, but seems to work. Unfortunately is also leads to increasing number
                      > > of entries in the storage queue on the MN.
                      > > >
                      > > > Jan Ove
                      > > >
                      > > > --- In nservicebus@yahoogroups.com, Andreas Öhlund wrote:
                      > > > >
                      > > > > The correct answer is "a". This question actually uncovered a few
                      > > issues
                      > > > > with the way we handle timeouts when scaled out. I'll get back to you
                      > > soon
                      > > > > with more info.
                      > > > >
                      > > > > Cheers,
                      > > > >
                      > > > > Andreas
                      > > > >
                      > > > > On Wed, Feb 6, 2013 at 3:51 PM, janovesk wrote:
                      > > > >
                      > > > > > **
                      > > > > >
                      > > > > >
                      > > > > > Hi!
                      > > > > >
                      > > > > > Could someone give me a clear answer on this?
                      > > > > >
                      > > > > > We have a service A. It uses sagas and timeouts. It runs on two
                      > > workers,
                      > > > > > W1 and W2. A Master Node for A is running on MN, working as a
                      > > > > > distributor.(MN does NOT take part as a worker itself)
                      > > > > >
                      > > > > > Which machines should be running a timeout manager in this scenario?
                      > > > > >
                      > > > > > a) ONLY the MN? W1 and W2 will send their timeout related messages
                      > > to the
                      > > > > > MN .timeouts/.timeoutsdispatcher queues.
                      > > > > >
                      > > > > > or
                      > > > > > b) W1 and W2 should be running their own timeout managers. MN does
                      > > not
                      > > > > > need one.
                      > > > > >
                      > > > > > or
                      > > > > > c) W1, W2 and MN all need to run the timeout manager.
                      > > > > >
                      > > > > > Regards,
                      > > > > > Jan Ove Skogheim Olsen
                      > > > > >
                      > > > > >
                      > > > > >
                      > > > >
                      > > > >
                      > > > >
                      > > > > --
                      > > > > http://andreasohlund.net
                      > > > > http://twitter.com/andreasohlund
                      > > > >
                      > > >
                      > >
                      > >
                      > >
                      >
                      >
                      >
                      > --
                      > http://andreasohlund.net
                      > http://twitter.com/andreasohlund
                      >
                    Your message has been successfully submitted and would be delivered to recipients shortly.