Loading ...
Sorry, an error occurred while loading the content.

Synchronisation implementation - thoughts from users please....

Expand Messages
  • Kevin Hawkins
    Hi This message is long but worth a read as I would like some ideas from people, particularly HV users. It relates to how best to initially synchronise C-Bus
    Message 1 of 9 , Jul 29, 2008
    • 0 Attachment
      Hi

      This message is long but worth a read as I would like some ideas
      from people, particularly HV users. It relates to how best to initially
      synchronise C-Bus and a controller, usually HomeVision but maybe a xAP
      application(s) eg HomeSeer at startup or following a power failure.
      The issue is that there are conflicting aspects potentially at work here.

      The problem manifested itself as a user - you know who ;-) with a
      significant C-Bus install of just under a hundred C-Bus groups who
      noticed the gateway hanging at startup and sometimes restarting itself
      whilst running. The problem usually crashed the gateway. A crashed
      gateway just stops dead... it ceases to send heartbeats, the web pages
      don't load , control and status tracking ceases and the firmware updater
      can't see the gateway (bad). This didn't show here , partly because I
      have less groups and surprisingly because by running in the development
      environment it was running slightly slower than in the real world which
      helped. In a nutshell it happens because the serial link is
      comparatively slow to HV and there is too much data to be sent causing
      an overflow or data loss (HV doesn't have any internal command buffering).

      I have fixed the gateway crashing and also revised the synch
      routines but some decisions need to be made on how best to implement the
      synchronisation and here and that effects all users , not just HV ones,
      so I'd like thoughts from people. It is very unlikley any other user
      would experience the crashing bug.

      Also if anyone does have synchronisation issues with C-Bus currently
      please let me know as I'm not aware of them and I want to resolve those
      in with the next update. One user mentioned this a while back but then
      went quiet. The in progress code is quite different here and may
      already resolve this.

      Here's a copy of a long email from me..... print it and read it in
      the bath or on holiday ;-) I would appreciate your ideas ASAP though
      as I'm revising this code as you read.....

      --------------------------------------------

      I can see what the issue is - fixing it is more problematic though.

      Background:

      C-Bus Level synchs by default are 10 minutes apart. When they do run
      they instantly - well using 16 fast successive messages - provide the
      level of all lights on the local network. C-Bus state synchs run every
      4 seconds ( 3 successive messages) and do the same for the
      on/off/error/absent states of every group. Every time they report they
      are cross checked with every other place that a state or level is held,
      ie in your case the gateway and HomeVision and reconcilled if needs
      be. If there are differences actually within C-Bus then the
      arbitration process is highly involved - but most effective. This could
      happen for example if a CB network became broken into two and then
      restored. Additionally if you have 'virtual' groups you need to be
      careful to not update the states and levels in the gateway with real
      data (which would be reporting as absent) . Smarts can ensure that
      traffic to HV is minimised by only transferring any changes - but only
      once everything is in a stable (running) condition. Getting to that
      stable synchronised state is the challenge.... and needs resolving to
      fix this.

      Issues:

      Here's the current problem if you use HV. Due to the speed of the
      serial port to HV you can't transfer the information quick enough
      (realtime) at startup when CB reports it's states and levels if they
      are different to the ones that HV already reported at synch. In my case
      because I am running with debug included and because I have less C-Bus
      groups and HV custom lighting entries it actually works better as it's
      slower. So the options are to buffer more things (which takes memory of
      course) or to delay things eg take some level data initially and then
      await a second level report from C-Bus or implement and buffer the
      initial level data . This messes up the timeline of quite what is
      happening in relation to other events as all buffers by their very
      nature are effectively queues and shift time. Achieving a realtime
      processing of events is the preferred approach as it avoids timing
      issues, responds quicker and conserves memory.

      The big issue though is 'reaction'. As you change things then other
      things react - eg change one CB group and others change too. Both CB and
      HV can react to changes (as can xAP) and they can react differently
      causing potentially cascading changes and even loops / non reconcillable
      differences.

      Take for example two lights A and B. At gateway startup in HV you
      have A reported as OFF and B is ON whereas on C-Bus both are ON. In HV
      you have an action setup such that either A or B is always on but not
      both. At startup I recover light A and B from C-Bus but have to
      buffer/delay the changes to HV as it can't take the info quick enough.
      So I transfer A first . HV has an action that says that if A turns ON
      then B is OFF . So it now sends the gateway a command to do this. I
      now have a situation where HV has asked for B to turn OFF but B is
      actually ON as reported by CB, and this ON state update is already
      queued to be transfered to HV as soon as HV indicates it can accept it
      . So do I action HV's request or not ? FTTB lets assume I do and I
      turn B off on CB. Now however HV sees that already queued B ON update
      and so it asks for A to turn OFF. So again assuming I action the
      request I turn A off on CB and queue the state update but HV now sees
      the queued B off from its first command and so it asks for A to turn
      ON. :-( I have a loop I can't ever reconcile, and yet if the states
      were settled and consistent (ie gateway running and one on and one off )
      there is no loop .

      Here are a variety of other similar issues, I'm sure there's more...

      1) At gateway power up (ignoring C-Bus and HV and xAP) what should
      the gateway believe in terms of group state if there are conflicts ?
      Should it enforce the group level as reported by HomeVision , or should
      it enforce the one from C-Bus , or even xAP / other HA software
      applications ?

      2) C-Bus has state recovery after a power failure and so does HV -
      which state in these circumstances is correct ?

      3) CB and HV will actually recover at different times after a power
      restore. as will any xAP devices or HA applications.

      4) I can't tell that C-Bus has initiated a power fail recovery - it
      just reappears with new states and it'll achieve this following a power
      recovery before my gateway is up and running.

      5) I can only tell HV has gone through a power fail recovery if it
      tells me - and if, as is likley I am also in a power recovery state I
      will miss this as I take longer to recover.

      6) HV has power fail recovery states per light, some may be set and
      some not. These power fail recovery states may conflict with the
      recovery state set on C-Bus

      7) If power had failed to only one device eg say HomeVision had a
      power restore or perhaps just a schedule reload then recovery would be
      different compared to if power had failed to all devices . This is
      because in tehsae circumstances CB would definitely be the state to base
      everything on. However if CB failed then HV would hold the correct
      state. If the gateway was just momentarily cycled - eg a firmware
      update then HV and CB would likely be nearly 'in synch' but if the C-Bus
      PCI was disconnected from the gateway to do some work using C-Bus
      toolkit and then re-attached then actually nothing power cycles - but
      upon C-Bus reconnection you have potentially significant state
      differences and no recovery macros being triggered on HV.

      8) Within HV remember 'last state' is not an option and on C-Bus it
      is an option on a per group basis (which is memory wearing and hence
      typically not used)

      9) Catch up macros can run on HV that change the state of groups - eg
      if you recover power at 1AM after a power outage of 6 hours then the
      scheduled event at 8PM (turn lights on as dark) will still run
      immediately followed by the 11PM (bedtime) event. These both run
      immediately after power recovery... ie the lights would turn on and then
      immediately turn off again.

      10) Currently I am tending to enforce the C-Bus state back on HV. So
      I startup, read HV's CL state table and then read CB and if things are
      different I change HV to match . However as per the first paragraph this
      has issues if macros are associated with the state changes (reactions).

      11) A C-Bus PAC or touchscreen may be installed and running its own
      recovery macros and logic and again will react to any changes seen on
      C-Bus.

      12) Manual changes may have been made by people which might be
      preferred to the recovery ones. Particularly if say C-Bus didnt have a
      power outage or ws changed immediately after power recovery. Power
      comes on and you immediately manually switch on a C-Bus light in the
      room - should the HV recovery macro be able to turn it off ?

      How to do this ?

      So I have some decisions to make and I'll have to ignore some
      aspects, or give them a priority.

      Specifically should I action any changes during synchronisation
      that HV or other controllers asks for ? I can't tell if these come from
      a power recovery macro, a catch up event, a power fail state recovery
      set on a per light basis or a reflex macro action run as a result of me
      updating a state from C-Bus . C-Bus might have its own controllers eg
      a PAC. Even within just a HV/CB setup this is complex and when there is
      also xAP involved - which can be asking for things to go to a specific
      state too - it is even more awkward. xAP can itself be linked with
      multiple HA automation software like HomeSeer, Charmed Quark, Xlobby,
      xAP Floorplan etc which are all essentially independent controllers -
      just like HV and so can have power on 'macros' running and also may
      react to things changing state. With PC applications their 'power
      recovery' state may run several seconds/minutes after power is restored
      as the PC reboots.

      One possibility I considered is to maintain a flag in HV that
      shows that CB and the gateway are in synch ( I think you suggested
      this). This would allow an approach that says until this is set then I
      could ignore all controller commands - but then you would need to write
      a fairly complex script in HV to utilise this, following a power fail ,
      and you won't for example know at this time which catchup events should
      be run. There's still also an even bigger problem. Even though I
      could ignore the state change events from HV during startup - it will
      still internally run them and it updates its internal state table
      directly , regardless of whether the command was actioned .

      An example - immediately after power restoration CB Group 01 is ON
      and in HV it is ON and I talk to HV and establish this so I set a flag
      in my gateway to say 'in synch' and move onto later groups. HV however
      asks for Group 01 to turn OFF and I ignore this command because we're in
      the synching state, but it updates its internal table anyway :-( So
      when I complete my synch at Group 255 actually my 'in synch' flags are
      all invalid still because HV updated itself directly .... maybe
      several times over .

      I then considered what happens if I now do a second pass of HV's
      state table and then apply those states back on CBus, but then CB may
      'react' ( for example a scene trigger that changes several groups or
      even more complex a PAC) and this in turn may force HV to run yet more
      macros.... I could just let this cascade several times and hope it
      settles....

      Another thought - have a flag in HV that says 'enforce HV' or
      'enforce CB' at startup. But as you 'enforce' one , the other reacts
      causing further changes - it's these reactions that are a problem - and
      can create loops. The catchup events are also problematic.

      All the time anything changes we are adding queued data to the
      serial buffer between the gateway and HV and (aside from the memory
      concern) this means we aren't realtime reacting as previous/current
      state changes are in the queue but not processed yet. Plus we can easily
      create these loops. Once you get a loop no amount of buffering is
      enough as it will eventually overflow and so you either have to
      discarded previous data (overwrite it) or discard the new data. Both
      result in states being incorrect.

      This is exactly what is happening in your case (that ## Buff XS
      ## message) and in the current implementation I have allowed the buffer
      to overflow which is causing the hang. This is definitely wrong (bad
      bad bad) but my alternatives of overwrite or discard are also fraught
      with issues. It's not easy to fix either due to the independent event
      driven nature of the code.

      Additionally HV is just one of maybe several controllers
      participating in this... although the speed of transfer (Ethernet) to
      most other options is much larger - plus they will typically buffer
      locally if needed whereas HV has no inward command buffering at all.

      One 'solution' is to implement the buffers in a way that protects
      against overflow (already done) and tries to slow everything down
      enough to ensure that data doesn't need to be overwritten or discarded.
      Then we allow users to chose which state model (C-Bus / HV / etc) to
      resynch to. Then to just make people aware of the potential
      interactions here and tell them to plan carefully to avoid 'reactions'
      messing things up and of course to be careful to not create loops. ie
      push most of the onus back on the user. This is of course much easier
      for me

      Thoughts / suggestions please .......

      K
    • Kevin Hawkins
      More coffee and I m now thinking that disabling the running of HomeVision Actions (state change triggered macros) during synch time is the right thing to do.
      Message 2 of 9 , Jul 29, 2008
      • 0 Attachment
        More coffee and I'm now thinking that disabling the running of
        HomeVision Actions (state change triggered macros) during synch time is
        the right thing to do. This will eliminate a lot of the reflex actions
        that cause the issues as I update HV to match C-Bus. These are the one
        aspect of HV's power recovery mechanism that I can block.

        Running actions based on state changes that arise during initial
        synch is probably a bad idea anyway , as the order and timing of changes
        has been lost. Group 22 might have changed two minutes after group 12
        but during a synch the changing groups will be presented back to HV 0
        upwards to 255 ie 12 before 22 and with negligable delay between them so
        the validty of the change is compromised.

        I will then ensure there is a flag/variable in HV that is maintained
        with the gateway status. A power recovery macro can be run when this
        becomes 'synched' to complete any required state adjustments. For other
        controllers I will provide a xAP BSC output that conveys the status.
        What I would strongly suggest is that your scripts in any HA software or
        xAP device honours this flag and avoids reflex actions (commands based
        on state changes) during the synch stage and has a final 'startup'
        script that runs once synch is achieved. For C-Bus controllers like the
        PAC and touchscreens I will maintain a virtual/phantom C-Bus group to
        reflect the status, allowing the same interlock to be used in your code.

        This doesn't resolve all issues, especially with dual controllers
        (eg PAC or xAP) but it does reduce the problem. I'm still looking for
        any users thoughts..

        K
      • max barker
        I ve had a few crashes of the gateway like you describe. No heartbeats/xAP, no webpage. But I m using the plain xAP/C-Bus gateway, no serial port connection to
        Message 3 of 9 , Jul 29, 2008
        • 0 Attachment
          I've had a few crashes of the gateway like you describe. No heartbeats/xAP, no webpage. But I'm using the plain xAP/C-Bus gateway, no serial port connection to HV. Very rare occurance, perhaps 3-4 time in the last six months. No pattern, no unusual activity that I can see over C-Bus or xAP.

          A BSC flag is a good fit for me as far as synchronisation goes. I use OPN-MAX and Floorplan as my primary controllers, each one has a limited amount of redundancy built in for the other. I would be trivial for me to include this BSC flag state onto any recovery logic.

          Max

          2008/7/29 Kevin Hawkins <yahoogroupskh@...>
          Hi

            This message is long but worth a read as I would like some ideas
          from people, particularly HV users. It relates to how best to initially
          synchronise C-Bus and a controller, usually HomeVision but maybe a xAP
          application(s) eg HomeSeer  at startup or following a power failure.
          The issue is that there are conflicting aspects potentially at work here.

            The problem manifested itself as a user - you know who ;-) with a
          significant C-Bus install of just under a hundred C-Bus groups who
          noticed the gateway hanging at startup and sometimes restarting itself
          whilst running. The problem usually crashed the gateway.  A crashed
          gateway just stops dead... it ceases to send heartbeats,  the web pages
          don't load , control and status tracking ceases and the firmware updater
          can't see the gateway (bad).   This didn't show here , partly because I
          have less groups and surprisingly because by running in the development
          environment it was running slightly slower than in the real world which
          helped. In a nutshell it happens because the serial link is
          comparatively slow to HV and there is too much data to be sent causing
          an overflow or data loss (HV doesn't have any internal command buffering).

            I have fixed the gateway crashing and also revised the synch
          routines but some decisions need to be made on how best to implement the
          synchronisation and here and that effects all users , not just HV ones,
          so I'd like thoughts from people.  It is very unlikley any other user
          would experience the crashing bug.

            Also if anyone does have synchronisation issues with C-Bus currently
          please let me know as I'm not aware of them and I want to resolve those
          in with the next update.  One user mentioned this a while back but then
          went quiet.  The in progress code is quite different here and may
          already resolve this.

            Here's a copy of a long email from me.....  print it and read it in
          the bath or on holiday ;-)  I would appreciate your ideas ASAP though
          as I'm revising this code as you read.....

          --------------------------------------------

          I can see what the issue is - fixing it is more problematic though.

          Background:

           C-Bus Level synchs by default are 10 minutes apart. When they do run
          they instantly - well using 16 fast successive messages - provide the
          level of all lights on the local network.  C-Bus state synchs run every
          4 seconds ( 3 successive messages) and do the same for the
          on/off/error/absent states of every group.  Every time they report they
          are cross checked with every other place that a state or level is held,
          ie in your case the gateway and HomeVision and reconcilled if needs
          be.   If there are differences actually within C-Bus then the
          arbitration process is highly involved - but most effective.  This could
          happen for example if a CB network became broken into two and then
          restored.   Additionally if you have 'virtual' groups you need to be
          careful to not update the states and levels in the gateway with real
          data (which would be reporting as absent) .  Smarts can ensure that
          traffic to HV is minimised by only transferring any changes - but only
          once everything is in a stable (running) condition.  Getting to that
          stable synchronised state is the challenge.... and needs resolving to
          fix this.

          Issues:

           Here's the current  problem if you use HV.  Due to the speed of the
          serial port to HV you can't transfer the information quick enough
          (realtime)  at startup when CB reports it's states and levels if they
          are different to the ones that HV already reported at synch.  In my case
          because I am running with debug included and because I have less C-Bus
          groups and HV custom lighting entries  it actually works better as it's
          slower. So the options are to buffer more things (which takes memory of
          course) or to delay things eg take some level data initially and then
          await a second level report from C-Bus or implement and buffer the
          initial level data . This messes up the timeline of quite what is
          happening in relation to other events as all buffers by their very
          nature are effectively queues and shift time.  Achieving a realtime
          processing of events is the preferred approach as it avoids timing
          issues, responds quicker  and conserves memory.

           The big issue though is 'reaction'. As you change things then other
          things react - eg change one CB group and others change too. Both CB and
          HV can react to changes (as can xAP) and they can react differently
          causing potentially cascading changes and even loops / non reconcillable
          differences.

           Take for example two lights A and B. At gateway startup in HV you
          have A reported as OFF and B is ON  whereas on C-Bus both are ON.  In HV
          you have an action setup such that either A or B is always on but not
          both.    At startup I recover light A and B from C-Bus but have to
          buffer/delay the changes to HV as it can't take the info quick enough.
          So I transfer A  first .   HV has an action that says that if A turns ON
          then B is OFF .  So it now sends the gateway  a command to do this.  I
          now have a situation where HV has asked for B to turn OFF but B is
          actually ON as reported by CB, and this ON state update is already
          queued to be transfered to HV as soon as HV indicates  it can accept it
          .  So do I action HV's request or not ?  FTTB lets assume I do and I
          turn B off on CB.  Now however HV sees that already queued  B ON update
          and so it asks for A to turn OFF.  So again assuming I action the
          request I turn A off on CB and queue the state update but HV now sees
          the queued B off from its first command and so it asks for A to turn
          ON.  :-(  I have a loop I can't ever reconcile, and yet if the states
          were settled and consistent (ie gateway running and one on and one off )
          there is no loop .

           Here are a variety of other similar  issues, I'm sure there's more...

           1) At gateway power up (ignoring C-Bus and HV and xAP) what should
          the gateway believe in terms of group state if there are conflicts ?
          Should it enforce the group level as reported by HomeVision , or should
          it enforce the one from C-Bus , or even xAP / other HA software
          applications ?

           2) C-Bus has state recovery after a power failure and so does HV -
          which state in these circumstances is correct ?

           3) CB and HV will actually recover at different times after a power
          restore. as will any xAP devices or HA applications.

           4) I can't tell that C-Bus has initiated a power fail recovery - it
          just reappears with new states and it'll achieve this following a  power
          recovery before my gateway is up and running.

           5) I can only tell HV has gone through a power fail recovery if it
          tells me - and if, as is likley I am also in a power recovery state I
          will miss this as I take longer to recover.

           6) HV has power fail recovery states per light, some may be set and
          some not. These power fail recovery states may conflict with the
          recovery state set on C-Bus

           7) If power had failed to only one device eg say HomeVision had a
          power restore or perhaps just a  schedule reload then recovery would be
          different compared to if power had failed to all devices . This is
          because in tehsae circumstances CB would definitely be the state to base
          everything on. However if CB failed then HV would hold the correct
          state. If the gateway was just momentarily cycled - eg a  firmware
          update then HV and CB would likely be nearly 'in synch' but if the C-Bus
          PCI was disconnected from the gateway to do some work using C-Bus
          toolkit and then re-attached then actually nothing power cycles - but
          upon C-Bus reconnection you have potentially significant state
          differences and no recovery macros being triggered on HV.

           8) Within HV remember 'last state' is not an option and on C-Bus it
          is an option  on a per group basis  (which is memory wearing and hence
          typically not used)

           9) Catch up macros can run on HV that change the state of groups - eg
          if you recover power at 1AM after a power outage of 6 hours then the
          scheduled event at 8PM (turn lights on as dark) will still run
          immediately followed by the 11PM (bedtime) event.  These both run
          immediately after power recovery... ie the lights would turn on and then
          immediately turn off again.

           10) Currently I am tending to enforce the C-Bus state back on HV. So
          I startup, read HV's CL state table and then read CB and if things are
          different I change HV to match . However as per the first paragraph this
          has issues if macros are associated with the state changes (reactions).

           11) A C-Bus PAC or touchscreen may be installed and  running its own
          recovery macros and logic and again will react to any changes seen on
          C-Bus.

            12) Manual changes may have been made by people which might be
          preferred to the recovery ones. Particularly if say C-Bus didnt have a
          power outage or ws changed immediately after power recovery.  Power
          comes on and you immediately manually switch on a C-Bus light in the
          room - should the HV recovery macro be able to turn it off ?

          How to do this ?

               So I have some decisions to make and I'll have to ignore some
          aspects, or give them a priority.

               Specifically should I action any changes during synchronisation
          that HV or other controllers asks for ? I can't tell if these come from
          a power recovery macro, a catch up event, a power fail state recovery
          set on a per light basis or a reflex macro action run as a result of me
          updating a state from C-Bus .    C-Bus might have its own controllers eg
          a PAC.  Even within just a HV/CB setup this is complex and when there is
          also xAP involved - which can be asking for things to go to a specific
          state too - it is even more awkward.  xAP can itself be linked with
          multiple HA automation software like HomeSeer, Charmed Quark, Xlobby,
          xAP Floorplan etc which are all essentially independent controllers -
          just like HV and so can have power on 'macros' running and also may
          react to things changing state.  With PC applications their 'power
          recovery' state may run several seconds/minutes after power is restored
          as the PC reboots.

              One possibility I considered is to maintain a flag in HV that
          shows that CB and the gateway are in synch ( I think you suggested
          this). This would allow an approach that says until this is set then I
          could ignore all controller commands - but then you would need to write
          a fairly complex script in HV to utilise this, following a power fail ,
          and you won't for example know at this time which catchup events should
          be run.   There's still also an even bigger problem.  Even though I
          could ignore the state change events from HV during startup - it will
          still internally run them and it updates its internal state table
          directly , regardless of whether the command was actioned .

            An example - immediately after power restoration CB Group 01 is ON
          and in HV it is ON and I talk to HV and establish this so I set a flag
          in my gateway to say 'in synch'  and move onto later groups.  HV however
          asks for Group 01 to turn OFF and I ignore this command because we're in
          the synching state, but it updates its internal table anyway :-(      So
          when I complete my synch at Group 255 actually my 'in synch' flags are
          all invalid still because HV updated itself directly ....    maybe
          several times over  .

               I then considered what happens if I now do a second pass of HV's
          state table and then apply those states back on CBus, but then CB may
          'react'  ( for example a scene trigger that changes several groups or
          even more complex a PAC) and this in turn may force HV to run yet more
          macros....   I could just  let this cascade several times and hope it
          settles....

               Another thought - have a flag in HV that says 'enforce HV' or
          'enforce CB' at startup. But as you 'enforce' one , the other reacts
          causing further changes - it's these reactions that are a problem - and
          can create loops.   The catchup events are also problematic.

                 All the time anything changes we are adding queued data to the
          serial buffer between the gateway and HV and (aside from the memory
          concern) this means we aren't realtime reacting as previous/current
          state changes are in the queue but not processed yet. Plus we can easily
          create these loops.  Once you get a loop no amount of buffering is
          enough as it will eventually overflow and so you either have to
          discarded previous data (overwrite it) or discard the new data.  Both
          result in states being incorrect.

               This is exactly what is happening in your case (that ## Buff XS
          ## message) and in the current implementation I have allowed the buffer
          to overflow  which is causing the hang. This is definitely wrong (bad
          bad bad) but my alternatives of overwrite or discard are also fraught
          with issues. It's not easy to fix either due to the independent event
          driven nature of the code.

               Additionally HV is just one of maybe several controllers
          participating in this... although the speed of transfer (Ethernet) to
          most other options is much larger - plus they will typically buffer
          locally if needed whereas HV has no inward command buffering at all.

               One 'solution' is to implement the buffers in a way that protects
          against overflow (already done)  and tries to slow everything down
          enough to ensure that data doesn't need to be overwritten or discarded.
          Then we allow users to chose which state model (C-Bus / HV / etc) to
          resynch to.  Then to just make people aware of the potential
          interactions here and tell them to plan carefully to avoid 'reactions'
          messing things up and of course to be careful to not create loops.  ie
          push most of the onus back on the user.  This is of course much easier
          for me

             Thoughts / suggestions please .......

                 K







          ------------------------------------

          Yahoo! Groups Links

          <*> To visit your group on the web, go to:
             http://groups.yahoo.com/group/ukusa_gateway/

          <*> Your email settings:
             Individual Email | Traditional

          <*> To change settings online go to:
             http://groups.yahoo.com/group/ukusa_gateway/join
             (Yahoo! ID required)

          <*> To change settings via email:
             mailto:ukusa_gateway-digest@yahoogroups.com
             mailto:ukusa_gateway-fullfeatured@yahoogroups.com

          <*> To unsubscribe from this group, send an email to:
             ukusa_gateway-unsubscribe@yahoogroups.com

          <*> Your use of Yahoo! Groups is subject to:
             http://docs.yahoo.com/info/terms/


        • Kevin Hawkins
          A crash is something I never see here but that s always the way, I ve never had one reported by a user either until now. This one to do with the startup synch
          Message 4 of 9 , Jul 29, 2008
          • 0 Attachment
            A crash is something I never see here but that's always the way,
            I've never had one reported by a user either until now. This one to do
            with the startup synch is unusual because there is a large number of
            C-Bus groups. Do you have a lot Max ?

            More often than not and coming from the a very forgiving VB6
            background my programming oversights are to do with string handling in C
            where something has overflowed its allocated size. C offers no
            protection against this so it's just waiting to catch you out, in fact
            it really doesn't have much string handling functionality at all instead
            you use character arrays. You achieve just the same , just in different
            ways. I really miss Mid$ and friends though.

            A hang may conceivably be caused by a xAP message that is badly
            formed - or parameters/addresses that are too long but slowly I've been
            adding much better validation against this in the code. If you can find
            any circumstances that do cause the gateway an issue then please let me
            know. It may actually be that the next release fixes it as I have
            spotted a couple of places where the a potential overflow might happen.
            I also corrected an issue in the wildcard addressing that might cause an
            issue.

            I too use OPNMax - great bit of kit as an embedded xAP controller
            with inbuilt scripting, logic engine, scheduler , database, graphing,
            webserver with PHP - all for around £60 - amazing value...

            K





            max barker wrote:
            > I've had a few crashes of the gateway like you describe. No
            > heartbeats/xAP, no webpage. But I'm using the plain xAP/C-Bus gateway,
            > no serial port connection to HV. Very rare occurance, perhaps 3-4 time
            > in the last six months. No pattern, no unusual activity that I can see
            > over C-Bus or xAP.
            >
            > A BSC flag is a good fit for me as far as synchronisation goes. I use
            > OPN-MAX and Floorplan as my primary controllers, each one has a
            > limited amount of redundancy built in for the other. I would be
            > trivial for me to include this BSC flag state onto any recovery logic.
            >
            > Max
            >
            > 2008/7/29 Kevin Hawkins <yahoogroupskh@...
            > <mailto:yahoogroupskh@...>>
            >
            > Hi
            >
            > This message is long but worth a read as I would like some ideas
            > from people, particularly HV users. It relates to how best to
            > initially
            > synchronise C-Bus and a controller, usually HomeVision but maybe a xAP
            > application(s) eg HomeSeer at startup or following a power failure.
            > The issue is that there are conflicting aspects potentially at
            > work here.
            >
            > The problem manifested itself as a user - you know who ;-) with a
            > significant C-Bus install of just under a hundred C-Bus groups who
            > noticed the gateway hanging at startup and sometimes restarting itself
            > whilst running. The problem usually crashed the gateway. A crashed
            > gateway just stops dead... it ceases to send heartbeats, the web
            > pages
            > don't load , control and status tracking ceases and the firmware
            > updater
            > can't see the gateway (bad). This didn't show here , partly
            > because I
            > have less groups and surprisingly because by running in the
            > development
            > environment it was running slightly slower than in the real world
            > which
            > helped. In a nutshell it happens because the serial link is
            > comparatively slow to HV and there is too much data to be sent causing
            > an overflow or data loss (HV doesn't have any internal command
            > buffering).
            >
            > I have fixed the gateway crashing and also revised the synch
            > routines but some decisions need to be made on how best to
            > implement the
            > synchronisation and here and that effects all users , not just HV
            > ones,
            > so I'd like thoughts from people. It is very unlikley any other user
            > would experience the crashing bug.
            >
            > Also if anyone does have synchronisation issues with C-Bus currently
            > please let me know as I'm not aware of them and I want to resolve
            > those
            > in with the next update. One user mentioned this a while back but
            > then
            > went quiet. The in progress code is quite different here and may
            > already resolve this.
            >
            > Here's a copy of a long email from me..... print it and read it in
            > the bath or on holiday ;-) I would appreciate your ideas ASAP though
            > as I'm revising this code as you read.....
            >
            > --------------------------------------------
            >
            > I can see what the issue is - fixing it is more problematic though.
            >
            > Background:
            >
            > C-Bus Level synchs by default are 10 minutes apart. When they do run
            > they instantly - well using 16 fast successive messages - provide the
            > level of all lights on the local network. C-Bus state synchs run
            > every
            > 4 seconds ( 3 successive messages) and do the same for the
            > on/off/error/absent states of every group. Every time they report
            > they
            > are cross checked with every other place that a state or level is
            > held,
            > ie in your case the gateway and HomeVision and reconcilled if needs
            > be. If there are differences actually within C-Bus then the
            > arbitration process is highly involved - but most effective. This
            > could
            > happen for example if a CB network became broken into two and then
            > restored. Additionally if you have 'virtual' groups you need to be
            > careful to not update the states and levels in the gateway with real
            > data (which would be reporting as absent) . Smarts can ensure that
            > traffic to HV is minimised by only transferring any changes - but only
            > once everything is in a stable (running) condition. Getting to that
            > stable synchronised state is the challenge.... and needs resolving to
            > fix this.
            >
            > Issues:
            >
            > Here's the current problem if you use HV. Due to the speed of the
            > serial port to HV you can't transfer the information quick enough
            > (realtime) at startup when CB reports it's states and levels if they
            > are different to the ones that HV already reported at synch. In
            > my case
            > because I am running with debug included and because I have less C-Bus
            > groups and HV custom lighting entries it actually works better as
            > it's
            > slower. So the options are to buffer more things (which takes
            > memory of
            > course) or to delay things eg take some level data initially and then
            > await a second level report from C-Bus or implement and buffer the
            > initial level data . This messes up the timeline of quite what is
            > happening in relation to other events as all buffers by their very
            > nature are effectively queues and shift time. Achieving a realtime
            > processing of events is the preferred approach as it avoids timing
            > issues, responds quicker and conserves memory.
            >
            > The big issue though is 'reaction'. As you change things then other
            > things react - eg change one CB group and others change too. Both
            > CB and
            > HV can react to changes (as can xAP) and they can react differently
            > causing potentially cascading changes and even loops / non
            > reconcillable
            > differences.
            >
            > Take for example two lights A and B. At gateway startup in HV you
            > have A reported as OFF and B is ON whereas on C-Bus both are ON.
            > In HV
            > you have an action setup such that either A or B is always on but not
            > both. At startup I recover light A and B from C-Bus but have to
            > buffer/delay the changes to HV as it can't take the info quick enough.
            > So I transfer A first . HV has an action that says that if A
            > turns ON
            > then B is OFF . So it now sends the gateway a command to do this. I
            > now have a situation where HV has asked for B to turn OFF but B is
            > actually ON as reported by CB, and this ON state update is already
            > queued to be transfered to HV as soon as HV indicates it can
            > accept it
            > . So do I action HV's request or not ? FTTB lets assume I do and I
            > turn B off on CB. Now however HV sees that already queued B ON
            > update
            > and so it asks for A to turn OFF. So again assuming I action the
            > request I turn A off on CB and queue the state update but HV now sees
            > the queued B off from its first command and so it asks for A to turn
            > ON. :-( I have a loop I can't ever reconcile, and yet if the states
            > were settled and consistent (ie gateway running and one on and one
            > off )
            > there is no loop .
            >
            > Here are a variety of other similar issues, I'm sure there's more...
            >
            > 1) At gateway power up (ignoring C-Bus and HV and xAP) what should
            > the gateway believe in terms of group state if there are conflicts ?
            > Should it enforce the group level as reported by HomeVision , or
            > should
            > it enforce the one from C-Bus , or even xAP / other HA software
            > applications ?
            >
            > 2) C-Bus has state recovery after a power failure and so does HV -
            > which state in these circumstances is correct ?
            >
            > 3) CB and HV will actually recover at different times after a power
            > restore. as will any xAP devices or HA applications.
            >
            > 4) I can't tell that C-Bus has initiated a power fail recovery - it
            > just reappears with new states and it'll achieve this following a
            > power
            > recovery before my gateway is up and running.
            >
            > 5) I can only tell HV has gone through a power fail recovery if it
            > tells me - and if, as is likley I am also in a power recovery state I
            > will miss this as I take longer to recover.
            >
            > 6) HV has power fail recovery states per light, some may be set and
            > some not. These power fail recovery states may conflict with the
            > recovery state set on C-Bus
            >
            > 7) If power had failed to only one device eg say HomeVision had a
            > power restore or perhaps just a schedule reload then recovery
            > would be
            > different compared to if power had failed to all devices . This is
            > because in tehsae circumstances CB would definitely be the state
            > to base
            > everything on. However if CB failed then HV would hold the correct
            > state. If the gateway was just momentarily cycled - eg a firmware
            > update then HV and CB would likely be nearly 'in synch' but if the
            > C-Bus
            > PCI was disconnected from the gateway to do some work using C-Bus
            > toolkit and then re-attached then actually nothing power cycles - but
            > upon C-Bus reconnection you have potentially significant state
            > differences and no recovery macros being triggered on HV.
            >
            > 8) Within HV remember 'last state' is not an option and on C-Bus it
            > is an option on a per group basis (which is memory wearing and hence
            > typically not used)
            >
            > 9) Catch up macros can run on HV that change the state of groups - eg
            > if you recover power at 1AM after a power outage of 6 hours then the
            > scheduled event at 8PM (turn lights on as dark) will still run
            > immediately followed by the 11PM (bedtime) event. These both run
            > immediately after power recovery... ie the lights would turn on
            > and then
            > immediately turn off again.
            >
            > 10) Currently I am tending to enforce the C-Bus state back on HV. So
            > I startup, read HV's CL state table and then read CB and if things are
            > different I change HV to match . However as per the first
            > paragraph this
            > has issues if macros are associated with the state changes
            > (reactions).
            >
            > 11) A C-Bus PAC or touchscreen may be installed and running its own
            > recovery macros and logic and again will react to any changes seen on
            > C-Bus.
            >
            > 12) Manual changes may have been made by people which might be
            > preferred to the recovery ones. Particularly if say C-Bus didnt have a
            > power outage or ws changed immediately after power recovery. Power
            > comes on and you immediately manually switch on a C-Bus light in the
            > room - should the HV recovery macro be able to turn it off ?
            >
            > How to do this ?
            >
            > So I have some decisions to make and I'll have to ignore some
            > aspects, or give them a priority.
            >
            > Specifically should I action any changes during synchronisation
            > that HV or other controllers asks for ? I can't tell if these come
            > from
            > a power recovery macro, a catch up event, a power fail state recovery
            > set on a per light basis or a reflex macro action run as a result
            > of me
            > updating a state from C-Bus . C-Bus might have its own
            > controllers eg
            > a PAC. Even within just a HV/CB setup this is complex and when
            > there is
            > also xAP involved - which can be asking for things to go to a specific
            > state too - it is even more awkward. xAP can itself be linked with
            > multiple HA automation software like HomeSeer, Charmed Quark, Xlobby,
            > xAP Floorplan etc which are all essentially independent controllers -
            > just like HV and so can have power on 'macros' running and also may
            > react to things changing state. With PC applications their 'power
            > recovery' state may run several seconds/minutes after power is
            > restored
            > as the PC reboots.
            >
            > One possibility I considered is to maintain a flag in HV that
            > shows that CB and the gateway are in synch ( I think you suggested
            > this). This would allow an approach that says until this is set then I
            > could ignore all controller commands - but then you would need to
            > write
            > a fairly complex script in HV to utilise this, following a power
            > fail ,
            > and you won't for example know at this time which catchup events
            > should
            > be run. There's still also an even bigger problem. Even though I
            > could ignore the state change events from HV during startup - it will
            > still internally run them and it updates its internal state table
            > directly , regardless of whether the command was actioned .
            >
            > An example - immediately after power restoration CB Group 01 is ON
            > and in HV it is ON and I talk to HV and establish this so I set a flag
            > in my gateway to say 'in synch' and move onto later groups. HV
            > however
            > asks for Group 01 to turn OFF and I ignore this command because
            > we're in
            > the synching state, but it updates its internal table anyway :-(
            > So
            > when I complete my synch at Group 255 actually my 'in synch' flags are
            > all invalid still because HV updated itself directly .... maybe
            > several times over .
            >
            > I then considered what happens if I now do a second pass of HV's
            > state table and then apply those states back on CBus, but then CB may
            > 'react' ( for example a scene trigger that changes several groups or
            > even more complex a PAC) and this in turn may force HV to run yet more
            > macros.... I could just let this cascade several times and hope it
            > settles....
            >
            > Another thought - have a flag in HV that says 'enforce HV' or
            > 'enforce CB' at startup. But as you 'enforce' one , the other reacts
            > causing further changes - it's these reactions that are a problem
            > - and
            > can create loops. The catchup events are also problematic.
            >
            > All the time anything changes we are adding queued data to the
            > serial buffer between the gateway and HV and (aside from the memory
            > concern) this means we aren't realtime reacting as previous/current
            > state changes are in the queue but not processed yet. Plus we can
            > easily
            > create these loops. Once you get a loop no amount of buffering is
            > enough as it will eventually overflow and so you either have to
            > discarded previous data (overwrite it) or discard the new data. Both
            > result in states being incorrect.
            >
            > This is exactly what is happening in your case (that ## Buff XS
            > ## message) and in the current implementation I have allowed the
            > buffer
            > to overflow which is causing the hang. This is definitely wrong (bad
            > bad bad) but my alternatives of overwrite or discard are also fraught
            > with issues. It's not easy to fix either due to the independent event
            > driven nature of the code.
            >
            > Additionally HV is just one of maybe several controllers
            > participating in this... although the speed of transfer (Ethernet) to
            > most other options is much larger - plus they will typically buffer
            > locally if needed whereas HV has no inward command buffering at all.
            >
            > One 'solution' is to implement the buffers in a way that protects
            > against overflow (already done) and tries to slow everything down
            > enough to ensure that data doesn't need to be overwritten or
            > discarded.
            > Then we allow users to chose which state model (C-Bus / HV / etc) to
            > resynch to. Then to just make people aware of the potential
            > interactions here and tell them to plan carefully to avoid 'reactions'
            > messing things up and of course to be careful to not create loops. ie
            > push most of the onus back on the user. This is of course much easier
            > for me
            >
            > Thoughts / suggestions please .......
            >
            > K
            >
            >
            >
            >
            >
            >
            >
            > ------------------------------------
            >
            > Yahoo! Groups Links
            >
            >
            > mailto:ukusa_gateway-fullfeatured@yahoogroups.com
            > <mailto:ukusa_gateway-fullfeatured@yahoogroups.com>
            >
            >
            >
            >
          • Paul Gale
            Yes, I know - I keep on causing problems! (At least I managed to successfully replace the max232 chip I blew up after accidentally plugging in the CB network
            Message 5 of 9 , Jul 30, 2008
            • 0 Attachment
              Yes, I know - I keep on causing problems! (At least I managed to successfully replace the max232 chip I blew up after accidentally plugging in the CB network to the CB 232 interface on the gateway!!!)

              My thoughts on the issue - bear in mind that your email was HUUUUGE, and I'm writing these as I read through it, so I may have some concepts wrong or just be writing total gibberish!...


              I don't think I mind if HV is "frozen" or doesn't work as expected when initially synchronising between HV and the gateway - I can live with that. Even if it takes several minutes for it to fully synch. What would be a problem is if it became out of sync when I wasn't expecting it or had no knowledge of it (especially if I'm controlling roof windows and blinds etc).

              So just to clarify - is the main problem only when the initial sync is happening? Does this ONLY happen on a power cycle of the gateway/power restore etc? What's the longest time that this sync would take? I don't think it's a problem to wait a set amount of time and then perform any actions needed after a sync, as long as there's a way for HV to know this.

              Another thought - is there a way that the gateway can do all of the synching it needs and only then, send this data to HV, where HV can then act upon this as needed?

              Re your questions:

              1) Surely CB is king here? Maybe virtual groups are the only exception though??? I would think that CB should be king and as long as the other apps know that a restart/sync has just happened, should then request any changes as necessary. Maybe more logic in the other apps is needed, but could be a real mess otherwise?

              2) Personally I don't use state recovery in CB - all are set to off. Does it matter if the user/programmer is aware of this and the app (or HV) knows that a power recovery has happened and can then set states/levels as appropriate. Again more programming but better maybe? Does HV actually recover light states? I didn't think it does. Yes, it can set levels and states in the power failure recovery section. These should only be performed after the gateway is running and fully synched - would need a way to be able to test for this in HV or a flag/variable etc as you mentioned. Probably also with clear user guidance on the issue?

              3) also, some apps may never go down if they're connected to a UPS etc. My xAP apps running on my HA Server PC will run for a good 45 mins or so, before the PC will shut down during a power cut etc. As above, the gateway should be fully up and synched before HV should take any action.

              4) does this matter - won't the gateway fetch these states anyway?

              5) can there not be some actions that the HV power recovery section HAS to have as part of the initial programming/setup of the gateway - setting variables/flags etc to show when the gateway is running properly. Or - variables/flags that need to be mandatorily setup as part of the gateway installation.

              6) Ah ok - that answers Q2 a bit more - ummm, is there a way to get HV to hold off doing anything until the gateway is ready? I guess that would require a PROM change? Might be a good option to have in there though?

              7) Is this actually true? If you have HV doing a lot of state/level changes, yes this is the case but if HV only reacts to CB levels/states and does not actually change anything, then CB is still king. OK, maybe not likely as HV is very powerful when coupled with CB and most users will probably soon get HV changing states - but it's not a foregone conclusion.

              9) OK - lights would turn on/off very quickly, but is there a case where this is a problem?

              11) doesn't affect me as I don't have one ;)

              12) This should be dealt with by user training/documentation - It pays to think through the consequences when writing recovery macros etc - do you really want to perform that action? Is this scenario really a problem anyway? Would most people would put up with some "odd" behaviour right after a power cut anyway?

              Other thoughts:

              "Specifically should I action any changes during synchronisation that HV or other controllers asks for ? I can't tell if these come from a power recovery macro, a catch up event, a power fail state recovery set on a per light basis or a reflex macro action run as a result of me
              updating a state from C-Bus"

              Could these be fixed by writing a macro in HV that checks for the readiness of the gateway - and use that macro in any power fail / catch-up etc event - only performing the actions when the gateway is ready. Maybe the gateway ships with a number of recommended macros etc and suggestions when to use them - in fact isn't there a way to hard code macros in HV - didn't you use that for another product of yours?

              "There's still also an even bigger problem. Even though I could ignore the state change events from HV during startup - it will still internally run them and it updates its internal state table directly , regardless of whether the command was actioned"

              Maybe a request to the way HV deals with these to add the option to check that flag etc and only perform them when it's set? Again a PROM change though?

              Last thought - Is there a way with additional hardware in the gateway to overcome the buffering issue?



              Paul.



              > -----Original Message-----
              > From: ukusa_gateway@yahoogroups.com
              > [mailto:ukusa_gateway@yahoogroups.com] On Behalf Of Kevin Hawkins
              > Sent: 29 July 2008 15:54
              > To: ukusa_gateway@yahoogroups.com
              > Subject: [ukusa_gateway] Synchronisation implementation - thoughts from
              > users please....
              >
              > Hi
              >
              > This message is long but worth a read as I would like some ideas
              > from people, particularly HV users. It relates to how best to initially
              > synchronise C-Bus and a controller, usually HomeVision but maybe a xAP
              > application(s) eg HomeSeer at startup or following a power failure.
              > The issue is that there are conflicting aspects potentially at work
              > here.
              >
              > The problem manifested itself as a user - you know who ;-) with a
              > significant C-Bus install of just under a hundred C-Bus groups who
              > noticed the gateway hanging at startup and sometimes restarting itself
              > whilst running. The problem usually crashed the gateway. A crashed
              > gateway just stops dead... it ceases to send heartbeats, the web pages
              > don't load , control and status tracking ceases and the firmware
              > updater
              > can't see the gateway (bad). This didn't show here , partly because I
              > have less groups and surprisingly because by running in the development
              > environment it was running slightly slower than in the real world which
              > helped. In a nutshell it happens because the serial link is
              > comparatively slow to HV and there is too much data to be sent causing
              > an overflow or data loss (HV doesn't have any internal command
              > buffering).
              >
              > I have fixed the gateway crashing and also revised the synch
              > routines but some decisions need to be made on how best to implement
              > the
              > synchronisation and here and that effects all users , not just HV ones,
              > so I'd like thoughts from people. It is very unlikley any other user
              > would experience the crashing bug.
              >
              > Also if anyone does have synchronisation issues with C-Bus currently
              > please let me know as I'm not aware of them and I want to resolve those
              > in with the next update. One user mentioned this a while back but then
              > went quiet. The in progress code is quite different here and may
              > already resolve this.
              >
              > Here's a copy of a long email from me..... print it and read it in
              > the bath or on holiday ;-) I would appreciate your ideas ASAP though
              > as I'm revising this code as you read.....
              >
              > --------------------------------------------
              >
              > I can see what the issue is - fixing it is more problematic though.
              >
              > Background:
              >
              > C-Bus Level synchs by default are 10 minutes apart. When they do run
              > they instantly - well using 16 fast successive messages - provide the
              > level of all lights on the local network. C-Bus state synchs run every
              > 4 seconds ( 3 successive messages) and do the same for the
              > on/off/error/absent states of every group. Every time they report they
              > are cross checked with every other place that a state or level is held,
              > ie in your case the gateway and HomeVision and reconcilled if needs
              > be. If there are differences actually within C-Bus then the
              > arbitration process is highly involved - but most effective. This
              > could
              > happen for example if a CB network became broken into two and then
              > restored. Additionally if you have 'virtual' groups you need to be
              > careful to not update the states and levels in the gateway with real
              > data (which would be reporting as absent) . Smarts can ensure that
              > traffic to HV is minimised by only transferring any changes - but only
              > once everything is in a stable (running) condition. Getting to that
              > stable synchronised state is the challenge.... and needs resolving to
              > fix this.
              >
              > Issues:
              >
              > Here's the current problem if you use HV. Due to the speed of the
              > serial port to HV you can't transfer the information quick enough
              > (realtime) at startup when CB reports it's states and levels if they
              > are different to the ones that HV already reported at synch. In my
              > case
              > because I am running with debug included and because I have less C-Bus
              > groups and HV custom lighting entries it actually works better as it's
              > slower. So the options are to buffer more things (which takes memory of
              > course) or to delay things eg take some level data initially and then
              > await a second level report from C-Bus or implement and buffer the
              > initial level data . This messes up the timeline of quite what is
              > happening in relation to other events as all buffers by their very
              > nature are effectively queues and shift time. Achieving a realtime
              > processing of events is the preferred approach as it avoids timing
              > issues, responds quicker and conserves memory.
              >
              > The big issue though is 'reaction'. As you change things then other
              > things react - eg change one CB group and others change too. Both CB
              > and
              > HV can react to changes (as can xAP) and they can react differently
              > causing potentially cascading changes and even loops / non
              > reconcillable
              > differences.
              >
              > Take for example two lights A and B. At gateway startup in HV you
              > have A reported as OFF and B is ON whereas on C-Bus both are ON. In
              > HV
              > you have an action setup such that either A or B is always on but not
              > both. At startup I recover light A and B from C-Bus but have to
              > buffer/delay the changes to HV as it can't take the info quick enough.
              > So I transfer A first . HV has an action that says that if A turns
              > ON
              > then B is OFF . So it now sends the gateway a command to do this. I
              > now have a situation where HV has asked for B to turn OFF but B is
              > actually ON as reported by CB, and this ON state update is already
              > queued to be transfered to HV as soon as HV indicates it can accept it
              > . So do I action HV's request or not ? FTTB lets assume I do and I
              > turn B off on CB. Now however HV sees that already queued B ON update
              > and so it asks for A to turn OFF. So again assuming I action the
              > request I turn A off on CB and queue the state update but HV now sees
              > the queued B off from its first command and so it asks for A to turn
              > ON. :-( I have a loop I can't ever reconcile, and yet if the states
              > were settled and consistent (ie gateway running and one on and one off
              > )
              > there is no loop .
              >
              > Here are a variety of other similar issues, I'm sure there's more...
              >
              > 1) At gateway power up (ignoring C-Bus and HV and xAP) what should
              > the gateway believe in terms of group state if there are conflicts ?
              > Should it enforce the group level as reported by HomeVision , or should
              > it enforce the one from C-Bus , or even xAP / other HA software
              > applications ?
              >
              > 2) C-Bus has state recovery after a power failure and so does HV -
              > which state in these circumstances is correct ?
              >
              > 3) CB and HV will actually recover at different times after a power
              > restore. as will any xAP devices or HA applications.
              >
              > 4) I can't tell that C-Bus has initiated a power fail recovery - it
              > just reappears with new states and it'll achieve this following a
              > power
              > recovery before my gateway is up and running.
              >
              > 5) I can only tell HV has gone through a power fail recovery if it
              > tells me - and if, as is likley I am also in a power recovery state I
              > will miss this as I take longer to recover.
              >
              > 6) HV has power fail recovery states per light, some may be set and
              > some not. These power fail recovery states may conflict with the
              > recovery state set on C-Bus
              >
              > 7) If power had failed to only one device eg say HomeVision had a
              > power restore or perhaps just a schedule reload then recovery would be
              > different compared to if power had failed to all devices . This is
              > because in tehsae circumstances CB would definitely be the state to
              > base
              > everything on. However if CB failed then HV would hold the correct
              > state. If the gateway was just momentarily cycled - eg a firmware
              > update then HV and CB would likely be nearly 'in synch' but if the C-
              > Bus
              > PCI was disconnected from the gateway to do some work using C-Bus
              > toolkit and then re-attached then actually nothing power cycles - but
              > upon C-Bus reconnection you have potentially significant state
              > differences and no recovery macros being triggered on HV.
              >
              > 8) Within HV remember 'last state' is not an option and on C-Bus it
              > is an option on a per group basis (which is memory wearing and hence
              > typically not used)
              >
              > 9) Catch up macros can run on HV that change the state of groups - eg
              > if you recover power at 1AM after a power outage of 6 hours then the
              > scheduled event at 8PM (turn lights on as dark) will still run
              > immediately followed by the 11PM (bedtime) event. These both run
              > immediately after power recovery... ie the lights would turn on and
              > then
              > immediately turn off again.
              >
              > 10) Currently I am tending to enforce the C-Bus state back on HV. So
              > I startup, read HV's CL state table and then read CB and if things are
              > different I change HV to match . However as per the first paragraph
              > this
              > has issues if macros are associated with the state changes (reactions).
              >
              > 11) A C-Bus PAC or touchscreen may be installed and running its own
              > recovery macros and logic and again will react to any changes seen on
              > C-Bus.
              >
              > 12) Manual changes may have been made by people which might be
              > preferred to the recovery ones. Particularly if say C-Bus didnt have a
              > power outage or ws changed immediately after power recovery. Power
              > comes on and you immediately manually switch on a C-Bus light in the
              > room - should the HV recovery macro be able to turn it off ?
              >
              > How to do this ?
              >
              > So I have some decisions to make and I'll have to ignore some
              > aspects, or give them a priority.
              >
              > Specifically should I action any changes during synchronisation
              > that HV or other controllers asks for ? I can't tell if these come from
              > a power recovery macro, a catch up event, a power fail state recovery
              > set on a per light basis or a reflex macro action run as a result of me
              > updating a state from C-Bus . C-Bus might have its own controllers
              > eg
              > a PAC. Even within just a HV/CB setup this is complex and when there
              > is
              > also xAP involved - which can be asking for things to go to a specific
              > state too - it is even more awkward. xAP can itself be linked with
              > multiple HA automation software like HomeSeer, Charmed Quark, Xlobby,
              > xAP Floorplan etc which are all essentially independent controllers -
              > just like HV and so can have power on 'macros' running and also may
              > react to things changing state. With PC applications their 'power
              > recovery' state may run several seconds/minutes after power is restored
              > as the PC reboots.
              >
              > One possibility I considered is to maintain a flag in HV that
              > shows that CB and the gateway are in synch ( I think you suggested
              > this). This would allow an approach that says until this is set then I
              > could ignore all controller commands - but then you would need to write
              > a fairly complex script in HV to utilise this, following a power fail ,
              > and you won't for example know at this time which catchup events should
              > be run. There's still also an even bigger problem. Even though I
              > could ignore the state change events from HV during startup - it will
              > still internally run them and it updates its internal state table
              > directly , regardless of whether the command was actioned .
              >
              > An example - immediately after power restoration CB Group 01 is ON
              > and in HV it is ON and I talk to HV and establish this so I set a flag
              > in my gateway to say 'in synch' and move onto later groups. HV
              > however
              > asks for Group 01 to turn OFF and I ignore this command because we're
              > in
              > the synching state, but it updates its internal table anyway :-(
              > So
              > when I complete my synch at Group 255 actually my 'in synch' flags are
              > all invalid still because HV updated itself directly .... maybe
              > several times over .
              >
              > I then considered what happens if I now do a second pass of HV's
              > state table and then apply those states back on CBus, but then CB may
              > 'react' ( for example a scene trigger that changes several groups or
              > even more complex a PAC) and this in turn may force HV to run yet more
              > macros.... I could just let this cascade several times and hope it
              > settles....
              >
              > Another thought - have a flag in HV that says 'enforce HV' or
              > 'enforce CB' at startup. But as you 'enforce' one , the other reacts
              > causing further changes - it's these reactions that are a problem - and
              > can create loops. The catchup events are also problematic.
              >
              > All the time anything changes we are adding queued data to the
              > serial buffer between the gateway and HV and (aside from the memory
              > concern) this means we aren't realtime reacting as previous/current
              > state changes are in the queue but not processed yet. Plus we can
              > easily
              > create these loops. Once you get a loop no amount of buffering is
              > enough as it will eventually overflow and so you either have to
              > discarded previous data (overwrite it) or discard the new data. Both
              > result in states being incorrect.
              >
              > This is exactly what is happening in your case (that ## Buff XS
              > ## message) and in the current implementation I have allowed the buffer
              > to overflow which is causing the hang. This is definitely wrong (bad
              > bad bad) but my alternatives of overwrite or discard are also fraught
              > with issues. It's not easy to fix either due to the independent event
              > driven nature of the code.
              >
              > Additionally HV is just one of maybe several controllers
              > participating in this... although the speed of transfer (Ethernet) to
              > most other options is much larger - plus they will typically buffer
              > locally if needed whereas HV has no inward command buffering at all.
              >
              > One 'solution' is to implement the buffers in a way that protects
              > against overflow (already done) and tries to slow everything down
              > enough to ensure that data doesn't need to be overwritten or discarded.
              > Then we allow users to chose which state model (C-Bus / HV / etc) to
              > resynch to. Then to just make people aware of the potential
              > interactions here and tell them to plan carefully to avoid 'reactions'
              > messing things up and of course to be careful to not create loops. ie
              > push most of the onus back on the user. This is of course much easier
              > for me
              >
              > Thoughts / suggestions please .......
              >
              > K
              >
              >
              >
              >
              >
              >
              >
              > ------------------------------------
              >
              > Yahoo! Groups Links
              >
              >
              >
              >
              >
              > __________ Information from ESET NOD32 Antivirus, version of virus
              > signature database 3310 (20080730) __________
              >
              > The message was checked by ESET NOD32 Antivirus.
              >
              > http://www.eset.com
              >


              __________ Information from ESET NOD32 Antivirus, version of virus signature database 3311 (20080730) __________

              The message was checked by ESET NOD32 Antivirus.

              http://www.eset.com
            • Paul Gale
              Sounds like a good idea :) Re the flag checking - I think this is also a good idea and perfectly acceptable, the only issue is that this needs to be well
              Message 6 of 9 , Jul 30, 2008
              • 0 Attachment
                Sounds like a good idea :)

                Re the flag checking - I think this is also a good idea and perfectly acceptable, the only issue is that this needs to be well documented. Were you planning on writing any further docs/wiki etc on the gateway and specific programming in HV? A nasty job though - I hate writing docs etc!

                Paul.



                > -----Original Message-----
                > From: ukusa_gateway@yahoogroups.com
                > [mailto:ukusa_gateway@yahoogroups.com] On Behalf Of Kevin Hawkins
                > Sent: 29 July 2008 17:10
                > To: ukusa_gateway@yahoogroups.com
                > Subject: Re: [ukusa_gateway] Synchronisation implementation - thoughts
                > from users please....
                >
                >
                > More coffee and I'm now thinking that disabling the running of
                > HomeVision Actions (state change triggered macros) during synch time is
                > the right thing to do. This will eliminate a lot of the reflex actions
                > that cause the issues as I update HV to match C-Bus. These are the
                > one
                > aspect of HV's power recovery mechanism that I can block.
                >
                > Running actions based on state changes that arise during initial
                > synch is probably a bad idea anyway , as the order and timing of
                > changes
                > has been lost. Group 22 might have changed two minutes after group 12
                > but during a synch the changing groups will be presented back to HV 0
                > upwards to 255 ie 12 before 22 and with negligable delay between them
                > so
                > the validty of the change is compromised.
                >
                > I will then ensure there is a flag/variable in HV that is
                > maintained
                > with the gateway status. A power recovery macro can be run when this
                > becomes 'synched' to complete any required state adjustments. For
                > other
                > controllers I will provide a xAP BSC output that conveys the status.
                > What I would strongly suggest is that your scripts in any HA software
                > or
                > xAP device honours this flag and avoids reflex actions (commands based
                > on state changes) during the synch stage and has a final 'startup'
                > script that runs once synch is achieved. For C-Bus controllers like
                > the
                > PAC and touchscreens I will maintain a virtual/phantom C-Bus group to
                > reflect the status, allowing the same interlock to be used in your
                > code.
                >
                > This doesn't resolve all issues, especially with dual controllers
                > (eg PAC or xAP) but it does reduce the problem. I'm still looking for
                > any users thoughts..
                >
                > K
                >
                > ------------------------------------
                >
                > Yahoo! Groups Links
                >
                >
                >
                >
                >
                > __________ Information from ESET NOD32 Antivirus, version of virus
                > signature database 3310 (20080730) __________
                >
                > The message was checked by ESET NOD32 Antivirus.
                >
                > http://www.eset.com
                >


                __________ Information from ESET NOD32 Antivirus, version of virus signature database 3311 (20080730) __________

                The message was checked by ESET NOD32 Antivirus.

                http://www.eset.com
              • Mark Kenny
                Hi Kevin Only just back from hols, so didn t see your mail before now . I am a Homeseer - Gateway - Xap user, so i m not sure if my feedback is of any use.
                Message 7 of 9 , Aug 5, 2008
                • 0 Attachment
                  Hi Kevin
                   
                  Only just back from hols, so didn't see your mail before now .
                   
                  I am a Homeseer - Gateway - Xap  user, so i'm not sure if my feedback is of any use. However as a user with a gateway that doesn't sync properly, I thought I would let you know the current state of things.   
                   
                  The gateway will only sync if I power down the c-bus network. As a work around, I have an event in homeseer which toggles all groups thus updating the status. This does not force the gateway to become "synced" but status changes are correctly shown and updated.
                   
                  I believe the problem lies with my c-bus network but as I have a couple of hundred hours work still to do on the rest of the house, I had to put further trouble shooting wayyyy down the list of priorities.
                   
                   
                  Regards
                  Mark  



                  __________ NOD32 3327 (20080805) Information __________

                  This message was checked by NOD32 antivirus system.
                  http://www.eset.com
                • Kevin Hawkins
                  Hi Mark... I m glad you ve resurfaced - albeit it sounds like you ve a way to go before the house nears completion. Actually we all now know that s like the
                  Message 8 of 9 , Aug 5, 2008
                  • 0 Attachment
                    Hi Mark...

                    I'm glad you've resurfaced - albeit it sounds like you've a way to
                    go before the house nears completion. Actually we all now know that's
                    like the end of the rainbow.

                    I have done quite a bit of work on the CB sync algorithm since the
                    version you have and whilst it is still possible that your problem is a
                    C-Bus issue I would like you to try my current build and see if it
                    improves things. Remind me how many groups you have and if there is
                    any other unusual aspect of the CB install eg network bridges or non
                    lighting applications ? Would you have a few moments to let me know
                    before I actually release the next beta ? ( I would need to send the
                    latest development version to you)

                    I can achieve a C-Bus state and level sync now in under 5 seconds.
                    However the HomeVision sync still takes nearly a minute as so much data
                    has to be transferred over the slow serial interface.

                    Currently it appears the gateway isn't synching at all for you .
                    The way you're getting around it is actually just change tracking which
                    is not ideal in many ways. Not least because the gateway never enters
                    its fully running state and so some aspects might not work, and it will
                    also remain continually polling the CB network, adding unnecessary
                    traffic. The 'toggle' actions you have to take, besides being
                    unnecessary in a working setup are also not good because in order to
                    toggle a state I need to know what state it is currently in - which I
                    don't ... so things might end up in an unexpected state.

                    I want to get this all working perfectly for you. Now that Max's
                    webserver issue has been sussed you're the only user I'm aware of with
                    an issue ....

                    K

                    Mark Kenny wrote:
                    > Hi Kevin
                    >
                    > Only just back from hols, so didn't see your mail before now .
                    >
                    > I am a Homeseer - Gateway - Xap user, so i'm not sure if my feedback
                    > is of any use. However as a user with a gateway that doesn't sync
                    > properly, I thought I would let you know the current state of things.
                    >
                    > The gateway will only sync if I power down the c-bus network. As a
                    > work around, I have an event in homeseer which toggles all groups thus
                    > updating the status. This does not force the gateway to become
                    > "synced" but status changes are correctly shown and updated.
                    >
                    > I believe the problem lies with my c-bus network but as I have a
                    > couple of hundred hours work still to do on the rest of the house, I
                    > had to put further trouble shooting wayyyy down the list of priorities.
                    >
                    >
                    > Regards
                    > Mark
                    >
                    >
                    >
                    > __________ NOD32 3327 (20080805) Information __________
                    >
                    > This message was checked by NOD32 antivirus system.
                    > http://www.eset.com
                    >
                  • Mark Kenny
                    Hi Kevin I am happy to try any development versions. I am sure I can find a few hours for this. If you could send me the dev. version or let me know where I
                    Message 9 of 9 , Aug 5, 2008
                    • 0 Attachment
                      Hi Kevin
                       
                      I am happy to try any development versions. I am sure I can find a few hours for this. 
                      If you could send me the dev. version or let me know where I can download from, I will give it a try tonight. Besides the continual rain over here means I can't do much work on the gutters this week. !!   
                       
                      Regards
                      Mark
                       
                       
                       
                       
                       
                    Your message has been successfully submitted and would be delivered to recipients shortly.