Loading ...
Sorry, an error occurred while loading the content.

race condition in navigator

Expand Messages
  • graniterose2003
    I have a wall-following class that uses the TVremote to pause/resume the robot. In the main thread a loop checks the IR sensors and uses nav.go(heading) to
    Message 1 of 4 , Oct 7, 2011
    • 0 Attachment
      I have a wall-following class that uses the TVremote to pause/resume the robot. In the main thread a loop checks the IR sensors and uses nav.go(heading) to keep the bot near the wall. The TV remote is also checked each pass through the loop. The code to check the TVremote looks like the following

      private int chkRemote() {
      int key = remote.read();
      if (key == PAUSE) {
      nav.stop();
      while (remote.read() != PLAY) {
      led4.toggle();
      sleep(1000);
      led4.toggle();
      }
      }
      return key;
      }

      Most of the time the pause/play works just fine. But every so often, after hitting PAUSE, the led will show we're in the PAUSE mode, but the motors are still running, despite having called nav.stop(); The remote won't stop the robot inside the toggle loop, but PLAY will give control back.

      The main loop is running at default priority, and the usual priorities are set for the encoders, screenmanager, and DifferentialDriveNavigator threads. So I'm not sure how the race condition occurs in the navigator thread. It seems that the navigator thread is about to call goHeading in the state switch loop when the main thread calls nav.stop(). The motors are stopped, but the navigator is resumed and calls goHeading so the motors are started again ....

      Of course, I can "fix" the race condition by putting a nav.stop() in the toggle loop of the remote checker ... but one wonders if there is a flaw revealed by the race condition ???
    • ridgesoft
      Hi, It looks like you have found a bug. The access of the mState variable is not protected in the run method so there is a potential race. Please try
      Message 2 of 4 , Oct 8, 2011
      • 0 Attachment
        Hi,

        It looks like you have found a bug. The access of the mState variable is not protected in the run method so there is a potential race.

        Please try replacing the run method with the following version that adds synchronization.

        The source code for the class is available here: http://www.ridgesoft.com/articles/ridgewarriorii/RidgeWarriorIIPartIII.zip

        Please let us know if that fixes it.

        Regards,

        RidgeSoft Support

        /**
        * Run method for the navigation thread.
        */
        public void run() {
        try {
        while (true) {
        synchronized (this) {
        switch (mState) {
        case MOVE_TO:
        goToPoint();
        break;
        case GO:
        goHeading();
        break;
        case ROTATE:
        doRotate();
        break;
        default: // stopped
        break;
        }
        }
        Thread.sleep(mPeriod);
        }
        }
        catch (Throwable t) {
        t.printStackTrace();
        }
        }



        --- In intellibrain@yahoogroups.com, "graniterose2003" <graniterose2003@...> wrote:
        >
        > I have a wall-following class that uses the TVremote to pause/resume the robot. In the main thread a loop checks the IR sensors and uses nav.go(heading) to keep the bot near the wall. The TV remote is also checked each pass through the loop. The code to check the TVremote looks like the following
        >
        > private int chkRemote() {
        > int key = remote.read();
        > if (key == PAUSE) {
        > nav.stop();
        > while (remote.read() != PLAY) {
        > led4.toggle();
        > sleep(1000);
        > led4.toggle();
        > }
        > }
        > return key;
        > }
        >
        > Most of the time the pause/play works just fine. But every so often, after hitting PAUSE, the led will show we're in the PAUSE mode, but the motors are still running, despite having called nav.stop(); The remote won't stop the robot inside the toggle loop, but PLAY will give control back.
        >
        > The main loop is running at default priority, and the usual priorities are set for the encoders, screenmanager, and DifferentialDriveNavigator threads. So I'm not sure how the race condition occurs in the navigator thread. It seems that the navigator thread is about to call goHeading in the state switch loop when the main thread calls nav.stop(). The motors are stopped, but the navigator is resumed and calls goHeading so the motors are started again ....
        >
        > Of course, I can "fix" the race condition by putting a nav.stop() in the toggle loop of the remote checker ... but one wonders if there is a flaw revealed by the race condition ???
        >
      • graniterose2003
        OK, adding your synchronized block around the switch block seems to fix the race condition. At least in the tests I ran I didn t see the race condition.
        Message 3 of 4 , Oct 9, 2011
        • 0 Attachment
          OK, adding your synchronized block around the switch block "seems" to fix the race condition. At least in the tests I ran I didn't see the race condition. (The usual caveat applies -- you can only prove/demonstrate a race condition exists. You can't prove there isn't a race condition. Or absence of proof is not proof of absence.)

          I agree the synchronized block should fix the race.

          Inquiring minds want to know ....
          It's not clear to me how the thread scheduler would allow the lower
          priority main thread to "interfere" with the higher priority DifferentialDriveNavigator thread. If the DiffNav is in its switch loop, the lower priority main thread should not get control til the DiffNav does its sleep. I could see how the race condition could occur with equal or higher priority main thread.

          thanks


          --- In intellibrain@yahoogroups.com, "ridgesoft" <rs1@...> wrote:
          >
          > Hi,
          >
          > It looks like you have found a bug. The access of the mState variable is not protected in the run method so there is a potential race.
          >
          > Please try replacing the run method with the following version that adds synchronization.
          >
          > The source code for the class is available here: http://www.ridgesoft.com/articles/ridgewarriorii/RidgeWarriorIIPartIII.zip
          >
          > Please let us know if that fixes it.
          >
          > Regards,
          >
          > RidgeSoft Support
          >
          > /**
          > * Run method for the navigation thread.
          > */
          > public void run() {
          > try {
          > while (true) {
          > synchronized (this) {
          > switch (mState) {
          > case MOVE_TO:
          > goToPoint();
          > break;
          > case GO:
          > goHeading();
          > break;
          > case ROTATE:
          > doRotate();
          > break;
          > default: // stopped
          > break;
          > }
          > }
          > Thread.sleep(mPeriod);
          > }
          > }
          > catch (Throwable t) {
          > t.printStackTrace();
          > }
          > }
          >
          >
          >
          > --- In intellibrain@yahoogroups.com, "graniterose2003" <graniterose2003@> wrote:
          > >
          > > I have a wall-following class that uses the TVremote to pause/resume the robot. In the main thread a loop checks the IR sensors and uses nav.go(heading) to keep the bot near the wall. The TV remote is also checked each pass through the loop. The code to check the TVremote looks like the following
          > >
          > > private int chkRemote() {
          > > int key = remote.read();
          > > if (key == PAUSE) {
          > > nav.stop();
          > > while (remote.read() != PLAY) {
          > > led4.toggle();
          > > sleep(1000);
          > > led4.toggle();
          > > }
          > > }
          > > return key;
          > > }
          > >
          > > Most of the time the pause/play works just fine. But every so often, after hitting PAUSE, the led will show we're in the PAUSE mode, but the motors are still running, despite having called nav.stop(); The remote won't stop the robot inside the toggle loop, but PLAY will give control back.
          > >
          > > The main loop is running at default priority, and the usual priorities are set for the encoders, screenmanager, and DifferentialDriveNavigator threads. So I'm not sure how the race condition occurs in the navigator thread. It seems that the navigator thread is about to call goHeading in the state switch loop when the main thread calls nav.stop(). The motors are stopped, but the navigator is resumed and calls goHeading so the motors are started again ....
          > >
          > > Of course, I can "fix" the race condition by putting a nav.stop() in the toggle loop of the remote checker ... but one wonders if there is a flaw revealed by the race condition ???
          > >
          >
        • ridgesoft
          Hi, It s hard to say exactly what scenario you were running into. However, if the main thread is executing the stop method when the higher priority navigator
          Message 4 of 4 , Oct 9, 2011
          • 0 Attachment
            Hi,

            It's hard to say exactly what scenario you were running into. However, if the main thread is executing the stop method when the higher priority navigator thread becomes runnable, the navigator thread would read mState, switch into the indicated case, and then block on the method invocation because the three methods referenced from the run method are synchronized, as is the stop method. Since the stop method would own the mutex, it would be allowed to complete before the higher priority thread could continue. So, if the navigator thread becomes runnable between the main thread's entry of the stop method and the point where the stop method updates mState, it is guaranteed the wrong case will be executed unless the navigator was stopped to begin with.

            Regards,

            RidgeSoft Support

            --- In intellibrain@yahoogroups.com, "graniterose2003" <graniterose2003@...> wrote:
            >
            > OK, adding your synchronized block around the switch block "seems" to fix the race condition. At least in the tests I ran I didn't see the race condition. (The usual caveat applies -- you can only prove/demonstrate a race condition exists. You can't prove there isn't a race condition. Or absence of proof is not proof of absence.)
            >
            > I agree the synchronized block should fix the race.
            >
            > Inquiring minds want to know ....
            > It's not clear to me how the thread scheduler would allow the lower
            > priority main thread to "interfere" with the higher priority DifferentialDriveNavigator thread. If the DiffNav is in its switch loop, the lower priority main thread should not get control til the DiffNav does its sleep. I could see how the race condition could occur with equal or higher priority main thread.
            >
            > thanks
            >
            >
            > --- In intellibrain@yahoogroups.com, "ridgesoft" <rs1@> wrote:
            > >
            > > Hi,
            > >
            > > It looks like you have found a bug. The access of the mState variable is not protected in the run method so there is a potential race.
            > >
            > > Please try replacing the run method with the following version that adds synchronization.
            > >
            > > The source code for the class is available here: http://www.ridgesoft.com/articles/ridgewarriorii/RidgeWarriorIIPartIII.zip
            > >
            > > Please let us know if that fixes it.
            > >
            > > Regards,
            > >
            > > RidgeSoft Support
            > >
            > > /**
            > > * Run method for the navigation thread.
            > > */
            > > public void run() {
            > > try {
            > > while (true) {
            > > synchronized (this) {
            > > switch (mState) {
            > > case MOVE_TO:
            > > goToPoint();
            > > break;
            > > case GO:
            > > goHeading();
            > > break;
            > > case ROTATE:
            > > doRotate();
            > > break;
            > > default: // stopped
            > > break;
            > > }
            > > }
            > > Thread.sleep(mPeriod);
            > > }
            > > }
            > > catch (Throwable t) {
            > > t.printStackTrace();
            > > }
            > > }
            > >
            > >
            > >
            > > --- In intellibrain@yahoogroups.com, "graniterose2003" <graniterose2003@> wrote:
            > > >
            > > > I have a wall-following class that uses the TVremote to pause/resume the robot. In the main thread a loop checks the IR sensors and uses nav.go(heading) to keep the bot near the wall. The TV remote is also checked each pass through the loop. The code to check the TVremote looks like the following
            > > >
            > > > private int chkRemote() {
            > > > int key = remote.read();
            > > > if (key == PAUSE) {
            > > > nav.stop();
            > > > while (remote.read() != PLAY) {
            > > > led4.toggle();
            > > > sleep(1000);
            > > > led4.toggle();
            > > > }
            > > > }
            > > > return key;
            > > > }
            > > >
            > > > Most of the time the pause/play works just fine. But every so often, after hitting PAUSE, the led will show we're in the PAUSE mode, but the motors are still running, despite having called nav.stop(); The remote won't stop the robot inside the toggle loop, but PLAY will give control back.
            > > >
            > > > The main loop is running at default priority, and the usual priorities are set for the encoders, screenmanager, and DifferentialDriveNavigator threads. So I'm not sure how the race condition occurs in the navigator thread. It seems that the navigator thread is about to call goHeading in the state switch loop when the main thread calls nav.stop(). The motors are stopped, but the navigator is resumed and calls goHeading so the motors are started again ....
            > > >
            > > > Of course, I can "fix" the race condition by putting a nav.stop() in the toggle loop of the remote checker ... but one wonders if there is a flaw revealed by the race condition ???
            > > >
            > >
            >
          Your message has been successfully submitted and would be delivered to recipients shortly.