Loading ...
Sorry, an error occurred while loading the content.

Re: runaway vim processes

Expand Messages
  • Bram Moolenaar
    ... This makes sense. I m glad you found out about this Linux problem. The solution would not work though: Catching SIGSEGV on the alternate stack works
    Message 1 of 14 , Mar 3 12:05 PM
    • 0 Attachment
      Daniel Elstner wrote:

      > I found some interesting information on the issue. This Debian bug
      > report explains what's going on:
      >
      > http://lists.debian.org/debian-glibc/2002/debian-glibc-200212/msg00347.html
      >
      > As mentioned in the mail, the problem with coroutines also applies to
      > sigaltstack(). There seems to be only one way around the problem:
      > simply don't use sigaltstack().
      >
      > The attached patch disables the alternative stack if compiling on Linux
      > with pthreads, except for SIGSEGV. This is AFAIK the only signal for
      > which the alternative stack is really necessary. Thus there shouldn't
      > be a regression in functionality when switching to fixed-up pthreads
      > some day.

      This makes sense. I'm glad you found out about this Linux problem.

      The solution would not work though: Catching SIGSEGV on the alternate
      stack works (assuming that longjmp() works with threading).
      But when SIGSEGV occurs for another reason it would run into the problem
      with the stack pointer and generate another SIGSEGV, thus loop forever.

      I cannot think of a solution without losing the ability to catch
      out-of-stack errors. We _need_ the alternate stack, and it can't be
      used when threading is enabled...

      --
      To keep milk from turning sour: Keep it in the cow.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Help AIDS victims, buy at Amazon -- http://ICCF.nl/click1.html ///
    • Daniel Elstner
      ... Yes, it s not pretty. My idea was that reducing the problem to SIGSEGV is a lot better than the current situation, and switching to a fixed pthread (i.e.
      Message 2 of 14 , Mar 3 12:27 PM
      • 0 Attachment
        On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:
        > Daniel Elstner wrote:
        >
        > > I found some interesting information on the issue. This Debian bug
        > > report explains what's going on:
        > >
        > > http://lists.debian.org/debian-glibc/2002/debian-glibc-200212/msg00347.html
        > >
        > > As mentioned in the mail, the problem with coroutines also applies to
        > > sigaltstack(). There seems to be only one way around the problem:
        > > simply don't use sigaltstack().
        > >
        > > The attached patch disables the alternative stack if compiling on Linux
        > > with pthreads, except for SIGSEGV. This is AFAIK the only signal for
        > > which the alternative stack is really necessary. Thus there shouldn't
        > > be a regression in functionality when switching to fixed-up pthreads
        > > some day.
        >
        > This makes sense. I'm glad you found out about this Linux problem.
        >
        > The solution would not work though: Catching SIGSEGV on the alternate
        > stack works (assuming that longjmp() works with threading).
        > But when SIGSEGV occurs for another reason it would run into the problem
        > with the stack pointer and generate another SIGSEGV, thus loop forever.

        Yes, it's not pretty. My idea was that reducing the problem to SIGSEGV
        is a lot better than the current situation, and switching to a fixed
        pthread (i.e. glibc compiled with minimum kernel >= 2.4) would fix the
        problem without recompiling Vim.

        SIGSEGV is fatal in most apps; Vim will just behave slightly more
        annoying and livelock instead of crashing immediately. And for this we
        can blame someone else.

        > I cannot think of a solution without losing the ability to catch
        > out-of-stack errors. We _need_ the alternate stack, and it can't be
        > used when threading is enabled...

        There is a solution -- rebuild glibc with minimum kernel >= 2.4 :/

        Regards,
        --Daniel
      • Daniel Elstner
        ... I just confirmed that at least that part still works. Inserting a recursive call early in regmatch() successfully resulted in E363, and Vim did not crash.
        Message 3 of 14 , Mar 3 11:55 PM
        • 0 Attachment
          On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:

          > The solution would not work though: Catching SIGSEGV on the alternate
          > stack works (assuming that longjmp() works with threading).

          I just confirmed that at least that part still works. Inserting a
          recursive call early in regmatch() successfully resulted in E363, and
          Vim did not crash.

          --Daniel
        • Daniel Elstner
          ... Ooops, sorry, I take that back. /me just realized that there is an explicit call to mch_stackcheck() in regmatch(). Is there any simple way to test the
          Message 4 of 14 , Mar 3 11:58 PM
          • 0 Attachment
            On Die, 2003-03-04 at 08:55, Daniel Elstner wrote:
            > On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:
            >
            > > The solution would not work though: Catching SIGSEGV on the alternate
            > > stack works (assuming that longjmp() works with threading).
            >
            > I just confirmed that at least that part still works. Inserting a
            > recursive call early in regmatch() successfully resulted in E363, and
            > Vim did not crash.

            Ooops, sorry, I take that back. /me just realized that there is an
            explicit call to mch_stackcheck() in regmatch(). Is there any simple
            way to test the functionality of the signal handler?

            --Daniel
          • Bram Moolenaar
            ... That s what you get when double checking for errors... I think the simplest way is to undefine HAVE_GETRLIMIT in auto/config.h and compile again. Since we
            Message 5 of 14 , Mar 4 1:25 AM
            • 0 Attachment
              Daniel Elstner wrote:

              > On Die, 2003-03-04 at 08:55, Daniel Elstner wrote:
              > > On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:
              > >
              > > > The solution would not work though: Catching SIGSEGV on the alternate
              > > > stack works (assuming that longjmp() works with threading).
              > >
              > > I just confirmed that at least that part still works. Inserting a
              > > recursive call early in regmatch() successfully resulted in E363, and
              > > Vim did not crash.
              >
              > Ooops, sorry, I take that back. /me just realized that there is an
              > explicit call to mch_stackcheck() in regmatch(). Is there any simple
              > way to test the functionality of the signal handler?

              That's what you get when double checking for errors...

              I think the simplest way is to undefine HAVE_GETRLIMIT in auto/config.h
              and compile again.

              Since we do HAVE_GETRLIMIT on linux, and threading may cause a hang, it
              might be better to rely on mch_stackcheck() and not use the alternate
              stack. Thus use your #ifdef around setting sa.sa_flags, also for
              SIGSEGV.

              /* Setup to use the alternate stack for the signal function. */
              sa.sa_handler = func_deadly;
              sigemptyset(&sa.sa_mask);
              # if defined(__linux__) && defined(_REENTRANT)
              /* Linux with kernel 2.2 has a bug in thread handling in
              * combination with using the alternate stack: library functions
              * will use the ordinary stack anyway, causing a SEGV signal,
              * which recursively calls deathtrap and hangs. */
              sa.sa_flags = 0;
              # else
              sa.sa_flags = SA_ONSTACK;
              # endif
              sigaction(signal_info[i].sig, &sa, NULL);

              Does that look OK?

              --
              SUPERIMPOSE "England AD 787". After a few more seconds we hear hoofbeats in
              the distance. They come slowly closer. Then out of the mist comes KING
              ARTHUR followed by a SERVANT who is banging two half coconuts together.
              "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

              /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
              /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
              \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
              \\\ Help AIDS victims, buy at Amazon -- http://ICCF.nl/click1.html ///
            • Daniel Elstner
              ... Yes, that sounds good to me. ... Yep. But I d change the wording of the comment to: /* On Linux, glibc compiled for minimum kernel 2.2 has a bug in *
              Message 6 of 14 , Mar 4 7:15 AM
              • 0 Attachment
                On Die, 2003-03-04 at 10:25, Bram Moolenaar wrote:

                > Since we do HAVE_GETRLIMIT on linux, and threading may cause a hang, it
                > might be better to rely on mch_stackcheck() and not use the alternate
                > stack. Thus use your #ifdef around setting sa.sa_flags, also for
                > SIGSEGV.

                Yes, that sounds good to me.

                > /* Setup to use the alternate stack for the signal function. */
                > sa.sa_handler = func_deadly;
                > sigemptyset(&sa.sa_mask);
                > # if defined(__linux__) && defined(_REENTRANT)
                > /* Linux with kernel 2.2 has a bug in thread handling in
                > * combination with using the alternate stack: library functions
                > * will use the ordinary stack anyway, causing a SEGV signal,
                > * which recursively calls deathtrap and hangs. */
                > sa.sa_flags = 0;
                > # else
                > sa.sa_flags = SA_ONSTACK;
                > # endif
                > sigaction(signal_info[i].sig, &sa, NULL);
                >
                > Does that look OK?

                Yep. But I'd change the wording of the comment to:

                /* On Linux, glibc compiled for minimum kernel 2.2 has a bug in
                * thread handling in combination with using the alternate stack:
                * pthread library functions try to use the stack pointer to
                * identify the current thread, causing a SEGV signal, which
                * recursively calls deathtrap and hangs. */

                Otherwise people might get confused since Linux 2.2 is quite rare on
                desktop systems these days.

                --Daniel
              Your message has been successfully submitted and would be delivered to recipients shortly.