Loading ...
Sorry, an error occurred while loading the content.

Re: runaway vim processes

Expand Messages
  • Daniel Elstner
    ... Yes, it s not pretty. My idea was that reducing the problem to SIGSEGV is a lot better than the current situation, and switching to a fixed pthread (i.e.
    Message 1 of 14 , Mar 3, 2003
    • 0 Attachment
      On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:
      > Daniel Elstner wrote:
      >
      > > I found some interesting information on the issue. This Debian bug
      > > report explains what's going on:
      > >
      > > http://lists.debian.org/debian-glibc/2002/debian-glibc-200212/msg00347.html
      > >
      > > As mentioned in the mail, the problem with coroutines also applies to
      > > sigaltstack(). There seems to be only one way around the problem:
      > > simply don't use sigaltstack().
      > >
      > > The attached patch disables the alternative stack if compiling on Linux
      > > with pthreads, except for SIGSEGV. This is AFAIK the only signal for
      > > which the alternative stack is really necessary. Thus there shouldn't
      > > be a regression in functionality when switching to fixed-up pthreads
      > > some day.
      >
      > This makes sense. I'm glad you found out about this Linux problem.
      >
      > The solution would not work though: Catching SIGSEGV on the alternate
      > stack works (assuming that longjmp() works with threading).
      > But when SIGSEGV occurs for another reason it would run into the problem
      > with the stack pointer and generate another SIGSEGV, thus loop forever.

      Yes, it's not pretty. My idea was that reducing the problem to SIGSEGV
      is a lot better than the current situation, and switching to a fixed
      pthread (i.e. glibc compiled with minimum kernel >= 2.4) would fix the
      problem without recompiling Vim.

      SIGSEGV is fatal in most apps; Vim will just behave slightly more
      annoying and livelock instead of crashing immediately. And for this we
      can blame someone else.

      > I cannot think of a solution without losing the ability to catch
      > out-of-stack errors. We _need_ the alternate stack, and it can't be
      > used when threading is enabled...

      There is a solution -- rebuild glibc with minimum kernel >= 2.4 :/

      Regards,
      --Daniel
    • Daniel Elstner
      ... I just confirmed that at least that part still works. Inserting a recursive call early in regmatch() successfully resulted in E363, and Vim did not crash.
      Message 2 of 14 , Mar 3, 2003
      • 0 Attachment
        On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:

        > The solution would not work though: Catching SIGSEGV on the alternate
        > stack works (assuming that longjmp() works with threading).

        I just confirmed that at least that part still works. Inserting a
        recursive call early in regmatch() successfully resulted in E363, and
        Vim did not crash.

        --Daniel
      • Daniel Elstner
        ... Ooops, sorry, I take that back. /me just realized that there is an explicit call to mch_stackcheck() in regmatch(). Is there any simple way to test the
        Message 3 of 14 , Mar 3, 2003
        • 0 Attachment
          On Die, 2003-03-04 at 08:55, Daniel Elstner wrote:
          > On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:
          >
          > > The solution would not work though: Catching SIGSEGV on the alternate
          > > stack works (assuming that longjmp() works with threading).
          >
          > I just confirmed that at least that part still works. Inserting a
          > recursive call early in regmatch() successfully resulted in E363, and
          > Vim did not crash.

          Ooops, sorry, I take that back. /me just realized that there is an
          explicit call to mch_stackcheck() in regmatch(). Is there any simple
          way to test the functionality of the signal handler?

          --Daniel
        • Bram Moolenaar
          ... That s what you get when double checking for errors... I think the simplest way is to undefine HAVE_GETRLIMIT in auto/config.h and compile again. Since we
          Message 4 of 14 , Mar 4, 2003
          • 0 Attachment
            Daniel Elstner wrote:

            > On Die, 2003-03-04 at 08:55, Daniel Elstner wrote:
            > > On Mon, 2003-03-03 at 21:05, Bram Moolenaar wrote:
            > >
            > > > The solution would not work though: Catching SIGSEGV on the alternate
            > > > stack works (assuming that longjmp() works with threading).
            > >
            > > I just confirmed that at least that part still works. Inserting a
            > > recursive call early in regmatch() successfully resulted in E363, and
            > > Vim did not crash.
            >
            > Ooops, sorry, I take that back. /me just realized that there is an
            > explicit call to mch_stackcheck() in regmatch(). Is there any simple
            > way to test the functionality of the signal handler?

            That's what you get when double checking for errors...

            I think the simplest way is to undefine HAVE_GETRLIMIT in auto/config.h
            and compile again.

            Since we do HAVE_GETRLIMIT on linux, and threading may cause a hang, it
            might be better to rely on mch_stackcheck() and not use the alternate
            stack. Thus use your #ifdef around setting sa.sa_flags, also for
            SIGSEGV.

            /* Setup to use the alternate stack for the signal function. */
            sa.sa_handler = func_deadly;
            sigemptyset(&sa.sa_mask);
            # if defined(__linux__) && defined(_REENTRANT)
            /* Linux with kernel 2.2 has a bug in thread handling in
            * combination with using the alternate stack: library functions
            * will use the ordinary stack anyway, causing a SEGV signal,
            * which recursively calls deathtrap and hangs. */
            sa.sa_flags = 0;
            # else
            sa.sa_flags = SA_ONSTACK;
            # endif
            sigaction(signal_info[i].sig, &sa, NULL);

            Does that look OK?

            --
            SUPERIMPOSE "England AD 787". After a few more seconds we hear hoofbeats in
            the distance. They come slowly closer. Then out of the mist comes KING
            ARTHUR followed by a SERVANT who is banging two half coconuts together.
            "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

            /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
            /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
            \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
            \\\ Help AIDS victims, buy at Amazon -- http://ICCF.nl/click1.html ///
          • Daniel Elstner
            ... Yes, that sounds good to me. ... Yep. But I d change the wording of the comment to: /* On Linux, glibc compiled for minimum kernel 2.2 has a bug in *
            Message 5 of 14 , Mar 4, 2003
            • 0 Attachment
              On Die, 2003-03-04 at 10:25, Bram Moolenaar wrote:

              > Since we do HAVE_GETRLIMIT on linux, and threading may cause a hang, it
              > might be better to rely on mch_stackcheck() and not use the alternate
              > stack. Thus use your #ifdef around setting sa.sa_flags, also for
              > SIGSEGV.

              Yes, that sounds good to me.

              > /* Setup to use the alternate stack for the signal function. */
              > sa.sa_handler = func_deadly;
              > sigemptyset(&sa.sa_mask);
              > # if defined(__linux__) && defined(_REENTRANT)
              > /* Linux with kernel 2.2 has a bug in thread handling in
              > * combination with using the alternate stack: library functions
              > * will use the ordinary stack anyway, causing a SEGV signal,
              > * which recursively calls deathtrap and hangs. */
              > sa.sa_flags = 0;
              > # else
              > sa.sa_flags = SA_ONSTACK;
              > # endif
              > sigaction(signal_info[i].sig, &sa, NULL);
              >
              > Does that look OK?

              Yep. But I'd change the wording of the comment to:

              /* On Linux, glibc compiled for minimum kernel 2.2 has a bug in
              * thread handling in combination with using the alternate stack:
              * pthread library functions try to use the stack pointer to
              * identify the current thread, causing a SEGV signal, which
              * recursively calls deathtrap and hangs. */

              Otherwise people might get confused since Linux 2.2 is quite rare on
              desktop systems these days.

              --Daniel
            Your message has been successfully submitted and would be delivered to recipients shortly.