Re: Daemon monitor
- On Tue, Nov 26, 2002 at 12:38:47AM -0800, Doug Barton wrote:
>Well, that's a good question. I hadn't really thought about that, but
> I'm a little skeptical about this idea... what happens if this daemon
> crashes? Personally I prefer the approach of having a process-guard
> script for each separate daemon you're interested in protecting.
> However, I don't want to seem to be a negative nancy here. :)
I could use the guard script for that. I don't like the idea of using
the guard script for all the daemons because that would mean you would
actually have double the number of processes (not necessarily bad, but
it offends my aesthetic sense :). Plus, I'm not sure all the daemon's
in the base system have a "foreground" switch (I may be wrong). Who knows
about any other daemons in ports. Also, it would be easier to handle
restart intervals, maximum restarts, etc... with a proper monitoring daemon.
Also, in addition I would like to be able to change any parameters, such
as restart interval for example, on the fly without having to stop and
restart the guard script (and consequently the server as well).
> > It's kind of like daemontools, except that it's going to be integrated
> > into rcng. Initially, the rc.d script passes information about
> > the daemon to the monitor through a unix domain socket.
> Does it have to be a socket? It seems that is the cause of a lot of the
> problems you mention below, which I'm leaving in place for others to
> comment on. Meanwhile, this is the approach I use for a process-guard
> script for named that I use at work
I'm not sure what you mean, but I'll explain why I want to do it that way.
I don't want Yet Another Config File. I want to be able to twiddle a knob
in rc.conf that will tell rc, "hey I want to this daemon monintored, and if
it crashes I want it restarted after 5 seconds and I don't want it restarted
more than 10 times." I want to build this into rcng without having to specify
all this in a separate (inetd.conf - like) configuration file. So, in essence,
this is what happens.
1. The first time the monitor is invoked it detaches from the
console, opens a unix domain socket (ala syslogd(8)), and waits.
2. Subsequent invocations of the monitor will sense this socket
and instead of doing #1, pass on the information about the
daemon to be monitored through the socket to the monitor. The
information includes things like: path the rc script or daemon,
restart interval, max restarts, and if the daemon is already
running its pid.
3. The monitoring daemon, upon receiving this information through the
socket, will start the process if it isn't already started and
use kqueue(2) to watch for the processes crashing/quitting/whatever.
Now here's the problem:
I want to be able to point the monitor at an rc.d script and tell it, "hey, if
_this_ process quits, run _that_ script to restart it. Now, the way daemontools
handles it is by making the script like your guard script. It doesn't exit
unless the daemon stops or crashes for some reason. The rc.d scripts don't work
like that, they start the daemon and then they exit. So, the rc.d scripts have to
have some way to communicate with the monitor. The best solution I've been able
to come up with is a pipe(2). When the monitor uses an rc.d script to restart
a stopped daemon it creates a pipe(2), forks(2), and execve(2)'s the script.
It passes the file descriptor of the the write-end of the pipe through an
environment variable. Now, like I said, this works fine, except that the file
descriptors for the pipe(2) are not close-on-exec. So, in addition to the monitor
and the script, any processes forked by the script also has access to the pipe(2)
and the environment variable that contains the fd for _one_ end of the pipe. So,
conceivably they could write garbage to it. Now this isn't really exploitable
1. The monitor only expects a pid. A malisciout user could not
use this to modify a parameter and get it to restart a different daemon.
2. The monitor closes both ends of the pipe(2) immediately after the
first read(2) so the window of opportunity isn't very big.
3. The domain socket, the rc.d scripts, etc are only writeable by
the root user, so a malicious user would first have to break root
to take advantage of this ...err...feature.
None the less, benign as it may be, I would like to mitigate this problem
by adding a built-in function to sh(1) that (un)sets the close-on-exec flag
of a descriptor so that the rc.d scripts can use it to stop from propagating the
pipe to any child processes. However, I am not sure if such a patch would be
acceptable to people.
Another solution might be to unset the environment
variable that contains the fd of the write-end of the pipe. So, even though
the pipe would be in the children's fd table, they wouldn't know which
one it was.
> > One of the bitsCheers.
> > of information it passes is the rc.d script to run (itself) when the
> > daemon dies. So, when a daemon crashes and it comes time to restart it
> > the monitor creates a pipe, runs the script, and read(2)s on the listening
> > end of the pipe for the script to send it the new pid of the daemon. This
> > works fine, except that every process spawned from the script now has access
> > to the pipe. It would be nice if the Bourne Shell had a built-in function to
> > set the close-on-exec flag of a file descriptor. Do you think people would
> > be amenable to such a patch? Or do you think this bug^H^H^Hfeature is
> > something we can live with?
Mike Makonnen | GPG-KEY: http://www.identd.net/~mtm/mtm.asc
mtm@... | Fingerprint: D228 1A6F C64E 120A A1C9 A3AA DAE1 E2AF DBCC 68B9