lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13755.1233458060@turing-police.cc.vt.edu>
Date:	Sat, 31 Jan 2009 22:14:20 -0500
From:	Valdis.Kletnieks@...edu
To:	Angelo Borsotti <angelo.borsotti@...il.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: processes/threads monitor

On Sat, 31 Jan 2009 22:57:36 +0100, Angelo Borsotti said:

> The need occurs on Linux installations in which there are some process
> control applications,
> or some continuous sever applications that need some resilience
> towards process failures.
> When a process providing an essential service terminates, come
> "process monitor" would
> create it again, restoring the service automatically (and without the
> need for human intervention).

Usually, this is easily done with any sane /sbin/init, by sticking a
line in /etc/inittab that looks like:

vip:12345:respawn:/usr/bin/my-important-process

It terminates, it gets respawned.

The *tricky* case isn't dealing with a process that manages to terminate,
the real uglyness starts when you have a process that becomes wedged up
but failing to terminate (which can happen for any number of reasons - it
went into an infinite loop, or two threads managed to deadlock, or....)

And unfortunately, there's no clean general-purpose way to do *that*, mostly
because there's no good config language that allows you to code stuff like
"if Apache wedges up and fails to respond in 0.5 seconds or less, shoot it
and restart it, unless you have detected that the failure is actually due to
unrelated cause A, in which case do X, or if B happened, then do Y, or if
C happened, then do Z, or...."

If you have real requirements for all-the-time continuous operations, then
you don't *really* want a monitor anyhow.  What you *want* is redundant
hardware with some good high-availability clustering software to do hot failover.

Because if your server comes to a screeching halt because a memory card or
other hardware just bit the dust, that monitor isn't going to do *anything* for you.

Plus - if you can't do failover, how do you do a system upgrade if needed?

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ