lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 14 Dec 2011 23:11:45 +0100
From:	Linus Walleij <linus.walleij@...aro.org>
To:	Vincent Li <vincent.mc.li@...il.com>
Cc:	linux-watchdog@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Seeking Linux watchdog design advice to trouble shoot mystory
 silent reboot issue

On Mon, Dec 5, 2011 at 8:55 PM, Vincent Li <vincent.mc.li@...il.com> wrote:

> we have  a complex system with a large number of processes running
> simutanously. If any of the processes gets into a faulty state and
> hangs or consumes more than its fair share of the system resources,
> the other processes may not get a chance to run, and the whole system
> can hang, interrupting the system functionality and user traffic.

Have you tried using RLIMITs?

Last time I used something like this from each process:

#include <sys/time.h>
#include <sys/resource.h>

struct rlimit rl;
int ret;

// No process run more than 5 seconds
rl.rlim_cur = rl.rlim_max = 5;
ret = setrlimit(RLIMIT_CPU, &rl);
// No realtime process run more than 1 second
rl.rlim_cur = rl.rlim_max = 1000000;
ret = setrlimit(RLIMIT_RTTIME, &rl);

The latter is good if you have real-time processes.

There are also RLIMITs for memory consumption.

Consult:
http://kernel.org/doc/man-pages/online/pages/man2/getrlimit.2.html

> CPU and memory control group features are not considered at this stage
> because it is too invasive to change in our custom kernel.

Do you mean that you are using an antique kernel with many custom
patches and you don't want to upgrade because it's a lot of work?
Mainlining your code and keeping each patch topic on a special
git branch (and using git) are recommended practices.

If you mean you have been stripping it down for footprint then
it's another thing which I can fully understand...

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ