[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120104200447.GC11810@n2100.arm.linux.org.uk>
Date: Wed, 4 Jan 2012 20:04:47 +0000
From: Russell King - ARM Linux <linux@....linux.org.uk>
To: Nicolas Pitre <nico@...xnic.net>
Cc: Olof Johansson <olof@...om.net>, linux-kernel@...r.kernel.org,
Arnd Bergmann <arnd@...db.de>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: ZenIV (was: Re: Status of arm-soc.git for 3.2)
On Wed, Jan 04, 2012 at 07:10:42PM +0000, Russell King - ARM Linux wrote:
> On Wed, Jan 04, 2012 at 01:55:48PM -0500, Nicolas Pitre wrote:
> > On Wed, 4 Jan 2012, Russell King - ARM Linux wrote:
> > > I've now disabled all access to my git tree there, and restarted apache.
> > > We will see whether that improves stability - I suspect it will do because
> > > I reckon that the problem is that the smart git stuff is what's killing
> > > the machine.
> >
> > I'm sure that is the case. However faulty hardware could still be the
> > root cause, but without the load from Git the machine might become
> > loaded lightly enough you might not see any ill effects before quite a
> > while.
>
> When it's serving the next Fedora release (it's one of the official
> mirrors) I'm sure any problems like that would become noticable - but
> they haven't yet. It purely seems to be something git is doing which
> is killing the machine.
I think we've just found the issue causing httpd to die:
- Fedora systems set a limit at login time on the number of processes a
user can run - which is set to a soft limit of 1024.
- When someone logs in, their shell inherits this soft rlimit. This
gets inherited by all sub-processes, including through a su to their
root shell.
- When they restart httpd, httpd also inherits this limit.
- httpd's own internal limits are set to a max clients of 1024.
The problem comes when httpd hits 1024 processes - as it forks as root,
this succeeds (root is not subjected to this rlimit), and then a
subsequent setuid() fails with -EAGAIN, causing httpd to experience a
fatal error.
Obviously, this is not a good combination of things to happen. It's
also completely unnoticable to anyone who restarts apache.
So, I've 'fixed' it by raising the rlimit in the httpd startup scripts,
which should keep it fixed whenever anyone else restarts httpd.
The problem should be solved, and as such I've re-enabled access to
the git tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists