[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061217223730.GW10054@mea-ext.zmailer.org>
Date: Mon, 18 Dec 2006 00:37:30 +0200
From: Matti Aarnio <matti.aarnio@...iler.org>
To: Randy Dunlap <randy.dunlap@...cle.com>
Cc: "J.H." <warthog9@...nel.org>, Andrew Morton <akpm@...l.org>,
Pavel Machek <pavel@....cz>,
kernel list <linux-kernel@...r.kernel.org>, hpa@...or.com,
webmaster@...nel.org
Subject: Re: [KORG] Re: kernel.org lies about latest -mm kernel
On Sun, Dec 17, 2006 at 10:23:54AM -0800, Randy Dunlap wrote:
> J.H. wrote:
...
> >The root cause boils down to with git, gitweb and the normal mirroring
> >on the frontend machines our basic working set no longer stays resident
> >in memory, which is forcing more and more to actively go to disk causing
> >a much higher I/O load. You have the added problem that one of the
> >frontend machines is getting hit harder than the other due to several
> >factors: various DNS servers not round robining, people explicitly
> >hitting [git|mirrors|www|etc]1 instead of 2 for whatever reason and
> >probably several other factors we aren't aware of. This has caused the
> >average load on that machine to hover around 150-200 and if for whatever
> >reason we have to take one of the machines down the load on the
> >remaining machine will skyrocket to 2000+.
Relaying on DNS and clients doing round-robin load-balancing is doomed.
You really, REALLY, need external L4 load-balancer switches.
(And installation help from somebody who really knows how to do this
kind of services on a cluster.)
Basic config features include, of course:
- number of parallel active connections with each protocol
- availability of each served protocol (e.g. one can shutdown rsync
at one server, and new rsync connections get pushed elsewere)
- running load-balance of each served protocol separately
- server load monitoring and letting it bias new connections to nodes
not so utterly loaded
- allowing direct access to each server in addition to the access
via cluster service
- some sort of connection persistence, only for HTTP access ?
(ftp and rsync can do nicely without)
> >Since it's apparent not everyone is aware of what we are doing, I'll
> >mention briefly some of the bigger points.
...
> >- We've cut back on the number of ftp and rsync users to the machines.
> >Basically we are cutting back where we can in an attempt to keep the
> >load from spiraling out of control, this helped a bit when we recently
> >had to take one of the machines down and instead of loads spiking into
> >the 2000+ range we peaked at about 500-600 I believe.
How about having filesystems mounted with "noatime" ?
Or do you already do that ?
> >So we know the problem is there, and we are working on it - we are
> >getting e-mails about it if not daily than every other day or so. If
> >there are suggestions we are willing to hear them - but the general
> >feeling with the admins is that we are probably hitting the biggest
> >problems already.
/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists