[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4BE3503A.2000309@google.com>
Date: Thu, 06 May 2010 16:26:50 -0700
From: Mike Waychison <mikew@...gle.com>
To: Michel Lespinasse <walken@...gle.com>
CC: David Howells <dhowells@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux-MM <linux-mm@...ck.org>, Ying Han <yinghan@...gle.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: rwsem: down_read_unfair() proposal
Michel Lespinasse wrote:
> On Wed, May 05, 2010 at 11:03:40AM +0100, David Howells wrote:
>> If the system is as heavily loaded as you say, how do you prevent
>> writer starvation? Or do things just grind along until sufficient
>> threads are queued waiting for a write lock?
>
> Reader/Writer fairness is not disabled in the general case - it only is
> for a few specific readers such as /proc/<pid>/maps. In particular, the
> do_page_fault path, which holds a read lock on mmap_sem for potentially long
> (~disk latency) periods of times, still uses a fair down_read() call.
> In comparison, the /proc/<pid>/maps path which we made unfair does not
> normally hold the mmap_sem for very long (it does not end up hitting disk);
> so it's been working out well for us in practice.
>
FWIW, these sorts of block-ups are usually really pronounce on machines
with harddrives that take _forever_ to respond to SMART commands (which
are done via PIO, and which can serialize many drives when they are
hidden behind a port multiplier). We've seen cases where hard faults
can take unusually long on an otherwise non-busy machines (~10 seconds?).
The other case we have problems with mmap_sem from a cluster monitoring
perspective occurs when we get blocked up behind a task that is having
problems dying from oom. We have a variety of hacks used internally to
cover these cases, though I think we (David and I?) figured that it'd
make more sense to fix the dependencies on down_read(¤t->mmap_sem)
in the do_exit() path. For instance, it really makes no sense to
coredump when we are being oom killed (and thus we should be able to
skip the mmap_sem dependency there..).
Mike Waychison
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists