[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200712061204.14854.phillips@phunq.net>
Date: Thu, 6 Dec 2007 12:04:14 -0800
From: Daniel Phillips <phillips@...nq.net>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC] [PATCH] A clean approach to writeout throttling
On Thursday 06 December 2007 03:55, Andrew Morton wrote:
> Consider an example.
>
> - We a-priori decide to limit a particular stack's peak memory usage
> to 1MB
Ah, this is the piece I was missing.
> - We empirically discover that the maximum amount of memory which is
> allocated by that stack on behalf of a single BIO is 16kb. (ie:
> that's the most it has ever used for a single BIO).
>
> - Now, we refuse to feed any more BIOs into the stack when its
> instantaneous memory usage exceeds (1MB - 16kb).
And printk a warning, because the programmer's static analysis was
wrong.
> Of course, the _average_ memory-per-BIO is much less than 16kb. So
> there are a lot of BIOs in flight - probably hundreds, but a minimum
> of 63.
And progress is being made, everybody is happy.
> There is a teeny so-small-it-doesn't-matter chance that the stack
> will exceed the 1MB limit. If it happens to be at its (1MB-16kb)
> limit and all the memory in the machine is AWOL and then someone
> throws a never-seen-before twirly BIO at it. Not worth worrying
> about, surely.
OK, I see where you are going. The programmer statically determines a
total reservation for the stack, which is big enough that progress is
guaranteed. We then throttle based on actual memory consumption, and
essentially use a heuristic to decide when we are near the upper limit.
Workable I think, but...
The main idea, current->pages_used_for_block_allocations++ is valid only
in direct call context. If a daemon needs to allocate memory on behalf
of the IO transfer (not unusual) it won't get accounted, which is
actually the central issue in this whole class of deadlocks. Any idea
how to extend the accounting idea to all tasks involved in a particular
block device stack?
Regards,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists