[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0804072121320.28744@blonde.site>
Date: Mon, 7 Apr 2008 21:31:54 +0100 (BST)
From: Hugh Dickins <hugh@...itas.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc: Christoph Lameter <clameter@....com>,
James Bottomley <James.Bottomley@...senPartnership.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
Jens Axboe <jens.axboe@...cle.com>,
Pekka Enberg <penberg@...helsinki.fi>,
"Rafael J. Wysocki" <rjw@...k.pl>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] scsi: fix sense_slab/bio swapping livelock
On Mon, 7 Apr 2008, Peter Zijlstra wrote:
> On Mon, 2008-04-07 at 20:40 +0100, Hugh Dickins wrote:
> >
> > My supposition is that once a page has been allocated from __GFP_HIGH
> > reserves to a scsi sense_slab, swap_writepages are liable to gobble up
> > the rest of the page with bio allocations which they wouldn't have had
> > access to traditionally (i.e. under SLAB).
> >
> > So an unexpected behaviour emerges from SLUB's slab merging.
>
> Somewhere along the line of my swap over network patches I made
> 'robustified' SLAB to ensure these sorts of things could not happen - it
> came at a cost though.
>
> It would basically fail[*] allocations that had a higher low watermark
> than what was used to allocate the current slab.
>
> [*] - well, it would attempt to allocate a new slab to raise the current
> watermark, but failing that it would fail the allocation.
Thanks, Peter: that sounds just right to me; but a larger change than
we'd want to jump into for this one particular issue - it might have
its own unexpected consequences.
> > If we had a SLAB_NOMERGE flag, would we want to apply it to the
> > bio cache or to the scsi_sense_cache or to both? My difficulty
> > in answering that makes me wonder whether such a flag is right.
>
> If this is critical to avoid memory deadlocks, I would suggest using
> mempools (or my reserve framework).
No, the critical part of it has been dealt with (small fix to scsi
free_list handling: which resembles a mempool, but done its own way).
What remains is about "unsightly" behaviour, the system having a
tendency to collapse briefly into far-from-efficient operation
when out of memory.
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists