[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070806132747.4b9cea80.akpm@linux-foundation.org>
Date: Mon, 6 Aug 2007 13:27:47 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Christoph Lameter <clameter@....com>
Cc: Matt Mackall <mpm@...enic.com>,
Daniel Phillips <phillips@...nq.net>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
David Miller <davem@...emloft.net>,
Daniel Phillips <phillips@...gle.com>,
Pekka Enberg <penberg@...helsinki.fi>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Steve Dickson <SteveD@...hat.com>
Subject: Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK
On Mon, 6 Aug 2007 13:19:26 -0700 (PDT) Christoph Lameter <clameter@....com> wrote:
> On Mon, 6 Aug 2007, Matt Mackall wrote:
>
> > > > Because a block device may have deadlocked here, leaving the system
> > > > unable to clean dirty memory, or unable to load executables over the
> > > > network for example.
> > >
> > > So this is a locking problem that has not been taken care of?
> >
> > No.
> >
> > It's very simple:
> >
> > 1) memory becomes full
>
> We do have limits to avoid memory getting too full.
>
> > 2) we try to free memory by paging or swapping
> > 3) I/O requires a memory allocation which fails because memory is full
> > 4) box dies because it's unable to dig itself out of OOM
> >
> > Most I/O paths can deal with this by having a mempool for their I/O
> > needs. For network I/O, this turns out to be prohibitively hard due to
> > the complexity of the stack.
>
> The common solution is to have a reserve (min_free_kbytes). The problem
> with the network stack seems to be that the amount of reserve needed
> cannot be predicted accurately.
>
> The solution may be as simple as configuring the reserves right and
> avoid the unbounded memory allocations. That is possible if one
> would make sure that the network layer triggers reclaim once in a
> while.
Such a simple fix would be attractive. Some of the net drivers now have
remarkably large rx and tx queues. One wonders if this is playing a part
in the problem and whether reducing the queue sizes would help.
I guess we'd need to reduce the queue size on all NICs in the machine
though, which might be somewhat of a performance problem.
I don't think we've seen a lot of justification for those large queues.
I'm suspecting it's a few percent in carefully-chosen workloads (of the
microbenchmark kind?)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists