[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20080128012718.65b7889a.akpm@linux-foundation.org>
Date: Mon, 28 Jan 2008 01:27:18 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Andi Kleen <ak@...e.de>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] Only print kernel debug information for OOMs caused by
kernel allocations
On Mon, 28 Jan 2008 10:11:57 +0100 Andi Kleen <ak@...e.de> wrote:
> On Monday 28 January 2008 09:56, Andrew Morton wrote:
> > On Mon, 28 Jan 2008 07:10:07 +0100 Andi Kleen <ak@...e.de> wrote:
> > > On Monday 28 January 2008 06:52, Andrew Morton wrote:
> > > > On Wed, 16 Jan 2008 23:24:21 +0100 Andi Kleen <ak@...e.de> wrote:
> > > > > I recently suffered an 20+ minutes oom thrash disk to death and
> > > > > computer completely unresponsive situation on my desktop when some
> > > > > user program decided to grab all memory. It eventually recovered, but
> > > > > left lots of ugly and imho misleading messages in the kernel log.
> > > > > here's a minor improvement
> > >
> > > As a followup this was with swap over dm crypt. I've recently heard
> > > about other people having trouble with this too so this setup seems to
> > > trigger something bad in the VM.
> >
> > Where's the backtrace and show_mem() output? :)
>
> I don't have it anymore. You want me to reproduce it? I don't think
> I saw messages from the other people either; just heard complaints.
May as well - it doesn't sound like it'll fix itself...
> > > > That information is useful for working out why a userspace allocation
> > > > attempt failed. If we don't print it, and the application gets killed
> > > > and thus frees a lot of memory, we will just never know why the
> > > > allocation failed.
> > >
> > > But it's basically only either page fault (direct or indirect) or write
> > > et.al. who do these page cache allocations. Do you really think it is
> > > that important to distingush these cases individually? In 95+% of all
> > > cases it should be a standard user page fault which always has the same
> > > backtrace.
> >
> > Sure, the backtrace isn't very important. The show_mem() output is vital.
>
> I see. So would the patch be acceptable if it only disabled the backtrace?
Spose so. The show_mem() spew is probably larger than the backtrace
though.
Are you sure we aren't doing dump_stack()/show_mem() mutiple times for a
single process? If we are, that would mena the TIF_MEMDIE thing broke.
It must have been one heck of an oomkilling slaughter.
> > Plus an additional function call. On the already-deep page allocation
> > path, I might add.
>
> The function call is already there if the kernel has CPUSETs enabled.
s/CPUSETS/NUMA/, which makes rather a difference.
> And that is what distribution kernels usually do. And most users
> use distribution kernels or distribution .config.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists