linux-kernel - Re: [PATCH] trace: Set oom_score_adj to maximum for ring buffer allocating process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1306453118.3857.20.camel@gandalf.stny.rr.com>
Date:	Thu, 26 May 2011 19:38:38 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Vaibhav Nagarnaik <vnagarnaik@...gle.com>
Cc:	David Rientjes <rientjes@...gle.com>,
	Ingo Molnar <mingo@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Michael Rubin <mrubin@...gle.com>,
	David Sharp <dhsharp@...gle.com>, linux-kernel@...r.kernel.org,
	Peter Zijlstra <peterz@...radead.org>,
	Mel Gorman <mel@....ul.ie>, Rik Van Riel <riel@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] trace: Set oom_score_adj to maximum for ring buffer
 allocating process

[ I added to the Cc people that understand MM more than I do ]

On Thu, 2011-05-26 at 15:28 -0700, Vaibhav Nagarnaik wrote:
> On Thu, May 26, 2011 at 2:00 PM, Steven Rostedt <rostedt@...dmis.org> wrote:
> > But the issue is, if the process increasing the size of the ring buffer
> > causes the oom, it will not handle the SIGKILL until after the ring
> > buffer has finished allocating. Now, if it failed to allocate, then we
> > are fine, but if it does not fail, but now we start killing processes,
> > then we may be in trouble.
> >
> 
> If I understand correctly, if a fatal signal is pending on a process
> while allocation is called, the allocation fails. Then we handle the
> freeing up memory correctly, though the echo gets killed once we return
> from the allocation process.
> 
> > I like the NORETRY better. But then, would this mean that if we have a
> > lot of cached filesystems, we wont be able to extend the ring buffer?
> 
> It doesn't seem so. I talked with the mm- team and I understand that
> even if NORETRY is set, cached pages will be flushed out and allocation
> will succeed. But it still does not address the situation when the ring
> buffer allocation is going on and another process invokes OOM. If the
> oom_score_adj is not set to maximum, then random processes will still be
> killed before ring buffer allocation fails.
> 
> >
> > I'm thinking the oom killer used here got lucky. As it killed this task,
> > we were still out of memory, and the ring buffer failed to get the
> > memory it needed and freed up everything that it previously allocated,
> > and returned. Then the process calling this function would be killed by
> > the OOM. Ideally, the process shouldn't be killed and the ring buffer
> > just returned -ENOMEM to the user.
> 
> What do you think of this?
> 
> test_set_oom_score_adj(MAXIMUM);
> allocate_ring_buffer(GFP_KERNEL | __GFP_NORETRY);
> test_set_oom_score_adj(original);
> 
> This makes sure that the allocation fails much sooner and more
> gracefully. If oom-killer is invoked in any circumstance, then the ring
> buffer allocation process gives up memory and is killed.

I don't know. But as I never seen this function before, I went and took
a look. This test_set_oom_score_adj() is new, and coincidentally written
by another google developer ;)

As there's not really a precedence to this, if those that I added to the
Cc, give their acks, I'm happy to apply this for the next merge window.

-- Steve
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/