lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 4 Apr 2018 10:31:11 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Zhaoyang Huang <huangzhaoyang@...il.com>,
        Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
        kernel-patch-test@...ts.linaro.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Joel Fernandes <joelaf@...gle.com>, linux-mm@...ck.org,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v1] kernel/trace:check the val against the available mem

On Wed, 4 Apr 2018 16:23:29 +0200
Michal Hocko <mhocko@...nel.org> wrote:

> On Wed 04-04-18 10:11:49, Steven Rostedt wrote:
> > On Wed, 4 Apr 2018 08:23:40 +0200
> > Michal Hocko <mhocko@...nel.org> wrote:
> >   
> > > If you are afraid of that then you can have a look at {set,clear}_current_oom_origin()
> > > which will automatically select the current process as an oom victim and
> > > kill it.  
> > 
> > Would it even receive the signal? Does alloc_pages_node() even respond
> > to signals? Because the OOM happens while the allocation loop is
> > running.  
> 
> Well, you would need to do something like:
> 
> > 
> > I tried it out, I did the following:
> > 
> > 	set_current_oom_origin();
> > 	for (i = 0; i < nr_pages; i++) {
> > 		struct page *page;
> > 		/*
> > 		 * __GFP_RETRY_MAYFAIL flag makes sure that the allocation fails
> > 		 * gracefully without invoking oom-killer and the system is not
> > 		 * destabilized.
> > 		 */
> > 		bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()),
> > 				    GFP_KERNEL | __GFP_RETRY_MAYFAIL,
> > 				    cpu_to_node(cpu));
> > 		if (!bpage)
> > 			goto free_pages;
> > 
> > 		list_add(&bpage->list, pages);
> > 
> > 		page = alloc_pages_node(cpu_to_node(cpu),
> > 					GFP_KERNEL | __GFP_RETRY_MAYFAIL, 0);
> > 		if (!page)
> > 			goto free_pages;  
> 
> 		if (fatal_signal_pending())
> 			fgoto free_pages;

But wouldn't page be NULL in this case?

> 
> > 		bpage->page = page_address(page);
> > 		rb_init_page(bpage->page);
> > 	}
> > 	clear_current_oom_origin();  
> 
> If you use __GFP_RETRY_MAYFAIL it would have to be somedy else to
> trigger the OOM killer and this user context would get killed. If you
> drop __GFP_RETRY_MAYFAIL it would be this context to trigger the OOM but
> it would still be the selected victim.

Then we guarantee to kill the process instead of just sending a
-ENOMEM, which would change user space ABI, and is a NO NO.

Ideally, we want to avoid an OOM. I could add the above as well, when
si_mem_avaiable() returns something that is greater than what is
available, and at least this is the process that will get the OOM if it
fails to allocate.

Would that work for you?

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ