linux-kernel - Re: [PATCH 2/2] mm: page allocator: Do not drain per-cpu lists via IPI from page allocator context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOtvUMfmSrotCGn-51SC3eiQU=xK4C_Trh+8FEfTGOJcGUgVag@mail.gmail.com>
Date:	Thu, 12 Jan 2012 16:51:56 +0200
From:	Gilad Ben-Yossef <gilad@...yossef.com>
To:	Mel Gorman <mgorman@...e.de>
Cc:	Linux-MM <linux-mm@...ck.org>,
	Linux-FSDevel <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Miklos Szeredi <mszeredi@...ell.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Greg KH <gregkh@...e.de>, Gong Chen <gong.chen@...el.com>
Subject: Re: [PATCH 2/2] mm: page allocator: Do not drain per-cpu lists via
 IPI from page allocator context

On Wed, Jan 11, 2012 at 12:11 PM, Mel Gorman <mgorman@...e.de> wrote:
<SNIP>
> Rather than making it safe to call get_online_cpus() from the page
> allocator, this patch simply removes the page allocator call to
> drain_all_pages(). To avoid impacting high-order allocation success
> rates, it still drains the local per-cpu lists for high-order
> allocations that failed. As a side effect, this reduces the number
> of IPIs sent during low memory situations.
>
> Signed-off-by: Mel Gorman <mgorman@...e.de>
> ---
>  mm/page_alloc.c |   16 ++++++++++++----
>  1 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2b8ba3a..b6df6fc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1119,7 +1119,9 @@ void drain_local_pages(void *arg)
>  */
>  void drain_all_pages(void)
>  {
> +       get_online_cpus();
>        on_each_cpu(drain_local_pages, NULL, 1);
> +       put_online_cpus();
>  }
>
>  #ifdef CONFIG_HIBERNATION
> @@ -1982,11 +1984,17 @@ retry:
>                                        migratetype);
>
>        /*
> -        * If an allocation failed after direct reclaim, it could be because
> -        * pages are pinned on the per-cpu lists. Drain them and try again
> +        * If a high-order allocation failed after direct reclaim, there is a
> +        * possibility that it is because the necessary buddies have been
> +        * freed to the per-cpu list. Drain the local list and try again.
> +        * drain_all_pages is not used because it is unsafe to call
> +        * get_online_cpus from this context as it is possible that kthreadd
> +        * would block during thread creation and the cost of sending storms
> +        * of IPIs in low memory conditions is quite high.
>         */
> -       if (!page && !drained) {
> -               drain_all_pages();
> +       if (!page && order && !drained) {
> +               drain_pages(get_cpu());
> +               put_cpu();
>                drained = true;
>                goto retry;
>        }
> --
> 1.7.3.4
>

I very much like the judo like quality of relying on the fact that in
memory pressure conditions most
of the cpus will end up in the direct reclaim path to drain them all
without IPIs.

What I can't figure out is why we don't need  get/put_online_cpus()
pair around each and every call
to on_each_cpu everywhere? and if we do, perhaps making it a part of
on_each_cpu is the way to go?

Something like:

diff --git a/kernel/smp.c b/kernel/smp.c
index f66a1b2..cfa3882 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -691,11 +691,15 @@ void on_each_cpu(void (*func) (void *info), void
*info, int wait)
 {
 	unsigned long flags;

+	BUG_ON(in_atomic());
+
+	get_online_cpus();
 	preempt_disable();
 	smp_call_function(func, info, wait);
 	local_irq_save(flags);
 	func(info);
 	local_irq_restore(flags);
 	preempt_enable();
+	put_online_cpus();
 }
 EXPORT_SYMBOL(on_each_cpu);

Does that makes?

-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@...yossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"Unfortunately, cache misses are an equal opportunity pain provider."
-- Mike Galbraith, LKML
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/