lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAmzW4MakOx_SA6eA=BYP4rq76np4yOS6RANpNObQP6dDDqYQg@mail.gmail.com>
Date:	Wed, 5 Jun 2013 07:21:14 +0900
From:	JoonSoo Kim <js1304@...il.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	LKML <linux-kernel@...r.kernel.org>,
	RT <linux-rt-users@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Clark Williams <clark@...hat.com>,
	Pekka Enberg <penberg@...nel.org>
Subject: Re: [RT LATENCY] 249 microsecond latency caused by slub's
 unfreeze_partials() code.

2013/6/4 Christoph Lameter <cl@...ux.com>:
> On Tue, 4 Jun 2013, JoonSoo Kim wrote:
>
>> And I re-read Steven initial problem report in RT kernel and find that
>> unfreeze_partial() do lock and unlock several times. This means that
>> each page in cpu partial list doesn't come from same node. Why do we
>> add other node's slab to this cpu partial list? This is also not good
>> for general case. How about considering node affinity in __slab_free()?
>> IMHO, if we implement this, Steven's problem can be solved.
>
> We may need the other nodes pages if we consistently allocate from there.
> __slab_alloc() ensures that only pages from the correct node are used. It
> will drop pages that do not come from the proper nodes.
>
> Filtering in __slab_free would mean that we cannot improve performance on
> remote frees.
>

Sorry for poor explanation.
I knew that we need the other nodes pages if we consistently allocate
from there.
In allocation path, it is reasonable.
But, in free path, it is not reasonable that we put other nodes' page
to cpu partial list.
In this case, we don't need other nodes pages, because there is
no allocation request for other nodes. Putting other node's page to
cpu partial list and
using this other node's page for normal allocation request(it doesn't
specify node affinity)
makes system performance degradation. Below is quick implementation of
this idea.

---------------------->8-------------------
diff --git a/mm/slub.c b/mm/slub.c
index 57707f0..7f614f4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2495,7 +2495,8 @@ static void __slab_free(struct kmem_cache *s,
struct page *page,
                new.inuse--;
                if ((!new.inuse || !prior) && !was_frozen) {

-                       if (!kmem_cache_debug(s) && !prior)
+                       if (!kmem_cache_debug(s) && !prior &&
+                               node_match(page,
cpu_to_node(smp_processor_id())))

                                /*
                                 * Slab was on no list before and will
be partially empty
@@ -2550,8 +2551,9 @@ static void __slab_free(struct kmem_cache *s,
struct page *page,
         * Objects left in the slab. If it was not on the partial list before
         * then add it.
         */
-       if (kmem_cache_debug(s) && unlikely(!prior)) {
-               remove_full(s, page);
+       if (!prior) {
+               if (kmem_cache_debug(s))
+                       remove_full(s, page);
                add_partial(n, page, DEACTIVATE_TO_TAIL);
                stat(s, FREE_ADD_PARTIAL);
        }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ