lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1343420271-3825-1-git-send-email-js1304@gmail.com>
Date:	Sat, 28 Jul 2012 05:17:51 +0900
From:	Joonsoo Kim <js1304@...il.com>
To:	Pekka Enberg <penberg@...nel.org>
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Joonsoo Kim <js1304@...il.com>,
	Christoph Lameter <cl@...ux.com>
Subject: [PATCH] slub: remove one code path and reduce lock contention in __slab_free()

When we try to free object, there is some of case that we need
to take a node lock. This is the necessary step for preventing a race.
After taking a lock, then we try to cmpxchg_double_slab().
But, there is a possible scenario that cmpxchg_double_slab() is failed.
Following example explains this.

CPU A               CPU B
need lock
...                 need lock
...                 lock!!
lock..but spin      free success
spin...             unlock
lock!!
free fail

In this case, retry with taking a lock is occured in CPU A.

I think that in this case,
"release a lock first, and re-take a lock if necessary" is preferable way.
There are two reasons for this.
First, this makes __slab_free()'s logic somehow simple.
Second, it may reduce lock contention.
When we do retrying, status of slab is already changed,
so we don't need a lock anymore in almost every case.
"release a lock first, and re-take a lock if necessary" policy is
helpful to this.

Signed-off-by: Joonsoo Kim <js1304@...il.com>
Cc: Christoph Lameter <cl@...ux.com>
---
This is v2 of "slub: release a lock if freeing object with a lock is failed in __slab_free()"
Subject and commit log are changed from v1.
Code is same as v1.

diff --git a/mm/slub.c b/mm/slub.c
index ca778e5..efce427 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2421,7 +2421,6 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
 	void *prior;
 	void **object = (void *)x;
 	int was_frozen;
-	int inuse;
 	struct page new;
 	unsigned long counters;
 	struct kmem_cache_node *n = NULL;
@@ -2433,13 +2432,17 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
 		return;
 
 	do {
+		if (unlikely(n)) {
+			spin_unlock_irqrestore(&n->list_lock, flags);
+			n = NULL;
+		}
 		prior = page->freelist;
 		counters = page->counters;
 		set_freepointer(s, object, prior);
 		new.counters = counters;
 		was_frozen = new.frozen;
 		new.inuse--;
-		if ((!new.inuse || !prior) && !was_frozen && !n) {
+		if ((!new.inuse || !prior) && !was_frozen) {
 
 			if (!kmem_cache_debug(s) && !prior)
 
@@ -2464,7 +2467,6 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
 
 			}
 		}
-		inuse = new.inuse;
 
 	} while (!cmpxchg_double_slab(s, page,
 		prior, counters,
@@ -2490,25 +2492,17 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
                 return;
         }
 
+	if (unlikely(!new.inuse && n->nr_partial > s->min_partial))
+		goto slab_empty;
+
 	/*
-	 * was_frozen may have been set after we acquired the list_lock in
-	 * an earlier loop. So we need to check it here again.
+	 * Objects left in the slab. If it was not on the partial list before
+	 * then add it.
 	 */
-	if (was_frozen)
-		stat(s, FREE_FROZEN);
-	else {
-		if (unlikely(!inuse && n->nr_partial > s->min_partial))
-                        goto slab_empty;
-
-		/*
-		 * Objects left in the slab. If it was not on the partial list before
-		 * then add it.
-		 */
-		if (unlikely(!prior)) {
-			remove_full(s, page);
-			add_partial(n, page, DEACTIVATE_TO_TAIL);
-			stat(s, FREE_ADD_PARTIAL);
-		}
+	if (kmem_cache_debug(s) && unlikely(!prior)) {
+		remove_full(s, page);
+		add_partial(n, page, DEACTIVATE_TO_TAIL);
+		stat(s, FREE_ADD_PARTIAL);
 	}
 	spin_unlock_irqrestore(&n->list_lock, flags);
 	return;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ