lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3470ab357b48bccfbd1f5133b982178a7d2befb.camel@gmx.de>
Date:   Mon, 26 Jul 2021 19:00:46 +0200
From:   Mike Galbraith <efault@....de>
To:     Vlastimil Babka <vbabka@...e.cz>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
Cc:     linux-rt-users@...r.kernel.org,
        Mel Gorman <mgorman@...hsingularity.net>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [rfc/patch] mm/slub: restore/expand unfreeze_partials() local
 exclusion scope

On Sun, 2021-07-25 at 21:34 +0200, Vlastimil Babka wrote:
> On 7/25/21 9:12 PM, Vlastimil Babka wrote:
> > +       /*
> > +        * On !RT we just want to disable preemption, on RT we need
> > the lock
> > +        * for real. This happens to match local_lock() semantics.
> > +        */
> > +       local_lock(&s->cpu_slab->lock);
>
> OK I realized (and tglx confirmed) that this will blow up on !RT +
> lockdep if interrupted by ___slab_alloc() that will do
> local_lock_irqsave(). So back to #ifdefs it is. But should work as-is
> for RT testing.

Speaking of that local_lock_irqsave(), and some unloved ifdefs..

Why not do something like the below?  When I look at new_slab:, I see
cpu_slab->partial assignment protected by IRQs being disabled, which
implies to me it should probably be so protected everywhere.  There
used to be another slub_set_percpu_partial() call in
unfreeze_partials(), which was indeed called with IRQs disabled, quite
sane looking to an mm outsider looking in. The odd man out ->partial
assignment was the preempt disabled put_cpu_partial() cmpxchg loop,
which contained an IRQ disabled region to accommodate the
aforementioned unfreeze_partials().

Is there real world benefit to the cmpxchg loops whacked below (ala
monkey see monkey do) over everyone just taking the straight shot you
laid down for RT?  It's easier on the eye (mine anyway), and neither
PREEMPT or PREEMPT_RT seem inclined to complain... tick... tock...

---
 mm/slub.c |   79 ++++++++++++++++++++++++++++++++------------------------------
 1 file changed, 41 insertions(+), 38 deletions(-)

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2437,13 +2437,12 @@ static void __unfreeze_partials(struct k
 static void unfreeze_partials(struct kmem_cache *s)
 {
 	struct page *partial_page;
+	unsigned long flags;

-	do {
-		partial_page = this_cpu_read(s->cpu_slab->partial);
-
-	} while (partial_page &&
-		 this_cpu_cmpxchg(s->cpu_slab->partial, partial_page, NULL)
-				  != partial_page);
+	local_lock_irqsave(&s->cpu_slab->lock, flags);
+	partial_page = this_cpu_read(s->cpu_slab->partial);
+	this_cpu_write(s->cpu_slab->partial, NULL);
+	local_unlock_irqrestore(&s->cpu_slab->lock, flags);

 	if (partial_page)
 		__unfreeze_partials(s, partial_page);
@@ -2480,41 +2479,45 @@ static void put_cpu_partial(struct kmem_
 {
 #ifdef CONFIG_SLUB_CPU_PARTIAL
 	struct page *oldpage;
-	int pages;
-	int pobjects;
-
-	slub_get_cpu_ptr(s->cpu_slab);
-	do {
-		pages = 0;
-		pobjects = 0;
-		oldpage = this_cpu_read(s->cpu_slab->partial);
-
-		if (oldpage) {
-			pobjects = oldpage->pobjects;
-			pages = oldpage->pages;
-			if (drain && pobjects > slub_cpu_partial(s)) {
-				/*
-				 * partial array is full. Move the existing
-				 * set to the per node partial list.
-				 */
-				unfreeze_partials(s);
-				oldpage = NULL;
-				pobjects = 0;
-				pages = 0;
-				stat(s, CPU_PARTIAL_DRAIN);
-			}
+	struct page *page_to_unfreeze = NULL;
+	unsigned long flags;
+	int pages = 0, pobjects = 0;
+
+	local_lock_irqsave(&s->cpu_slab->lock, flags);
+
+	if (oldpage = this_cpu_read(s->cpu_slab->partial)) {
+		pobjects = oldpage->pobjects;
+		pages = oldpage->pages;
+		if (drain && pobjects > slub_cpu_partial(s)) {
+			/*
+			 * partial array is full. Move the existing
+			 * set to the per node partial list.
+			 *
+			 * Postpone unfreezing until we drop the local
+			 * lock to avoid an RT unlock/relock requirement
+			 * due to MEMCG __slab_free() recursion.
+			 */
+			page_to_unfreeze = oldpage;
+
+			oldpage = NULL;
+			pobjects = 0;
+			pages = 0;
+			stat(s, CPU_PARTIAL_DRAIN);
 		}
+	}
+
+	pages++;
+	pobjects += page->objects - page->inuse;
+
+	page->pages = pages;
+	page->pobjects = pobjects;
+	page->next = oldpage;

-		pages++;
-		pobjects += page->objects - page->inuse;
+	this_cpu_write(s->cpu_slab->partial, page);
+	local_unlock_irqrestore(&s->cpu_slab->lock, flags);

-		page->pages = pages;
-		page->pobjects = pobjects;
-		page->next = oldpage;
-
-	} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page)
-								!= oldpage);
-	slub_put_cpu_ptr(s->cpu_slab);
+	if (page_to_unfreeze)
+		__unfreeze_partials(s, page_to_unfreeze);
 #endif	/* CONFIG_SLUB_CPU_PARTIAL */
 }



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ