lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200402194233.GA171919@carbon.DHCP.thefacebook.com>
Date:   Thu, 2 Apr 2020 12:42:33 -0700
From:   Roman Gushchin <guro@...com>
To:     Joonsoo Kim <js1304@...il.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Rik van Riel <riel@...riel.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>, <kernel-team@...com>,
        Qian Cai <cai@....pw>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>
Subject: Re: [PATCH] mm,page_alloc,cma: conditionally prefer cma pageblocks
 for movable allocations

On Thu, Apr 02, 2020 at 02:43:49PM +0900, Joonsoo Kim wrote:
> 2020년 4월 2일 (목) 오전 11:54, Roman Gushchin <guro@...com>님이 작성:
> >
> > On Wed, Apr 01, 2020 at 07:13:22PM -0700, Andrew Morton wrote:
> > > On Thu, 12 Mar 2020 10:41:28 +0900 Joonsoo Kim <js1304@...il.com> wrote:
> > >
> > > > Hello, Roman.
> > > >
> > > > 2020년 3월 12일 (목) 오전 2:35, Roman Gushchin <guro@...com>님이 작성:
> > > > >
> > > > > On Wed, Mar 11, 2020 at 09:51:07AM +0100, Vlastimil Babka wrote:
> > > > > > On 3/6/20 9:01 PM, Rik van Riel wrote:
> > > > > > > Posting this one for Roman so I can deal with any upstream feedback and
> > > > > > > create a v2 if needed, while scratching my head over the next piece of
> > > > > > > this puzzle :)
> > > > > > >
> > > > > > > ---8<---
> > > > > > >
> > > > > > > From: Roman Gushchin <guro@...com>
> > > > > > >
> > > > > > > Currently a cma area is barely used by the page allocator because
> > > > > > > it's used only as a fallback from movable, however kswapd tries
> > > > > > > hard to make sure that the fallback path isn't used.
> > > > > >
> > > > > > Few years ago Joonsoo wanted to fix these kinds of weird MIGRATE_CMA corner
> > > > > > cases by using ZONE_MOVABLE instead [1]. Unfortunately it was reverted due to
> > > > > > unresolved bugs. Perhaps the idea could be resurrected now?
> > > > >
> > > > > Hi Vlastimil!
> > > > >
> > > > > Thank you for this reminder! I actually looked at it and also asked Joonsoo in private
> > > > > about the state of this patch(set). As I understand, Joonsoo plans to resubmit
> > > > > it later this year.
> > > > >
> > > > > What Rik and I are suggesting seems to be much simpler, however it's perfectly
> > > > > possible that Joonsoo's solution is preferable long-term.
> > > > >
> > > > > So if the proposed patch looks ok for now, I'd suggest to go with it and return
> > > > > to this question once we'll have a new version of ZONE_MOVABLE solution.
> > > >
> > > > Hmm... utilization is not the only matter for CMA user. The more
> > > > important one is
> > > > success guarantee of cma_alloc() and this patch would have a bad impact on it.
> > > >
> > > > A few years ago, I have tested this kind of approach and found that increasing
> > > > utilization increases cma_alloc() failure. Reason is that the page
> > > > allocated with
> > > > __GFP_MOVABLE, especially, by sb_bread(), is sometimes pinned by someone.
> > > >
> > > > Until now, cma memory isn't used much so this problem doesn't occur easily.
> > > > However, with this patch, it would happen.
> > >
> > > So I guess we keep Roman's patch on hold pending clarification of this
> > > risk?
> >
> > The problem here is that we can't really find problems if we don't use CMA as intended
> > and just leave it free. Me and Rik are actively looking for page migration problems
> > in our production, and we've found and fixed some of them. Our setup is likely different
> > from embedded guys who are in my understanding most active cma users, so even if we
> > don't see any issues I can't guarantee it for everybody.
> >
> > So given Joonsoo's ack down in the thread (btw, I'm sorry I've missed a good optimization
> > he suggested, will send a patch for that), I'd go with this patch at least until
> 
> Looks like you mean Minchan's ack. Anyway, I don't object this one.

Right, I'm really sorry.

> 
> In fact, I've tested this patch and your fixes for migration problem
> and found that there is
> still migration problem and failure rate is increased by this patch.

Do you mind sharing any details? What kind of pages are those?

I'm using the following patch to dump failed pages:

@@ -1455,6 +1455,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 						private, page, pass > 2, mode,
 						reason);
 
+			if (rc && reason == MR_CONTIG_RANGE)
+				dump_page(page, "unmap_and_move");
+
 			switch(rc) {
 			case -ENOMEM:
 				/*


> However, given that
> there is no progress on this area for a long time, I think that
> applying the change aggressively
> is required to break the current situation.

I totally agree!

Btw, I've found that cma_release() grabs the cma->lock mutex,
so it can't be called from the atomic context (I've got a lockdep warning).

Of course, I can change the calling side, but I think it's better to change
the cma code to make cma_release() more accepting. What do you think
about the following patch?

Thank you!

--

>From 3f3f43746391705c0b57ea3846d74c1af2684c11 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro@...com>
Date: Thu, 2 Apr 2020 12:24:13 -0700
Subject: [PATCH] mm: cma: convert cma->lock into a spinlock

Currently cma->lock is a mutex which protects cma->bitmap.
cma_release() grabs this mutex in cma_clear_bitmap().

It means that cma_release() can't be called from the atomic
context, which is not very convenient for a generic memory
release function.

There are two options to solve this problem:
1) introduce some sort of a delayed deallocation
2) convert the mutex into a spinlock

This patch implements the second approach.
Indeed, bitmap operations cannot sleep and should be relatively fast,
so there are no reasons why a spinlock can't do the synchronization.

Signed-off-by: Roman Gushchin <guro@...com>
---
 mm/cma.c | 21 ++++++++++++---------
 mm/cma.h |  2 +-
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index be55d1988c67..cb4a3e0a9eeb 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -88,9 +88,9 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn,
 	bitmap_no = (pfn - cma->base_pfn) >> cma->order_per_bit;
 	bitmap_count = cma_bitmap_pages_to_bits(cma, count);
 
-	mutex_lock(&cma->lock);
+	spin_lock(&cma->lock);
 	bitmap_clear(cma->bitmap, bitmap_no, bitmap_count);
-	mutex_unlock(&cma->lock);
+	spin_unlock(&cma->lock);
 }
 
 static int __init cma_activate_area(struct cma *cma)
@@ -126,7 +126,7 @@ static int __init cma_activate_area(struct cma *cma)
 		init_cma_reserved_pageblock(pfn_to_page(base_pfn));
 	} while (--i);
 
-	mutex_init(&cma->lock);
+	spin_lock_init(&cma->lock);
 
 #ifdef CONFIG_CMA_DEBUGFS
 	INIT_HLIST_HEAD(&cma->mem_head);
@@ -381,22 +381,25 @@ static void cma_debug_show_areas(struct cma *cma)
 	unsigned long nr_part, nr_total = 0;
 	unsigned long nbits = cma_bitmap_maxno(cma);
 
-	mutex_lock(&cma->lock);
 	pr_info("number of available pages: ");
 	for (;;) {
+		spin_lock(&cma->lock);
 		next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, start);
-		if (next_zero_bit >= nbits)
+		if (next_zero_bit >= nbits) {
+			spin_unlock(&cma->lock);
 			break;
+		}
 		next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit);
 		nr_zero = next_set_bit - next_zero_bit;
 		nr_part = nr_zero << cma->order_per_bit;
+		spin_unlock(&cma->lock);
+
 		pr_cont("%s%lu@%lu", nr_total ? "+" : "", nr_part,
 			next_zero_bit);
 		nr_total += nr_part;
 		start = next_zero_bit + nr_zero;
 	}
 	pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count);
-	mutex_unlock(&cma->lock);
 }
 #else
 static inline void cma_debug_show_areas(struct cma *cma) { }
@@ -441,12 +444,12 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 		return NULL;
 
 	for (;;) {
-		mutex_lock(&cma->lock);
+		spin_lock(&cma->lock);
 		bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
 				bitmap_maxno, start, bitmap_count, mask,
 				offset);
 		if (bitmap_no >= bitmap_maxno) {
-			mutex_unlock(&cma->lock);
+			spin_unlock(&cma->lock);
 			break;
 		}
 		bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
@@ -455,7 +458,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 		 * our exclusive use. If the migration fails we will take the
 		 * lock again and unmark it.
 		 */
-		mutex_unlock(&cma->lock);
+		spin_unlock(&cma->lock);
 
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
 		mutex_lock(&cma_mutex);
diff --git a/mm/cma.h b/mm/cma.h
index 33c0b517733c..7f5985b11439 100644
--- a/mm/cma.h
+++ b/mm/cma.h
@@ -7,7 +7,7 @@ struct cma {
 	unsigned long   count;
 	unsigned long   *bitmap;
 	unsigned int order_per_bit; /* Order of pages represented by one bit */
-	struct mutex    lock;
+	spinlock_t lock;
 #ifdef CONFIG_CMA_DEBUGFS
 	struct hlist_head mem_head;
 	spinlock_t mem_head_lock;
-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ