linux-kernel - [PATCH] mm,page_alloc,cma: configurable CMA utilization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230131071052.GB19285@hu-sbhattip-lv.qualcomm.com>
Date:   Mon, 30 Jan 2023 23:10:52 -0800
From:   Sukadev Bhattiprolu <quic_sukadev@...cinc.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
CC:     Rik van Riel <riel@...riel.com>, Roman Gushchin <guro@...com>,
        "Vlastimil Babka" <vbabka@...e.cz>, Joonsoo Kim <js1304@...il.com>,
        Minchan Kim <minchan@...nel.org>,
        Chris Goldsworthy <quic_cgoldswo@...cinc.com>,
        "Georgi Djakov" <quic_c_gdjako@...cinc.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>
Subject: [PATCH] mm,page_alloc,cma: configurable CMA utilization


Commit 16867664936e ("mm,page_alloc,cma: conditionally prefer cma pageblocks for movable allocations")
added support to use CMA pages when more than 50% of total free pages in
the zone are free CMA pages.

However, with multiplatform kernels a single binary is used across different
targets of varying memory sizes. A low memory target using one such kernel
would incur allocation failures even when sufficient memory is available in
the CMA region. On these targets we would want to utilize a higher percentage
of the CMA region and reduce the allocation failures, even if it means that a
subsequent cma_alloc() would take longer.

Make the percentage of CMA utilization a configurable parameter to allow
for such usecases.

Signed-off-by: Sukadev Bhattiprolu <quic_sukadev@...cinc.com>
---
Note:	There was a mention about it being the last resort to making this
	percentage configurable (https://lkml.org/lkml/2020/3/12/751). But
	as explained above, multi-platform kernels for varying memory size
	targets would need this to be configurable.
---
 include/linux/mm.h |  1 +
 kernel/sysctl.c    |  8 ++++++++
 mm/page_alloc.c    | 18 +++++++++++++++---
 mm/util.c          |  2 ++
 4 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8f857163ac89..e4e5d508e9eb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -203,6 +203,7 @@ extern unsigned long sysctl_admin_reserve_kbytes;
 
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
+extern int sysctl_cma_utilization_ratio;
 extern unsigned long sysctl_overcommit_kbytes;
 
 int overcommit_ratio_handler(struct ctl_table *, int, void *, size_t *,
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 137d4abe3eda..2dce6a908aa6 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2445,6 +2445,14 @@ static struct ctl_table vm_table[] = {
 		.extra2		= SYSCTL_ONE,
 	},
 #endif
+	{
+		.procname	= "cma_utilization_ratio",
+		.data		= &sysctl_cma_utilization_ratio,
+		.maxlen		= sizeof(sysctl_cma_utilization_ratio),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ONE,
+	},
 	{ }
 };
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0745aedebb37..b72db3824687 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3071,6 +3071,20 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
 
 }
 
+static __always_inline bool zone_can_use_cma_pages(struct zone *zone)
+{
+	unsigned long cma_free_pages;
+	unsigned long zone_free_pages;
+
+	cma_free_pages = zone_page_state(zone, NR_FREE_CMA_PAGES);
+	zone_free_pages = zone_page_state(zone, NR_FREE_PAGES);
+
+	if (cma_free_pages > zone_free_pages / sysctl_cma_utilization_ratio)
+		return true;
+
+	return false;
+}
+
 /*
  * Do the hard work of removing an element from the buddy allocator.
  * Call me with the zone->lock already held.
@@ -3087,9 +3101,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
 		 * allocating from CMA when over half of the zone's free memory
 		 * is in the CMA area.
 		 */
-		if (alloc_flags & ALLOC_CMA &&
-		    zone_page_state(zone, NR_FREE_CMA_PAGES) >
-		    zone_page_state(zone, NR_FREE_PAGES) / 2) {
+		if (alloc_flags & ALLOC_CMA && zone_can_use_cma_pages(zone)) {
 			page = __rmqueue_cma_fallback(zone, order);
 			if (page)
 				return page;
diff --git a/mm/util.c b/mm/util.c
index b56c92fb910f..4de81f04b249 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -781,6 +781,8 @@ void folio_copy(struct folio *dst, struct folio *src)
 }
 
 int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;
+
+int sysctl_cma_utilization_ratio __read_mostly = 2;
 int sysctl_overcommit_ratio __read_mostly = 50;
 unsigned long sysctl_overcommit_kbytes __read_mostly;
 int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
-- 
2.17.1