linux-kernel - Re: [PATCH 1/2 v6] mm/mempolicy: Weighted Interleave Auto-tuning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250228064016.1325-1-yunjeong.mun@sk.com>
Date: Fri, 28 Feb 2025 15:39:55 +0900
From: Yunjeong Mun <yunjeong.mun@...com>
To: Joshua Hahn <joshua.hahnjy@...il.com>
Cc: honggyu.kim@...com,
	gregkh@...uxfoundation.org,
	rakie.kim@...com,
	akpm@...ux-foundation.org,
	rafael@...nel.org,
	lenb@...nel.org,
	dan.j.williams@...el.com,
	Jonathan.Cameron@...wei.com,
	dave.jiang@...el.com,
	horen.chuang@...ux.dev,
	hannes@...xchg.org,
	linux-kernel@...r.kernel.org,
	linux-acpi@...r.kernel.org,
	linux-mm@...ck.org,
	kernel-team@...a.com,
	kernel_team@...ynix.com
Subject: Re: [PATCH 1/2 v6] mm/mempolicy: Weighted Interleave Auto-tuning

Hi, Joshua. 

First of all I accidentally sent the wrong email a few hours ago.
Please disregard it. Sorry for the confusion.

On Wed, 26 Feb 2025 13:35:17 -0800 Joshua Hahn <joshua.hahnjy@...il.com> wrote:

[...snip...]
>  
> +/*
> + * Convert bandwidth values into weighted interleave weights.
> + * Call with iw_table_lock.
> + */
> +static void reduce_interleave_weights(unsigned int *bw, u8 *new_iw)
> +{
> +	u64 sum_bw = 0;
> +	unsigned int cast_sum_bw, sum_iw = 0;
> +	unsigned int scaling_factor = 1, iw_gcd = 1;
> +	int nid;
> +
> +	/* Recalculate the bandwidth distribution given the new info */
> +	for_each_node_state(nid, N_MEMORY)
> +		sum_bw += bw[nid];
> +
> +	for (nid = 0; nid < nr_node_ids; nid++) {
> +		/* Set memoryless nodes' weights to 1 to prevent div/0 later */
> +		if (!node_state(nid, N_MEMORY)) {
> +			new_iw[nid] = 1;
> +			continue;
> +		}
> +
> +		scaling_factor = 100 * bw[nid];
> +
> +		/*
> +		 * Try not to perform 64-bit division.
> +		 * If sum_bw < scaling_factor, then sum_bw < U32_MAX.
> +		 * If sum_bw > scaling_factor, then bw[nid] is less than
> +		 * 1% of the total bandwidth. Round up to 1%.
> +		 */
> +		if (bw[nid] && sum_bw < scaling_factor) {
> +			cast_sum_bw = (unsigned int)sum_bw;
> +			new_iw[nid] = scaling_factor / cast_sum_bw;
> +		} else {
> +			new_iw[nid] = 1;
> +		}
> +		sum_iw += new_iw[nid];
> +	}
> +
> +	/*
> +	 * Scale each node's share of the total bandwidth from percentages
> +	 * to whole numbers in the range [1, weightiness]
> +	 */
> +	for_each_node_state(nid, N_MEMORY) {
> +		scaling_factor = weightiness * new_iw[nid];
> +		new_iw[nid] = max(scaling_factor / sum_iw, 1);
> +		if (nid == 0)
> +			iw_gcd = new_iw[0];
> +		iw_gcd = gcd(iw_gcd, new_iw[nid]);
> +	}
> +
> +	/* 1:2 is strictly better than 16:32. Reduce by the weights' GCD. */
> +	for_each_node_state(nid, N_MEMORY)
> +		new_iw[nid] /= iw_gcd;
> +}

In my understanding, new_iw[nid] values are scaled twice, first to 100 and then to a 
weightines value of 32. I think this scaling can be done just once, directly 
to weightness value as follows:

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 50cbb7c047fa..65a7e2baf161 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -176,47 +176,22 @@ static u8 get_il_weight(int node)
 static void reduce_interleave_weights(unsigned int *bw, u8 *new_iw)
 {
	u64 sum_bw = 0;
-	unsigned int cast_sum_bw, sum_iw = 0;
-	unsigned int scaling_factor = 1, iw_gcd = 1;
+	unsigned int scaling_factor = 1, iw_gcd = 0;
	int nid;

	/* Recalculate the bandwidth distribution given the new info */
	for_each_node_state(nid, N_MEMORY)
		sum_bw += bw[nid];

-       for (nid = 0; nid < nr_node_ids; nid++) {
 			[...snip...]
-		/*
-		 * Try not to perform 64-bit division.
-		 * If sum_bw < scaling_factor, then sum_bw < U32_MAX.
-		 * If sum_bw > scaling_factor, then bw[nid] is less than
-		 * 1% of the total bandwidth. Round up to 1%.
-		 */
 			[...snip...]
-		sum_iw += new_iw[nid];
-	}
-
     
	/*
	 * Scale each node's share of the total bandwidth from percentages
	 * to whole numbers in the range [1, weightiness]
	 */
	for_each_node_state(nid, N_MEMORY) {
-		scaling_factor = weightiness * new_iw[nid];
-		new_iw[nid] = max(scaling_factor / sum_iw, 1);
-		if (nid == 0)
-			iw_gcd = new_iw[0];
+		scaling_factor = weightiness * bw[nid];
+		new_iw[nid] = max(scaling_factor / sum_bw, 1);
+		if (!iw_gcd)
+			iw_gcd = new_iw[nid];
		iw_gcd = gcd(iw_gcd, new_iw[nid]);
	}

Please let me know how you think about this.

Best regards,
Yunjeong