lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o77fkprp.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Wed, 03 Jul 2024 13:28:42 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Tvrtko Ursulin <tursulin@...lia.com>
Cc: linux-mm@...ck.org,  linux-kernel@...r.kernel.org,
  kernel-dev@...lia.com,  Tvrtko Ursulin <tvrtko.ursulin@...lia.com>,  Mel
 Gorman <mgorman@...e.de>,  Peter Zijlstra <peterz@...radead.org>,  Ingo
 Molnar <mingo@...hat.com>,  Rik van Riel <riel@...riel.com>,  Johannes
 Weiner <hannes@...xchg.org>,  "Matthew Wilcox (Oracle)"
 <willy@...radead.org>,  Dave Hansen <dave.hansen@...el.com>,  Andi Kleen
 <ak@...ux.intel.com>,  Michal Hocko <mhocko@...e.com>,  David Rientjes
 <rientjes@...gle.com>
Subject: Re: [PATCH v2] mm/numa_balancing: Teach mpol_to_str about the
 balancing mode

Tvrtko Ursulin <tursulin@...lia.com> writes:

> From: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
>
> Since balancing mode was added in
> bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes"),
> it was possible to set this mode but it wouldn't be shown in
> /proc/<pid>/numa_maps since there was no support for it in the
> mpol_to_str() helper.
>
> Furthermore, because the balancing mode sets the MPOL_F_MORON flag, it
> would be displayed as 'default' due a workaround introduced a few years
> earlier in
> 8790c71a18e5 ("mm/mempolicy.c: fix mempolicy printing in numa_maps").
>
> To tidy this up we implement two changes:
>
> First we introduce a new internal flag MPOL_F_KERNEL and with it mark the
> kernel's internal default and fallback policies (for tasks and/or VMAs
> with no explicit policy set). By doing this we generalise the current
> special casing and replace the incorrect 'default' with the correct
> 'bind'.
>
> Secondly, we add a string representation and corresponding handling for
> MPOL_F_NUMA_BALANCING. We do this by adding a sparse mapping array of
> flags to names. With the sparseness being the downside, but with the
> advantage of generalising and removing the "policy" from flags display.

Please split these 2 changes into 2 patches.  Because we will need to
back port the first one to -stable kernel.

> End result:
>
> $ numactl -b -m 0-1,3 cat /proc/self/numa_maps
> 555559580000 bind=balancing:0-1,3 file=/usr/bin/cat mapped=3 active=0 N0=3 kernelpagesize_kB=16
> ...
>
> v2:
>  * Fully fix by introducing MPOL_F_KERNEL.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
> Fixes: bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes")
> References: 8790c71a18e5 ("mm/mempolicy.c: fix mempolicy printing in numa_maps")
> Cc: Huang Ying <ying.huang@...el.com>
> Cc: Mel Gorman <mgorman@...e.de>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Rik van Riel <riel@...riel.com>
> Cc: Johannes Weiner <hannes@...xchg.org>
> Cc: "Matthew Wilcox (Oracle)" <willy@...radead.org>
> Cc: Dave Hansen <dave.hansen@...el.com>
> Cc: Andi Kleen <ak@...ux.intel.com>
> Cc: Michal Hocko <mhocko@...e.com>
> Cc: David Rientjes <rientjes@...gle.com>
> ---
>  include/uapi/linux/mempolicy.h |  1 +
>  mm/mempolicy.c                 | 44 ++++++++++++++++++++++++----------
>  2 files changed, 32 insertions(+), 13 deletions(-)
>
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..bcf56ce9603b 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -64,6 +64,7 @@ enum {
>  #define MPOL_F_SHARED  (1 << 0)	/* identify shared policies */
>  #define MPOL_F_MOF	(1 << 3) /* this policy wants migrate on fault */
>  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> +#define MPOL_F_KERNEL   (1 << 5) /* Kernel's internal policy */
>  
>  /*
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index aec756ae5637..8ecc6d9f100a 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -134,6 +134,7 @@ enum zone_type policy_zone = 0;
>  static struct mempolicy default_policy = {
>  	.refcnt = ATOMIC_INIT(1), /* never free it */
>  	.mode = MPOL_LOCAL,
> +	.flags = MPOL_F_KERNEL,
>  };
>  
>  static struct mempolicy preferred_node_policy[MAX_NUMNODES];
> @@ -3095,7 +3096,7 @@ void __init numa_policy_init(void)
>  		preferred_node_policy[nid] = (struct mempolicy) {
>  			.refcnt = ATOMIC_INIT(1),
>  			.mode = MPOL_PREFERRED,
> -			.flags = MPOL_F_MOF | MPOL_F_MORON,
> +			.flags = MPOL_F_MOF | MPOL_F_MORON | MPOL_F_KERNEL,
>  			.nodes = nodemask_of_node(nid),
>  		};
>  	}
> @@ -3150,6 +3151,12 @@ static const char * const policy_modes[] =
>  	[MPOL_PREFERRED_MANY]  = "prefer (many)",
>  };
>  
> +static const char * const policy_flags[] = {
> +	[ilog2(MPOL_F_STATIC_NODES)] = "static",
> +	[ilog2(MPOL_F_RELATIVE_NODES)] = "relative",
> +	[ilog2(MPOL_F_NUMA_BALANCING)] = "balancing",
> +};
> +
>  #ifdef CONFIG_TMPFS
>  /**
>   * mpol_parse_str - parse string to mempolicy, for tmpfs mpol mount option.
> @@ -3293,17 +3300,18 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
>   * @pol:  pointer to mempolicy to be formatted
>   *
>   * Convert @pol into a string.  If @buffer is too short, truncate the string.
> - * Recommend a @maxlen of at least 32 for the longest mode, "interleave", the
> - * longest flag, "relative", and to display at least a few node ids.
> + * Recommend a @maxlen of at least 42 for the longest mode, "weighted
> + * interleave", the longest flag, "balancing", and to display at least a few
> + * node ids.
>   */
>  void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
>  {
>  	char *p = buffer;
>  	nodemask_t nodes = NODE_MASK_NONE;
>  	unsigned short mode = MPOL_DEFAULT;
> -	unsigned short flags = 0;
> +	unsigned long flags = 0;
>  
> -	if (pol && pol != &default_policy && !(pol->flags & MPOL_F_MORON)) {
> +	if (!(pol->flags & MPOL_F_KERNEL)) {

Can we avoid to introduce a new flag?  Whether the following code work?

        if (pol && pol != &default_policy && !(pol->mode !=
            MPOL_PREFERRED) && !(pol->flags & MPOL_F_MORON))

But I think that this is kind of fragile.  A flag is better.  But
personally, I don't think MPOL_F_KERNEL is a good name, maybe
MPOL_F_DEFAULT?

>  		mode = pol->mode;
>  		flags = pol->flags;
>  	}
> @@ -3328,15 +3336,25 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
>  	p += snprintf(p, maxlen, "%s", policy_modes[mode]);
>  
>  	if (flags & MPOL_MODE_FLAGS) {
> -		p += snprintf(p, buffer + maxlen - p, "=");
> +		unsigned int bit, cnt = 0;
>  
> -		/*
> -		 * Currently, the only defined flags are mutually exclusive
> -		 */
> -		if (flags & MPOL_F_STATIC_NODES)
> -			p += snprintf(p, buffer + maxlen - p, "static");
> -		else if (flags & MPOL_F_RELATIVE_NODES)
> -			p += snprintf(p, buffer + maxlen - p, "relative");
> +		for_each_set_bit(bit, &flags, ARRAY_SIZE(policy_flags)) {
> +			if (bit <= ilog2(MPOL_F_KERNEL))
> +				continue;
> +
> +			if (cnt == 0)
> +				p += snprintf(p, buffer + maxlen - p, "=");
> +			else
> +				p += snprintf(p, buffer + maxlen - p, ",");
> +
> +			if (WARN_ON_ONCE(!policy_flags[bit]))
> +				p += snprintf(p, buffer + maxlen - p, "bit%u",
> +					      bit);
> +			else
> +				p += snprintf(p, buffer + maxlen - p,
> +					      policy_flags[bit]);
> +			cnt++;
> +		}

Please refer to commit 2291990ab36b ("mempolicy: clean-up mpol-to-str()
mempolicy formatting") for the original format.

>  	}
>  
>  	if (!nodes_empty(nodes))

--
Best Regards,
Huang, Ying

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ