lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250512141412.3792050-1-joshua.hahnjy@gmail.com>
Date: Mon, 12 May 2025 07:14:12 -0700
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Honggyu Kim <honggyu.kim@...com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	kernel_team@...ynix.com,
	"Huang, Ying" <ying.huang@...ux.alibaba.com>,
	gourry@...rry.net,
	yunjeong.mun@...com,
	gregkh@...uxfoundation.org,
	rafael@...nel.org,
	lenb@...nel.org,
	dan.j.williams@...el.com,
	Jonathan.Cameron@...wei.com,
	dave.jiang@...el.com,
	horen.chuang@...ux.dev,
	hannes@...xchg.org,
	osalvador@...e.de,
	linux-kernel@...r.kernel.org,
	linux-acpi@...r.kernel.org,
	linux-mm@...ck.org,
	kernel-team@...a.com
Subject: Re: [PATCH v8] mm/mempolicy: Weighted Interleave Auto-tuning

Hi Honggyu, thank you for reviewing & testing my patch (again)!

On Sun, 11 May 2025 21:56:20 +0900 Honggyu Kim <honggyu.kim@...com> wrote:

> Hi Joshua,
> 
> Thanks for the update this patch and it looks good to me.
> 
> I've applied your v8 patch with your fixup here together, then tested it in my
> server, which has 8ch of DRAM with 4ch of CXL memory in each socket.
> 
> I can confirm that it shows decent ratio with this auto weight configuration as
> follows.
> 
>    $ ls /sys/kernel/mm/mempolicy/weighted_interleave/
>    auto  node0  node1  node2  node3
> 
>    $ cat /sys/kernel/mm/mempolicy/weighted_interleave/*
>    true
>    3
>    3
>    2
>    2
> 
> Hi Andrew,
> 
> I'm not sure if Joshua is better to post v9, but if you want to fold and update,
> then could you please add my tags as follows when you fold this change?
> 
>    Reviewed-by: Honggyu Kim <honggyu.kim@...com>
>    Tested-by: Honggyu Kim <honggyu.kim@...com>
> 
> I added the same tags in v7 but not included in v8 somehow.
> https://lore.kernel.org/linux-mm/5fdd7db9-96fb-49ea-9803-977158cb0132@sk.com

I must have missed including these tags. Sorry about the confusion --
hopefully we can incorporate them into v8!

> Thanks,
> Honggyu
> 
> On 5/11/2025 11:58 AM, Joshua Hahn wrote:
> > Hello Andrew,
> > 
> > I was hoping to fold this fixlet in with the patch this belongs to. It includes
> > some wordsmithing changes, some code simplification/cleanups, and makes sure
> > that the code behavior matches that of the ABI I described. I've kept the
> > original message below as well, where Ying suggested the changes present in
> > this fixlet.
> > 
> > Please let me know if this fixlet is too big, and you would rather prefer a
> > new version instead. Thank you as always for your patience and support!
> > Joshua
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave
> > index ec13382c606f..649c0e9b895c 100644
> > --- a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave
> > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave
> > @@ -24,7 +24,7 @@ Description:	Weight configuration interface for nodeN
> >   		empty string, ...) will return -EINVAL.
> > 
> >   		Changing the weight to a valid value will automatically
> > -		update the system to manual mode as well.
> > +		switch the system to manual mode as well.
> > 
> >   What:		/sys/kernel/mm/mempolicy/weighted_interleave/auto
> >   Date:		May 2025
> > diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
> > index 3e8da8ba1146..0fe96f3ab3ef 100644
> > --- a/include/linux/mempolicy.h
> > +++ b/include/linux/mempolicy.h
> > @@ -57,15 +57,6 @@ struct mempolicy {
> >   	} w;
> >   };
> > 
> > -/*
> > - * A null weighted_interleave_state is interpted as having .mode = "auto",
> > - * and .iw_table is interpreted as an array of 1s with length nr_node_ids.
> > - */
> > -struct weighted_interleave_state {
> > -	bool mode_auto;
> > -	u8 iw_table[];
> > -};
> > -
> >   /*
> >    * Support for managing mempolicy data objects (clone, copy, destroy)
> >    * The default fast path of a NULL MPOL_DEFAULT policy is always inlined.
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index f542691b7123..0624d735a2e7 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -148,6 +148,14 @@ static struct mempolicy preferred_node_policy[MAX_NUMNODES];
> >    */
> >   static const int weightiness = 32;
> > 
> > +/*
> > + * A null weighted_interleave_state is interpreted as having .mode="auto",
> > + * and .iw_table is interpreted as an array of 1s with length nr_node_ids.
> > + */
> > +struct weighted_interleave_state {
> > +	bool mode_auto;
> > +	u8 iw_table[];
> > +};
> >   static struct weighted_interleave_state __rcu *wi_state;
> >   static unsigned int *node_bw_table;
> > 
> > @@ -3561,9 +3569,8 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
> >   	int i;
> > 
> >   	node_attr = container_of(attr, struct iw_node_attr, kobj_attr);
> > -	if (count == 0 || sysfs_streq(buf, ""))
> > -		weight = 0;
> > -	else if (kstrtou8(buf, 0, &weight) || weight == 0)
> > +	if (count == 0 || sysfs_streq(buf, "") ||
> > +	    kstrtou8(buf, 0, &weight) || weight == 0)
> >   		return -EINVAL;
> > 
> >   	new_wi_state = kzalloc(struct_size(new_wi_state, iw_table, nr_node_ids),
> > @@ -3630,9 +3637,15 @@ static ssize_t weighted_interleave_auto_store(struct kobject *kobj,
> >   	if (!input) {
> >   		old_wi_state = rcu_dereference_protected(wi_state,
> >   					lockdep_is_held(&wi_state_lock));
> > -		if (old_wi_state)
> > -			memcpy(new_wi_state->iw_table, old_wi_state->iw_table,
> > -					nr_node_ids * sizeof(u8));
> > +		if (!old_wi_state)
> > +			goto update_wi_state;
> > +		if (input == old_wi_state->mode_auto) {
> > +			mutex_unlock(&wi_state_lock);
> > +			return count;
> > +		}
> > +
> > +		memcpy(new_wi_state->iw_table, old_wi_state->iw_table,
> > +					       nr_node_ids * sizeof(u8));
> >   		goto update_wi_state;
> >   	}
> > 
> > @@ -3707,8 +3720,12 @@ static void wi_state_free(void)
> >   	kfree(&wi_group->wi_kobj);
> >   }
> > 
> > +static struct kobj_attribute wi_auto_attr =
> > +	__ATTR(auto, 0664, weighted_interleave_auto_show,
> > +			   weighted_interleave_auto_store);
> > +
> >   static void wi_cleanup(void) {
> > -	sysfs_remove_file(&wi_group->wi_kobj, &wi_group->auto_kobj_attr.attr);
> > +	sysfs_remove_file(&wi_group->wi_kobj, &wi_auto_attr.attr);
> >   	sysfs_wi_node_delete_all();
> >   	wi_state_free();
> >   }
> > @@ -3798,10 +3815,6 @@ static int wi_node_notifier(struct notifier_block *nb,
> >   	return NOTIFY_OK;
> >   }
> > 
> > -static struct kobj_attribute wi_auto_attr =
> > -	__ATTR(auto, 0664, weighted_interleave_auto_show,
> > -			   weighted_interleave_auto_store);
> > -
> >   static int __init add_weighted_interleave_group(struct kobject *mempolicy_kobj)
> >   {
> >   	int nid, err;
> > 
> > 
> > On Sat, 10 May 2025 11:51:50 -0700 Joshua Hahn <joshua.hahnjy@...il.com> wrote:
> > 
> >> On Sat, 10 May 2025 13:25:32 +0800 "Huang, Ying" <ying.huang@...ux.alibaba.com> wrote:
> >>
> >> Hi Ying,
> >> Thank you for reviewing my patch, as always!
> >>
> >>> Hi, Joshua,
> >>>
> >>> Thank you for updated version!  And sorry for late reply.
> >>>
> >>> Joshua Hahn <joshua.hahnjy@...il.com> writes:
> >>
> >> [...snip...]
> >>
> >>>> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave
> >>>> index 0b7972de04e9..ec13382c606f 100644
> >>>> --- a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave
> >>>> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave
> >>>> @@ -20,6 +20,35 @@ Description:	Weight configuration interface for nodeN
> >>>>   		Minimum weight: 1
> >>>>   		Maximum weight: 255
> >>>>   
> >>>> -		Writing an empty string or `0` will reset the weight to the
> >>>> -		system default. The system default may be set by the kernel
> >>>> -		or drivers at boot or during hotplug events.
> >>>> +		Writing invalid values (i.e. any values not in [1,255],
> >>>> +		empty string, ...) will return -EINVAL.
> >>>> +
> >>>> +		Changing the weight to a valid value will automatically
> >>>> +		update the system to manual mode as well.
> >>>
> >>> s/update/switch/ ?
> >>>
> >>> But my English is poor, please keep your version if you think that it's
> >>> better.
> >>
> >> I have no particular preference here, whatever will make it easiest for the
> >> users to understand what is happening. I'll take your suggestion!
> >>
> >> [...snip...]
> >>
> >>>> +/*
> >>>> + * A null weighted_interleave_state is interpted as having .mode = "auto",
> >>>> + * and .iw_table is interpreted as an array of 1s with length nr_node_ids.
> >>>> + */
> >>>
> >>> Better to move the comments to above "wi_state" definition?
> >>>
> >>>> +struct weighted_interleave_state {
> >>>> +	bool mode_auto;
> >>>> +	u8 iw_table[];
> >>>> +};
> >>>> +
> >>>
> >>> Why do you put the type definition in mempolicy.h instead of
> >>> mempolicy.c?  I don't find other users except mempolicy.c.
> >>
> >> Good point, I'll move the definition to mempolicy.c and move the comment
> >> to the wi_state definition as well.
> >>
> >> [...snip...]
> >>
> >>>> @@ -3450,31 +3555,104 @@ static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr,
> >>>>   static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
> >>>>   			  const char *buf, size_t count)
> >>>>   {
> >>>> +	struct weighted_interleave_state *new_wi_state, *old_wi_state = NULL;
> >>>>   	struct iw_node_attr *node_attr;
> >>>> -	u8 *new;
> >>>> -	u8 *old;
> >>>>   	u8 weight = 0;
> >>>> +	int i;
> >>>>   
> >>>>   	node_attr = container_of(attr, struct iw_node_attr, kobj_attr);
> >>>>   	if (count == 0 || sysfs_streq(buf, ""))
> >>>>   		weight = 0;
> >>>
> >>> According to revised ABI, we should return -EINVAL here?
> >>
> >> Great catch, I completely ignored the ABI description that I wrote...
> >> I'll go ahead and just return -EINVAL here!
> >>
> >> [...snip...]
> >>
> >>>> +static ssize_t weighted_interleave_auto_store(struct kobject *kobj,
> >>>> +		struct kobj_attribute *attr, const char *buf, size_t count)
> >>>> +{
> >>>> +	struct weighted_interleave_state *new_wi_state, *old_wi_state = NULL;
> >>>> +	unsigned int *bw;
> >>>> +	bool input;
> >>>> +	int i;
> >>>> +
> >>>> +	if (kstrtobool(buf, &input))
> >>>> +		return -EINVAL;
> >>>> +
> >>>> +	new_wi_state = kzalloc(struct_size(new_wi_state, iw_table, nr_node_ids),
> >>>> +			       GFP_KERNEL);
> >>>> +	if (!new_wi_state)
> >>>> +		return -ENOMEM;
> >>>> +	for (i = 0; i < nr_node_ids; i++)
> >>>> +		new_wi_state->iw_table[i] = 1;
> >>>> +
> >>>> +	mutex_lock(&wi_state_lock);
> >>>> +	if (!input) {
> >>>
> >>> If input == old_wi_state->mode_auto, we can return directly?
> >>
> >> Yes, that makes sense to me.
> >>
> >>>>   static void wi_cleanup(void) {
> >>>> +	sysfs_remove_file(&wi_group->wi_kobj, &wi_group->auto_kobj_attr.attr);
> >>>
> >>> Why not just
> >>>
> >>> 	sysfs_remove_file(&wi_group->wi_kobj, &wi_auto_attr.attr);
> >>>
> >>> ?
> >>
> >> Also makes sense!!
> >>
> >>> ---
> >>> Best Regards,
> >>> Huang, Ying
> >>
> >> Thank you for your great feedback Ying, I'll make changes based on
> >> your suggestions and shortly send up a v9. I hope you have a great day!
> >> Joshua
> >>
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ