lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250805200319.1298046-1-joshua.hahnjy@gmail.com>
Date: Tue,  5 Aug 2025 13:03:18 -0700
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: "Huang, Ying" <ying.huang@...ux.alibaba.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	SeongJae Park <sj@...nel.org>,
	David Hildenbrand <david@...hat.com>,
	Zi Yan <ziy@...dia.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Matthew Brost <matthew.brost@...el.com>,
	Rakie Kim <rakie.kim@...com>,
	Byungchul Park <byungchul@...com>,
	Gregory Price <gourry@...rry.net>,
	Alistair Popple <apopple@...dia.com>,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	kernel-team@...a.com,
	Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH v2] mempolicy: Clarify what zone reclaim means

On Tue, 05 Aug 2025 09:27:30 +0800 "Huang, Ying" <ying.huang@...ux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@...il.com> writes:
> 
> > On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@...ux.alibaba.com> wrote:
> >
> >> Joshua Hahn <joshua.hahnjy@...il.com> writes:
> >> 
> >> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@...ux.alibaba.com> wrote:
> >> >
> >> >> Joshua Hahn <joshua.hahnjy@...il.com> writes:
> >> >> 
> >> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> >> > "node_reclaim_mode".
> >> >> >
> >> >> > This can be confusing. But because we cannot change the name of the API since
> >> >> > it has been in place since at least 2.6, let's try to be more explicit about
> >> >> > what the behavior of this API is. 
> >> >> >
> >> >> > Change the description to clarify what zone reclaim entails, and be explicit
> >> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> >> > past already [1] [2].
> >> >> >
> >> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >> >> >
> >> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
> >> >> > ---
> >> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >> >> >
> >> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
> >> >> > --- a/include/uapi/linux/mempolicy.h
> >> >> > +++ b/include/uapi/linux/mempolicy.h
> >> >> > @@ -66,10 +66,16 @@ enum {
> >> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >> >> >  
> >> >> >  /*
> >> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> >> > + * the allocation request on the current node by triggering reclaim and
> >> >> > + * trying to shrink the current node.
> >> >> > + * Fallback allocations on the next candidates in the zonelist are considered
> >> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> >> >> > + *
> >> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> >> >   * ABI.  New bits are OK, but existing bits can never change.
> >> >> 
> >> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> >> >> this line too?
> >> >
> >> > Hi Ying, 
> >> >
> >> > Thank you for reviewing this patch!
> >> >
> >> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> >> > suggestion correctly, I can rephrase the comment block above to something like this?
> >> >
> >> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> > - * ABI. New bits are OK, but existing bits can never change.
> >> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> >> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> >> > + * can never change.
> >
> > Hi Ying,
> >
> >> Because it's not an ABI, I think that we could avoid to say "never".
> >
> > My personal opinion is that we should keep this warning, since there has
> > already been an example before where a developer tried to remove this bit [1],
> > and this broke some behavior for userspace configurations. However, if I
> > understand your comment correctly, you are suggesting that we should change
> > the wording to not include "never", since sysctls are no longer an ABI (and
> > therefore we should be OK to change what the values mean?)
> >
> > If that is the case, then I can send in another patch since I think the goals
> > are a bit different for the two patches. With that said, I think we should
> > keep the warning just to avoid any breakages in userspace, even if sysctl
> > might not be considered an ABI anymore (also I must have missed this, I didn't
> > know this at all!)
> 
> Sorry for confusing.  I agree that we shouldn't change the sysctl
> interface in most cases.  I just thought that we could soften the
> wording a little?  For example,
> 
> New bits are OK, but existing bits shouldn't be changed.
> 
> I think that it's still clear that we don't want to change the existing
> bits.
> 
> However, my English is poor.  So, my suggestion may not make sense.

Hi Ying, thank you again for the response!

No worries at all, it was my misunderstanding : -) This suggestion makes sense,
and I think it's small enough & relevant to the code block, so I'll also fold
this change into my patch as well. I'll send out the next version shortly!

Have a great day!
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ