[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250804144200.1047918-1-joshua.hahnjy@gmail.com>
Date: Mon, 4 Aug 2025 07:41:59 -0700
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: "Huang, Ying" <ying.huang@...ux.alibaba.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
SeongJae Park <sj@...nel.org>,
David Hildenbrand <david@...hat.com>,
Zi Yan <ziy@...dia.com>,
Johannes Weiner <hannes@...xchg.org>,
Matthew Brost <matthew.brost@...el.com>,
Rakie Kim <rakie.kim@...com>,
Byungchul Park <byungchul@...com>,
Gregory Price <gourry@...rry.net>,
Alistair Popple <apopple@...dia.com>,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
kernel-team@...a.com
Subject: Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@...ux.alibaba.com> wrote:
> Joshua Hahn <joshua.hahnjy@...il.com> writes:
>
> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@...ux.alibaba.com> wrote:
> >
> >> Joshua Hahn <joshua.hahnjy@...il.com> writes:
> >>
> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> > "node_reclaim_mode".
> >> >
> >> > This can be confusing. But because we cannot change the name of the API since
> >> > it has been in place since at least 2.6, let's try to be more explicit about
> >> > what the behavior of this API is.
> >> >
> >> > Change the description to clarify what zone reclaim entails, and be explicit
> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> > past already [1] [2].
> >> >
> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >> >
> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
> >> > ---
> >> > include/uapi/linux/mempolicy.h | 8 +++++++-
> >> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
> >> > --- a/include/uapi/linux/mempolicy.h
> >> > +++ b/include/uapi/linux/mempolicy.h
> >> > @@ -66,10 +66,16 @@ enum {
> >> > #define MPOL_F_MORON (1 << 4) /* Migrate On protnone Reference On Node */
> >> >
> >> > /*
> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> > + * the allocation request on the current node by triggering reclaim and
> >> > + * trying to shrink the current node.
> >> > + * Fallback allocations on the next candidates in the zonelist are considered
> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> >> > + *
> >> > * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> > * ABI. New bits are OK, but existing bits can never change.
> >>
> >> As far as I know, sysctl isn't considered kernel ABI now. So, cghane
> >> this line too?
> >
> > Hi Ying,
> >
> > Thank you for reviewing this patch!
> >
> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> > suggestion correctly, I can rephrase the comment block above to something like this?
> >
> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> > - * ABI. New bits are OK, but existing bits can never change.
> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> > + * can never change.
Hi Ying,
> Because it's not an ABI, I think that we could avoid to say "never".
My personal opinion is that we should keep this warning, since there has
already been an example before where a developer tried to remove this bit [1],
and this broke some behavior for userspace configurations. However, if I
understand your comment correctly, you are suggesting that we should change
the wording to not include "never", since sysctls are no longer an ABI (and
therefore we should be OK to change what the values mean?)
If that is the case, then I can send in another patch since I think the goals
are a bit different for the two patches. With that said, I think we should
keep the warning just to avoid any breakages in userspace, even if sysctl
might not be considered an ABI anymore (also I must have missed this, I didn't
know this at all!)
> > Thanks again for your review Ying, I hope you have a good day : -)
>
> Welcome! You too!
>
> With some trivial tweak, please feel free to add my
>
> Reviewed-by: Huang Ying <ying.huang@...ux.alibaba.com>
>
> in the future version.
Thank you for your review Ying! Since there is a question remaining about what
to do with the "never" statement, I will wait to send out a v3 with your
review : -)
Have a great day!
Joshua
[1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
Sent using hkml (https://github.com/sjp38/hackermail)
Powered by blists - more mailing lists