[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210802113326.GA78980@shbuild999.sh.intel.com>
Date: Mon, 2 Aug 2021 19:33:26 +0800
From: Feng Tang <feng.tang@...el.com>
To: Michal Hocko <mhocko@...e.com>
Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
David Rientjes <rientjes@...gle.com>,
"Hansen, Dave" <dave.hansen@...el.com>,
"Widawsky, Ben" <ben.widawsky@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Mike Kravetz <mike.kravetz@...cle.com>,
Randy Dunlap <rdunlap@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>,
Andi Kleen <ak@...ux.intel.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"Huang, Ying" <ying.huang@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH v6 1/6] mm/mempolicy: Add MPOL_PREFERRED_MANY for
multiple preferred nodes
On Mon, Aug 02, 2021 at 01:14:29PM +0200, Michal Hocko wrote:
> On Mon 02-08-21 16:11:30, Feng Tang wrote:
> > On Fri, Jul 30, 2021 at 03:18:40PM +0800, Tang, Feng wrote:
> > [snip]
> > > > > One thing is, it's possible that 'nd' is not set in the preferred
> > > > > nodemask.
> > > >
> > > > Yes, and there shouldn't be any problem with that. The given node is
> > > > only used to get the respective zonelist (order distance ordered list of
> > > > zones to try). get_page_from_freelist will then use the preferred node
> > > > mask to filter this zone list. Is that more clear now?
> > >
> > > Yes, from the code, the policy_node() is always coupled with
> > > policy_nodemask(), which secures the 'nodemask' limit. Thanks for
> > > the clarification!
> >
> > Hi Michal,
> >
> > To ensure the nodemask limit, the policy_nodemask() also needs some
> > change to return the nodemask for 'prefer-many' policy, so here is a
> > updated 1/6 patch, which mainly changes the node/nodemask selection
> > for 'prefer-many' policy, could you review it? thanks!
>
> right, I have mixed it with get_policy_nodemask
>
> > @@ -1875,8 +1897,13 @@ static int apply_policy_zone(struct mempolicy *policy, enum zone_type zone)
> > */
> > nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy)
> > {
> > - /* Lower zones don't get a nodemask applied for MPOL_BIND */
> > - if (unlikely(policy->mode == MPOL_BIND) &&
> > + int mode = policy->mode;
> > +
> > + /*
> > + * Lower zones don't get a nodemask applied for 'bind' and
> > + * 'prefer-many' policies
> > + */
> > + if (unlikely(mode == MPOL_BIND || mode == MPOL_PREFERRED_MANY) &&
> > apply_policy_zone(policy, gfp_zone(gfp)) &&
> > cpuset_nodemask_valid_mems_allowed(&policy->nodes))
> > return &policy->nodes;
>
> Isn't this just too cryptic? Why didn't you simply
> if (mode == MPOL_PREFERRED_MANY)
> return &policy->mode;
>
> in addition to the existing code? I mean why would you even care about
> cpusets? Those are handled at the page allocator layer and will further
> filter the given nodemask.
Ok, I will follow your suggestion and keep 'bind' handling unchanged.
And to be honest, I don't fully understand the current handling for
'bind' policy, will the returning NULL for 'bind' policy open a
sideway for the strict 'bind' limit.
Thanks,
Feng
> --
> Michal Hocko
> SUSE Labs
Powered by blists - more mailing lists