[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <222c1c94-bc9a-a253-f408-937d02512151@linux.vnet.ibm.com>
Date: Mon, 14 Mar 2022 14:39:07 +0530
From: Abhishek Goel <huntbag@...ux.vnet.ibm.com>
To: Dave Hansen <dave.hansen@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Oscar Salvador <osalvador@...e.de>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>,
"Huang, Ying" <ying.huang@...el.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm: Only re-generate demotion targets when a numa node
changes its N_CPU state
On 11/03/22 22:40, Dave Hansen wrote:
> On 3/10/22 18:39, Andrew Morton wrote:
>> On Thu, 10 Mar 2022 13:07:49 +0100 Oscar Salvador <osalvador@...e.de> wrote:
>>> We do already have two CPU callbacks (vmstat_cpu_online() and vmstat_cpu_dead())
>>> that check exactly that, so get rid of the CPU callbacks in
>>> migrate_on_reclaim_init() and only call set_migration_target_nodes() from
>>> vmstat_cpu_{dead,online}() whenever a numa node change its N_CPU state.
>> What I'm not getting here (as so often happens) is a sense of how badly
>> this affects our users. Does anyone actually hotplug frequently enough
>> to care?
> I asked Abhishek about this a bit here:
>
>> https://lore.kernel.org/all/4e8067e1-0574-c9d2-9d6c-d676d32071bd@linux.vnet.ibm.com/
> It sounded to me like there are ppc users who convert their systems from
> SMT=1 to SMT=8. I'd guess that they want to do this as a side-channel
> mitigation because ppc has been dealing with the same basic issues as
> those of us over in x86 land. The increase in time (20s->36s) would be
> noticeable and probably slightly annoying to a human waiting on it.
>
> I'd love to hear more details on this from Abhishek, like whether end
> users do this as opposed to IBM's kernel developers. But, it does sound
> deserving of a stable@ tag to me.
Yes, end users also use this, especially on large systems, might want
to switch between SMT=1, SMT=4 and SMT=8.
And this is also usable for dynamic LPAR operations.
As Dave pointed out, this increase in time while manageable and just
noticeable on smaller systems, can be very clearly observed as the
systems become larger.
Powered by blists - more mailing lists