[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZpfFaElo1wwTOpNm@localhost.localdomain>
Date: Wed, 17 Jul 2024 15:21:44 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Michal Hocko <mhocko@...e.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: [RFC PATCH 6/6] mm: Drain LRUs upon resume to userspace on
nohz_full CPUs
Le Thu, Jul 04, 2024 at 03:11:24PM +0200, Michal Hocko a écrit :
> On Wed 03-07-24 14:52:21, Frederic Weisbecker wrote:
> > Le Tue, Jun 25, 2024 at 04:20:01PM +0200, Michal Hocko a écrit :
> > > On Tue 25-06-24 15:52:44, Frederic Weisbecker wrote:
> > > > LRUs can be drained through several ways. One of them may add disturbances
> > > > to isolated workloads while queuing a work at any time to any target,
> > > > whether running in nohz_full mode or not.
> > > >
> > > > Prevent from that on isolated tasks with draining LRUs upon resuming to
> > > > userspace using the isolated task work framework.
> > > >
> > > > It's worth noting that this is inherently racy against
> > > > lru_add_drain_all() remotely queueing the per CPU drain work and
> > > > therefore it prevents from the undesired disturbance only
> > > > *most of the time*.
> > >
> > > Can we simply not schedule flushing on remote CPUs and leave that to the
> > > "return to the userspace" path?
> >
> > Do you mean I should add a call on return to the userspace path or can
> > I expect it to be drained at some point already?
>
> I would make the particular per cpu cache to be drained on return to the
> userspace.
And then we need the patchset from Valentin that defers work to kernel entry?
>
> > The other limitation with that task work thing is that if the task
> > queueing the work actually goes to sleep and another task go on the CPU
> > and does isolated work in userspace, the drain doesn't happen. Now whether
> > that is a real problem or not, I have no idea.
>
> Theoretically there is a problem because pages sitting on pcp LRU caches
> cannot be migrated and some other operations will fail as well. But
> practically speaking those pages should be mostly of interest to the
> process allocating them most of the time. Page sharing between isolated
> workloads sounds like a terrible idea to me. Maybe reality hits us in
> this regards but we can deal with that when we learn about those
> workloads.
>
> So I wouldn't lose too much sleep over that. We are dealing with those
> isolated workloads being broken by simple things like fork now because
> that apparently adds pages on the pcp LRU cache and draining will happen
> sooner or later (very often when the task is already running in the
> userspace).
That sounds good!
Thanks.
>
> --
> Michal Hocko
> SUSE Labs
Powered by blists - more mailing lists