linux-kernel - Re: [RFC PATCH 6/6] mm: Drain LRUs upon resume to userspace on nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zn0MittvTWJm_bIN@tiehlicka>
Date: Thu, 27 Jun 2024 08:54:02 +0200
From: Michal Hocko <mhocko@...e.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Frederic Weisbecker <frederic@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Oleg Nesterov <oleg@...hat.com>
Subject: Re: [RFC PATCH 6/6] mm: Drain LRUs upon resume to userspace on
 nohz_full CPUs

On Wed 26-06-24 15:16:04, Vlastimil Babka wrote:
> On 6/25/24 4:20 PM, Michal Hocko wrote:
> > On Tue 25-06-24 15:52:44, Frederic Weisbecker wrote:
> >> LRUs can be drained through several ways. One of them may add disturbances
> >> to isolated workloads while queuing a work at any time to any target,
> >> whether running in nohz_full mode or not.
> >> 
> >> Prevent from that on isolated tasks with draining LRUs upon resuming to
> >> userspace using the isolated task work framework.
> >> 
> >> It's worth noting that this is inherently racy against
> >> lru_add_drain_all() remotely queueing the per CPU drain work and
> >> therefore it prevents from the undesired disturbance only
> >> *most of the time*.
> > 
> > Can we simply not schedule flushing on remote CPUs and leave that to the
> > "return to the userspace" path?
> > 
> > I do not think we rely on LRU cache flushing for correctness purposes anywhere.
> 
> I guess drain via lru_cache_disable() should be honored, but also rare.

I do not think we can call it rare because it can be triggered by the
userspace by NUMA syscalls for example. I think we should just either
make it fail and let caller decide what to do or just make it best
effort and eventually fail the operation if there is no other way. The
latter has an advantage that the failure is lazy as well. In an ideal
world, memory offlining will be a complete no-no in isolated workloads
and mbind calls will not try to migrate memory that has been just
added on the LRU cache. In any case this would require to document this
limitation at least.
-- 
Michal Hocko
SUSE Labs