linux-kernel - Re: [PATCH v1 0/3] Avoid scheduling cache draining to isolated cpus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y2TQLavnLVd4qHMT@dhcp22.suse.cz>
Date:   Fri, 4 Nov 2022 09:41:17 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Leonardo Brás <leobras@...hat.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Frederic Weisbecker <frederic@...nel.org>,
        Phil Auld <pauld@...hat.com>,
        Marcelo Tosatti <mtosatti@...hat.com>,
        linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH v1 0/3] Avoid scheduling cache draining to isolated cpus

On Thu 03-11-22 13:53:41, Leonardo Brás wrote:
> On Thu, 2022-11-03 at 16:31 +0100, Michal Hocko wrote:
> > On Thu 03-11-22 11:59:20, Leonardo Brás wrote:
[...]
> > > I understand there will be a locking cost being paid in the isolated CPUs when:
> > > a) The isolated CPU is requesting the stock drain,
> > > b) When the isolated CPUs do a syscall and end up using the protected structure
> > > the first time after a remote drain.
> > 
> > And anytime the charging path (consume_stock resp. refill_stock)
> > contends with the remote draining which is out of control of the RT
> > task. It is true that the RT kernel will turn that spin lock into a
> > sleeping RT lock and that could help with potential priority inversions
> > but still quite costly thing I would expect.
> > 
> > > Both (a) and (b) should happen during a syscall, and IIUC the a rt workload
> > > should not expect the syscalls to be have a predictable time, so it should be
> > > fine.
> > 
> > Now I am not sure I understand. If you do not consider charging path to
> > be RT sensitive then why is this needed in the first place? What else
> > would be populating the pcp cache on the isolated cpu? IRQs?
> 
> I am mostly trying to deal with drain_all_stock() calling schedule_work_on() at
> isolated_cpus. Since the scheduled drain_local_stock() will be competing for cpu
> time with the RT workload, we can have preemption of the RT workload, which is a
> problem for meeting the deadlines.

Yes, this is understood. But it is not really clear to me why would any
draining be necessary for such an isolated CPU if no workload other than
the RT (which pressumably doesn't charge any memory?) is running on that
CPU? Is that the RT task during the initialization phase that leaves
that cache behind or something else? Sorry for being so focused on this
but I would like to understand on whether this is avoidable by a
different startup scheme or it really needs to be addressed in some way.

> One way I thought to solve that was introducing a remote drain, which would
> require a different strategy for locking, since not all accesses to the pcp
> caches would happen on a local CPU. 

Yeah, I am not supper happy about additional spin lock TBH. One
potential way to go would be to completely avoid pcp cache for isolated
CPUs. That would have some performance impact of course but on the other
hand it would give a more predictable behavior for those CPUs which
sounds like a reasonable compromise to me. What do you think?
-- 
Michal Hocko
SUSE Labs