lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Aug 2023 11:15:46 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>, Tejun Heo <tj@...nel.org>,
        linux-mm@...ck.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mm: memcg: use non-unified stats flushing for
 userspace reads

On Thu, Aug 24, 2023 at 12:13 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Wed 23-08-23 07:55:40, Yosry Ahmed wrote:
> > On Wed, Aug 23, 2023 at 12:33 AM Michal Hocko <mhocko@...e.com> wrote:
> > >
> > > On Tue 22-08-23 08:30:05, Yosry Ahmed wrote:
> > > > On Tue, Aug 22, 2023 at 2:06 AM Michal Hocko <mhocko@...e.com> wrote:
> > > > >
> > > > > On Mon 21-08-23 20:54:58, Yosry Ahmed wrote:
> > > [...]
> > > > So to answer your question, I don't think a random user can really
> > > > affect the system in a significant way by constantly flushing. In
> > > > fact, in the test script (which I am now attaching, in case you're
> > > > interested), there are hundreds of threads that are reading stats of
> > > > different cgroups every 1s, and I don't see any negative effects on
> > > > in-kernel flushers in this case (reclaimers).
> > >
> > > I suspect you have missed my point.
> >
> > I suspect you are right :)
> >
> >
> > > Maybe I am just misunderstanding
> > > the code but it seems to me that the lock dropping inside
> > > cgroup_rstat_flush_locked effectivelly allows unbounded number of
> > > contenders which is really dangerous when it is triggerable from the
> > > userspace. The number of spinners at a moment is always bound by the
> > > number CPUs but depending on timing many potential spinners might be
> > > cond_rescheded and the worst time latency to complete can be really
> > > high. Makes more sense?
> >
> > I think I understand better now. So basically because we might drop
> > the lock and resched, there can be nr_cpus spinners + other spinners
> > that are currently scheduled away, so these will need to wait to be
> > scheduled and then start spinning on the lock. This may happen for one
> > reader multiple times during its read, which is what can cause a high
> > worst case latency.
> >
> > I hope I understood you correctly this time. Did I?
>
> Yes. I would just add that this could also influence the worst case
> latency for a different reader - so an adversary user can stall others.

I can add that for v2 to the commit log, thanks.

> Exposing a shared global lock in uncontrolable way over generally
> available user interface is not really a great idea IMHO.

I think that's how it was always meant to be when it was designed. The
global rstat lock has always existed and was always available to
userspace readers. The memory controller took a different path at some
point with unified flushing, but that was mainly because of high
concurrency from in-kernel flushers, not because userspace readers
caused a problem. Outside of memcg, the core cgroup code has always
exercised this global lock when reading cpu.stat since rstat's
introduction. I assume there hasn't been any problems since it's still
there. I was hoping Tejun would confirm/deny this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ