lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 Aug 2023 10:07:48 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Michal Hocko <mhocko@...e.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>, Tejun Heo <tj@...nel.org>,
        linux-mm@...ck.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mm: memcg: use non-unified stats flushing for
 userspace reads

On Mon, Aug 28, 2023 at 10:00 AM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> On Mon, Aug 28, 2023 at 9:15 AM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> >
> [...]
> > >
> > > Well, I really have to say that I do not like the notion that reading
> > > stats is unpredictable. This just makes it really hard to use. If
> > > the precision is to be sarificed then this should be preferable over
> > > potentially high global lock contention. We already have that model in
> > > place of /proc/vmstat (configurable timeout for flusher and a way to
> > > flush explicitly). I appreciate you would like to have a better
> > > precision but as you have explored the locking is really hard to get rid
> > > of here.
> >
> > Reading the stats *is* unpredictable today. In terms of
> > accuracy/staleness and cost. Avoiding the flush entirely on the read
> > path will surely make the cost very stable and cheap, but will make
> > accuracy even less predictable.
> >
>
> On average you would get the stats at most 2 second old, so I would
> not say it is less predictable.
>
> > >
> > > So from my POV I would prefer to avoid flushing from the stats reading
> > > path and implement force flushing by writing to stat file. If the 2s
> > > flushing interval is considered to coarse I would be OK to allow setting
> > > it from userspace. This way this would be more in line with /proc/vmstat
> > > which seems to be working quite well.
> > >
> > > If this is not accaptable or deemed a wrong approach long term then it
> > > would be good to reonsider the current cgroup_rstat_lock at least.
> > > Either by turning it into mutex or by dropping the yielding code which
> > > can severly affect the worst case latency AFAIU.
> >
> > Honestly I think it's better if we do it the other way around. We make
> > flushing on the stats reading path non-unified and deterministic. That
> > model also exists and is used for cpu.stat. If we find a problem with
> > the locking being held from userspace, we can then remove flushing
> > from the read path and add interface(s) to configure the periodic
> > flusher and do a force flush.
> >
>
> Here I agree with you. Let's go with the approach which is easy to
> undo for now. Though I prefer the new explicit interface for flushing,
> that step would be very hard to undo. Let's reevaluate if the proposed
> approach shows negative impact on production traffic and I think
> Cloudflare folks can give us the results soon.

Do you prefer we also switch to using a mutex (with preemption
disabled) to avoid the scenario Michal described where flushers give
up the lock and sleep resulting in an unbounded wait time in the worst
case?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ