lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkYSk-73c1=5vmuRdykAQO=pJSkQFgRqkpdfnh7f-Zufkw@mail.gmail.com>
Date:   Wed, 16 Aug 2023 15:35:26 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Shakeel Butt <shakeelb@...gle.com>, Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Muchun Song <muchun.song@...ux.dev>, cgroups@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Ivan Babrou <ivan@...udflare.com>
Subject: Re: [PATCH] mm: memcg: provide accurate stats for userspace reads

On Wed, Aug 16, 2023 at 12:08 PM Tejun Heo <tj@...nel.org> wrote:
>
> Hello,
>
> On Wed, Aug 16, 2023 at 10:11:20AM -0700, Shakeel Butt wrote:
> > These options are not white and black and there can be something in
> > between but let me be very clear on what I don't want and would NACK.
>
> I'm not a big fan of interfaces with hidden states. What you're proposing
> isn't strictly that but it's still a bit nasty. So, if we can get by without
> doing that, that'd be great.

Agreed. I will try to send patches soon implementing option (2) above,
basically removing unified flushing. I will try to document any
potential regressions that may happen and how we may fix them. Ideally
we see no regressions.

>
> > I don't want a global sleepable lock which can be taken by potentially
> > any application running on the system. We have seen similar global
> > locks causing isolation and priority inversion issues in production.
> > So, not another lock which needs to be taken under extreme condition
> > (reading stats under OOM) by a high priority task (node controller)
> > and might be held by a low priority task.
>
> Yeah, this is a real concern. Those priority inversions do occur and can be
> serious but causing serious problems under memory pressure usually requires
> involving memory allocations and IOs. Here, it's just all CPU. So, at least
> in OOM conditions, this shouldn't be in the way (the system wouldn't have
> anything else to do anyway).
>
> It is true that this still can lead to priority through CPU competition tho.
> However, that problem isn't necessarily solved by what you're suggesting
> either unless you want to restrict explicit flushing based on permissions
> which is another can of worms.

Right. Also in the case of a mutex, if we disable preemption while
holding the mutex, this makes sure that whoever holding the mutex does
not starve waiters. Essentially the difference would be that waiters
will sleep with the mutex instead of spinning, but the mutex holder
itself wouldn't sleep.

I will make this a separate patch, just in case it's too
controversial. Switching the spinlock to a mutex should not block
removing unified flushing.

>
> My preference is not exposing this in user interface. This is mostly arising
> from internal implementation details and isn't what users necessarily care
> about. There are many things we can do on the kernel side to make trade-offs
> among overhead, staleness and priority inversions. If we make this an
> explicit userland interface behavior, we get locked into that semantics
> which we'll likely regret in some future.
>

Yeah that's what I am trying to do here as well.  I will try to follow
up on this discussion with patches soon.

Thanks everyone!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ