linux-kernel - Re: [PATCH] psi: reduce min window size to 50ms

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJuCfpFNdhcVN66_j9J1s4GLjQ99t5mkQsWgOwmTPpzX9Ss=_g@mail.gmail.com>
Date:   Mon, 27 Feb 2023 11:51:26 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Josh Hunt <johunt@...mai.com>
Cc:     Michal Hocko <mhocko@...e.com>,
        Sudarshan Rajagopalan <quic_sudaraja@...cinc.com>,
        David Hildenbrand <david@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Mike Rapoport <rppt@...nel.org>,
        Oscar Salvador <osalvador@...e.de>,
        Anshuman Khandual <anshuman.khandual@....com>,
        mark.rutland@....com, will@...nel.org,
        virtualization@...ts.linux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-arm-msm@...r.kernel.org,
        Trilok Soni <quic_tsoni@...cinc.com>,
        Sukadev Bhattiprolu <quic_sukadev@...cinc.com>,
        Srivatsa Vaddagiri <quic_svaddagi@...cinc.com>,
        Patrick Daly <quic_pdaly@...cinc.com>
Subject: Re: [PATCH] psi: reduce min window size to 50ms

On Mon, Feb 27, 2023 at 11:19 AM Josh Hunt <johunt@...mai.com> wrote:
>
>
>
> On 2/27/23 9:49 AM, Suren Baghdasaryan wrote:
> > On Mon, Feb 27, 2023 at 5:34 AM Michal Hocko <mhocko@...e.com> wrote:
> >>
> >> On Fri 24-02-23 13:07:57, Suren Baghdasaryan wrote:
> >>> On Fri, Feb 24, 2023 at 4:47 AM Michal Hocko <mhocko@...e.com> wrote:
> >>>>
> >>>> On Tue 14-02-23 11:34:30, Suren Baghdasaryan wrote:
> >>>> [...]
> >>>>> Your suggestion to have this limit configurable sounds like obvious
> >>>>> solution. I would like to get some opinions from other maintainers.
> >>>>> Johannes, WDYT? CC'ing Michal to chime in as well since this is mostly
> >>>>> related to memory stalls.
> >>>>
> >>>> I do not think that making this configurable helps much. Many users will
> >>>> be bound to distribution config and also it would be hard to experiment
> >>>> with a recompile cycle every time. This seems just too impractical.
> >>>>
> >>>> Is there any reason why we shouldn't allow any timeout? Shorter
> >>>> timeouts could be restricted to a priviledged context to avoid an easy
> >>>> way to swamp system by too frequent polling.
> >>>
> >>> Hmm, ok. Maybe then we just ensure that only privileged users can set
> >>> triggers and remove the min limit (use a >0 check)?
> >>
> >> This could break existing userspace which is not privileged. I would
> >> just go with CAP_SYS_NICE or similar with small (sub min) timeouts.
> >
> > Yeah, that's what I meant. /proc/pressure/* files already check for
> > CAP_SYS_RESOURCE
> > (https://urldefense.com/v3/__https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c*L1440__;Iw!!GjvTz_vk!WtI61poYlZk9kg5P1sX19RdYnUNGvBJRjnOpu8hL6IOZ_NKhuw2qZ_tAdNRwzZoQVlO4jEObYN6x$ )
> > but per-cgroup pressure files do not have this check. I think the
> > original patch which added this check
> > (https://urldefense.com/v3/__https://lore.kernel.org/all/20210402025833.27599-1-johunt@akamai.com/__;!!GjvTz_vk!WtI61poYlZk9kg5P1sX19RdYnUNGvBJRjnOpu8hL6IOZ_NKhuw2qZ_tAdNRwzZoQVlO4jAVqIVDv$ )
> > missed the cgroup ones. This should be easy to add but I wonder if
> > that was left that way intentionally.
> >
> > CC'ing the author. Josh, Johannes is that inconsistency between system
> > pressure files and cgroup-specific ones intentional? Can we change
> > them all to check for CAP_SYS_RESOURCE?
>
> No, this was just an oversight in the original patch at least from my
> end, and did not come up during code review. Fine with me to change them
> all to use CAP_SYS_RESOURCE.

Thanks for the confirmation! Will get this fixed.

>
> Josh
>
> >
> >>
> >>>> Btw. it seems that there is is only a limit on a single trigger per fd
> >>>> but no limits per user so it doesn't sound too hard to end up with too
> >>>> much polling even with a larger timeouts. To me it seems like we need to
> >>>> contain the polling thread to be bound by the cpu controller.
> >>>
> >>> Hmm. We have one "psimon" thread per cgroup (+1 system-level one) and
> >>> poll_min_period for each thread is chosen as the min() of polling
> >>> periods between triggers created in that group. So, a bad trigger that
> >>> causes overly aggressive polling and polling thread being throttled,
> >>> might affect other triggers in that cgroup.
> >>
> >> Yes, and why that would be a problem?
> >
> > If unprivileged processes are allowed to add new triggers then a
> > malicious process can add a bad trigger and affect other legit
> > processes. That sounds like a problem to me.
> > Thanks,
> > Suren.
> >
> >> --
> >> Michal Hocko
> >> SUSE Labs