linux-kernel - Re: [memcg] 0f12156dff: will-it-scale.per_process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YTedOVCV3s6Z210f@carbon.dhcp.thefacebook.com>
Date:   Tue, 7 Sep 2021 10:11:21 -0700
From:   Roman Gushchin <guro@...com>
To:     Jens Axboe <axboe@...nel.dk>
CC:     Linus Torvalds <torvalds@...ux-foundation.org>,
        kernel test robot <oliver.sang@...el.com>,
        Vasily Averin <vvs@...tuozzo.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Andrei Vagin <avagin@...il.com>,
        Borislav Petkov <bp@...en8.de>, Borislav Petkov <bp@...e.de>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Dmitry Safonov <0x7f454c46@...il.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        "J. Bruce Fields" <bfields@...ldses.org>,
        Jeff Layton <jlayton@...nel.org>,
        Jiri Slaby <jirislaby@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        Michal Hocko <mhocko@...nel.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Serge Hallyn <serge@...lyn.com>, Tejun Heo <tj@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Yutian Yang <nglaive@...il.com>,
        Zefan Li <lizefan.x@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, <lkp@...ts.01.org>,
        kernel test robot <lkp@...el.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        Feng Tang <feng.tang@...el.com>,
        Zhengjun Xing <zhengjun.xing@...ux.intel.com>
Subject: Re: [memcg] 0f12156dff: will-it-scale.per_process_ops -33.6%
 regression

On Tue, Sep 07, 2021 at 10:43:46AM -0600, Jens Axboe wrote:
> On 9/7/21 10:39 AM, Linus Torvalds wrote:
> > On Tue, Sep 7, 2021 at 8:46 AM Jens Axboe <axboe@...nel.dk> wrote:
> >>
> >> Are we at all worried about these? There's been a number of them
> >> reported, basically for all the accounting enablements that have been
> >> done in this merge window.
> > 
> > We are worried about them. I'm considering reverting several of them
> > because I think the problems are
> > 
> >  (a) big
> > 
> >  (b) nontrivial
> > 
> > and the patches clearly weren't ready and people weren't aware of this issue.
> 
> I think that is prudent. When I first enabled it for io_uring it was a
> bit of a shit show in terms of performance degradations, and some work
> had to be done before it could get enabled in a sane fashion.
> 
> The accounting needs to be more efficient if we're seeing 30-50%
> slowdowns simply by enabling it on a kmem cache.

There are two polar cases:
1) a big number of relatively short-living allocations, which lifetime is well
   bounded (e.g. by a lifetime of a task),
2) a relatively small number of long-living allocations, which lifetime
   is potentially indefinite (e.g. struct mount).

We can't use the same approach for both cases, otherwise we'll run into either
performance or garbage collection problems (which also lead to performance
problems, but delayed).

I think of maybe building a generic cache layer for accounted allocations,
which can be used in cases like io_uring. Shakeel, what's your plan here?

As now, I agree that reverting patches causing a significant regression is best
way forward.