lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFbkSA3jdtDrWz9-i2ZED5k8uBx6nwrikSO6x22qGeWqj8bgHg@mail.gmail.com>
Date:   Sat, 16 Apr 2022 11:32:08 -0500
From:   Justin Forbes <jforbes@...oraproject.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Yu Zhao <yuzhao@...gle.com>, Stephen Rothwell <sfr@...hwell.id.au>,
        Linux-MM <linux-mm@...ck.org>, Andi Kleen <ak@...ux.intel.com>,
        Aneesh Kumar <aneesh.kumar@...ux.ibm.com>,
        Barry Song <21cnbao@...il.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Jesse Barnes <jsbarnes@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Michael Larabel <Michael@...haellarabel.com>,
        Michal Hocko <mhocko@...nel.org>,
        Mike Rapoport <rppt@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Will Deacon <will@...nel.org>,
        Ying Huang <ying.huang@...el.com>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Kernel Page Reclaim v2 <page-reclaim@...gle.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        Brian Geffon <bgeffon@...gle.com>,
        Jan Alexander Steffens <heftig@...hlinux.org>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        Steven Barrett <steven@...uorix.net>,
        Suleiman Souhlal <suleiman@...gle.com>,
        Daniel Byrne <djbyrne@....edu>,
        Donald Carr <d@...os-reins.com>,
        Holger Hoffstätte <holger@...lied-asynchrony.com>,
        Konstantin Kharlamov <Hi-Angel@...dex.ru>,
        Shuang Zhai <szhai2@...rochester.edu>,
        Sofia Trinh <sofia.trinh@....works>,
        Vaibhav Jain <vaibhav@...ux.ibm.com>
Subject: Re: [PATCH v10 08/14] mm: multi-gen LRU: support page table walks

On Fri, Apr 15, 2022 at 4:33 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Fri, 15 Apr 2022 14:11:32 -0600 Yu Zhao <yuzhao@...gle.com> wrote:
>
> > >
> > > I grabbed
> > > https://kojipkgs.fedoraproject.org//packages/kernel/5.18.0/0.rc2.23.fc37/src/kernel-5.18.0-0.rc2.23.fc37.src.rpm
> > > and
> >
> > Yes, Fedora/RHEL is one concrete example of the model I mentioned
> > above (experimental/stable). I added Justin, the Fedora kernel
> > maintainer, and he can further clarify.

We almost split into 3 scenarios. In rawhide we run a standard Fedora
config for rcX releases and .0, but git snapshots are built with debug
configs only. The trade off is that we can't turn on certain options
which kill performance, but we do get more users running these kernels
which expose real bugs.  The rawhide kernel follows Linus' tree and is
rebuilt most weekdays.  Stable Fedora is not a full debug config, but
in cases where we can keep a debug feature on without it much getting
in the way of performance, as is the case with CONFIG_DEBUG_VM, I
think there is value in keeping those on, until there is not.  And of
course RHEL is a much more conservative config, and a much more
conservative rebase/backport codebase.

> > If we don't want more VM_BUG_ONs, I'll remove them. But (let me
> > reiterate) it seems to me that just defeats the purpose of having
> > CONFIG_DEBUG_VM.
> >
>
> Well, I feel your pain.  It was never expected that VM_BUG_ON() would
> get subverted in this fashion.

Fedora is not trying to subvert anything.  If keeping the option on
becomes problematic, we can simply turn it off.   Fedora certainly has
a more diverse installed base than typical enterprise distributions,
and much more diverse than most QA pools.  Both in the array of
hardware, and in the use patterns, so things do get uncovered that
would not be seen otherwise.

> We could create a new MM-developer-only assertion.  Might even call it
> MM_BUG_ON().  With compile-time enablement but perhaps not a runtime
> switch.
>
> With nice simple semantics, please.  Like "it returns void" and "if you
> pass an expression with side-effects then you lose".  And "if you send
> a patch which produces warnings when CONFIG_MM_BUG_ON=n then you get to
> switch to windows95 for a month".
>
> Let's leave the mglru assertions in place for now and let's think about
> creating something more suitable, with a view to switching mglru over
> to that at a later time.
>
>
>
> But really, none of this addresses the core problem: *_BUG_ON() often
> kills the kernel.  So guess what we just did?  We killed the user's
> kernel at the exact time when we least wished to do so: when they have
> a bug to report to us.  So the thing is self-defeating.
>
> It's much much better to WARN and to attempt to continue.  This makes
> it much more likely that we'll get to hear about the kernel flaw.

I agree very much with this. We hear about warnings from users, they
don't go unnoticed, and several of these users are willing to spend
time to help get to the bottom of an issue. They may not know the
code, but plenty are willing to test various patches or scenarios.

Justin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ