linux-kernel - Re: [PATCH] kfence: check kfence canary in panic and reboot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANpmjNM0qeKraYviOXFO4znVE3hUdG8-0VbFbzXzWH8twtQM9w@mail.gmail.com>
Date:   Thu, 21 Apr 2022 15:28:45 +0200
From:   Marco Elver <elver@...gle.com>
To:     Alexander Potapenko <glider@...gle.com>
Cc:     Shaobo Huang <huangshaobo6@...wei.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        chenzefeng2@...wei.com, Dmitriy Vyukov <dvyukov@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        nixiaoming@...wei.com, wangbing6@...wei.com,
        wangfangpeng1@...wei.com, young.liuyang@...wei.com,
        zengweilin@...wei.com, zhongjubin@...wei.com
Subject: Re: [PATCH] kfence: check kfence canary in panic and reboot

On Thu, 21 Apr 2022 at 15:06, Alexander Potapenko <glider@...gle.com> wrote:
[...]
> This report will denote that in a system that could have been running for days a particular skbuff was corrupted by some unknown task at some unknown point in time.
> How do we figure out what exactly caused this corruption?
>
> When we deploy KFENCE at scale, it is rarely possible for the kernel developer to get access to the host that reported the bug and try to reproduce it.
> With that in mind, the report (plus the kernel source) must contain all the necessary information to address the bug, otherwise reporting it will result in wasting the developer's time.
> Moreover, if we report such bugs too often, our tool loses the credit, which is hard to regain.

I second this - in particular we'll want this off in fuzzers etc.,
because it'll just generate reports that nobody can use to debug an
issue. I do see the value in this in potentially narrowing the cause
of a panic, but that information is likely not enough to fully
diagnose the root cause of the panic - it might however prompt to
re-run with KASAN, or check if memory DIMMs are faulty etc.

We can still have this feature, but I suggest to make it
off-by-default, and only enable via a boot param. I'd call it
'kfence.check_on_panic'. For your setup, you can then use it to enable
where you see fit.

Thanks,
-- Marco