lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 3 Mar 2022 10:30:30 +0100
From:   Marco Elver <elver@...gle.com>
To:     Alexander Potapenko <glider@...gle.com>
Cc:     Tianchen Ding <dtcccc@...ux.alibaba.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        kasan-dev <kasan-dev@...glegroups.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 0/2] Alloc kfence_pool after system startup

On Thu, 3 Mar 2022 at 10:05, Alexander Potapenko <glider@...gle.com> wrote:

I share Alex's concerns.

> On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding <dtcccc@...ux.alibaba.com> wrote:
>>
>> KFENCE aims at production environments, but it does not allow enabling
>> after system startup because kfence_pool only alloc pages from memblock.
>> Consider the following production scene:
>> At first, for performance considerations, production machines do not
>> enable KFENCE.
>
> What are the performance considerations you have in mind? Are you running KFENCE with a very aggressive sampling rate?

Indeed, what is wrong with simply starting up KFENCE with a sample
interval of 10000? However, I very much doubt that you'll notice any
performance issues above 500ms.

Do let us know what performance issues you have seen. It may be
related to an earlier version of KFENCE but has since been fixed (see
log).

>> However, after running for a while, the kernel is suspected to have
>> memory errors. (e.g., a sibling machine crashed.)
>
> I have doubts regarding this setup. It might be faster (although one can tune KFENCE to have nearly zero performance impact), but is harder to maintain.
> It will also catch fewer errors than if you just had KFENCE on from the very beginning:
>  - sibling machines may behave differently, and a certain bug may only occur once - in that case the secondary instances won't notice it, even with KFENCE;
>  - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which may stay under radar for a very long time.
>
>>
>> So other production machines need to enable KFENCE, but it's hard for
>> them to reboot.
>>
>> The 1st patch allows re-enabling KFENCE if the pool is already
>> allocated from memblock.

Patch 1/2 might be ok by itself, but I still don't see the point
because you should just leave KFENCE enabled. There should be no
reason to have to turn it off. If anything, you can increase the
sample interval to something very large if needed.

Powered by blists - more mailing lists