[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAeHK+xEQ2krRDrPPFmOvp-pR+jR179VDg1iwd+mB0hVZ9rsgg@mail.gmail.com>
Date: Thu, 22 Oct 2020 19:00:43 +0200
From: Andrey Konovalov <andreyknvl@...gle.com>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Vincenzo Frascino <vincenzo.frascino@....com>,
Alexander Potapenko <glider@...gle.com>,
Marco Elver <elver@...gle.com>,
Evgenii Stepanov <eugenis@...gle.com>,
Kostya Serebryany <kcc@...gle.com>,
Peter Collingbourne <pcc@...gle.com>,
Serban Constantinescu <serbanc@...gle.com>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
Elena Petrova <lenaptr@...gle.com>,
Branislav Rankov <Branislav.Rankov@....com>,
Kevin Brodsky <kevin.brodsky@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
kasan-dev <kasan-dev@...glegroups.com>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v2 00/21] kasan: hardware tag-based mode for
production use on arm64
On Thu, Oct 22, 2020 at 5:16 PM Dmitry Vyukov <dvyukov@...gle.com> wrote:
>
> On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov <andreyknvl@...gle.com> wrote:
> >
> > This patchset is not complete (hence sending as RFC), but I would like to
> > start the discussion now and hear people's opinions regarding the
> > questions mentioned below.
> >
> > === Overview
> >
> > This patchset adopts the existing hardware tag-based KASAN mode [1] for
> > use in production as a memory corruption mitigation. Hardware tag-based
> > KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory
> > and pointer tagging. Please see [3] and [4] for detailed analysis of how
> > MTE helps to fight memory safety problems.
> >
> > The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a
> > boot time switch, that allows to choose between a debugging mode, that
> > includes all KASAN features as they are, and a production mode, that only
> > includes the essentials like tag checking.
> >
> > It is essential that switching between these modes doesn't require
> > rebuilding the kernel with different configs, as this is required by the
> > Android GKI initiative [5].
> >
> > The patch titled "kasan: add and integrate kasan boot parameters" of this
> > series adds a few new boot parameters:
> >
> > kasan.mode allows choosing one of main three modes:
> >
> > - kasan.mode=off - no checks at all
> > - kasan.mode=prod - only essential production features
> > - kasan.mode=full - all features
> >
> > Those mode configs provide default values for three more internal configs
> > listed below. However it's also possible to override the default values
> > by providing:
> >
> > - kasan.stack=off/on - enable stacks collection
> > (default: on for mode=full, otherwise off)
> > - kasan.trap=async/sync - use async or sync MTE mode
> > (default: sync for mode=full, otherwise async)
> > - kasan.fault=report/panic - only report MTE fault or also panic
> > (default: report)
> >
> > === Benchmarks
> >
> > For now I've only performed a few simple benchmarks such as measuring
> > kernel boot time and slab memory usage after boot. The benchmarks were
> > performed in QEMU and the results below exclude the slowdown caused by
> > QEMU memory tagging emulation (as it's different from the slowdown that
> > will be introduced by hardware and therefore irrelevant).
> >
> > KASAN_HW_TAGS=y + kasan.mode=off introduces no performance or memory
> > impact compared to KASAN_HW_TAGS=n.
> >
> > kasan.mode=prod (without executing the tagging instructions) introduces
> > 7% of both performace and memory impact compared to kasan.mode=off.
> > Note, that 4% of performance and all 7% of memory impact are caused by the
> > fact that enabling KASAN essentially results in CONFIG_SLAB_MERGE_DEFAULT
> > being disabled.
> >
> > Recommended Android config has CONFIG_SLAB_MERGE_DEFAULT disabled (I assume
> > for security reasons), but Pixel 4 has it enabled. It's arguable, whether
> > "disabling" CONFIG_SLAB_MERGE_DEFAULT introduces any security benefit on
> > top of MTE. Without MTE it makes exploiting some heap corruption harder.
> > With MTE it will only make it harder provided that the attacker is able to
> > predict allocation tags.
> >
> > kasan.mode=full has 40% performance and 30% memory impact over
> > kasan.mode=prod. Both come from alloc/free stack collection.
FTR, this only accounts for slab memory overhead that comes from
redzones that store stack ids. There's also page_alloc overhead from
the stacks themselves, which I didn't measure yet.
> >
> > === Questions
> >
> > Any concerns about the boot parameters?
>
> For boot parameters I think we are now "safe" in the sense that we
> provide maximum possible flexibility and can defer any actual
> decisions.
Perfect!
I realized that I actually forgot to think about the default values
when no boot params are specified, I'll fix this in the next version.
> > Should we try to deal with CONFIG_SLAB_MERGE_DEFAULT-like behavor mentioned
> > above?
>
> How hard it is to allow KASAN with CONFIG_SLAB_MERGE_DEFAULT? Are
> there any principal conflicts?
I'll explore this.
> The numbers you provided look quite substantial (on a par of what MTE
> itself may introduce). So I would assume if a vendor does not have
> CONFIG_SLAB_MERGE_DEFAULT disabled, it may not want to disable it
> because of MTE (effectively doubles overhead).
Sounds reasonable.
Thanks!
Powered by blists - more mailing lists