linux-kernel - Re: [PATCH v1 00/11] mm/kasan: support per-page shadow memory to reduce memory consumption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+YREmHViSMsH84bwtEqbUsqsgzaa76eWzJXqmSgqKbgvg@mail.gmail.com>
Date:   Wed, 24 May 2017 08:57:23 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Joonsoo Kim <js1304@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>, kernel-team@....com
Subject: Re: [PATCH v1 00/11] mm/kasan: support per-page shadow memory to
 reduce memory consumption

On Tue, May 16, 2017 at 10:49 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Mon, May 15, 2017 at 11:23 PM, Joonsoo Kim <js1304@...il.com> wrote:
>>> >
>>> > Hello, all.
>>> >
>>> > This is an attempt to recude memory consumption of KASAN. Please see
>>> > following description to get the more information.
>>> >
>>> > 1. What is per-page shadow memory
>>>
>>> Hi Joonsoo,
>>
>> Hello, Dmitry.
>>
>>>
>>> First I need to say that this is great work. I wanted KASAN to consume
>>
>> Thanks!
>>
>>> 1/8-th of _kernel_ memory rather than total physical memory for a long
>>> time.
>>>
>>> However, this implementation does not work inline instrumentation. And
>>> the inline instrumentation is the main mode for KASAN. Outline
>>> instrumentation is merely a rudiment to support gcc 4.9, and it needs
>>> to be removed as soon as we stop caring about gcc 4.9 (do we at all?
>>> is it the current compiler in any distro? Ubuntu 12 has 4.8, Ubuntu 14
>>> already has 5.4. And if you build gcc yourself or get a fresher
>>> compiler from somewhere else, you hopefully get something better than
>>> 4.9).
>>
>> Hmm... I don't think that outline instrumentation is something to be
>> removed. In embedded world, there is a fixed partition table and
>> enlarging the kernel binary would cause the problem. Changing that
>> table is possible but is really uncomfortable thing for debugging
>> something. So, I think that outline instrumentation has it's own merit.
>
> Fair. Let's consider both as important.
>
>> Anyway, I have missed inline instrumentation completely.
>>
>> I will attach the fix in the bottom. It doesn't look beautiful
>> since it breaks layer design (some check will be done at report
>> function). However, I think that it's a good trade-off.
>
>
> I can confirm that inline works with that patch.
>
> I can also confirm that it reduces memory usage. I've booted qemu with
> 2G ram and run some fixed workload. Before:
> 31853 dvyukov   20   0 3043200 765464  21312 S 366.0  4.7   2:39.53
> qemu-system-x86
>  7528 dvyukov   20   0 3043200 732444  21676 S 333.3  4.5   2:23.19
> qemu-system-x86
> After:
> 6192 dvyukov   20   0 3043200 394244  20636 S  17.9  2.4   2:32.95
> qemu-system-x86
>  6265 dvyukov   20   0 3043200 388860  21416 S 399.3  2.4   3:02.88
> qemu-system-x86
>  9005 dvyukov   20   0 3043200 383564  21220 S 397.1  2.3   2:35.33
> qemu-system-x86
>
> However, I see some very significant slowdowns with inline
> instrumentation. I did 3 tests:
> 1. Boot speed, I measured time for a particular message to appear on
> console. Before:
> [    2.504652] random: crng init done
> [    2.435861] random: crng init done
> [    2.537135] random: crng init done
> After:
> [    7.263402] random: crng init done
> [    7.263402] random: crng init done
> [    7.174395] random: crng init done
>
> That's ~3x slowdown.
>
> 2. I've run bench_readv benchmark:
> https://raw.githubusercontent.com/google/sanitizers/master/address-sanitizer/kernel_buildbot/slave/bench_readv.c
> as:
> while true; do time ./bench_readv bench_readv 300000 1; done
>
> Before:
> sys 0m7.299s
> sys 0m7.218s
> sys 0m6.973s
> sys 0m6.892s
> sys 0m7.035s
> sys 0m6.982s
> sys 0m6.921s
> sys 0m6.940s
> sys 0m6.905s
> sys 0m7.006s
>
> After:
> sys 0m8.141s
> sys 0m8.077s
> sys 0m8.067s
> sys 0m8.116s
> sys 0m8.128s
> sys 0m8.115s
> sys 0m8.108s
> sys 0m8.326s
> sys 0m8.529s
> sys 0m8.164s
> sys 0m8.380s
>
> This is ~19% slowdown.
>
> 3. I've run bench_pipes benchmark:
> https://raw.githubusercontent.com/google/sanitizers/master/address-sanitizer/kernel_buildbot/slave/bench_pipes.c
> as:
> while true; do time ./bench_pipes 10 10000 1; done
>
> Before:
> sys 0m5.393s
> sys 0m6.178s
> sys 0m5.909s
> sys 0m6.024s
> sys 0m5.874s
> sys 0m5.737s
> sys 0m5.826s
> sys 0m5.664s
> sys 0m5.758s
> sys 0m5.421s
> sys 0m5.444s
> sys 0m5.479s
> sys 0m5.461s
> sys 0m5.417s
>
> After:
> sys 0m8.718s
> sys 0m8.281s
> sys 0m8.268s
> sys 0m8.334s
> sys 0m8.246s
> sys 0m8.267s
> sys 0m8.265s
> sys 0m8.437s
> sys 0m8.228s
> sys 0m8.312s
> sys 0m8.556s
> sys 0m8.680s
>
> This is ~52% slowdown.
>
>
> This does not look acceptable to me. I would ready to pay for this,
> say, 10% of performance. But it seems that this can have up to 2-4x
> slowdown for some workloads.
>
>
> Your use-case is embed devices where you care a lot about both code
> size and memory consumption, right?
>
> I see 2 possible ways forward:
> 1. Enable this new mode only for outline, but keep current scheme for
> inline. Then outline will be "small but slow" type of configuration.
> 2. Somehow fix slowness (at least in inline mode).
>
>
>> Mapping zero page to non-kernel memory could cause true-negative
>> problem since we cannot flush the TLB in all cpus. We will read zero
>> shadow value value in this case even if actual shadow value is not
>> zero. This is one of the reason that black page is introduced in this
>> patchset.
>
> What does make your current patch work then?
> Say we map a new shadow page, update the page shadow to say that there
> is mapped shadow. Then another CPU loads the page shadow and then
> loads from the newly mapped shadow. If we don't flush TLB, what makes
> the second CPU see the newly mapped shadow?

/\/\/\/\/\/\

Joonsoo, please answer this question above.
I am trying to understand if there is any chance to make mapping a
single page for all non-interesting shadow ranges work. That would be
much simpler change that does not require changing instrumentation,
and will not force inline instrumentation onto slow path for some
ranges (vmalloc?).