lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 May 2017 09:46:40 +0900
From:   Joonsoo Kim <js1304@...il.com>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Potapenko <glider@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>, kernel-team@....com
Subject: Re: [PATCH v1 00/11] mm/kasan: support per-page shadow memory to
 reduce memory consumption

On Wed, May 24, 2017 at 06:31:04PM +0200, Dmitry Vyukov wrote:
> On Wed, May 24, 2017 at 8:04 AM, Joonsoo Kim <js1304@...il.com> wrote:
> >> >> > From: Joonsoo Kim <iamjoonsoo.kim@....com>
> >> >> >
> >> >> > Hello, all.
> >> >> >
> >> >> > This is an attempt to recude memory consumption of KASAN. Please see
> >> >> > following description to get the more information.
> >> >> >
> >> >> > 1. What is per-page shadow memory
> >> >> >
> >> >> > This patch introduces infrastructure to support per-page shadow memory.
> >> >> > Per-page shadow memory is the same with original shadow memory except
> >> >> > the granualarity. It's one byte shows the shadow value for the page.
> >> >> > The purpose of introducing this new shadow memory is to save memory
> >> >> > consumption.
> >> >> >
> >> >> > 2. Problem of current approach
> >> >> >
> >> >> > Until now, KASAN needs shadow memory for all the range of the memory
> >> >> > so the amount of statically allocated memory is so large. It causes
> >> >> > the problem that KASAN cannot run on the system with hard memory
> >> >> > constraint. Even if KASAN can run, large memory consumption due to
> >> >> > KASAN changes behaviour of the workload so we cannot validate
> >> >> > the moment that we want to check.
> >> >> >
> >> >> > 3. How does this patch fix the problem
> >> >> >
> >> >> > This patch tries to fix the problem by reducing memory consumption for
> >> >> > the shadow memory. There are two observations.
> >> >> >
> >> >>
> >> >>
> >> >> I think that the best way to deal with your problem is to increase shadow scale size.
> >> >>
> >> >> You'll need to add tunable to gcc to control shadow size. I expect that gcc has some
> >> >> places where 8-shadow scale size is hardcoded, but it should be fixable.
> >> >>
> >> >> The kernel also have some small amount of code written with KASAN_SHADOW_SCALE_SIZE == 8 in mind,
> >> >> which should be easy to fix.
> >> >>
> >> >> Note that bigger shadow scale size requires bigger alignment of allocated memory and variables.
> >> >> However, according to comments in gcc/asan.c gcc already aligns stack and global variables and at
> >> >> 32-bytes boundary.
> >> >> So we could bump shadow scale up to 32 without increasing current stack consumption.
> >> >>
> >> >> On a small machine (1Gb) 1/32 of shadow is just 32Mb which is comparable to yours 30Mb, but I expect it to be
> >> >> much faster. More importantly, this will require only small amount of simple changes in code, which will be
> >> >> a *lot* more easier to maintain.
> >>
> >>
> >> Interesting option. We never considered increasing scale in user space
> >> due to performance implications. But the algorithm always supported up
> >> to 128x scale. Definitely worth considering as an option.
> >
> > Could you explain me how does increasing scale reduce performance? I
> > tried to guess the reason but failed.
> 
> 
> The main reason is inline instrumentation. Inline instrumentation for
> a check of 8-byte access (which are very common in 64-bit code) is
> just a check of the shadow byte for 0. For smaller accesses we have
> more complex instrumentation that first checks shadow for 0 and then
> does precise check based on size/offset of the access + shadow value.
> That's slower and also increases register pressure and code size
> (which can further reduce performance due to icache overflow). If we
> increase scale to 16/32, all accesses will need that slow path.
> Another thing is stack instrumentation: larger scale will require
> larger redzones to ensure proper alignment. That will increase stack
> frames and also more instructions to poison/unpoison stack shadow on
> function entry/exit.

Now, I see. Thanks for explanation.

Thanks.

Powered by blists - more mailing lists