lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJRXSb-xK_VxkHqm8NLGhfH1Q_HW_p=ygAH7QocyVh+DQ@mail.gmail.com>
Date:   Tue, 10 Jan 2023 09:53:48 +0100
From:   Eric Dumazet <edumazet@...gle.com>
To:     Alexander Potapenko <glider@...gle.com>
Cc:     Dan Williams <dan.j.williams@...el.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Marco Elver <elver@...gle.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
        Christoph Hellwig <hch@....de>,
        Christoph Lameter <cl@...ux.com>,
        David Rientjes <rientjes@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Ilya Leoshkevich <iii@...ux.ibm.com>,
        Ingo Molnar <mingo@...hat.com>, Jens Axboe <axboe@...nel.dk>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Kees Cook <keescook@...omium.org>,
        Mark Rutland <mark.rutland@....com>,
        Matthew Wilcox <willy@...radead.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Pekka Enberg <penberg@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Petr Mladek <pmladek@...e.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Vegard Nossum <vegard.nossum@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        kasan-dev <kasan-dev@...glegroups.com>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Linux-Arch <linux-arch@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 10/45] libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE

On Tue, Jan 10, 2023 at 9:49 AM Alexander Potapenko <glider@...gle.com> wrote:
>
> On Tue, Jan 10, 2023 at 7:55 AM Dan Williams <dan.j.williams@...el.com> wrote:
> >
> > Greg Kroah-Hartman wrote:
> > > On Mon, Jan 09, 2023 at 02:06:36PM -0800, Dan Williams wrote:
> > > > Alexander Potapenko wrote:
> > > > > On Thu, Jan 5, 2023 at 11:09 PM Dan Williams <dan.j.williams@...el.com> wrote:
> > > > > >
> > > > > > Alexander Potapenko wrote:
> > > > > > > (+ Dan Williams)
> > > > > > > (resending with patch context included)
> > > > > > >
> > > > > > > On Mon, Jul 11, 2022 at 6:27 PM Marco Elver <elver@...gle.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, 1 Jul 2022 at 16:23, Alexander Potapenko <glider@...gle.com> wrote:
> > > > > > > > >
> > > > > > > > > KMSAN adds extra metadata fields to struct page, so it does not fit into
> > > > > > > > > 64 bytes anymore.
> > > > > > > >
> > > > > > > > Does this somehow cause extra space being used in all kernel configs?
> > > > > > > > If not, it would be good to note this in the commit message.
> > > > > > > >
> > > > > > > I actually couldn't verify this on QEMU, because the driver never got loaded.
> > > > > > > Looks like this increases the amount of memory used by the nvdimm
> > > > > > > driver in all kernel configs that enable it (including those that
> > > > > > > don't use KMSAN), but I am not sure how much is that.
> > > > > > >
> > > > > > > Dan, do you know how bad increasing MAX_STRUCT_PAGE_SIZE can be?
> > > > > >
> > > > > > Apologies I missed this several months ago. The answer is that this
> > > > > > causes everyone creating PMEM namespaces on v6.1+ to lose double the
> > > > > > capacity of their namespace even when not using KMSAN which is too
> > > > > > wasteful to tolerate. So, I think "6e9f05dc66f9 libnvdimm/pfn_dev:
> > > > > > increase MAX_STRUCT_PAGE_SIZE" needs to be reverted and replaced with
> > > > > > something like:
> > > > > >
> > > > > > diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
> > > > > > index 79d93126453d..5693869b720b 100644
> > > > > > --- a/drivers/nvdimm/Kconfig
> > > > > > +++ b/drivers/nvdimm/Kconfig
> > > > > > @@ -63,6 +63,7 @@ config NVDIMM_PFN
> > > > > >         bool "PFN: Map persistent (device) memory"
> > > > > >         default LIBNVDIMM
> > > > > >         depends on ZONE_DEVICE
> > > > > > +       depends on !KMSAN
> > > > > >         select ND_CLAIM
> > > > > >         help
> > > > > >           Map persistent memory, i.e. advertise it to the memory
> > > > > >
> > > > > >
> > > > > > ...otherwise, what was the rationale for increasing this value? Were you
> > > > > > actually trying to use KMSAN for DAX pages?
> > > > >
> > > > > I was just building the kernel with nvdimm driver and KMSAN enabled.
> > > > > Because KMSAN adds extra data to every struct page, it immediately hit
> > > > > the following assert:
> > > > >
> > > > > drivers/nvdimm/pfn_devs.c:796:3: error: call to
> > > > > __compiletime_assert_330 declared with 'error' attribute: BUILD_BUG_ON
> > > > > fE
> > > > >                 BUILD_BUG_ON(sizeof(struct page) > MAX_STRUCT_PAGE_SIZE);
> > > > >
> > > > > The comment before MAX_STRUCT_PAGE_SIZE declaration says "max struct
> > > > > page size independent of kernel config", but maybe we can afford
> > > > > making it dependent on CONFIG_KMSAN (and possibly other config options
> > > > > that increase struct page size)?
> > > > >
> > > > > I don't mind disabling the driver under KMSAN, but having an extra
> > > > > ifdef to keep KMSAN support sounds reasonable, WDYT?
> > > >
> > > > How about a module parameter to opt-in to the increased permanent
> > > > capacity loss?
> > >
> > > Please no, this isn't the 1990's, we should never force users to keep
> > > track of new module parameters that you then have to support for
> > > forever.
> >
> > Fair enough, premature enabling. If someone really wants this they can
> > find this thread in the archives and ask for another solution like
> > compile time override.
> >
> > >
> > >
> > > >
> > > > -- >8 --
> > > > >From 693563817dea3fd8f293f9b69ec78066ab1d96d2 Mon Sep 17 00:00:00 2001
> > > > From: Dan Williams <dan.j.williams@...el.com>
> > > > Date: Thu, 5 Jan 2023 13:27:34 -0800
> > > > Subject: [PATCH] nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE
> > > >
> > > > Commit 6e9f05dc66f9 ("libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE")
> > > >
> > > > ...updated MAX_STRUCT_PAGE_SIZE to account for sizeof(struct page)
> > > > potentially doubling in the case of CONFIG_KMSAN=y. Unfortunately this
> > > > doubles the amount of capacity stolen from user addressable capacity for
> > > > everyone, regardless of whether they are using the debug option. Revert
> > > > that change, mandate that MAX_STRUCT_PAGE_SIZE never exceed 64, but
> > > > allow for debug scenarios to proceed with creating debug sized page maps
> > > > with a new 'libnvdimm.page_struct_override' module parameter.
> > > >
> > > > Note that this only applies to cases where the page map is permanent,
> > > > i.e. stored in a reservation of the pmem itself ("--map=dev" in "ndctl
> > > > create-namespace" terms). For the "--map=mem" case, since the allocation
> > > > is ephemeral for the lifespan of the namespace, there are no explicit
> > > > restriction. However, the implicit restriction, of having enough
> > > > available "System RAM" to store the page map for the typically large
> > > > pmem, still applies.
> > > >
> > > > Fixes: 6e9f05dc66f9 ("libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE")
> > > > Cc: <stable@...r.kernel.org>
> > > > Cc: Alexander Potapenko <glider@...gle.com>
> > > > Cc: Marco Elver <elver@...gle.com>
> > > > Reported-by: Jeff Moyer <jmoyer@...hat.com>
> > > > ---
> > > >  drivers/nvdimm/nd.h       |  2 +-
> > > >  drivers/nvdimm/pfn_devs.c | 45 ++++++++++++++++++++++++++-------------
> > > >  2 files changed, 31 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
> > > > index 85ca5b4da3cf..ec5219680092 100644
> > > > --- a/drivers/nvdimm/nd.h
> > > > +++ b/drivers/nvdimm/nd.h
> > > > @@ -652,7 +652,7 @@ void devm_namespace_disable(struct device *dev,
> > > >             struct nd_namespace_common *ndns);
> > > >  #if IS_ENABLED(CONFIG_ND_CLAIM)
> > > >  /* max struct page size independent of kernel config */
> > > > -#define MAX_STRUCT_PAGE_SIZE 128
> > > > +#define MAX_STRUCT_PAGE_SIZE 64
> > > >  int nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap);
> > > >  #else
> > > >  static inline int nvdimm_setup_pfn(struct nd_pfn *nd_pfn,
> > > > diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> > > > index 61af072ac98f..978d63559c0e 100644
> > > > --- a/drivers/nvdimm/pfn_devs.c
> > > > +++ b/drivers/nvdimm/pfn_devs.c
> > > > @@ -13,6 +13,11 @@
> > > >  #include "pfn.h"
> > > >  #include "nd.h"
> > > >
> > > > +static bool page_struct_override;
> > > > +module_param(page_struct_override, bool, 0644);
> > > > +MODULE_PARM_DESC(page_struct_override,
> > > > +            "Force namespace creation in the presence of mm-debug.");
> > >
> > > I can't figure out from this description what this is for so perhaps it
> > > should be either removed and made dynamic (if you know you want to debug
> > > the mm core, why not turn it on then?) or made more obvious what is
> > > happening?
> >
> > I'll kill it and update the KMSAN Documentation that KMSAN has
> > interactions with the NVDIMM subsystem that may cause some namespaces to
> > fail to enable. That Documentation needs to be a part of this patch
> > regardless as that would be the default behavior of this module
> > parameter.
> >
> > Unfortunately, it can not be dynamically enabled because the size of
> > 'struct page' is unfortunately recorded in the metadata of the device.
> > Recall this is for supporting platform configurations where the capacity
> > of the persistent memory exceeds or consumes too much of System RAM.
> > Consider 4TB of PMEM consumes 64GB of space just for 'struct page'. So,
> > NVDIMM subsystem has a mode to store that page array in a reservation on
> > the PMEM device itself.
>
> Sorry, I might be missing something, but why cannot we have
>
> #ifdef CONFIG_KMSAN
> #define MAX_STRUCT_PAGE_SIZE 128
> #else
> #define MAX_STRUCT_PAGE_SIZE 64
> #endif
>

Possibly because this needs to be a fixed size on permanent storage
(like an inode on a disk file system)



> ?
>
> KMSAN is a debug-only tool, it already consumes more than two thirds
> of the system memory, so you don't want to enable it in any production
> environment anyway.
>
> > KMSAN mandates either that all namespaces all the time reserve the extra
> > capacity, or that those namespace cannot be mapped while KMSAN is
> > enabled.
>
> Struct page depends on a couple of config options that affect its
> size, and has already been approaching the 64 byte boundary.
> It is unfortunate that the introduction of KMSAN was the last straw,
> but it could've been any other debug config that needs to store data
> in the struct page.
> Keeping the struct within cacheline size sounds reasonable for the
> default configuration, but having a build-time assert that prevents us
> from building debug configs sounds excessive.
>
> > --
> > You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@...glegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/63bd0be8945a0_5178e29414%40dwillia2-xfh.jf.intel.com.notmuch.
>
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Paul Manicle, Liana Sebastian
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ