lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOSf1CEZoLw5QqEMTKwiZ+d_qPLp_D9pJZUtnQWMXWpAXOQ2YA@mail.gmail.com>
Date:   Thu, 21 Mar 2019 14:08:45 +1100
From:   Oliver <oohall@...il.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Jan Kara <jack@...e.cz>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Ross Zwisler <zwisler@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

On Thu, Mar 21, 2019 at 7:57 AM Dan Williams <dan.j.williams@...el.com> wrote:
>
> On Wed, Mar 20, 2019 at 8:34 AM Dan Williams <dan.j.williams@...el.com> wrote:
> >
> > On Wed, Mar 20, 2019 at 1:09 AM Aneesh Kumar K.V
> > <aneesh.kumar@...ux.ibm.com> wrote:
> > >
> > > Aneesh Kumar K.V <aneesh.kumar@...ux.ibm.com> writes:
> > >
> > > > Dan Williams <dan.j.williams@...el.com> writes:
> > > >
> > > >>
> > > >>> Now what will be page size used for mapping vmemmap?
> > > >>
> > > >> That's up to the architecture's vmemmap_populate() implementation.
> > > >>
> > > >>> Architectures
> > > >>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
> > > >>> device-dax with struct page in the device will have pfn reserve area aligned
> > > >>> to PAGE_SIZE with the above example? We can't map that using
> > > >>> PMD_SIZE page size?
> > > >>
> > > >> IIUC, that's a different alignment. Currently that's handled by
> > > >> padding the reservation area up to a section (128MB on x86) boundary,
> > > >> but I'm working on patches to allow sub-section sized ranges to be
> > > >> mapped.
> > > >
> > > > I am missing something w.r.t code. The below code align that using nd_pfn->align
> > > >
> > > >       if (nd_pfn->mode == PFN_MODE_PMEM) {
> > > >               unsigned long memmap_size;
> > > >
> > > >               /*
> > > >                * vmemmap_populate_hugepages() allocates the memmap array in
> > > >                * HPAGE_SIZE chunks.
> > > >                */
> > > >               memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
> > > >               offset = ALIGN(start + SZ_8K + memmap_size + dax_label_reserve,
> > > >                               nd_pfn->align) - start;
> > > >       }
> > > >
> > > > IIUC that is finding the offset where to put vmemmap start. And that has
> > > > to be aligned to the page size with which we may end up mapping vmemmap
> > > > area right?
> >
> > Right, that's the physical offset of where the vmemmap ends, and the
> > memory to be mapped begins.
> >
> > > > Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
> > > > is to compute howmany pfns we should map for this pfn dev right?
> > > >
> > >
> > > Also i guess those 4K assumptions there is wrong?
> >
> > Yes, I think to support non-4K-PAGE_SIZE systems the 'pfn' metadata
> > needs to be revved and the PAGE_SIZE needs to be recorded in the
> > info-block.
>
> How often does a system change page-size. Is it fixed or do
> environment change it from one boot to the next? I'm thinking through
> the behavior of what do when the recorded PAGE_SIZE in the info-block
> does not match the current system page size. The simplest option is to
> just fail the device and require it to be reconfigured. Is that
> acceptable?

The kernel page size is set at build time and as far as I know every
distro configures their ppc64(le) kernel for 64K. I've used 4K kernels
a few times in the past to debug PAGE_SIZE dependent problems, but I'd
be surprised if anyone is using 4K in production.

Anyway, my view is that using 4K here isn't really a problem since
it's just the accounting unit of the pfn superblock format. The kernel
reading form it should understand that and scale it to whatever
accounting unit it wants to use internally. Currently we don't so that
should probably be fixed, but that doesn't seem to cause any real
issues. As far as I can tell the only user of npfns in
__nvdimm_setup_pfn() whih prints the "number of pfns truncated"
message.

Am I missing something?

> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@...ts.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ