lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iDkEo+xG-AJetOfp12RO8qDV0t=AF3rvoq5GKc5VFuzw@mail.gmail.com>
Date:   Tue, 18 Dec 2018 11:07:55 -0800
From:   Dan Williams <dan.j.williams@...el.com>
To:     Mike Rapoport <rppt@...ux.ibm.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Kees Cook <keescook@...omium.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Linux MM <linux-mm@...ck.org>, X86 ML <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 3/5] mm: Shuffle initial free memory to improve
 memory-side-cache utilization

On Tue, Dec 18, 2018 at 1:11 AM Mike Rapoport <rppt@...ux.ibm.com> wrote:
>
> On Mon, Dec 17, 2018 at 11:56:36AM -0800, Dan Williams wrote:
> > On Sun, Dec 16, 2018 at 4:43 AM Mike Rapoport <rppt@...ux.ibm.com> wrote:
> > >
> > > On Fri, Dec 14, 2018 at 05:48:46PM -0800, Dan Williams wrote:
> > > > Randomization of the page allocator improves the average utilization of
> > > > a direct-mapped memory-side-cache. Memory side caching is a platform
> > > > capability that Linux has been previously exposed to in HPC
> > > > (high-performance computing) environments on specialty platforms. In
> > > > that instance it was a smaller pool of high-bandwidth-memory relative to
> > > > higher-capacity / lower-bandwidth DRAM. Now, this capability is going to
> > > > be found on general purpose server platforms where DRAM is a cache in
> > > > front of higher latency persistent memory [1].
> > [..]
> > > > diff --git a/mm/memblock.c b/mm/memblock.c
> > > > index 185bfd4e87bb..fd617928ccc1 100644
> > > > --- a/mm/memblock.c
> > > > +++ b/mm/memblock.c
> > > > @@ -834,8 +834,16 @@ int __init_memblock memblock_set_sidecache(phys_addr_t base, phys_addr_t size,
> > > >               return ret;
> > > >
> > > >       for (i = start_rgn; i < end_rgn; i++) {
> > > > -             type->regions[i].cache_size = cache_size;
> > > > -             type->regions[i].direct_mapped = direct_mapped;
> > > > +             struct memblock_region *r = &type->regions[i];
> > > > +
> > > > +             r->cache_size = cache_size;
> > > > +             r->direct_mapped = direct_mapped;
> > >
> > > I think this change can be merged into the previous patch
> >
> > Ok, will do.
> >
> > > > +             /*
> > > > +              * Enable randomization for amortizing direct-mapped
> > > > +              * memory-side-cache conflicts.
> > > > +              */
> > > > +             if (r->size > r->cache_size && r->direct_mapped)
> > > > +                     page_alloc_shuffle_enable();
> > >
> > > It seems that this is the only use for ->direct_mapped in the memblock
> > > code. Wouldn't cache_size != 0 suffice? I.e., in the code that sets the
> > > memblock region attributes, the cache_size can be set to 0 for the non
> > > direct mapped caches, isn't it?
> > >
> >
> > The HMAT specification allows for other cache-topologies, so it's not
> > sufficient to just look for non-zero size when a platform implements a
> > set-associative cache. The expectation is that a set-associative cache
> > would not need the kernel to perform memory randomization to improve
> > the cache utilization.
> >
> > The check for memory size > cache-size is a sanity check for a
> > platform BIOS or system configuration that mis-reports or mis-sizes
> > the cache.
>
> Apparently I didn't explain my point well.
>
> The acpi_numa_memory_affinity_init() already knows whether the cache is
> direct mapped or a set-associative. It can just skip calling
> memblock_set_sidecache() for the set-associative case.
>
> Another thing I've noticed only now, is that memory randomization is
> enabled if there is at least one memory region with a direct mapped side
> cache attached and once the randomization is on the cache size and the
> mapping mode do not matter. So, I think it's not necessary to store them in
> the memory region at all.

Fair enough. I was anticipating the case when non-ACPI systems gain
this capability, but you're right no need to design that now. The size
sanity check has some small value, but given there is an override and
broken platform firmware would need to be fixed I don't think we lose
much by getting rid of it. Will re-flow without memblock integration.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ