[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <217e817e-2f91-91a5-1bef-16fb0cbacb63@intel.com>
Date: Tue, 31 Jan 2017 10:04:11 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: John Hubbard <jhubbard@...dia.com>,
Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Cc: mhocko@...e.com, vbabka@...e.cz, mgorman@...e.de,
minchan@...nel.org, aneesh.kumar@...ux.vnet.ibm.com,
bsingharora@...il.com, srikar@...ux.vnet.ibm.com,
haren@...ux.vnet.ibm.com, jglisse@...hat.com,
dan.j.williams@...el.com
Subject: Re: [RFC V2 03/12] mm: Change generic FALLBACK zonelist creation
process
On 01/30/2017 11:25 PM, John Hubbard wrote:
> I also don't like having these policies hard-coded, and your 100x
> example above helps clarify what can go wrong about it. It would be
> nicer if, instead, we could better express the "distance" between nodes
> (bandwidth, latency, relative to sysmem, perhaps), and let the NUMA
> system figure out the Right Thing To Do.
>
> I realize that this is not quite possible with NUMA just yet, but I
> wonder if that's a reasonable direction to go with this?
In the end, I don't think the kernel can make the "right" decision very
widely here.
Intel's Xeon Phis have some high-bandwidth memory (MCDRAM) that
evidently has a higher latency than DRAM. Given a plain malloc(), how
is the kernel to know that the memory will be used for AVX-512
instructions that need lots of bandwidth vs. some random data structure
that's latency-sensitive?
In the end, I think all we can do is keep the kernel's existing default
of "low latency to the CPU that allocated it", and let apps override
when that policy doesn't fit them.
Powered by blists - more mailing lists