[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7E27A89B-DAFD-43E3-B90D-76E90FEE2EDD@nvidia.com>
Date: Tue, 22 Jun 2021 08:48:14 -0400
From: Zi Yan <ziy@...dia.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: "Huang, Ying" <ying.huang@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Yang Shi <shy828301@...il.com>,
Michal Hocko <mhocko@...e.com>, Wei Xu <weixugc@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Dan Williams <dan.j.williams@...el.com>,
David Hildenbrand <david@...hat.com>,
osalvador <osalvador@...e.de>
Subject: Re: [PATCH -V8 02/10] mm/numa: automatically generate node migration order
On 22 Jun 2021, at 8:06, Dave Hansen wrote:
> Yan, your reply came through in HTML. It doesn't bother me too much,
> but you'll find your replies dropped by LKML and other mailing lists
> if you do this.
Apologies. I used the wrong text mode. Thanks for letting me know.
>
> On 6/21/21 7:50 AM, Zi Yan wrote:
>> Is there a plan of allowing user to change where the migration path
>> starts? Or maybe one step further providing an interface to allow
>> user to specify the demotion path. Something like
>> /sys/devices/system/node/node*/node_demotion.
>
> We actually had this in an earlier series. I pulled it out because we
> don't really *need* this ABI at the moment. But, I totally agree that
> it would be handy for many things, including any non-obvious topology
> where the built-in ordering isn't optimal.
>
>> I don't think that's necessary at least for now. Do you know any
>> real world use case for this?
>>
>> In our P9+volta system, GPU memory is exposed as a NUMA node. For
>> the GPU workloads with data size greater than GPU memory size, it
>> will be very helpful to allow pages in GPU memory to be
>> migrated/demoted to CPU memory. With your current assumption, GPU
>> memory -> CPU memory demotion seems not possible, right? This
>> should also apply to any system with a device memory exposed as a
>> NUMA node and workloads running on the device and using CPU memory
>> as a lower tier memory than the device memory.
>
> Yes, with the current ordering, CPU memory would be demoted to the
> GPU, not the other way around. The right way to fix this (on ACPI
> platforms at least) is probably to use the HMAT table and build the
> demotion based on any memory targets rather than just CPUs.
>
> That would be a great future enhancement to all of this. But, because
> not all systems have HMATs, we also need something more basic, which
> is what is in this series.
This information is very helpful. I agree that reading HMAT table is
the right way. I will look into it. Thanks!
—
Best Regards,
Yan, Zi
Download attachment "signature.asc" of type "application/pgp-signature" (855 bytes)
Powered by blists - more mailing lists