[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874jrqb5ms.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Mon, 13 Feb 2023 11:34:19 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Bharata B Rao <bharata@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<mgorman@...e.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>, <x86@...nel.org>,
<akpm@...ux-foundation.org>, <luto@...nel.org>,
<tglx@...utronix.de>, <yue.li@...verge.com>,
<Ravikumar.Bangoria@....com>
Subject: Re: [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing
Bharata B Rao <bharata@....com> writes:
> On 2/13/2023 8:26 AM, Huang, Ying wrote:
>> Bharata B Rao <bharata@....com> writes:
>>
>>> On 2/8/2023 11:33 PM, Peter Zijlstra wrote:
>>>> On Wed, Feb 08, 2023 at 01:05:28PM +0530, Bharata B Rao wrote:
>>>>
>>>>
>>>>> - Hardware provided access information could be very useful for driving
>>>>> hot page promotion in tiered memory systems. Need to check if this
>>>>> requires different tuning/heuristics apart from what NUMA balancing
>>>>> already does.
>>>>
>>>> I think Huang Ying looked at that from the Intel POV and I think the
>>>> conclusion was that it doesn't really work out. What you need is
>>>> frequency information, but the PMU doesn't really give you that. You
>>>> need to process a *ton* of PMU data in-kernel.
>>>
>>> What I am doing here is to feed the access data into NUMA balancing which
>>> already has the logic to aggregate that at task and numa group level and
>>> decide if that access is actionable in terms of migrating the page. In this
>>> context, I am not sure about the frequency information that you and Dave
>>> are mentioning. AFAIU, existing NUMA balancing takes care of taking
>>> action, IBS becomes an alternative source of access information to NUMA
>>> hint faults.
>>
>> We do need frequency information to determine whether a page is hot
>> enough to be migrated to the fast memory (promotion). What PMU provided
>> is just "recently" accessed pages, not "frequently" accessed pages. For
>> current NUMA balancing implementation, please check
>> NUMA_BALANCING_MEMORY_TIERING in should_numa_migrate_memory(). In
>> general, it estimates the page access frequency via measuring the
>> latency between page table scanning and page fault, the shorter the
>> latency, the higher the frequency. This isn't perfect, but provides a
>> starting point. You need to consider how to get frequency information
>> via PMU. For example, you may count access number for each page, aging
>> them periodically, and get hot threshold via some statistics.
>
> For the tiered memory hot page promotion case of NUMA balancing, we will
> have to maintain frequency information in software when such information
> isn't available from the hardware.
Yes. It's challenging to calculate frequency information. Please
consider how to do that.
Best Regards,
Huang, Ying
Powered by blists - more mailing lists