[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ccba1a65-fe4f-89d5-a32b-2efba30a1350@amd.com>
Date: Tue, 7 Feb 2023 12:11:47 +0530
From: Raghavendra K T <raghavendra.kt@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Ingo Molnar <mingo@...hat.com>, Mel Gorman <mgorman@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, rppt@...nel.org,
Bharata B Rao <bharata@....com>,
Disha Talreja <dishaa.talreja@....com>
Subject: Re: [PATCH V2 2/3] sched/numa: Enhance vma scanning logic
On 2/4/2023 11:44 PM, Raghavendra K T wrote:
> On 2/3/2023 4:45 PM, Peter Zijlstra wrote:
>> On Wed, Feb 01, 2023 at 01:32:21PM +0530, Raghavendra K T wrote:
[...]
>
>>> + if (!vma_is_accessed(vma))
>>> + continue;
>>> +
>>> do {
>>> start = max(start, vma->vm_start);
>>> end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE);
>>
>>
>> This feels wrong, specifically we track numa_scan_offset per mm, now, if
>> we divide the threads into two dis-joint groups each only using their
>> own set of vmas (in fact quite common for workloads with proper data
>> partitioning) it is possible to consistently sample one set of threads
>> and thus not scan the other set of vmas.
>>
>> It seems somewhat unlikely, but not impossible to create significant
>> unfairness.
>>
>
> Agree, But that is the reason why we want to allow first few
> unconditional scans Or am I missing something?
>
Thinking further, may be we can summarize the different aspects of
thread/ two disjoint set case itself into:
1) Unfairness because of way in which threads gets opportunity
to scan.
2) Disjoint set of vmas in the partition set could be of different sizes
3) Disjoint set of vmas could be associated with different number of
threads
Each of above can potentially help or make some thread do heavy lifting
but (2), and (3). is what I think we are trying to be Okay with by
making sure tasks mostly do not scan others' vmas
(1) could be a real issue (though I know that there are many places we
have corrected the issue by introducing offset in p->numa_next_scan)
but how the distribution looks now practically, I take it as a TODO and
post.
Powered by blists - more mailing lists