[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <21d55e78-de3e-4a95-acef-5fdc144f3a9a@amd.com>
Date: Tue, 25 Mar 2025 22:53:16 +0530
From: Nikhil Dhama <nikdhama@....com>
To: Raghavendra K T <raghavendra.kt@....com>,
Nikhil Dhama <nikhil.dhama@....com>, akpm@...ux-foundation.org,
ying.huang@...ux.alibaba.com
Cc: Ying Huang <huang.ying.caritas@...il.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Bharata B Rao <bharata@....com>,
Raghavendra <raghavendra.kodsarathimmappa@....com>
Subject: Re: [PATCH -V2] mm: pcp: scale batch to reduce number of high order
pcp flushes on deallocation
On 3/25/2025 1:30 PM, Raghavendra K T wrote:
> On 3/19/2025 1:44 PM, Nikhil Dhama wrote:
> [...]
>>> And, do you run network related workloads on one machine? If so,
>>> please
>>> try to run them on two machines instead, with clients and servers
>>> run on
>>> different machines. At least, please use different sockets for clients
>>> and servers. Because larger pcp->free_count will make it easier to
>>> trigger free_high heuristics. If that is the case, please try to
>>> optimize free_high heuristics directly too.
>>
>> I agree with Ying Huang, the above change is not the best possible
>> fix for
>> the issue. On futher analysis I figured that root cause of the issue is
>> the frequent pcp high order flushes. During a 20sec iperf3 run
>> I observed on avg 5 pcp high order flushes in kernel v6.6, whereas, in
>> v6.7, I observed about 170 pcp high order flushes.
>> Tracing pcp->free_count, I figured with the patch v1 (patch I suggested
>> earlier) free_count is going into negatives which reduces the number of
>> times free_high heuristics is triggered hence reducing the high order
>> flushes.
>>
>> As Ying Huang Suggested, it helps the performance on increasing the
>> batch size
>> for free_high heuristics. I tried different scaling factors to find best
>> suitable batch value for free_high heuristics,
>>
>>
>> score # free_high
>> ----------- ----- -----------
>> v6.6 (base) 100 4
>> v6.12 (batch*1) 69 170
>> batch*2 69 150
>> batch*4 74 101
>> batch*5 100 53
>> batch*6 100 36
>> batch*8 100 3
>> scaling batch for free_high heuristics with a factor of 5 restores
>> the
>> performance.
>
> Hello Nikhil,
>
> Thanks for looking further on this. But from design standpoint,
> how a batch-size of 5 is helping here is not clear (Andrew's original
> question).
>
> Any case can you post the patch-set in a new email so that the below
> patch is not lost in discussion thread?
Hi Raghavendra,
Thanks, I have posted the patch-set in a new email
link:
https://lore.kernel.org/linux-mm/20250325171915.14384-1-nikhil.dhama@amd.com/
with a better explanation on how scaling batch is helping here.
Thanks,
Nikhil
Powered by blists - more mailing lists