[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <2cfb4e1a-d9be-47ab-b92d-94cd65bfec43@linux.alibaba.com>
Date: Mon, 2 Sep 2024 17:36:36 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Nanyong Sun <sunnanyong@...wei.com>, Matthew Wilcox <willy@...radead.org>
Cc: hughd@...gle.com, akpm@...ux-foundation.org, david@...hat.com,
ryan.roberts@....com, baohua@...nel.org, ioworker0@...il.com,
peterx@...hat.com, ziy@...dia.com, wangkefeng.wang@...wei.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mm: control mthp per process/cgroup
On 2024/8/19 13:58, Nanyong Sun wrote:
> On 2024/8/17 2:15, Matthew Wilcox wrote:
>
>> On Fri, Aug 16, 2024 at 05:13:27PM +0800, Nanyong Sun wrote:
>>> Now the large folio control interfaces is system wide and tend to be
>>> default on: file systems use large folio by default if supported,
>>> mTHP is tend to default enable when boot [1].
>>> When large folio enabled, some workloads have performance benefit,
>>> but some may not and some side effects can happen: the memory usage
>>> may increase, direct reclaim maybe more frequently because of more
>>> large order allocations, result in cpu usage also increases. We observed
>>> this on a product environment which run nginx, the pgscan_direct count
>>> increased a lot than before, can reach to 3000 times per second, and
>>> disable file large folio can fix this.
>> Can you share any details of your nginx workload that shows a regression?
>> The heuristics for allocating large folios are completely untuned, so
>> having data for a workload which performs better with small folios is
>> very valuable.
>>
>> .
> The RPS(/Requests per second/) which is the performance metric of nginx
> workload has no
> regression(also no improvement),we just observed that pgscan_direct
> rate is much higher
> with large folio.
> So far, we have tested some workloads' benchmark, some did not have
> performance improvement
> but also did not have regression.
> In a production environment, different workloads may be deployed on a
> machine. Therefore,
> do we need to add a process/cgroup level control to prevent workloads
> that will not have
> performance improvement from using mTHP? In this way, the memory
> overhead and direct reclaim
> caused by mTHP can be avoided for those process/cgroup.
OK. So no regression with mTHP, seems just some theoretical analysis.
IMHO, it would be better to evaluate your 'per-cgroup mTHP control' idea
on some real workloads, and gather some data to evaluation, which can be
more convincing.
Just my 2 cents:)
Powered by blists - more mailing lists