[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <af4f3e15-a306-4728-b5bf-1deaa700c99b@huawei-partners.com>
Date: Tue, 3 Feb 2026 17:25:11 +0300
From: Gutierrez Asier <gutierrez.asier@...wei-partners.com>
To: SeongJae Park <sj@...nel.org>
CC: <artem.kuzin@...wei.com>, <stepanov.anatoly@...wei.com>,
<wangkefeng.wang@...wei.com>, <yanquanmin1@...wei.com>, <zuoze1@...wei.com>,
<damon@...ts.linux.dev>, <akpm@...ux-foundation.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v1 0/4] mm/damon: Support hot application detections
SeongJae,
Thanks a lot for all the useful feedback.
One thing that I was not sure about while working on this patch set
is whether to have an external new module or adding the logic to
damon core. I mean, the hot application detecting can be useful for
all other modules and can improve DAMON performance. What do you think?
My implementation was module based because I tried to avoid changes
to DAMON core for the RFC.
On 2/3/2026 4:10 AM, SeongJae Park wrote:
> Hello Asier,
>
>
> Thank you for sharing this nice RFC patch series!
>
> On Mon, 2 Feb 2026 14:56:45 +0000 <gutierrez.asier@...wei-partners.com> wrote:
>
>> From: Asier Gutierrez <gutierrez.asier@...wei-partners.com>
>>
>> Overview
>> ----------
>>
>> This patch set introduces a new dynamic mechanism for detecting hot applications
>> and hot regions in those applications.
>>
>> Motivation
>> -----------
>>
>> Currently DAMON requires the system administrator to provide information about
>> which application needs to be monitored and all the parameters. Ideally this
>> should be done automatically, with minimal intervention from the system
>> administrator.
>>
>>
>> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
>> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
>> fragmentation and memory waste. For this reason, most application guides and
>> system administrators suggest to disable THP.
>>
>> We would like to detect: 1. which applications are hot in the system and 2.
>> which memory regions are hot in order to collapse those regions.
>>
>>
>> Solution
>> -----------
>>
>> ┌────────────┐ ┌────────────┐
>> │Damon_module│ │Task_monitor│
>> └──────┬─────┘ └──────┬─────┘
>> │ start │
>> │───────────────────────>│
>> │ │
>> │ │────┐
>> │ │ │ calculate task load
>> │ │<───┘
>> │ │
>> │ │────┐
>> │ │ │ sort tasks
>> │ │<───┘
>> │ │
>> │ │────┐
>> │ │ │ start kdamond for top 3 tasks
>> │ │<───┘
>> ┌──────┴─────┐ ┌──────┴─────┐
>> │Damon_module│ │Task_monitor│
>> └────────────┘ └────────────┘
>>
>>
>> We calculate the task load base on the sum of all the utime for all the threads
>> in a given task. Once we get total utime, we use the exponential load average
>> provided by calc_load. The tasks that become cold, the kdamond will be stopped
>> for them.
>
> Sounds interesting, and this high level idea makes sense to me. :)
>
> I'd like to further learn a few things. Is there a reason to think the top 3
> tasks are enough number of tasks? Also, what if a region was hot and
> successfully promoted to use huge pages, but later be cold? Should we also
> have a DAMOS scheme for splitting such no-more-hot huge pages?
>
>>
>> In each kdamond, we start with a high min_access value. Our goal is to find the
>> "maximum" min_access value at which point the DAMON action is applied. In each
>> cycle, if no action is applied, we lower the min_access.
>
> Sounds like a nice auto-tuning. And we have DAMOS quota goal for that kind of
> auto-tuning. Have you considered using that?
>
>>
>> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
>> collapse synchronously and avoid polluting khugepaged and other parts of the MM
>> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
>> which needs the correct vm_flags_t set.
>>
>> Benchmark
>> -----------
>
> Seems you forgot writing this section up. Or, you don't have benchmark results
> yet, but only mistakenly wrote the above section header? Either is fine, as
> this is just an RFC. Nevertheless, test results and your expected use case of
> this patch series will be very helpful.
>
>
> Thanks,
> SJ
>
> [...]
>
--
Asier Gutierrez
Huawei
Powered by blists - more mailing lists