lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ab6b06d6-d2e8-4695-9eda-5bfaf507c2f1@huawei.com>
Date: Tue, 29 Jul 2025 21:52:18 +0800
From: Quanmin Yan <yanquanmin1@...wei.com>
To: SeongJae Park <sj@...nel.org>
CC: zuoze <zuoze1@...wei.com>, <akpm@...ux-foundation.org>,
	<damon@...ts.linux.dev>, <linux-kernel@...r.kernel.org>,
	<linux-mm@...ck.org>, <wangkefeng.wang@...wei.com>
Subject: Re: [RFC PATCH] mm/damon: add full LPAE support for memory monitoring
 above 4GB


在 2025/7/27 1:16, SeongJae Park 写道:
> Hi Quanmin,
>
> On Sat, 26 Jul 2025 11:14:19 +0800 Quanmin Yan <yanquanmin1@...wei.com> wrote:
>
>> 在 2025/7/26 4:22, SeongJae Park 写道:
>>> On Fri, 25 Jul 2025 11:15:22 +0800 zuoze <zuoze1@...wei.com> wrote:
>>>
>>>> 在 2025/4/23 1:43, SeongJae Park 写道:
>>>>> On Tue, 22 Apr 2025 19:50:11 +0800 zuoze <zuoze1@...wei.com> wrote:
>>>>>
>>>>> [...]
>>>>>> Thanks for the patches - I’ve noted the RFC series and user-space
>>>>>> updates. Apologies for the delay; I’ll prioritize reviewing these soon
>>>>>> to verify they meet the intended tracking goals. Appreciate your
>>>>>> patience.
>>>>> No worry.  Please take your time and let me know if there is anything I can
>>>>> help.
>>>>>
>>>>> I think we can improve the user-space tool support better for usability.  For
>>>>> example, it could find LPAE case, set addr_unit parameter, and convert
>>>>> user-input and output address ranges on its own.  But hopefully the current
>>>>> support allows simple tests of the kernel side change, and we could do such
>>>>> improvement after the kernel side change is made.
>>>>>
>>>>>
>>>> Hi SJ,
>>>>
>>>> Apologies for the delayed response. We've verified your patch in our
>>>> environment and confirmed it supports LPAE address monitoring.
>>> No worry, thank you for testing that :)
>>>
>>>> However,
>>>> we observed some anomalies in the reclaim functionality. During code
>>>> review, we identified a few issues:
>>>>
>>>> The semantic meaning of damon_region changed after addr_unit was
>>>> introduced. The units in damon_addr_range may no longer represent bytes
>>>> directly.
>>> You're right, and this is an intended change.
>>>
>>>> The size returned by damon_sz_region() now requires multiplication by
>>>> addr_unit to get the actual byte count.
>>> Again, this is an intended change.  damon_sz_region() callers should aware this
>>> semantic and updated accordingly, if it could make a real problem otherwise.
>>> If you found such changes required cases that this patch series is missing,
>>> could you please list up?
>>>
>>>> Heavy usage of damon_sz_region() and DAMON_MIN_REGION likely requires
>>>> addr_unit-aware adjustments throughout the codebase. While this approach
>>>> works, it would involve considerable changes.
>>> It has been a while since I wrote this patch series, but at least while writing
>>> it, I didn't find such required changes.  Of course I should missed something,
>>> though.  As I mentioned above, could you please list such changes required
>>> parts that makes problem?  That would be helpful at finding the path forward.
>>>
>>>> What's your perspective on
>>>> how we should proceed?
>>> Let's see the list of required additional changes with why those are required
>>> (what problems can happen if such chadnges are not made), and discuss.
>> Hi SJ,
>>
>> Thank you for your email reply. Let's discuss the impacts introduced after
>> incorporating addr_unit. First of all, it's essential to clarify that the
>> definition of damon_addr_range (in damon_region) has changed, we will now use
>> damon_addr_range * addr_unit to calculate physical addresses.
>>
>> I've noticed some issues, in mm/damon/core.c:
>>
>>    damos_apply_scheme()
>>        ...
>>        unsigned long sz = damon_sz_region(r);  // the unit of 'sz' is no longer bytes.
>>        ...
>>        if (c->ops.apply_scheme)
>>            if (quota->esz && quota->charged_sz + sz > quota->esz)
>>                sz = ALIGN_DOWN(quota->esz - quota->charged_sz,
>>                        DAMON_MIN_REGION);  // the core issue lies here.
>>            ...
>>            quota->charged_sz += sz;    // note the units.
>>        ...
>>        update_stat:
>>            // 'sz' should be multiplied by addr_unit:
>>            damos_update_stat(s, sz, sz_applied, sz_ops_filter_passed);
>>
>> Currently, DAMON_MIN_REGION is defined as PAGE_SIZE, therefore aligning
>> sz downward to DAMON_MIN_REGION is likely unreasonable. Meanwhile, the unit
>> of sz in damos_quota is also not bytes, which necessitates updates to comments
>> and user documentation. Additionally, the calculation involving DAMON_MIN_REGION
>> requires reconsideration. Here are a few examples:
>>
>>    damos_skip_charged_region()
>>        ...
>>        sz_to_skip = ALIGN_DOWN(quota->charge_addr_from -
>>                        r->ar.start, DAMON_MIN_REGION);
>>        ...
>>        if (damon_sz_region(r) <= DAMON_MIN_REGION)
>>                        return true;
>>        sz_to_skip = DAMON_MIN_REGION;
>>
>>    damon_region_sz_limit()
>>          ...
>>        if (sz < DAMON_MIN_REGION)
>>            sz = DAMON_MIN_REGION;
> Thank you for this kind and detailed explanation of the issue!  I understand
> adopting addr_unit would make DAMON_MINREGION 'addr_unit * 4096' bytes, and it
> is not a desired result when 'addr_unit' is large.  For example, if 'addr_unit'
> is set as 4096, the access monitoring and operation schemes will work in only
>> 16 MiB granularity at the best.
>> Now I can think of two approaches, one is to keep sz in bytes, this requires
>> modifications to many other call sites that use these two functions (at least
>> passing the corresponding ctx->addr_unit. By the way, should we restrict the
>> input of addr_unit?):
>>
>>    damos_apply_scheme()
>>        ...
>>    -    unsigned long sz = damon_sz_region(r);
>>    +    unsigned long sz = damon_sz_region(r) * c->addr_unit;
>>        ...
>>    -    damon_split_region_at(t, r, sz);
>>    +    damon_split_region_at(t, r, sz / c->addr_unit);
>>
>> The second approach is to divide by addr_unit when applying DAMON_MIN_REGION,
>> and revert to byte units for statistics, this approach seems to involve
>> significant changes as well:
>>
>>    damos_apply_scheme()
>>        ...
>>        sz = ALIGN_DOWN(quota->esz - quota->charged_sz,
>>    -                    DAMON_MIN_REGION);
>>    +                    DAMON_MIN_REGION / c->addr_unit);
>>        ...
>>    update_stat:
>>    -    damos_update_stat(s, sz, sz_applied, sz_ops_filter_passed);
>>    +    damos_update_stat(s, sz, sz_applied * c->addr_unit, sz_ops_filter_passed);
>>
>> These are my observations. What's your perspective on how we should proceed? Looking
>> forward to your reply.
> I think the second approach is better.  But I want to avoid changing every
> DAMON_MIN_REGION usage.  What about changing DAMON_MIN_REGION as 'max(4096 /
> addr_unit, 1)' instead?  Specifically, we can change DAMON_MIN_REGION from a
> global macro value to per-context variable (a field of damon_ctx), and set it
> accordingly when the parameters are set.
>
> For stats, I think the users should aware of the fact DAMON is working with the
> addr_unit, so they should multiply addr_unit to the stats to get bytes
> information.  So, I think the stats update in kernel is not really required.
> DAMON user-space tool may need to be updated accordingly, though.
>
> I didn't take time to think about all corner cases, so I may missing something.
> Please let me knwo if you find such missing things.

Hi SJ,

Apologies for the delayed response to this email. Following your suggested method,
I added implementation damon_ctx->min_region, and also uncovered additional
issues specific to 32-bit platforms. I've prepared a patch series, which is
currently under testing. I'll get back to you as soon as the verification is complete.

Thanks,
Quanmin Yan

>
> Thanks,
> SJ
>
> [...]
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ