[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250726171616.53704-1-sj@kernel.org>
Date: Sat, 26 Jul 2025 10:16:16 -0700
From: SeongJae Park <sj@...nel.org>
To: Quanmin Yan <yanquanmin1@...wei.com>
Cc: SeongJae Park <sj@...nel.org>,
zuoze <zuoze1@...wei.com>,
akpm@...ux-foundation.org,
damon@...ts.linux.dev,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
wangkefeng.wang@...wei.com
Subject: Re: [RFC PATCH] mm/damon: add full LPAE support for memory monitoring above 4GB
Hi Quanmin,
On Sat, 26 Jul 2025 11:14:19 +0800 Quanmin Yan <yanquanmin1@...wei.com> wrote:
>
> 在 2025/7/26 4:22, SeongJae Park 写道:
> > On Fri, 25 Jul 2025 11:15:22 +0800 zuoze <zuoze1@...wei.com> wrote:
> >
> >>
> >> 在 2025/4/23 1:43, SeongJae Park 写道:
> >>> On Tue, 22 Apr 2025 19:50:11 +0800 zuoze <zuoze1@...wei.com> wrote:
> >>>
> >>> [...]
> >>>> Thanks for the patches - I’ve noted the RFC series and user-space
> >>>> updates. Apologies for the delay; I’ll prioritize reviewing these soon
> >>>> to verify they meet the intended tracking goals. Appreciate your
> >>>> patience.
> >>> No worry. Please take your time and let me know if there is anything I can
> >>> help.
> >>>
> >>> I think we can improve the user-space tool support better for usability. For
> >>> example, it could find LPAE case, set addr_unit parameter, and convert
> >>> user-input and output address ranges on its own. But hopefully the current
> >>> support allows simple tests of the kernel side change, and we could do such
> >>> improvement after the kernel side change is made.
> >>>
> >>>
> >> Hi SJ,
> >>
> >> Apologies for the delayed response. We've verified your patch in our
> >> environment and confirmed it supports LPAE address monitoring.
> > No worry, thank you for testing that :)
> >
> >> However,
> >> we observed some anomalies in the reclaim functionality. During code
> >> review, we identified a few issues:
> >>
> >> The semantic meaning of damon_region changed after addr_unit was
> >> introduced. The units in damon_addr_range may no longer represent bytes
> >> directly.
> > You're right, and this is an intended change.
> >
> >> The size returned by damon_sz_region() now requires multiplication by
> >> addr_unit to get the actual byte count.
> > Again, this is an intended change. damon_sz_region() callers should aware this
> > semantic and updated accordingly, if it could make a real problem otherwise.
> > If you found such changes required cases that this patch series is missing,
> > could you please list up?
> >
> >> Heavy usage of damon_sz_region() and DAMON_MIN_REGION likely requires
> >> addr_unit-aware adjustments throughout the codebase. While this approach
> >> works, it would involve considerable changes.
> > It has been a while since I wrote this patch series, but at least while writing
> > it, I didn't find such required changes. Of course I should missed something,
> > though. As I mentioned above, could you please list such changes required
> > parts that makes problem? That would be helpful at finding the path forward.
> >
> >> What's your perspective on
> >> how we should proceed?
> > Let's see the list of required additional changes with why those are required
> > (what problems can happen if such chadnges are not made), and discuss.
>
> Hi SJ,
>
> Thank you for your email reply. Let's discuss the impacts introduced after
> incorporating addr_unit. First of all, it's essential to clarify that the
> definition of damon_addr_range (in damon_region) has changed, we will now use
> damon_addr_range * addr_unit to calculate physical addresses.
>
> I've noticed some issues, in mm/damon/core.c:
>
> damos_apply_scheme()
> ...
> unsigned long sz = damon_sz_region(r); // the unit of 'sz' is no longer bytes.
> ...
> if (c->ops.apply_scheme)
> if (quota->esz && quota->charged_sz + sz > quota->esz)
> sz = ALIGN_DOWN(quota->esz - quota->charged_sz,
> DAMON_MIN_REGION); // the core issue lies here.
> ...
> quota->charged_sz += sz; // note the units.
> ...
> update_stat:
> // 'sz' should be multiplied by addr_unit:
> damos_update_stat(s, sz, sz_applied, sz_ops_filter_passed);
>
> Currently, DAMON_MIN_REGION is defined as PAGE_SIZE, therefore aligning
> sz downward to DAMON_MIN_REGION is likely unreasonable. Meanwhile, the unit
> of sz in damos_quota is also not bytes, which necessitates updates to comments
> and user documentation. Additionally, the calculation involving DAMON_MIN_REGION
> requires reconsideration. Here are a few examples:
>
> damos_skip_charged_region()
> ...
> sz_to_skip = ALIGN_DOWN(quota->charge_addr_from -
> r->ar.start, DAMON_MIN_REGION);
> ...
> if (damon_sz_region(r) <= DAMON_MIN_REGION)
> return true;
> sz_to_skip = DAMON_MIN_REGION;
>
> damon_region_sz_limit()
> ...
> if (sz < DAMON_MIN_REGION)
> sz = DAMON_MIN_REGION;
Thank you for this kind and detailed explanation of the issue! I understand
adopting addr_unit would make DAMON_MINREGION 'addr_unit * 4096' bytes, and it
is not a desired result when 'addr_unit' is large. For example, if 'addr_unit'
is set as 4096, the access monitoring and operation schemes will work in only
>16 MiB granularity at the best.
>
> Now I can think of two approaches, one is to keep sz in bytes, this requires
> modifications to many other call sites that use these two functions (at least
> passing the corresponding ctx->addr_unit. By the way, should we restrict the
> input of addr_unit?):
>
> damos_apply_scheme()
> ...
> - unsigned long sz = damon_sz_region(r);
> + unsigned long sz = damon_sz_region(r) * c->addr_unit;
> ...
> - damon_split_region_at(t, r, sz);
> + damon_split_region_at(t, r, sz / c->addr_unit);
>
> The second approach is to divide by addr_unit when applying DAMON_MIN_REGION,
> and revert to byte units for statistics, this approach seems to involve
> significant changes as well:
>
> damos_apply_scheme()
> ...
> sz = ALIGN_DOWN(quota->esz - quota->charged_sz,
> - DAMON_MIN_REGION);
> + DAMON_MIN_REGION / c->addr_unit);
> ...
> update_stat:
> - damos_update_stat(s, sz, sz_applied, sz_ops_filter_passed);
> + damos_update_stat(s, sz, sz_applied * c->addr_unit, sz_ops_filter_passed);
>
> These are my observations. What's your perspective on how we should proceed? Looking
> forward to your reply.
I think the second approach is better. But I want to avoid changing every
DAMON_MIN_REGION usage. What about changing DAMON_MIN_REGION as 'max(4096 /
addr_unit, 1)' instead? Specifically, we can change DAMON_MIN_REGION from a
global macro value to per-context variable (a field of damon_ctx), and set it
accordingly when the parameters are set.
For stats, I think the users should aware of the fact DAMON is working with the
addr_unit, so they should multiply addr_unit to the stats to get bytes
information. So, I think the stats update in kernel is not really required.
DAMON user-space tool may need to be updated accordingly, though.
I didn't take time to think about all corner cases, so I may missing something.
Please let me knwo if you find such missing things.
Thanks,
SJ
[...]
Powered by blists - more mailing lists