[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHOvCC5izV51xSXFBBijVOCaffwimbmEe6YPk2AVJWXAK59t+w@mail.gmail.com>
Date: Sat, 13 Dec 2025 13:09:37 +0900
From: JaeJoon Jung <rgbi3307@...il.com>
To: SeongJae Park <sj@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, damon@...ts.linux.dev,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH v3 07/37] mm/damon/core: apply access reports to high
level snapshot
On Sat, 13 Dec 2025 at 12:21, SeongJae Park <sj@...nel.org> wrote:
>
> On Sat, 13 Dec 2025 10:10:38 +0900 JaeJoon Jung <rgbi3307@...il.com> wrote:
>
> > On Sat, 13 Dec 2025 at 08:11, SeongJae Park <sj@...nel.org> wrote:
> > >
> > > On Fri, 12 Dec 2025 22:20:04 +0900 JaeJoon Jung <rgbi3307@...il.com> wrote:
> > >
> > > > On Mon, 8 Dec 2025 at 15:35, SeongJae Park <sj@...nel.org> wrote:
> > > > >
> > > > > Now any DAMON API callers can report their observed access information.
> > > > > The DAMON core layer is just ignoring those, though. Update the core to
> > > > > use the reported information at building the high level access pattern
> > > > > snapshot.
> > > >
> > > > It seems inefficient to repeatedly access the damon_access_reports[1000] array
> > > > using a for loop in the kdamond_check_reported_accesses() function.
> > > > It is inefficient to for loop through the entire
> > > > damon_access_reports[1000] array.
> > > > When CONFIG_HZ and jiffies are increased as follows and
> > > > damond sample_interval is 5000us (5ms), the time flow diagram is as follows.
> > > >
> > > > CONFIG_HZ 1000, jiffies == 1ms
> > > > damond sample_interval == 5000us (5ms)
> > > >
> > > > reports_len(==): [0 ... 5]
> > > > [*]
> > > > 0 1 2 3 4 5 6 7 8 9 997 998 999
> > > > [====|====|====|====|====]-----|----|----|----| .... |------|-------|
> > > > jiffies++ 1 2 3 4 5 0 0 0 0 0 0 0
> > > > damond_fn(sample interval) -5[0<]
> > > >
> > > > reports_len(==): [997 ... 2]
> > > > [*]
> > > > 0 1 2 3 4 5 6 7 8 9 997 998 999
> > > > [======|======]----|----|----|-----|----|----|----| .... [=====|=====]
> > > > jiffies++ 1001 1002 3 4 5 6 7 8 9 997 998 999
> > > > damond_fn(sample interval)
> > > > -5[997<]
> > >
> > > Please use fixed-length fonts for something like above, from next time. I
> > > fixed the diagram with my best effort, as above. But I still fail at
> > > understanding your point. More clarification about what the diagram means
> > > would be nice.
> >
> > Thank you for readjusting the font to fit. The first diagram above is when
> > reports_len is processed normally starting from 0 to reports_len.
> > The second diagram shows the process where reports_len increases to its
> > maximum values of 997, 998, 999, and then returns to 0.
>
> Thank you for adding this clarification.
>
> > The biggest problem here is that the latter part of the array is not processed.
>
> I don't get what "processed" is meaning, and what is the latter part of the
> array that not processed, and why it is a problem. Could you please clarify?
I'll just organize the code related to this issue as below.
This applies when kdamond_check_reported_accesses() is executed
when damon_access_reports_len becomes DAMON_ACCESS_REPORTS_CAP.
void damon_report_access(struct damon_access_report *report)
{
...
if (damon_access_reports_len == DAMON_ACCESS_REPORTS_CAP)
damon_access_reports_len = 0;
...
}
static void kdamond_check_reported_accesses(struct damon_ctx *ctx)
{
for (i = 0; i < damon_access_reports_len; i++) {
...
}
}
>
> >
> > >
> > > >
> > > > It seems that only the section corresponding to the sample interval ([==|==])
> > > > can be cycled as follows. And, how about enjoying damon_access_reports[1000]
> > > > as damon_access_reports[500]? Even if it reduce the 1000ms to 500ms
> > > > array space, it seems that it can sufficiently report and process within
> > > > the sample interval of 5ms.
> > >
> > > Are you assuming the the reports can be made only once per 1 millisecond? That
> > > is not true. The design assumes any kernel API caller could make the report,
> > > so more than one report can be made within one millisecond. Am I
> > > missingsomething?
> >
> > jiffies 1ms is just to simply unfold the passage of time when
> > CONFIG_HZ is set to 1000.
> > This is a simplification to help it understand the flow of time.
>
> So I understand you are saying that only one report can be made per jiffy. But
> that doesn't answer my question because I'm saying that design allows any
> report at any time. Any number of reports can be made within one jiffy time
> interval.
The input events are what you pointed out, but when reporting,
it is processed in jiffies time with time_before/after().
So we have to take everyone into consideration.
>
> >
> > >
> > > >
> > > > static unsigned int kdamond_check_reported_accesses(struct damon_ctx *ctx)
> > > > {
> > > > - int i;
> > > > + int i = damon_access_reports_len;
> > > > + unsigned int nr = 0;
> > > > struct damon_access_report *report;
> > > > struct damon_target *t;
> > > >
> > > > @@ -2904,16 +2905,18 @@ static unsigned int
> > > > kdamond_check_reported_accesses(struct damon_ctx *ctx)
> > > > return 0;
> > > >
> > > > mutex_lock(&damon_access_reports_lock);
> > > > - for (i = 0; i < damon_access_reports_len; i++) {
> > > > - report = &damon_access_reports[i];
> > > > - if (time_before(report->report_jiffies,
> > > > - jiffies -
> > > > - usecs_to_jiffies(
> > > > - ctx->attrs.sample_interval)))
> > > > - continue;
> > > > + report = &damon_access_reports[i];
> > > > + while (time_after(report->report_jiffies,
> > > > + jiffies - usecs_to_jiffies(ctx->attrs.sample_interval))) {
> > > > damon_for_each_target(t, ctx)
> > > > kdamond_apply_access_report(report, t, ctx);
> > > > + if (++nr >= DAMON_ACCESS_REPORTS_CAP)
> > > > + break;
> > > > +
> > > > + i = (i == 0) ? (DAMON_ACCESS_REPORTS_CAP - 1) : (i - 1);
> > > > + report = &damon_access_reports[i];
> > > > }
> > > > +
> > > > mutex_unlock(&damon_access_reports_lock);
> > > > /* For nr_accesses_bp, absence of access should also be reported. */
> > > > return kdamond_apply_zero_access_report(ctx);
> > > > }
> > >
> > > So I still don't get your points before the above code diff, but I understand
> > > this code diff.
> > >
> > > I agree this is more efficient. I will consider doing something like this in
> > > the next spin.
> >
> > What I tried above is to process the current array [1000] as
> > efficiently as possible.
> > But, if I think again, It would be better to store it in a linked-list
> > and process it
> > in FIFO mode whenever requested in damon_report_page_fault(),
> > damon_report_access(report)
> > instead of storing it in an array. I'm also analyzing the source code
> > starting this week,
> > so I'll organize it a bit more and get back to you with my opinion.
>
> I personally don't feel linked list is specially better than the current
> ring-buffer like implementation at the moment. But I would be happy to learn
> new ideas. Please feel free to revisit when you get a chance.
I agree that the ring-buffer you mentioned is good.
However, if this is not well controlled, it is less efficient than FIFO,
so I am analyzing your source code a bit more.
Thanks,
JaeJoon
>
>
> Thanks,
> SJ
>
> [...]
Powered by blists - more mailing lists