linux-kernel - Re: [RFC PATCH v3 07/37] mm/damon/core: apply access reports to high level snapshot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251213091704.53985-1-sj@kernel.org>
Date: Sat, 13 Dec 2025 01:17:03 -0800
From: SeongJae Park <sj@...nel.org>
To: SeongJae Park <sj@...nel.org>
Cc: JaeJoon Jung <rgbi3307@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	damon@...ts.linux.dev,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: [RFC PATCH v3 07/37] mm/damon/core: apply access reports to high level snapshot

On Fri, 12 Dec 2025 21:53:34 -0800 SeongJae Park <sj@...nel.org> wrote:

> On Sat, 13 Dec 2025 13:09:37 +0900 JaeJoon Jung <rgbi3307@...il.com> wrote:
> 
> > On Sat, 13 Dec 2025 at 12:21, SeongJae Park <sj@...nel.org> wrote:
> > >
> > > On Sat, 13 Dec 2025 10:10:38 +0900 JaeJoon Jung <rgbi3307@...il.com> wrote:
> > >
> > > > On Sat, 13 Dec 2025 at 08:11, SeongJae Park <sj@...nel.org> wrote:
> > > > >
> > > > > On Fri, 12 Dec 2025 22:20:04 +0900 JaeJoon Jung <rgbi3307@...il.com> wrote:
> > > > >
> > > > > > On Mon, 8 Dec 2025 at 15:35, SeongJae Park <sj@...nel.org> wrote:
> > > > > > >
> > > > > > > Now any DAMON API callers can report their observed access information.
> > > > > > > The DAMON core layer is just ignoring those, though.  Update the core to
> > > > > > > use the reported information at building the high level access pattern
> > > > > > > snapshot.
> > > > > >
> > > > > > It seems inefficient to repeatedly access the damon_access_reports[1000] array
> > > > > > using a for loop in the kdamond_check_reported_accesses() function.
> > > > > > It is inefficient to for loop through the entire
> > > > > > damon_access_reports[1000] array.
> > > > > > When CONFIG_HZ and jiffies are increased as follows and
> > > > > > damond sample_interval is 5000us (5ms), the time flow diagram is as follows.
> > > > > >
> > > > > > CONFIG_HZ 1000, jiffies == 1ms
> > > > > > damond sample_interval == 5000us (5ms)
> > > > > >
> > > > > > reports_len(==): [0 ... 5]
> > > > > >                                                      [*]
> > > > > >         0    1    2    3    4    5     6    7    8    9        997    998     999
> > > > > >         [====|====|====|====|====]-----|----|----|----|  ....  |------|-------|
> > > > > > jiffies++    1    2    3    4    5     0    0    0    0        0      0       0
> > > > > > damond_fn(sample interval)      -5[0<]
> > > > > >
> > > > > > reports_len(==): [997 ... 2]
> > > > > >                                   [*]
> > > > > >         0      1      2    3    4    5     6    7    8    9        997   998   999
> > > > > >         [======|======]----|----|----|-----|----|----|----|  ....  [=====|=====]
> > > > > > jiffies++   1001      1002 3    4    5     6    7    8    9        997   998   999
> > > > > > damond_fn(sample interval)
> > > > > >                             -5[997<]
> > > > >
> > > > > Please use fixed-length fonts for something like above, from next time.  I
> > > > > fixed the diagram with my best effort, as above.  But I still fail at
> > > > > understanding your point.  More clarification about what the diagram means
> > > > > would be nice.
> > > >
> > > > Thank you for readjusting the font to fit.  The first diagram above is when
> > > > reports_len is processed normally starting from 0 to reports_len.
> > > > The second diagram shows the process where reports_len increases to its
> > > > maximum values of 997, 998, 999, and then returns to 0.
> > >
> > > Thank you for adding this clarification.
> > >
> > > > The biggest problem here is that the latter part of the array is not processed.
> > >
> > > I don't get what "processed" is meaning, and what is the latter part of the
> > > array that not processed, and why it is a problem.  Could you please clarify?
> > 
> > I'll just organize the code related to this issue as below.
> > This applies when kdamond_check_reported_accesses() is executed
> > when damon_access_reports_len becomes DAMON_ACCESS_REPORTS_CAP.
> > 
> > void damon_report_access(struct damon_access_report *report)
> > {
> >         ...
> >         if (damon_access_reports_len == DAMON_ACCESS_REPORTS_CAP)
> >                 damon_access_reports_len = 0;
> >         ...
> > }
> > 
> > static void kdamond_check_reported_accesses(struct damon_ctx *ctx)
> > {
> >         for (i = 0; i < damon_access_reports_len; i++) {
> >                 ...
> >         }
> > }
> 
> Ok, so I understand that when damon_access_reports_len is reset, the reports
> that stored at the end part of the array is simply ignored.  And your suggested
> change can fix it.

And I now recall that this was an intended behavior for making this RFC very
simple and therefore can be implemented very quick.  The intention should
definitely be better documented.  I will do so by the next spin.

The purpose of this patchset is only for high level concept review and early
testing, not for being merged as-is.  Until I start believing the high level
concept is not making people very concerned, I want to keep this code as simple
as possible.  Only if it becomes clear the simplicity is making this too
inefficient to do even the early testing, I'd consider optimizing the
efficiency.

So, I will make the intention better documented, but keep the code as is for
now.

> 
> > 
> > >
> > > >
> > > > >
> > > > > >
> > > > > > It seems that only the section corresponding to the sample interval ([==|==])
> > > > > > can be cycled as follows.  And, how about enjoying damon_access_reports[1000]
> > > > > > as damon_access_reports[500]?  Even if it reduce the 1000ms to 500ms
> > > > > > array space, it seems that it can sufficiently report and process within
> > > > > > the sample interval of 5ms.
> > > > >
> > > > > Are you assuming the the reports can be made only once per 1 millisecond?  That
> > > > > is not true.  The design assumes any kernel API caller could make the report,
> > > > > so more than one report can be made within one millisecond.  Am I
> > > > > missingsomething?
> > > >
> > > > jiffies 1ms is just to simply unfold the passage of time when
> > > > CONFIG_HZ is set to 1000.
> > > > This is a simplification to help it understand the flow of time.
> > >
> > > So I understand you are saying that only one report can be made per jiffy.  But
> > > that doesn't answer my question because I'm saying that design allows any
> > > report at any time.  Any number of reports can be made within one jiffy time
> > > interval.
> > 
> > The input events are what you pointed out, but when reporting,
> > it is processed in jiffies time with time_before/after().
> > So we have to take everyone into consideration.
> 
> I don't get your point yet.  Can you please elaborate?

Any elaboration will still be helpful.

> 
> > 
> > >
> > > >
> > > > >
> > > > > >
> > > > > >  static unsigned int kdamond_check_reported_accesses(struct damon_ctx *ctx)
> > > > > >  {
> > > > > > -       int i;
> > > > > > +       int i = damon_access_reports_len;
> > > > > > +       unsigned int nr = 0;
> > > > > >         struct damon_access_report *report;
> > > > > >         struct damon_target *t;
> > > > > >
> > > > > > @@ -2904,16 +2905,18 @@ static unsigned int
> > > > > > kdamond_check_reported_accesses(struct damon_ctx *ctx)
> > > > > >                 return 0;
> > > > > >
> > > > > >         mutex_lock(&damon_access_reports_lock);
> > > > > > -       for (i = 0; i < damon_access_reports_len; i++) {
> > > > > > -               report = &damon_access_reports[i];
> > > > > > -               if (time_before(report->report_jiffies,
> > > > > > -                                       jiffies -
> > > > > > -                                       usecs_to_jiffies(
> > > > > > -                                               ctx->attrs.sample_interval)))
> > > > > > -                       continue;
> > > > > > +       report = &damon_access_reports[i];
> > > > > > +       while (time_after(report->report_jiffies,
> > > > > > +               jiffies - usecs_to_jiffies(ctx->attrs.sample_interval))) {
> > > > > >                 damon_for_each_target(t, ctx)
> > > > > >                         kdamond_apply_access_report(report, t, ctx);
> > > > > > +               if (++nr >= DAMON_ACCESS_REPORTS_CAP)
> > > > > > +                       break;
> > > > > > +
> > > > > > +               i = (i == 0) ? (DAMON_ACCESS_REPORTS_CAP - 1) : (i - 1);
> > > > > > +               report = &damon_access_reports[i];
> > > > > >         }
> > > > > > +
> > > > > >         mutex_unlock(&damon_access_reports_lock);
> > > > > >         /* For nr_accesses_bp, absence of access should also be reported. */
> > > > > >         return kdamond_apply_zero_access_report(ctx);
> > > > > > }
> > > > >
> > > > > So I still don't get your points before the above code diff, but I understand
> > > > > this code diff.
> > > > >
> > > > > I agree this is more efficient.  I will consider doing something like this in
> > > > > the next spin.

As I mentioned above, I may not do that but make the documentation better.


Thanks,
SJ

[...]