linux-kernel - Re: [PATCH 0/7] mm/damon: auto-tune DAMOS for NUMA setups including tiered memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250508092833.800-1-yunjeong.mun@sk.com>
Date: Thu,  8 May 2025 18:28:27 +0900
From: Yunjeong Mun <yunjeong.mun@...com>
To: SeongJae Park <sj@...nel.org>
Cc: honggyu.kim@...com,
	Jonathan Corbet <corbet@....net>,
	damon@...ts.linux.dev,
	kernel-team@...a.com,
	linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	kernel_team@...ynix.com
Subject: Re: [PATCH 0/7] mm/damon: auto-tune DAMOS for NUMA setups including tiered memory

Hi Seongjae, I'm sorry for the delayed response due to the holidays.

On Fri,  2 May 2025 08:49:49 -0700 SeongJae Park <sj@...nel.org> wrote:
> Hi Yunjeong,
> 
> On Fri,  2 May 2025 16:38:48 +0900 Yunjeong Mun <yunjeong.mun@...com> wrote:
> 
> > Hi SeongJae, thanks for your helpful auto-tuning patchset, which optimizes 
> > the ease of used of DAMON on tiered memory systems. I have tested demotion
> > mechanism with a microbenchmark and would like to share the result.
> 
> Thank you for sharing your test result!
> 
> [...]
> > Hardware. 
> > - Node 0: 512GB DRAM
> > - Node 1: 0GB (memoryless)
> > - Node 2: 96GB CXL memory
> > 
> > Kernel
> > - RFC patchset on top of v6.14-rc7 
> > https://lore.kernel.org/damon/20250320053937.57734-1-sj@kernel.org/
> > 
> > Workload
> > - Microbenchmark creates hot and cold regions based on the specified parameters.
> >   $ ./hot_cold 1g 100g
> > It repetitively performs memset on a 1GB hot region, but only performs memset
> > once on a 100GB cold region. 
> > 
> > DAMON setup
> > - My intention is to demote most of all regions of cold memory from node 0 to 
> > node 2. So, damo start with below yaml configuration:
> > ...
> > # damo v2.7.2 from https://git.kernel.org/pub/scm/linux/kernel/git/sj/damo.git/
> >    schemes:
> >    - action: migrate_cold
> >       target_nid: 2
> > ...
> >       apply_interval_us: 0
> >       quotas:
> >         time_ms: 0 s
> >         sz_bytes: 0 GiB
> >         reset_interval_ms: 6 s
> >         goals:
> >         - metric: node_mem_free_bp 
> >           target_value: 99%
> >           nid: 0
> >           current_value: 1
> >         effective_sz_bytes: 0 B
> > ...
> 
> Sharing DAMON parameters you used can be helpful, thank you!  Can you further
> share full parameters?  I'm especially interested in how the parameters for
> monitoring targets and migrate_cold scheme's target access pattern, and if
> there are other DAMON contexts or DAMOS schemes running together.
> 

Actually, I realized that the 'regions' field in my YAML configuration is 
incorrect. I've been using a configuration file that was create on another 
server, not the testing server. As a result, the scheme is applied to wrong
region, causing the results to appear confusing. I've  fixed the issue and
confirmed that the demotion occured successfully. I'm sorry for any confusion
this may have caused.

After fixing it up, Honggyu and I tested this patch again. I would like to
share two issues: 1) slow start of action, 2) action does not stop even when 
target is acheived. Below are the test configurations:

Hardware
- node 0: 64GB DRAM
- node 1: 0GB (memoryless)
- node 2: 96GB CXL memory

Kernel
- This patchset on top of v6.15-rc4

Workload: microbenchmark that `mmap` and `memset` once for size GB
$ ./mmap 50

DAMON setup: just one contexts and schemes.
    ...
    schemes:
    - action: migrate_cold
      target_nid: 2
      access_pattern:
        sz_bytes:
          min: 4.000 KiB
          max: max
        nr_accesses:
          min: 0 %
          max: 0 %
        age:
          min: 10 s
          max: max
      apply_interval_us: 0
      quotas:
        time_ms: 0 s
        sz_bytes: 0 GiB
        reset_interval_ms: 20 s
        goals:
        - metric: node_mem_free_bp
          target_value: 50%
          nid: 0
          current_value: 1
     ...

Two issues mentioned above are both caused by the calculation logic of 
`quota->esz`, which grows too slowly and increases gradually.

Slow start: 50GB of data is allocated on node 0, and the demotion first occurs
after about 15 minutes. This is because `quota->esz` is growing slowly even
when the `current` is lower than the `target`. 

Not stop: the `target` is to maintain 50% free space on node 0, which we expect
to be about 32GB. However, it demoted more than intended, maintaing about 90%
free space as follows:

  Per-node process memory usage (in MBs)
  PID           Node 0 Node 1 Node 2 Total
  ------------  ------ ------ ------ -----
  1182 (watch)       2      0      0     2
  1198 (mmap)     7015      0  44187 51201
  ------------  ------ ------ ------ -----
  Total           7017      0  44187 51204

This is becuase the `esz` decreased slowly after acheiving the `target`.
In the end, the demotion occured more excessively than intended.

We believe that the defference between `target` and `current` increases, the
`esz` should be raised more rapidly to increase the aggressiveness of action.
In the current implementation, the `esz` remains low even when the `current` is
below the `target`, leading to a slow start issue. Also, there is a not-stop
issue where high `esz` persist (decreasing slowly) even when an over_achieved
state. 

> 
> Yes, as you intrpret, seems the auto-tuning is working as designed, but
> migration is not successfully happened.  I'm curious if migration is tried but
> failed.  DAMOS stats[1] may let us know that.  Can you check and share those?
> 

Thank you for providing the DAMOS stats information. I will use it when
analyzing with DAMON. I would appreciate any feedback you might have on the new
results.

Best Regards,
Yunjeong

[..snip..]