lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Oct 2023 11:48:25 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Stefan Roesch <shr@...kernel.io>
Cc:     kernel-team@...com, akpm@...ux-foundation.org, hannes@...xchg.org,
        riel@...riel.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v1 0/4] mm/ksm: Add ksm advisor

On 06.10.23 18:17, Stefan Roesch wrote:
> 
> David Hildenbrand <david@...hat.com> writes:
> 
>> On 04.10.23 21:02, Stefan Roesch wrote:
>>> What is the KSM advisor?
>>> =========================
>>> The ksm advisor automatically manages the pages_to_scan setting to
>>> achieve a target scan time. The target scan time defines how many seconds
>>> it should take to scan all the candidate KSM pages. In other words the
>>> pages_to_scan rate is changed by the advisor to achieve the target scan
>>> time.
>>> Why do we need a KSM advisor?
>>> ==============================
>>> The number of candidate pages for KSM is dynamic. It can often be observed
>>> that during the startup of an application more candidate pages need to be
>>> processed. Without an advisor the pages_to_scan parameter needs to be
>>> sized for the maximum number of candidate pages. With the scan time
>>> advisor the pages_to_scan parameter based can be changed based on demand.
>>> Algorithm
>>> ==========
>>> The algorithm calculates the change value based on the target scan time
>>> and the previous scan time. To avoid pertubations an exponentially
>>> weighted moving average is applied.
>>> The algorithm has a max and min
>>> value to:
>>> - guarantee responsiveness to changes
>>> - to avoid to spend too much CPU
>>> Parameters to influence the KSM scan advisor
>>> =============================================
>>> The respective parameters are:
>>> - ksm_advisor_mode
>>>     0: None (default), 1: scan time advisor
>>> - ksm_advisor_target_scan_time
>>>     how many seconds a scan should of all candidate pages take
>>> - ksm_advisor_min_pages
>>>     minimum value for pages_to_scan per batch
>>> - ksm_advisor_max_pages
>>>     maximum value for pages_to_scan per batch
>>> The parameters are exposed as knobs in /sys/kernel/mm/ksm.
>>> By default the scan time advisor is disabled.
>>
>> What would be the main reason to not have this enabled as default?
>>
> There might be already exisiting users which directly set pages_to_scan
> and tuned the KSM settings accordingly, as the default setting of 100 for
> pages_to_scan is too low for typical workloads.

Good point.

> 
>> IIUC, it is kind-of an auto-tuning of pages_to_scan. Would "auto-tuning"
>> describe it better than "advisor" ?
>>
>> [...]
>>
> 
> I'm fine with auto-tune. I was also thinking about that name, but I
> chose advisor, its a bit less strong and it needs input from the user.
> 

I'm not a native speaker, but "adviser" to me implies that no action is 
taken, only advises are given :) But again, no native speaker.

>>> How is defining a target scan time better?
>>> ===========================================
>>> For an administrator it is more logical to set a target scan time.. The
>>> administrator can determine how many pages are scanned on each scan.
>>> Therefore setting a target scan time makes more sense.
>>> In addition the administrator might have a good idea about the
>>> memory sizing of its respective workloads.
>>
>> Is there any way you could imagine where we could have this just do something
>> reasonable without any user input? IOW, true auto-tuning?
>>
> 
> True auto-tuning might be difficult as users might want to be able to
> choose how aggressive KSM is. Some might want it to be as aggressive as
> possible to get the maximum de-duplication rate. Others might want a
> more balanced approach that takes CPU-consumption into consideration.
> 
> I guess it depends if you are memory-bound, cpu-bound or both.

Agreed, more below.

> 
>> I read above:
>>> - guarantee responsiveness to changes
>>> - to avoid to spend too much CPU
>>
>> whereby both things are accountable/measurable to use that as the input for
>> auto-tuning?
>>
> I'm not sure a true auto-tuning can be achieved. I think we need
> some input from the user
> - How much resources to consume
> - How fast memory changes or how stable memory is
>    (this we might be able to detect)

Setting the pages_to_scan is a bit mystical. Setting upper/lower 
pages_to_scan bounds is similarly mystical, and highly workload dependent.

So I agree that a better abstraction to automatically tune the scanning 
is reasonable. I wonder if we can let the user give better inputs that 
are less workload dependent.

For example, do we need min/max values for pages_to_scan, or can we 
replace it by something better to the auto-tuning algorithm?

IMHO "target scan time" goes into the right direction, but it can still 
be fairly workload dependent. Maybe a "max CPU consumption" or sth. like 
that would similarly help to limit CPU waste, and it could be fairly 
workload dependent.


-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ