linux-kernel - Re: [PATCH v2 2/5] mm: zswap: calculate limits only when updated

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7aec7b98-db81-4238-bdd6-afb69679f852@redhat.com>
Date: Mon, 8 Apr 2024 09:54:47 +0200
From: David Hildenbrand <david@...hat.com>
To: Yosry Ahmed <yosryahmed@...gle.com>, Johannes Weiner <hannes@...xchg.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Nhat Pham <nphamcs@...il.com>,
 Chengming Zhou <chengming.zhou@...ux.dev>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/5] mm: zswap: calculate limits only when updated

On 05.04.24 20:43, Yosry Ahmed wrote:
> On Fri, Apr 5, 2024 at 8:26 AM Johannes Weiner <hannes@...xchg.org> wrote:
>>
>> On Fri, Apr 05, 2024 at 05:35:07AM +0000, Yosry Ahmed wrote:
>>> Currently, we calculate the zswap global limit, and potentially the
>>> acceptance threshold in the zswap, in pages in the zswap store path.
>>> This is unnecessary because the values rarely change.
>>>
>>> Instead, precalculate the them when the module parameters are updated,
>>> which should be rare. Since we are adding custom handlers for setting
>>> the percentages now, add proper validation that they are <= 100.
>>>
>>> Suggested-by: Johannes Weiner <hannes@...xchg.org>
>>> Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
>>
>> Nice! Getting that stuff out of the hotpath!
>>
>> Two comments below:
>>
>>> @@ -684,6 +703,43 @@ static int zswap_enabled_param_set(const char *val,
>>>        return ret;
>>>   }
>>>
>>> +static int __zswap_percent_param_set(const char *val,
>>> +                                  const struct kernel_param *kp)
>>> +{
>>> +     unsigned int n;
>>> +     int ret;
>>> +
>>> +     ret = kstrtouint(val, 10, &n);
>>> +     if (ret || n > 100)
>>> +             return -EINVAL;
>>> +
>>> +     return param_set_uint(val, kp);
>>> +}
>>> +
>>> +static int zswap_max_pool_param_set(const char *val,
>>> +                                 const struct kernel_param *kp)
>>> +{
>>> +     int err = __zswap_percent_param_set(val, kp);
>>> +
>>> +     if (!err) {
>>> +             zswap_update_max_pages();
>>> +             zswap_update_accept_thr_pages();
>>> +     }
>>> +
>>> +     return err;
>>> +}
>>> +
>>> +static int zswap_accept_thr_param_set(const char *val,
>>> +                                   const struct kernel_param *kp)
>>> +{
>>> +     int err = __zswap_percent_param_set(val, kp);
>>> +
>>> +     if (!err)
>>> +             zswap_update_accept_thr_pages();
>>> +
>>> +     return err;
>>> +}
>>
>> I think you can keep this simple and just always update both if
>> anything changes for whatever reason. It's an extremely rare event
>> after all. That should cut it from 3 functions to 1.
> 
> Yeah we can also combine both update functions in this case.
> 
>>
>> Note that totalram_pages can also change during memory onlining and
>> offlining. For that you need a memory notifier that also calls that
>> refresh function. It's simple enough, though, check out the code
>> around register_memory_notifier() in drivers/xen/balloon.c.
> 
> Good point, I completely missed this. It seems like totalram_pages can
> actually change from contexts other than memory hotplug as well,
> specifically through adjust_managed_page_count(), and mostly through
> ballooning drivers. Do we trigger the notifiers in this case? I can't
> find such logic.

Things like virtio-balloon never online/offline memory and would never 
call it.

Things like virtio-mem sometimes will online/offline memory and would 
sometimes call it (but not always). Things like the Hyper-V balloon and 
XEN balloon never offline memory, and would only call it when onlining 
memory.

> 
> It seems like in this case the actual amount of memory does not
> change, but the drivers take it away from the system. It makes some
> sense to me that the zswap limits do not change in this case,
> especially that userspace just sets those limits as a percentage of
> total memory. I wouldn't expect userspace to take ballooning into
> account here.
> 

For virtio-mem, it does change ("actual amount of memory"). For 
virtio-balloon, it's tricky. When using virtio-balloon for VM resizing, 
it would similarly change. When using it for pure memory overcommit, it 
depends on whatever the policy in the hypervisor is ... might be that 
under memory pressure that memory is simply given back to the VM.

> However, it would be a behavioral change from today where we always
> rely on totalram_pages(). Also, if userspace happens to change the
> limit when a driver is holding a big chunk of memory away from
> totalram_pages, then the limit would be constantly underestimated.
> 
> I do not have enough familiarity with memory ballooning to know for
> sure if this is okay. How much memory can memory ballooning add/remove
> from totalram_pages(), and is it usually transient or does it stick
> around for a while?
> 
> Also CCing David here.

It can be a lot. Especially with the Hyper-V balloon (but also on 
environments where other forms of memory hotunplug are not supported), 
memory ballooning can be used to unplug memory. So that memory can be 
gone for good and it can end up being quite a lot of memory.

The clean thing to do would be to have a way for other subsystems to get 
notified on any totalram_pages() changes, so they can adjust accordingly.

-- 
Cheers,

David / dhildenb