[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <273a3376-c45a-4d41-85b4-9c4f3428f268@suse.de>
Date: Mon, 28 Jul 2025 11:28:19 +0200
From: Hannes Reinecke <hare@...e.de>
To: David Hildenbrand <david@...hat.com>, Oscar Salvador <osalvador@...e.de>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Michal Hocko <mhocko@...e.com>, Hannes Reinecke <hare@...nel.org>
Subject: Re: [RFC] Disable auto_movable_ratio for selfhosted memmap
On 7/28/25 10:44, David Hildenbrand wrote:
> On 28.07.25 10:15, Oscar Salvador wrote:
>> Hi,
[ .. ]
>>
>> One way to tackle this would be update the ratio every time a new CXL
>> card gets inserted, but this seems suboptimal.
>> Another way is that since CXL memory works with selfhosted memmap, we
>> could relax
>> the check when 'auto-movable' and only look at the ratio if we aren't
>> working with selfhosted memmap.
>
> The memmap is only a small piece of unmovable data we require late at
> runtime (a bigger factor is user space page tables actually mapping that
> memory). The zone ratio we have configured in the kernel dates back to
> the highmem times, where such ratios were considered safe. Maybe there
> are better defaults for the ratios today, but it really depends on the
> workload.
>
Point is, the ratio is accounted for the _entire_ memory.
Which means that you have to _know_ how much memory you are going to
plug in prior to plugging that in.
So to make that correct one would need to update the ratio prior to
plug in one module, check if that succeeded, update the ratio, plug
in the next module, check that, etc.
Really?
> One could find ways of subtracting the selfhosted part, to account it
> differently in the kernel, but the memmap is not the only consumer that
> affects the ratio.
>
> I mean, the memmap is roughly 1.6%, I don't think that really makes a
> difference for you, does it? Can you share some real-life examples?
>
>
> I have a colleague working on one of my old prototypes (memoryhotplugd)
> for replacing udev rules.
>
> The idea there is, to detect that CXL memory is getting hotplugged and
> keep it offline. Because user space hotplugging that memory (daxctl)
> will explicitly online it to the proper zone.
>
> Things like virtio-mem, DIMMs etc can happily use the auto-movable
> behavior. But the auto-movable behavior doesn't quite make sense if (a)
> you want everything movable and (b) daxctl already expects to online the
> memory itself, usually to ZONE_MOVABLE.
>
> So I think this is mostly a user-space problem to solve.
>
Hmm.
Yes, and no.
While CXL memory is hotpluggable (it's a PCI device, after all),
it won't be hotplugged on a regular basis.
So the current use-case I'm aware of is that the system will be
configured once, and then it will be expected to come up in the
very same state after reboot.
As such a daemon is a bit of an overkill, as the number of events
it would need to listen to is in the very low single-digit range.
Udev rules would work fine, though (in fact, we already have one ...)
so I'd be happy to keep CXL memory offline after boot / hotplug
and let udev / daxctl handle things.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists