[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e0f7fc1e-2227-4c6b-985a-34a697a52679@redhat.com>
Date: Fri, 30 May 2025 10:39:39 +0200
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Baoquan He <bhe@...hat.com>, Donald Dutile <ddutile@...hat.com>,
Jiri Bohac <jbohac@...e.cz>, Vivek Goyal <vgoyal@...hat.com>,
Dave Young <dyoung@...hat.com>, kexec@...ts.infradead.org,
Philipp Rudo <prudo@...hat.com>, Pingfan Liu <piliu@...hat.com>,
Tao Liu <ltao@...hat.com>, linux-kernel@...r.kernel.org,
David Hildenbrand <dhildenb@...hat.com>
Subject: Re: [PATCH v2 0/5] kdump: crashkernel reservation from CMA
On 30.05.25 10:28, Michal Hocko wrote:
> On Fri 30-05-25 10:06:52, David Hildenbrand wrote:
>> On 29.05.25 09:46, Michal Hocko wrote:
>>> On Wed 28-05-25 23:01:04, David Hildenbrand wrote:
>>> [...]
>>>> I think we just have to be careful to document it properly -- especially the
>>>> shortcomings and that this feature might become a problem in the future.
>>>> Movable user-space page tables getting placed on CMA memory would probably
>>>> not be a problem if we don't care about ... user-space data either way.
>>>
>>> I think makedumpfile could refuse to capture a dump if userspace memory
>>> is requested to enforce this.
>>
>> Yeah, it will be tricky once we support placing other memory on CMA regions.
>> E.g., there was the discussion of making some slab allocations movable.
>>
>> But probably, in such a configuration, we would later simply refuse to
>> active CMA kdump.
>
> Or we can make the kdump CMA region more special and only allow
> GFP_HIGHUSER_MOVABLE allocations from that. Anyaway I think we should
> deal with this once we get there.
Might be doable. When migrating (e.g., compacting) pages we'd have to
make sure to also not migrate these pages into the CMA regions. Might be
a bit more tricky, but likely solvable.
>
>>>> The whole "Direct I/O takes max 1s" part is a bit shaky. Maybe it could be
>>>> configurable how long to wait? 10s is certainly "safer".
>>>
>>> Quite honestly we will never know and rather than making this
>>> configurable I would go with reasonably large. Couple of seconds
>>> certainly do not matter for the kdump situations but I would go as far
>>> as minutes.
>>
>> I recall that somebody raised that kdump downtime might be problematic
>> (might affect service downtime?).
>>
>> So I would just add a kconfig option with a default of 10s.
>
> kconfig option usually doesn't really work for distro kernels. I am
> personally not really keen on having a tuning knob because there is a
> risk of cargo cult based tuning we have seen in other areas. That might
> make it hard to remove the knob later on. Fundamentally we should have 2
> situations though. Either we know that the HW is sane and then we
> shouldn't really need any sleep or the HW might misbehave and then we
> need to wait _some_ time. If our initial guess is incorrect then we can
> always increase it and we would learn about that through bug reports.
kconfigs are usually much easier to alter/remove than other tunables in
my experience.
But yeah, it would have to go for the setting that works for all
supported hw (iow, conservative timeout).
>
> All that being said I would go with an additional parameter to the
> kdump cma setup - e.g. cma_sane_dma that would skip waiting and use 10s
> otherwise. That would make the optimized behavior opt in, we do not need
> to support all sorts of timeouts and also learn if this is not
> sufficient.
>
> Makes sense?
Just so I understand correctly, you mean extending the "crashkernel="
option with a boolean parameter? If set, e.g., wait 1s, otherwise magic
number 10?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists