linux-kernel - Re: [PATCH v2 0/5] kdump: crashkernel reservation from CMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e0f7fc1e-2227-4c6b-985a-34a697a52679@redhat.com>
Date: Fri, 30 May 2025 10:39:39 +0200
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Baoquan He <bhe@...hat.com>, Donald Dutile <ddutile@...hat.com>,
 Jiri Bohac <jbohac@...e.cz>, Vivek Goyal <vgoyal@...hat.com>,
 Dave Young <dyoung@...hat.com>, kexec@...ts.infradead.org,
 Philipp Rudo <prudo@...hat.com>, Pingfan Liu <piliu@...hat.com>,
 Tao Liu <ltao@...hat.com>, linux-kernel@...r.kernel.org,
 David Hildenbrand <dhildenb@...hat.com>
Subject: Re: [PATCH v2 0/5] kdump: crashkernel reservation from CMA

On 30.05.25 10:28, Michal Hocko wrote:
> On Fri 30-05-25 10:06:52, David Hildenbrand wrote:
>> On 29.05.25 09:46, Michal Hocko wrote:
>>> On Wed 28-05-25 23:01:04, David Hildenbrand wrote:
>>> [...]
>>>> I think we just have to be careful to document it properly -- especially the
>>>> shortcomings and that this feature might become a problem in the future.
>>>> Movable user-space page tables getting placed on CMA memory would probably
>>>> not be a problem if we don't care about ... user-space data either way.
>>>
>>> I think makedumpfile could refuse to capture a dump if userspace memory
>>> is requested to enforce this.
>>
>> Yeah, it will be tricky once we support placing other memory on CMA regions.
>> E.g., there was the discussion of making some slab allocations movable.
>>
>> But probably, in such a configuration, we would later simply refuse to
>> active CMA kdump.
> 
> Or we can make the kdump CMA region more special and only allow
> GFP_HIGHUSER_MOVABLE allocations from that. Anyaway I think we should
> deal with this once we get there.

Might be doable. When migrating (e.g., compacting) pages we'd have to 
make sure to also not migrate these pages into the CMA regions. Might be 
a bit more tricky, but likely solvable.

>   
>>>> The whole "Direct I/O takes max 1s" part is a bit shaky. Maybe it could be
>>>> configurable how long to wait? 10s is certainly "safer".
>>>
>>> Quite honestly we will never know and rather than making this
>>> configurable I would go with reasonably large. Couple of seconds
>>> certainly do not matter for the kdump situations but I would go as far
>>> as minutes.
>>
>> I recall that somebody raised that kdump downtime might be problematic
>> (might affect service downtime?).
>>
>> So I would just add a kconfig option with a default of 10s.
> 
> kconfig option usually doesn't really work for distro kernels. I am
> personally not really keen on having a tuning knob because there is a
> risk of cargo cult based tuning we have seen in other areas. That might
> make it hard to remove the knob later on. Fundamentally we should have 2
> situations though. Either we know that the HW is sane and then we
> shouldn't really need any sleep or the HW might misbehave and then we
> need to wait _some_ time. If our initial guess is incorrect then we can
> always increase it and we would learn about that through bug reports.

kconfigs are usually much easier to alter/remove than other tunables in 
my experience.

But yeah, it would have to go for the setting that works for all 
supported hw (iow, conservative timeout).

> 
> All that being said I would go with an additional parameter to the
> kdump cma setup - e.g. cma_sane_dma that would skip waiting and use 10s
> otherwise. That would make the optimized behavior opt in, we do not need
> to support all sorts of timeouts and also learn if this is not
> sufficient.
> 
> Makes sense?

Just so I understand correctly, you mean extending the "crashkernel=" 
option with a boolean parameter? If set, e.g., wait 1s, otherwise magic 
number 10?

-- 
Cheers,

David / dhildenb