linux-kernel - Re: [PATCH] kexec: change locking mechanism to a mutex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0ec5f56e-6b55-627a-39c0-ff0a1680794d@oracle.com>
Date:   Fri, 22 Sep 2023 12:35:13 -0500
From:   Eric DeVolder <eric.devolder@...cle.com>
To:     Valentin Schneider <vschneid@...hat.com>,
        linux-kernel@...r.kernel.org, bhe@...hat.com, vgoyal@...hat.com,
        dyoung@...hat.com, ebiederm@...ssion.com, kexec@...ts.infradead.org
Cc:     sourabhjain@...ux.ibm.com, konrad.wilk@...cle.com,
        boris.ostrovsky@...cle.com,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] kexec: change locking mechanism to a mutex



On 9/22/23 11:28, Valentin Schneider wrote:
> On 21/09/23 17:59, Eric DeVolder wrote:
>> The design decision to use the atomic lock is described in the comment
>> from kexec_internal.h, cited above. However, examining the code of
>> __crash_kexec():
>>
>>          if (kexec_trylock()) {
>>                  if (kexec_crash_image) {
>>                          ...
>>                  }
>>                  kexec_unlock();
>>          }
>>
>> reveals that the use of kexec_trylock() here is actually a "best effort"
>> due to the atomic lock.  This atomic lock, prior to crash hotplug,
>> would almost always be assured (another kexec syscall could hold the lock
>> and prevent this, but that is about it).
>>
>> So at the point where the capture kernel would be invoked, if the lock
>> is not obtained, then kdump doesn't occur.
>>
>> It is possible to instead use a mutex with proper waiting, and utilize
>> mutex_trylock() as the "best effort" in __crash_kexec(). The use of a
>> mutex then avoids all the lock acquisition problems that were revealed
>> by the crash hotplug activity.
>>
> 
> @Dave thanks for the Cc, I'd have missed this otherwise.
> 
> 
> Prior to the atomic thingie, we actually had a mutex and did
> mutex_trylock() in __crash_kexec(). I'm a bit confused as this looks like a
> revert of
>    05c6257433b7 ("panic, kexec: make __crash_kexec() NMI safe")
> with just the helpers kept in - this doesn't seem to address any of the
> original issues regarding NMIs?
> 
> Sebastian raised some good points in [1] regarding these issues.
> The main hurdle pointed out there is, if we end up in the slowpath during
> the unlock, then we can can up acquiring the ->wait_lock which isn't NMI
> safe.
> 
> This is even worse on PREEMPT_RT, as both trylock and the unlock can end up
> acquiring the ->wait_lock.
> 
> [1]: https://lore.kernel.org/all/YqyZ%2FUf14qkYtMDX@linutronix.de/
> 
Having reviewed the references, it would seem that Baoquan's approach of a new
lock to handle the hotplug activity is the way to go?
Eric