linux-kernel - Re: CONFIG_PROVE_RAW_LOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <6E96F296-665D-4522-9531-914B7B340A89@infradead.org>
Date:   Thu, 23 Nov 2023 15:15:20 +0000
From:   David Woodhouse <dwmw2@...radead.org>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: CONFIG_PROVE_RAW_LOCK_NESTING false positive?

On 23 November 2023 15:13:45 GMT, Peter Zijlstra <peterz@...radead.org> wrote:
>On Thu, Nov 23, 2023 at 03:05:15PM +0000, David Woodhouse wrote:
>> On 23 November 2023 15:01:19 GMT, Peter Zijlstra <peterz@...radead.org> wrote:
>> >On Thu, Nov 23, 2023 at 09:00:41AM +0000, David Woodhouse wrote:
>> >> Is this telling me that I'm no longer allowed to take a read_lock() in
>> >> a callback for an HRTIMER_MODE_ABS_HARD timer? Is that intentional?
>> >> 
>> >> If I must, I can probably cope with this by using read_trylock()
>> >> instead. The object being locked is a cache, and we opportunistically
>> >> try to use it from the fast path but fall back to a slow path in
>> >> process context which will revalidate and try again. So if someone
>> >> *has* taken the write lock, it's a fairly safe bet that the cache is
>> >> going to be invalidated and we were going to take the slow path anyway.
>> >> 
>> >> [   62.336965] =============================
>> >> [   62.336973] [ BUG: Invalid wait context ]
>> >> [   62.336992] 6.7.0-rc1+ #1437 Tainted: G          I       
>> >> [   62.337001] -----------------------------
>> >> [   62.337008] qemu-system-x86/1935 is trying to lock:
>> >> [   62.337017] ffffc900018fecc0 (&gpc->lock){....}-{3:3}, at: kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [   62.337133] other info that might help us debug this:
>> >> [   62.337142] context-{2:2}
>> >> [   62.337148] 2 locks held by qemu-system-x86/1935:
>> >> [   62.337156]  #0: ffff888108f780b0 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x7f/0x730 [kvm]
>> >> [   62.337239]  #1: ffffc900018ff2d8 (&kvm->srcu){.?.+}-{0:0}, at: kvm_xen_set_evtchn_fast+0xcd/0x460 [kvm]
>> >> [   62.337339] stack backtrace:
>> >> [   62.337346] CPU: 7 PID: 1935 Comm: qemu-system-x86 Tainted: G          I        6.7.0-rc1+ #1437
>> >> [   62.337370] Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
>> >> [   62.337384] Call Trace:
>> >> [   62.337390]  <IRQ>
>> >> [   62.337395]  dump_stack_lvl+0x57/0x90
>> >> [   62.337407]  __lock_acquire+0x7bb/0xbb0
>> >> [   62.337416]  ? __lock_acquire+0x4f0/0xbb0
>> >> [   62.337425]  lock_acquire.part.0+0xad/0x240
>> >> [   62.337433]  ? kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [   62.337512]  ? rcu_is_watching+0xd/0x40
>> >> [   62.337520]  ? lock_acquire+0xf2/0x110
>> >> [   62.337529]  __raw_read_lock_irqsave+0x4e/0xa0
>> >> [   62.337538]  ? kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [   62.337604]  kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [   62.337669]  ? kvm_xen_set_evtchn_fast+0xcd/0x460 [kvm]
>> >> [   62.337734]  xen_timer_callback+0x86/0xc0 [kvm]
>> >
>> >xen_timer_callback is HRTIMER_MODE_ABS_HARD, which means it will still
>> >run in IRQ context for PREEMPT_RT.
>> >
>> >OTOH read_lock_irqsave() is not a raw spinlock and will be turned into a
>> >blocking lock.
>> >
>> >This then gives scheduling from IRQ context, which is somewhat frowned
>> >upon.
>> >
>> >Warning is real and valid.
>> 
>> 
>> ... or at least will be when PREEMPT_RT turns the read_lock into a mutex? 
>
>Right, this check specifically validates the RT lock nesting rules.
>
>> But there is no raw version of read_lock(). Can we have one please?
>
>Should be possible, but is somewhat non-trivial, it is very easy to
>create significant latencies with RW locks. Definitely not something I'm
>going to be able to do in a hurry.
>
>Also, I suspect Thomas is going to strongly suggest not going down that
>road and looking if this can be solved differently.

That's the "If I must…" paragraph above. I'll hack it up. Thanks.