linux-kernel - Re: CONFIG_PROVE_RAW_LOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231123151345.GB38813@noisy.programming.kicks-ass.net>
Date:   Thu, 23 Nov 2023 16:13:45 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     David Woodhouse <dwmw2@...radead.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: CONFIG_PROVE_RAW_LOCK_NESTING false positive?

On Thu, Nov 23, 2023 at 03:05:15PM +0000, David Woodhouse wrote:
> On 23 November 2023 15:01:19 GMT, Peter Zijlstra <peterz@...radead.org> wrote:
> >On Thu, Nov 23, 2023 at 09:00:41AM +0000, David Woodhouse wrote:
> >> Is this telling me that I'm no longer allowed to take a read_lock() in
> >> a callback for an HRTIMER_MODE_ABS_HARD timer? Is that intentional?
> >> 
> >> If I must, I can probably cope with this by using read_trylock()
> >> instead. The object being locked is a cache, and we opportunistically
> >> try to use it from the fast path but fall back to a slow path in
> >> process context which will revalidate and try again. So if someone
> >> *has* taken the write lock, it's a fairly safe bet that the cache is
> >> going to be invalidated and we were going to take the slow path anyway.
> >> 
> >> [   62.336965] =============================
> >> [   62.336973] [ BUG: Invalid wait context ]
> >> [   62.336992] 6.7.0-rc1+ #1437 Tainted: G          I       
> >> [   62.337001] -----------------------------
> >> [   62.337008] qemu-system-x86/1935 is trying to lock:
> >> [   62.337017] ffffc900018fecc0 (&gpc->lock){....}-{3:3}, at: kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
> >> [   62.337133] other info that might help us debug this:
> >> [   62.337142] context-{2:2}
> >> [   62.337148] 2 locks held by qemu-system-x86/1935:
> >> [   62.337156]  #0: ffff888108f780b0 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x7f/0x730 [kvm]
> >> [   62.337239]  #1: ffffc900018ff2d8 (&kvm->srcu){.?.+}-{0:0}, at: kvm_xen_set_evtchn_fast+0xcd/0x460 [kvm]
> >> [   62.337339] stack backtrace:
> >> [   62.337346] CPU: 7 PID: 1935 Comm: qemu-system-x86 Tainted: G          I        6.7.0-rc1+ #1437
> >> [   62.337370] Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
> >> [   62.337384] Call Trace:
> >> [   62.337390]  <IRQ>
> >> [   62.337395]  dump_stack_lvl+0x57/0x90
> >> [   62.337407]  __lock_acquire+0x7bb/0xbb0
> >> [   62.337416]  ? __lock_acquire+0x4f0/0xbb0
> >> [   62.337425]  lock_acquire.part.0+0xad/0x240
> >> [   62.337433]  ? kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
> >> [   62.337512]  ? rcu_is_watching+0xd/0x40
> >> [   62.337520]  ? lock_acquire+0xf2/0x110
> >> [   62.337529]  __raw_read_lock_irqsave+0x4e/0xa0
> >> [   62.337538]  ? kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
> >> [   62.337604]  kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
> >> [   62.337669]  ? kvm_xen_set_evtchn_fast+0xcd/0x460 [kvm]
> >> [   62.337734]  xen_timer_callback+0x86/0xc0 [kvm]
> >
> >xen_timer_callback is HRTIMER_MODE_ABS_HARD, which means it will still
> >run in IRQ context for PREEMPT_RT.
> >
> >OTOH read_lock_irqsave() is not a raw spinlock and will be turned into a
> >blocking lock.
> >
> >This then gives scheduling from IRQ context, which is somewhat frowned
> >upon.
> >
> >Warning is real and valid.
> 
> 
> ... or at least will be when PREEMPT_RT turns the read_lock into a mutex? 

Right, this check specifically validates the RT lock nesting rules.

> But there is no raw version of read_lock(). Can we have one please?

Should be possible, but is somewhat non-trivial, it is very easy to
create significant latencies with RW locks. Definitely not something I'm
going to be able to do in a hurry.

Also, I suspect Thomas is going to strongly suggest not going down that
road and looking if this can be solved differently.