linux-kernel - Re: [PATCH] sched: fix rq lock recursion issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <24c63730-2d6a-de14-57ca-919870b64323@arm.com>
Date:   Mon, 4 Jul 2022 10:32:59 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Qais Yousef <qais.yousef@....com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Satya Durga Srinivasu Prabhala <quic_satyap@...cinc.com>,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: fix rq lock recursion issue

On 01/07/2022 13:48, Qais Yousef wrote:
> On 07/01/22 10:33, Peter Zijlstra wrote:
>> On Thu, Jun 30, 2022 at 10:53:10PM +0100, Qais Yousef wrote:
>>> Hi Satya
>>>
>>> On 06/24/22 00:42, Satya Durga Srinivasu Prabhala wrote:
>>>> Below recursion is observed in a rare scenario where __schedule()
>>>> takes rq lock, at around same time task's affinity is being changed,
>>>> bpf function for tracing sched_switch calls migrate_enabled(),
>>>> checks for affinity change (cpus_ptr != cpus_mask) lands into
>>>> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
>>>> recursion bug.
>>>>
>>>> Fix the issue by switching to preempt_enable/disable() for non-RT
>>>> Kernels.
>>>
>>> Interesting bug. Thanks for the report. Unfortunately I can't see this being
>>> a fix as it just limits the bug visibility to PREEMPT_RT kernels, but won't fix
>>> anything, no? ie: Kernels compiled with PREEMPT_RT will still hit this failure.
>>
>> Worse, there's !RT stuff that grew to rely on the preemptible migrate
>> disable stuff, so this actively breaks things.
>>
>>> I'm curious how the race with set affinity is happening. I would have thought
>>> user space would get blocked as __schedule() will hold the rq lock.
>>>
>>> Do you have more details on that?
>>
>> Yeah, I'm not seeing how this works either, in order for
>> migrate_enable() to actually call __set_cpus_allowed_ptr(), it needs to
>> have done migrate_disable() *before* schedule, schedule() will then have
>> to call migrate_disable_swich(), and *then* migrate_enable() does this.
>>
>> However, if things are nicely balanced (as they should be), then
>> trace_call_bpf() using migrate_disable()/migrate_enable() should never
>> hit this path.
>>
>> If, OTOH, migrate_disable() was called prior to schedule() and we did do
>> migrate_disable_switch(), then it should be impossible for the
>> tracepoint/bpf stuff to reach p->migration_disabled == 0.
> 
> I think it's worth to confirm which kernel Satya is on too. If it's GKI, then
> worth checking first this is actually reproducible on/applicable to mainline.

Satya, do you still have these lines from your spin_dump() output showing
current, the kernel version and the hardware? Or a way to recreate this?
I couldn't provoke it so far.

...
[  212.196452] BUG: spinlock recursion on CPU#4, bpftrace/1662
                                                 ^^^^^^^^^^^^^       
[  212.196473]  lock: 0xffff00097ef7f500, .magic: dead4ead, .owner: bpftrace/1662, .owner_cpu: 4
[  212.196500] CPU: 4 PID: 1662 Comm: bpftrace Not tainted 5.19.0-rc2-00018-gb7ce5b6b4622-dirty #96
                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                             
[  212.196513] Hardware name: ARM Juno development board (r0) (DT)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^
[  212.196520] Call trace:
...