[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220701114846.42o2tkm5fqt325df@wubuntu>
Date: Fri, 1 Jul 2022 12:48:46 +0100
From: Qais Yousef <qais.yousef@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Satya Durga Srinivasu Prabhala <quic_satyap@...cinc.com>,
mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vschneid@...hat.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: fix rq lock recursion issue
On 07/01/22 10:33, Peter Zijlstra wrote:
> On Thu, Jun 30, 2022 at 10:53:10PM +0100, Qais Yousef wrote:
> > Hi Satya
> >
> > On 06/24/22 00:42, Satya Durga Srinivasu Prabhala wrote:
> > > Below recursion is observed in a rare scenario where __schedule()
> > > takes rq lock, at around same time task's affinity is being changed,
> > > bpf function for tracing sched_switch calls migrate_enabled(),
> > > checks for affinity change (cpus_ptr != cpus_mask) lands into
> > > __set_cpus_allowed_ptr which tries acquire rq lock and causing the
> > > recursion bug.
> > >
> > > Fix the issue by switching to preempt_enable/disable() for non-RT
> > > Kernels.
> >
> > Interesting bug. Thanks for the report. Unfortunately I can't see this being
> > a fix as it just limits the bug visibility to PREEMPT_RT kernels, but won't fix
> > anything, no? ie: Kernels compiled with PREEMPT_RT will still hit this failure.
>
> Worse, there's !RT stuff that grew to rely on the preemptible migrate
> disable stuff, so this actively breaks things.
>
> > I'm curious how the race with set affinity is happening. I would have thought
> > user space would get blocked as __schedule() will hold the rq lock.
> >
> > Do you have more details on that?
>
> Yeah, I'm not seeing how this works either, in order for
> migrate_enable() to actually call __set_cpus_allowed_ptr(), it needs to
> have done migrate_disable() *before* schedule, schedule() will then have
> to call migrate_disable_swich(), and *then* migrate_enable() does this.
>
> However, if things are nicely balanced (as they should be), then
> trace_call_bpf() using migrate_disable()/migrate_enable() should never
> hit this path.
>
> If, OTOH, migrate_disable() was called prior to schedule() and we did do
> migrate_disable_switch(), then it should be impossible for the
> tracepoint/bpf stuff to reach p->migration_disabled == 0.
I think it's worth to confirm which kernel Satya is on too. If it's GKI, then
worth checking first this is actually reproducible on/applicable to mainline.
Cheers
--
Qais Yousef
Powered by blists - more mailing lists