lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yr6xPWOReXNuDQqh@worktop.programming.kicks-ass.net>
Date:   Fri, 1 Jul 2022 10:33:01 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Qais Yousef <qais.yousef@....com>
Cc:     Satya Durga Srinivasu Prabhala <quic_satyap@...cinc.com>,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: fix rq lock recursion issue

On Thu, Jun 30, 2022 at 10:53:10PM +0100, Qais Yousef wrote:
> Hi Satya
> 
> On 06/24/22 00:42, Satya Durga Srinivasu Prabhala wrote:
> > Below recursion is observed in a rare scenario where __schedule()
> > takes rq lock, at around same time task's affinity is being changed,
> > bpf function for tracing sched_switch calls migrate_enabled(),
> > checks for affinity change (cpus_ptr != cpus_mask) lands into
> > __set_cpus_allowed_ptr which tries acquire rq lock and causing the
> > recursion bug.
> > 
> > Fix the issue by switching to preempt_enable/disable() for non-RT
> > Kernels.
> 
> Interesting bug. Thanks for the report. Unfortunately I can't see this being
> a fix as it just limits the bug visibility to PREEMPT_RT kernels, but won't fix
> anything, no? ie: Kernels compiled with PREEMPT_RT will still hit this failure.

Worse, there's !RT stuff that grew to rely on the preemptible migrate
disable stuff, so this actively breaks things.

> I'm curious how the race with set affinity is happening. I would have thought
> user space would get blocked as __schedule() will hold the rq lock.
> 
> Do you have more details on that?

Yeah, I'm not seeing how this works either, in order for
migrate_enable() to actually call __set_cpus_allowed_ptr(), it needs to
have done migrate_disable() *before* schedule, schedule() will then have
to call migrate_disable_swich(), and *then* migrate_enable() does this.

However, if things are nicely balanced (as they should be), then
trace_call_bpf() using migrate_disable()/migrate_enable() should never
hit this path.

If, OTOH, migrate_disable() was called prior to schedule() and we did do
migrate_disable_switch(), then it should be impossible for the
tracepoint/bpf stuff to reach p->migration_disabled == 0.

> > -010 |spin_bug(lock = ???, msg = ???)
> > -011 |debug_spin_lock_before(inline)
> > -011 |do_raw_spin_lock(lock = 0xFFFFFF89323BB600)
> > -012 |_raw_spin_lock(inline)
> > -012 |raw_spin_rq_lock_nested(inline)
> > -012 |raw_spin_rq_lock(inline)
> > -012 |task_rq_lock(p = 0xFFFFFF88CFF1DA00, rf = 0xFFFFFFC03707BBE8)
> > -013 |__set_cpus_allowed_ptr(inline)
> > -013 |migrate_enable()
> > -014 |trace_call_bpf(call = ?, ctx = 0xFFFFFFFDEF954600)
> > -015 |perf_trace_run_bpf_submit(inline)
> > -015 |perf_trace_sched_switch(__data = 0xFFFFFFE82CF0BCB8, preempt = FALSE, prev = ?, next = ?)
> > -016 |__traceiter_sched_switch(inline)
> > -016 |trace_sched_switch(inline)
> > -016 |__schedule(sched_mode = ?)
> > -017 |schedule()
> > -018 |arch_local_save_flags(inline)
> > -018 |arch_irqs_disabled(inline)
> > -018 |__raw_spin_lock_irq(inline)
> > -018 |_raw_spin_lock_irq(inline)
> > -018 |worker_thread(__worker = 0xFFFFFF88CE251300)
> > -019 |kthread(_create = 0xFFFFFF88730A5A80)
> > -020 |ret_from_fork(asm)

This doesn't clarify much. Please explain how things got to be
unbalanced, don't ever just make a problem dissapear like this without
understanding what the root cause is, that'll just get your reputation
sullied.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ