lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220308161455.036e9933@gandalf.local.home>
Date:   Tue, 8 Mar 2022 16:14:55 -0500
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: sched_core_balance() releasing interrupts with pi_lock held

Hi Peter,

A ChromeOS bug report showed a lockdep splat that I first thought was a bad
backport. But when looking at upstream, I don't see how it would work there
either. The lockdep splat had:

[56064.673346] Call Trace:
[56064.676066]  dump_stack+0xb9/0x117
[56064.679861]  ? print_usage_bug+0x2af/0x2c2
[56064.684434]  mark_lock_irq+0x25e/0x27d
[56064.688618]  mark_lock+0x11a/0x16c
[56064.692412]  mark_held_locks+0x57/0x87
[56064.696595]  ? _raw_spin_unlock_irq+0x2c/0x40
[56064.701460]  lockdep_hardirqs_on+0xb1/0x19d
[56064.706130]  _raw_spin_unlock_irq+0x2c/0x40
[56064.710799]  sched_core_balance+0x8a/0x4af
[56064.715369]  ? __balance_callback+0x1f/0x9a
[56064.720030]  __balance_callback+0x4f/0x9a
[56064.724506]  rt_mutex_setprio+0x43a/0x48b
[56064.728982]  task_blocks_on_rt_mutex+0x14d/0x1d5

Where I see:

task_blocks_on_rt_mutex() {
  spin_lock(pi_lock);
  rt_mutex_setprio() {
    balance_callback() {
      sched_core_balance() {
        spin_unlock_irq(rq);

Where spin_unlock_irq() enables interrupts while holding the pi_lock, and
BOOM, lockdep (rightfully) complains.

The above was me looking at mainline, not the kernel that blew up. So, I'm
guessing that this is a bug in mainline as well.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ