lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875yeuqgl5.ffs@tglx>
Date:   Fri, 02 Dec 2022 10:51:02 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Steven Rostedt <rostedt@...dmis.org>,
        LKML <linux-kernel@...r.kernel.org>
Cc:     Pekka Paalanen <ppaalanen@...il.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org
Subject: Re: [BUG] lockdep splat using mmio tracer

On Thu, Dec 01 2022 at 21:31, Steven Rostedt wrote:
> I hit this while testing ftrace on an x86 32 bit VM (I've just started
> converting my tests to run on a VM, which is find new found bugs).

Which is find new found grammar twists for the english language :)

> [ 1111.130669] ================================   
> [ 1111.130670] WARNING: inconsistent lock state   
> [ 1111.130672] 6.1.0-rc6-test-00020-gbc591e45c100-dirty #245 Not tainted
> [ 1111.130674] --------------------------------   
> [ 1111.130675] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> [ 1111.130676] kworker/0:0/3433 [HC1[1]:SC0[0]:HE0:SE1] takes:
> [ 1111.130679] d3dc2b90 (kmmio_lock){....}-{2:2}, at: kmmio_die_notifier+0x70/0x140
> [ 1111.130690] {INITIAL USE} state was registered at:
> [ 1111.130691]   lock_acquire+0xa2/0x2b0
> [ 1111.130696]   _raw_spin_lock_irqsave+0x36/0x60 
> [ 1111.130701]   register_kmmio_probe+0x43/0x210  
> [ 1111.130704]   mmiotrace_ioremap+0x188/0x1b0
> [ 1111.130706]   __ioremap_caller.constprop.0+0x257/0x340
> [ 1111.130711]   ioremap_wc+0x12/0x20

That's regular task context, while the int3, which is raised by the
actual MMIO access, is considered to be NMI context. int3 has to be
considered an NMI type exception because int3 can be hit anywhere, even
in actual NMI context.

> [ 1111.130924]  lock_acquire.cold+0x31/0x37
> [ 1111.130927]  ? kmmio_die_notifier+0x70/0x140   
> [ 1111.130935]  ? get_ins_imm_val+0xf0/0xf0
> [ 1111.130938]  _raw_spin_lock+0x2a/0x40
> [ 1111.130942]  ? kmmio_die_notifier+0x70/0x140   
> [ 1111.130945]  kmmio_die_notifier+0x70/0x140
> [ 1111.130948]  ? arm_kmmio_fault_page+0xa0/0xa0  
> [ 1111.130951]  atomic_notifier_call_chain+0x75/0x120
> [ 1111.130955]  notify_die+0x44/0x90
> [ 1111.130959]  exc_debug+0xd0/0x2a0
> [ 1111.130965]  ? exc_int3+0x100/0x100
> [ 1111.130968]  handle_exception+0x133/0x133
> [ 1111.130970] EIP: qxl_draw_dirty_fb+0x2ae/0x440 [qxl]

So for the mmio tracer there is no way that this happens:

> [ 1111.130788]   lock(kmmio_lock);
> [ 1111.130789]   <Interrupt>
> [ 1111.130790]     lock(kmmio_lock);

but obviously lockdep cannot know that :)

The quick and dirty, but IMO safe way out of this, is to convert that
lock to an arch_spinlock and evade lockdep.

> I never hit this before, but then again, mmio tracer is showing output on
> the VM which it did not do on the baremetal machine.

It's exactly the same problem on bare metal.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ