linux-kernel - Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEvfkUpsY4LRTuH7w18DZdq+w3=Ef6b-0sG0XvzVUVKdzg@mail.gmail.com>
Date:   Wed, 11 May 2022 10:22:59 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Cornelia Huck <cohuck@...hat.com>
Cc:     mst <mst@...hat.com>,
        virtualization <virtualization@...ts.linux-foundation.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Marc Zyngier <maz@...nel.org>,
        Halil Pasic <pasic@...ux.ibm.com>,
        eperezma <eperezma@...hat.com>, Cindy Lu <lulu@...hat.com>,
        Stefano Garzarella <sgarzare@...hat.com>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio

On Tue, May 10, 2022 at 5:29 PM Cornelia Huck <cohuck@...hat.com> wrote:
>
> On Sat, May 07 2022, Jason Wang <jasowang@...hat.com> wrote:
>
> > Hi All:
> >
> > This is a rework on the IRQ hardening for virtio which is done
> > previously by the following commits are reverted:
> >
> > 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
> > 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> >
> > The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
> > with the assumption of the affinity managed IRQ that is used by some
> > virtio drivers. And what's more, it is only done for virtio-pci but
> > not other transports.
> >
> > In this rework, I try to implement a general virtio solution which
> > borrows the idea of the INTX hardening by re-using per virtqueue
> > boolean vq->broken and toggle it in virtio_device_ready() and
> > virtio_reset_device(). Then we can simply reuse the existing checks in
> > the vring_interrupt() and return early if the driver is not ready.
> >
> > Note that, I only did compile test on ccw and MMIO transport.
>
> Lockdep is unhappy with the ccw parts:
>
> ================================
> WARNING: inconsistent lock state
> 5.18.0-rc6+ #191 Not tainted
> --------------------------------
> inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
> kworker/u4:0/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
> 00000000058e9618 (&vcdev->irq_lock){+-..}-{2:2}, at: virtio_ccw_synchronize_cbs+0x4e/0x60
> {IN-HARDIRQ-R} state was registered at:
>   __lock_acquire+0x442/0xc20
>   lock_acquire.part.0+0xdc/0x228
>   lock_acquire+0xa6/0x1b0
>   _raw_read_lock_irqsave+0x72/0x100
>   virtio_ccw_int_handler+0x84/0x238
>   ccw_device_call_handler+0x72/0xd0
>   ccw_device_irq+0x7a/0x198
>   do_cio_interrupt+0x11c/0x1d0
>   __handle_irq_event_percpu+0xc2/0x318
>   handle_irq_event_percpu+0x26/0x68
>   handle_percpu_irq+0x64/0x88
>   generic_handle_irq+0x40/0x58
>   do_irq_async+0x56/0xb0
>   do_io_irq+0x82/0x160
>   io_int_handler+0xe6/0x120
>   rcu_read_lock_sched_held+0x3e/0xb0
>   lock_acquired+0x12e/0x208
>   new_inode+0x3e/0xd0
>   debugfs_get_inode+0x22/0x68
>   __debugfs_create_file+0x78/0x1c0
>   debugfs_create_file_unsafe+0x36/0x58
>   debugfs_create_u32+0x38/0x68
>   sched_init_debug+0xb0/0x1c0
>   do_one_initcall+0x108/0x280
>   do_initcalls+0x124/0x148
>   kernel_init_freeable+0x242/0x280
>   kernel_init+0x2e/0x158
>   __ret_from_fork+0x3c/0x50
>   ret_from_fork+0xa/0x40
> irq event stamp: 539789
> hardirqs last  enabled at (539789): [<0000000000d9c632>] _raw_spin_unlock_irqrestore+0x72/0x88
> hardirqs last disabled at (539788): [<0000000000d9c2b6>] _raw_spin_lock_irqsave+0x96/0xd0
> softirqs last  enabled at (539568): [<0000000000d9e0d4>] __do_softirq+0x434/0x588
> softirqs last disabled at (539503): [<000000000018cd66>] __irq_exit_rcu+0x146/0x170
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&vcdev->irq_lock);
>   <Interrupt>
>     lock(&vcdev->irq_lock);
>
>  *** DEADLOCK ***

It looks to me we need to use write_lock_irq()/write_unlock_irq() to
do the synchronization.

And we probably need to keep the
read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
virtio_ccw_int_handler() to be called from process context (e.g from
the io_subchannel_quiesce()).

Thanks

>
> 2 locks held by kworker/u4:0/9:
>  #0: 000000000288d948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
>  #1: 000003800004bdc8 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
>
> stack backtrace:
> CPU: 1 PID: 9 Comm: kworker/u4:0 Not tainted 5.18.0-rc6+ #191
> Hardware name: QEMU 8561 QEMU (KVM/Linux)
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
>  [<0000000000d8af22>] dump_stack_lvl+0x92/0xd0
>  [<00000000002032ac>] mark_lock_irq+0x864/0x968
>  [<0000000000203670>] mark_lock.part.0+0x2c0/0x790
>  [<0000000000203cea>] mark_usage+0x10a/0x178
>  [<000000000020692a>] __lock_acquire+0x442/0xc20
>  [<0000000000207cc4>] lock_acquire.part.0+0xdc/0x228
>  [<0000000000207eb6>] lock_acquire+0xa6/0x1b0
>  [<0000000000d9c774>] _raw_write_lock+0x54/0xa8
>  [<0000000000d5a1f6>] virtio_ccw_synchronize_cbs+0x4e/0x60
>  [<00000000008eec04>] register_virtio_device+0xdc/0x1b0
>  [<0000000000d5aabe>] virtio_ccw_online+0x246/0x2e8
>  [<0000000000c9fecc>] ccw_device_set_online+0x1c4/0x540
>  [<0000000000d5a05e>] virtio_ccw_auto_online+0x26/0x50
>  [<00000000001ba2b0>] async_run_entry_fn+0x40/0x108
>  [<00000000001ab9b4>] process_one_work+0x2a4/0x658
>  [<00000000001abdd0>] worker_thread+0x68/0x440
>  [<00000000001b4668>] kthread+0x128/0x130
>  [<0000000000102fac>] __ret_from_fork+0x3c/0x50
>  [<0000000000d9d3aa>] ret_from_fork+0xa/0x40
> INFO: lockdep is turned off.
>