[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
Date: Fri, 25 Mar 2022 11:04:08 +0800
From: Jason Wang <jasowang@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: virtualization@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, maz@...nel.org, tglx@...utronix.de,
peterz@...radead.org, sgarzare@...hat.com, keirf@...gle.com
Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
>> This is a rework on the previous IRQ hardening that is done for
>> virtio-pci where several drawbacks were found and were reverted:
>>
>> 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
>> that is used by some device such as virtio-blk
>> 2) done only for PCI transport
>>
>> In this patch, we tries to borrow the idea from the INTX IRQ hardening
>> in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
>> by introducing a global irq_soft_enabled variable for each
>> virtio_device. Then we can to toggle it during
>> virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
>> used in virtio_reset_device() to synchronize with the IRQ handlers. In
>> the future, we may provide config_ops for the transport that doesn't
>> use IRQ. With this, vring_interrupt() can return check and early if
>> irq_soft_enabled is false. This lead to smp_load_acquire() to be used
>> but the cost should be acceptable.
> Maybe it should be but is it? Can't we use synchronize_irq instead?
Even if we allow the transport driver to synchornize through
synchronize_irq() we still need a check in the vring_interrupt().
We do something like the following previously:
if (!READ_ONCE(vp_dev->intx_soft_enabled))
return IRQ_NONE;
But it looks like a bug since speculative read can be done before the
check where the interrupt handler can't see the uncommitted setup which
is done by the driver.
>
>> To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
>> module parameter is introduced to enable the hardening so function
>> hardening is disabled by default.
> Which devices are these? How come they send an interrupt before there
> are any buffers in any queues?
I copied this from the commit log for 22b7050a024d7
"
This change will also benefit old hypervisors (before 2009)
that send interrupts without checking DRIVER_OK: previously,
the callback could race with driver-specific initialization.
"
If this is only for config interrupt, I can remove the above log.
>
>> Note that the hardening is only done for vring interrupt since the
>> config interrupt hardening is already done in commit 22b7050a024d7
>> ("virtio: defer config changed notifications"). But the method that is
>> used by config interrupt can't be reused by the vring interrupt
>> handler because it uses spinlock to do the synchronization which is
>> expensive.
>>
>> Signed-off-by: Jason Wang <jasowang@...hat.com>
>
>> ---
>> drivers/virtio/virtio.c | 19 +++++++++++++++++++
>> drivers/virtio/virtio_ring.c | 9 ++++++++-
>> include/linux/virtio.h | 4 ++++
>> include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
>> 4 files changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>> index 8dde44ea044a..85e331efa9cc 100644
>> --- a/drivers/virtio/virtio.c
>> +++ b/drivers/virtio/virtio.c
>> @@ -7,6 +7,12 @@
>> #include <linux/of.h>
>> #include <uapi/linux/virtio_ids.h>
>>
>> +static bool irq_hardening = false;
>> +
>> +module_param(irq_hardening, bool, 0444);
>> +MODULE_PARM_DESC(irq_hardening,
>> + "Disalbe IRQ software processing when it is not expected");
>> +
>> /* Unique numbering for virtio devices. */
>> static DEFINE_IDA(virtio_index_ida);
>>
>> @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
>> * */
>> void virtio_reset_device(struct virtio_device *dev)
>> {
>> + /*
>> + * The below synchronize_rcu() guarantees that any
>> + * interrupt for this line arriving after
>> + * synchronize_rcu() has completed is guaranteed to see
>> + * irq_soft_enabled == false.
> News to me I did not know synchronize_rcu has anything to do
> with interrupts. Did not you intend to use synchronize_irq?
> I am not even 100% sure synchronize_rcu is by design a memory barrier
> though it's most likely is ...
According to the comment above tree RCU version of synchronize_rcu():
"""
* RCU read-side critical sections are delimited by rcu_read_lock()
* and rcu_read_unlock(), and may be nested. In addition, but only in
* v5.0 and later, regions of code across which interrupts, preemption,
* or softirqs have been disabled also serve as RCU read-side critical
* sections. This includes hardware interrupt handlers, softirq handlers,
* and NMI handlers.
"""
So interrupt handlers are treated as read-side critical sections.
And it has the comment for explain the barrier:
"""
* Note that this guarantee implies further memory-ordering guarantees.
* On systems with more than one CPU, when synchronize_rcu() returns,
* each CPU is guaranteed to have executed a full memory barrier since
* the end of its last RCU read-side critical section whose beginning
* preceded the call to synchronize_rcu(). In addition, each CPU having
"""
So on SMP it provides a full barrier. And for UP/tiny RCU we don't need
the barrier, if the interrupt come after WRITE_ONCE() it will see the
irq_soft_enabled as false.
>
>> + */
>> + WRITE_ONCE(dev->irq_soft_enabled, false);
>> + synchronize_rcu();
>> +
>> dev->config->reset(dev);
>> }
>> EXPORT_SYMBOL_GPL(virtio_reset_device);
> Please add comment explaining where it will be enabled.
> Also, we *really* don't need to synch if it was already disabled,
> let's not add useless overhead to the boot sequence.
Ok.
>
>
>> @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
>> spin_lock_init(&dev->config_lock);
>> dev->config_enabled = false;
>> dev->config_change_pending = false;
>> + dev->irq_soft_check = irq_hardening;
>> +
>> + if (dev->irq_soft_check)
>> + dev_info(&dev->dev, "IRQ hardening is enabled\n");
>>
>> /* We always start by resetting the device, in case a previous
>> * driver messed it up. This also tests that code path a little. */
> one of the points of hardening is it's also helpful for buggy
> devices. this flag defeats the purpose.
Do you mean:
1) we need something like config_enable? This seems not easy to be
implemented without obvious overhead, mainly the synchronize with the
interrupt handlers
2) enable this by default, so I don't object, but this may have some
risk for old hypervisors
>
>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>> index 962f1477b1fa..0170f8c784d8 100644
>> --- a/drivers/virtio/virtio_ring.c
>> +++ b/drivers/virtio/virtio_ring.c
>> @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
>> return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
>> }
>>
>> -irqreturn_t vring_interrupt(int irq, void *_vq)
>> +irqreturn_t vring_interrupt(int irq, void *v)
>> {
>> + struct virtqueue *_vq = v;
>> + struct virtio_device *vdev = _vq->vdev;
>> struct vring_virtqueue *vq = to_vvq(_vq);
>>
>> + if (!virtio_irq_soft_enabled(vdev)) {
>> + dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
>> + return IRQ_NONE;
>> + }
>> +
>> if (!more_used(vq)) {
>> pr_debug("virtqueue interrupt with no work for %p\n", vq);
>> return IRQ_NONE;
>> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
>> index 5464f398912a..957d6ad604ac 100644
>> --- a/include/linux/virtio.h
>> +++ b/include/linux/virtio.h
>> @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
>> * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
>> * @config_enabled: configuration change reporting enabled
>> * @config_change_pending: configuration change reported while disabled
>> + * @irq_soft_check: whether or not to check @irq_soft_enabled
>> + * @irq_soft_enabled: callbacks enabled
>> * @config_lock: protects configuration change reporting
>> * @dev: underlying device.
>> * @id: the device type identification (used to match it with a driver).
>> @@ -109,6 +111,8 @@ struct virtio_device {
>> bool failed;
>> bool config_enabled;
>> bool config_change_pending;
>> + bool irq_soft_check;
>> + bool irq_soft_enabled;
>> spinlock_t config_lock;
>> spinlock_t vqs_list_lock; /* Protects VQs list access */
>> struct device dev;
>> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
>> index dafdc7f48c01..9c1b61f2e525 100644
>> --- a/include/linux/virtio_config.h
>> +++ b/include/linux/virtio_config.h
>> @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
>> return __virtio_test_bit(vdev, fbit);
>> }
>>
>> +/*
>> + * virtio_irq_soft_enabled: whether we can execute callbacks
>> + * @vdev: the device
>> + */
>> +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
>> +{
>> + if (!vdev->irq_soft_check)
>> + return true;
>> +
>> + /*
>> + * Read irq_soft_enabled before reading other device specific
>> + * data. Paried with smp_store_relase() in
> paired
Will fix.
Thanks
>
>> + * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
>> + * virtio_reset_device().
>> + */
>> + return smp_load_acquire(&vdev->irq_soft_enabled);
>> +}
>> +
>> /**
>> * virtio_has_dma_quirk - determine whether this device has the DMA quirk
>> * @vdev: the device
>> @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
>> if (dev->config->enable_cbs)
>> dev->config->enable_cbs(dev);
>>
>> + /*
>> + * Commit the driver setup before enabling the virtqueue
>> + * callbacks. Paried with smp_load_acuqire() in
>> + * virtio_irq_soft_enabled()
>> + */
>> + smp_store_release(&dev->irq_soft_enabled, true);
>> +
>> BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
>> dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
>> }
>> --
>> 2.25.1
Powered by blists - more mailing lists