[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220118030711-mutt-send-email-mst@kernel.org>
Date: Tue, 18 Jan 2022 03:07:40 -0500
From: "Michael S. Tsirkin" <mst@...hat.com>
To: "Zhu, Lingshan" <lingshan.zhu@...el.com>
Cc: jasowang@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH 7/7] vDPA/ifcvf: improve irq requester, to handle
per_vq/shared/config irq
On Tue, Jan 18, 2022 at 11:07:52AM +0800, Zhu, Lingshan wrote:
>
>
> On 1/14/2022 9:36 PM, Michael S. Tsirkin wrote:
> > On Fri, Jan 14, 2022 at 08:32:24PM +0800, Zhu, Lingshan wrote:
> > >
> > > On 1/13/2022 6:29 PM, Michael S. Tsirkin wrote:
> > >
> > > On Thu, Jan 13, 2022 at 06:10:15PM +0800, Zhu, Lingshan wrote:
> > >
> > >
> > > On 1/13/2022 5:52 PM, Michael S. Tsirkin wrote:
> > >
> > > On Thu, Jan 13, 2022 at 04:17:29PM +0800, Zhu, Lingshan wrote:
> > >
> > > On 1/11/2022 3:11 PM, Zhu, Lingshan wrote:
> > >
> > >
> > >
> > > On 1/10/2022 2:04 PM, Michael S. Tsirkin wrote:
> > >
> > > On Mon, Jan 10, 2022 at 01:18:51PM +0800, Zhu Lingshan wrote:
> > >
> > > This commit expends irq requester abilities to handle per vq irq,
> > > shared irq and config irq.
> > >
> > > On some platforms, the device can not get enough vectors for every
> > > virtqueue and config interrupt, the device needs to work under such
> > > circumstances.
> > >
> > > Normally a device can get enough vectors, so every virtqueue and
> > > config interrupt can have its own vector/irq. If the total vector
> > > number is less than all virtqueues + 1(config interrupt), all
> > > virtqueues need to share a vector/irq and config interrupt is
> > > enabled. If the total vector number < 2, all vitequeues share
> > > a vector/irq, and config interrupt is disabled. Otherwise it will
> > > fail if allocation for vectors fails.
> > >
> > > This commit also made necessary chages to the irq cleaner to
> > > free per vq irq/shared irq and config irq.
> > >
> > > Signed-off-by: Zhu Lingshan <lingshan.zhu@...el.com>
> > >
> > > In this case, shouldn't you also check VIRTIO_PCI_ISR_CONFIG?
> > > doing that will skip the need
> > >
> > > Hello Michael,
> > >
> > > When insufficient MSIX vectors granted:
> > > If num_vectors >=2, there will be a vector for the config interrupt, and all vqs share one vector.
> > > If num_vectors =1, all vqs share the only one vector, and config interrupt is disabled.
> > >
> > > ATM linux falls back to INTX in that case, shared by config and vqs.
> > >
> > > Yes, the same result. However this driver needs to drive VFs too, and VFs do
> > > not support INTX,
> > > so we need it to send msix dma.
> > >
> > > currently vqs and config interrupt don't share vectors, so IMHO, no need to check .
> > >
> > > IMHO it does not matter much that current Linux drivers do not use it,
> > > the spec explicitly allows this option. If such hardware
> > > becomes more common (and you seem to want to improve support
> > > for managing interrupts so maybe yes) we'll add it in Linux.
> > >
> > > (just see your another email coming), so I think I should implement a irq
> > > handler
> > > for num_vectors=1 case, a handler checks VIRTIO_PCI_ISR_CONFIG to tell
> > > whether
> > > it is a vq interrupt or config interrupt, then handle it(not disabling
> > > config interrupt).
> > >
> > > Thanks,
> > > Zhu Lingshan
> > >
> > > Right. To be more exact, if status is bit 2 is not set you call
> > > vq interrupt, if set you call both vq and config interrupt,
> > > since with MSI only config interrupts have a status bit.
> > >
> > > Thanks! I guess I should call both vq and config interrupt handlers when they
> > > share only one vector, because the spec says VIRTIO_PCI_CAP_ISR_CFG
> > > is for INTx, but VFs don't support INTx as SRIOV spec required, so
> > > isr may always be zero.
> > >
> > > Thanks,
> > > Zhu Lingshan
> > >
> > Yes. But on the other hand, the spec says:
> >
> > The device MUST present at least one VIRTIO_PCI_CAP_ISR_CFG capability.
> > The device MUST set the Device Configuration Interrupt bit in ISR status before sending a device configu
> > ration change notification to the driver.
> > If MSIX capability is disabled, the device MUST set the Queue Interrupt bit in ISR status before sending a
> > virtqueue notification to the driver.
> >
> > which to me implies that the Device Configuration Interrupt bit
> > is set unconditionally.
> >
> > And yes it says:
> > ...to be used for INT#x interrupt handling
> > but it does not say "exclusively".
> >
> > It is unfortunate that it does not copy this requirement in more places, but
> > I think that device does have to set Device Configuration Interrupt bit
> > unconditionally.
> sorry for the late reply, I totally agree on expanding ISR cap to MSI(and
> MSIX) usage.
> >
> >
> > What exactly does ifcvf do? Does it ever trigger config change
> > interrupts? If it does, does it set the Device Configuration Interrupt
> > when MSI is used?
> It triggers config interrupt upon config changes. However, the spec says:
> "If MSI-X capability is enabled, the driver SHOULD NOT access ISR status
> upon detecting a Queue Interrupt.",
> so for a VF(only MSIX, no INTx), when a vq interrupt is triggered, we see
> isr == 0;
Exactly. So vq callback unconditionally invoked, config callback
only if ISR is set.
> So I think it would be nice to make slight changes to the spec: consistently
> describes ISR cap usage, and
> remove this (ambiguous) limitation of ISR usage(which implies only for
> INTx).
>
> For the driver which drivers both VF and PF, I think currently we should
> ignore ISR cap, means if the all device vqs and config interrupt
> share the only one vector/IRQ, just kick them all. I agree we should work
> out a way to tell the device type in the future.
> Does this sounds reasonable?
>
> Thanks,
> Zhu Lingshan
> >
> > I will report a spec defect for the apparent inconsistency.
> >
> > > I will send a V2 patch address your comments.
> > >
> > > Thanks,
> > > Zhu Lingshan
> > >
> > > ---
> > > drivers/vdpa/ifcvf/ifcvf_base.h | 6 +--
> > > drivers/vdpa/ifcvf/ifcvf_main.c | 78 +++++++++++++++------------------
> > > 2 files changed, 38 insertions(+), 46 deletions(-)
> > >
> > > diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
> > > index 1d5431040d7d..1d0afb63f06c 100644
> > > --- a/drivers/vdpa/ifcvf/ifcvf_base.h
> > > +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
> > > @@ -27,8 +27,6 @@
> > >
> > > #define IFCVF_QUEUE_ALIGNMENT PAGE_SIZE
> > > #define IFCVF_QUEUE_MAX 32768
> > > -#define IFCVF_MSI_CONFIG_OFF 0
> > > -#define IFCVF_MSI_QUEUE_OFF 1
> > > #define IFCVF_PCI_MAX_RESOURCE 6
> > >
> > > #define IFCVF_LM_CFG_SIZE 0x40
> > > @@ -102,11 +100,13 @@ struct ifcvf_hw {
> > > u8 notify_bar;
> > > /* Notificaiton bar address */
> > > void __iomem *notify_base;
> > > + u8 vector_per_vq;
> > > + u16 padding;
> > >
> > > What is this padding doing?
> > >
> > > for cacheline alignment
> > >
> > >
> > >
> > > phys_addr_t notify_base_pa;
> > > u32 notify_off_multiplier;
> > > + u32 dev_type;
> > > u64 req_features;
> > > u64 hw_features;
> > > - u32 dev_type;
> > >
> > > moving things around ... optimization? split out.
> > >
> > > sure
> > >
> > >
> > >
> > > struct virtio_pci_common_cfg __iomem *common_cfg;
> > > void __iomem *net_cfg;
> > > struct vring_info vring[IFCVF_MAX_QUEUES];
> > > diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> > > index 414b5dfd04ca..ec76e342bd7e 100644
> > > --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> > > +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> > > @@ -17,6 +17,8 @@
> > > #define DRIVER_AUTHOR "Intel Corporation"
> > > #define IFCVF_DRIVER_NAME "ifcvf"
> > >
> > > +static struct vdpa_config_ops ifc_vdpa_ops;
> > > +
> > >
> > > there can be multiple devices thinkably.
> > > reusing a global ops does not sound reasonable.
> > >
> > > OK, I will set vq irq number to -EINVAL when vqs share irq,
> > > then we can disable irq_bypass when see irq = -EINVAL,
> > > no need to set get_vq_irq = NULL.
> > >
> > >
> > >
> > >
> > > static irqreturn_t ifcvf_config_changed(int irq, void *arg)
> > > {
> > > struct ifcvf_hw *vf = arg;
> > > @@ -63,13 +65,20 @@ static void ifcvf_free_irq(struct ifcvf_adapter *adapter, int queues)
> > > struct ifcvf_hw *vf = &adapter->vf;
> > > int i;
> > >
> > > + if (vf->vector_per_vq)
> > > + for (i = 0; i < queues; i++) {
> > > + devm_free_irq(&pdev->dev, vf->vring[i].irq, &vf->vring[i]);
> > > + vf->vring[i].irq = -EINVAL;
> > > + }
> > > + else
> > > + devm_free_irq(&pdev->dev, vf->vring[0].irq, vf);
> > >
> > > - for (i = 0; i < queues; i++) {
> > > - devm_free_irq(&pdev->dev, vf->vring[i].irq, &vf->vring[i]);
> > > - vf->vring[i].irq = -EINVAL;
> > > +
> > > + if (vf->config_irq != -EINVAL) {
> > > + devm_free_irq(&pdev->dev, vf->config_irq, vf);
> > > + vf->config_irq = -EINVAL;
> > > }
> > >
> > > what about other error types?
> > >
> > > vf->config_irq is set to -EINVAL in ifcvf_request_config_irq(),
> > > if no config irq(vector) is granted, or it should be a valid irq number,
> > > so there can be no other error numbers. But I can change it
> > > to if (vf->config_irq < 0) for sure
> > >
> > >
> > >
> > >
> > > - devm_free_irq(&pdev->dev, vf->config_irq, vf);
> > > ifcvf_free_irq_vectors(pdev);
> > > }
> > >
> > > @@ -191,52 +200,35 @@ static int ifcvf_request_config_irq(struct ifcvf_adapter *adapter, int config_ve
> > >
> > > static int ifcvf_request_irq(struct ifcvf_adapter *adapter)
> > > {
> > > - struct pci_dev *pdev = adapter->pdev;
> > > struct ifcvf_hw *vf = &adapter->vf;
> > > - int vector, i, ret, irq;
> > > - u16 max_intr;
> > > + u16 nvectors, max_vectors;
> > > + int config_vector, ret;
> > >
> > > - /* all queues and config interrupt */
> > > - max_intr = vf->nr_vring + 1;
> > > + nvectors = ifcvf_alloc_vectors(adapter);
> > > + if (nvectors < 0)
> > > + return nvectors;
> > >
> > > - ret = pci_alloc_irq_vectors(pdev, max_intr,
> > > - max_intr, PCI_IRQ_MSIX);
> > > - if (ret < 0) {
> > > - IFCVF_ERR(pdev, "Failed to alloc IRQ vectors\n");
> > > - return ret;
> > > - }
> > > + vf->vector_per_vq = true;
> > > + max_vectors = vf->nr_vring + 1;
> > > + config_vector = vf->nr_vring;
> > >
> > > - snprintf(vf->config_msix_name, 256, "ifcvf[%s]-config\n",
> > > - pci_name(pdev));
> > > - vector = 0;
> > > - vf->config_irq = pci_irq_vector(pdev, vector);
> > > - ret = devm_request_irq(&pdev->dev, vf->config_irq,
> > > - ifcvf_config_changed, 0,
> > > - vf->config_msix_name, vf);
> > > - if (ret) {
> > > - IFCVF_ERR(pdev, "Failed to request config irq\n");
> > > - return ret;
> > > + if (nvectors < max_vectors) {
> > > + vf->vector_per_vq = false;
> > > + config_vector = 1;
> > > + ifc_vdpa_ops.get_vq_irq = NULL;
> > > }
> > >
> > > - for (i = 0; i < vf->nr_vring; i++) {
> > > - snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n",
> > > - pci_name(pdev), i);
> > > - vector = i + IFCVF_MSI_QUEUE_OFF;
> > > - irq = pci_irq_vector(pdev, vector);
> > > - ret = devm_request_irq(&pdev->dev, irq,
> > > - ifcvf_intr_handler, 0,
> > > - vf->vring[i].msix_name,
> > > - &vf->vring[i]);
> > > - if (ret) {
> > > - IFCVF_ERR(pdev,
> > > - "Failed to request irq for vq %d\n", i);
> > > - ifcvf_free_irq(adapter, i);
> > > + if (nvectors < 2)
> > > + config_vector = 0;
> > >
> > > - return ret;
> > > - }
> > > + ret = ifcvf_request_vq_irq(adapter, vf->vector_per_vq);
> > > + if (ret)
> > > + return ret;
> > >
> > > - vf->vring[i].irq = irq;
> > > - }
> > > + ret = ifcvf_request_config_irq(adapter, config_vector);
> > > +
> > > + if (ret)
> > > + return ret;
> > >
> > > here on error we need to cleanup vq irq we requested, need we not?
> > >
> > > I think it may not be needed, it can work without config interrupt, though lame
> > >
> > > Thanks for your comments!
> > > Zhu Lingshan
> > >
> > >
> > >
> > >
> > > return 0;
> > > }
> > > @@ -573,7 +565,7 @@ static struct vdpa_notification_area ifcvf_get_vq_notification(struct vdpa_devic
> > > * IFCVF currently does't have on-chip IOMMU, so not
> > > * implemented set_map()/dma_map()/dma_unmap()
> > > */
> > > -static const struct vdpa_config_ops ifc_vdpa_ops = {
> > > +static struct vdpa_config_ops ifc_vdpa_ops = {
> > > .get_features = ifcvf_vdpa_get_features,
> > > .set_features = ifcvf_vdpa_set_features,
> > > .get_status = ifcvf_vdpa_get_status,
> > > --
> > > 2.27.0
> > >
> > >
> > >
> > >
> > >
> > >
Powered by blists - more mailing lists