netdev - Re: [PATCH 7/7] vDPA/ifcvf: improve irq requester, to handle per

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05435798-f496-5d12-ccce-bc53efa30e65@intel.com>
Date:   Tue, 18 Jan 2022 11:07:52 +0800
From:   "Zhu, Lingshan" <lingshan.zhu@...el.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     jasowang@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH 7/7] vDPA/ifcvf: improve irq requester, to handle
 per_vq/shared/config irq



On 1/14/2022 9:36 PM, Michael S. Tsirkin wrote:
> On Fri, Jan 14, 2022 at 08:32:24PM +0800, Zhu, Lingshan wrote:
>>
>> On 1/13/2022 6:29 PM, Michael S. Tsirkin wrote:
>>
>>      On Thu, Jan 13, 2022 at 06:10:15PM +0800, Zhu, Lingshan wrote:
>>
>>
>>          On 1/13/2022 5:52 PM, Michael S. Tsirkin wrote:
>>
>>              On Thu, Jan 13, 2022 at 04:17:29PM +0800, Zhu, Lingshan wrote:
>>
>>                  On 1/11/2022 3:11 PM, Zhu, Lingshan wrote:
>>
>>
>>
>>                       On 1/10/2022 2:04 PM, Michael S. Tsirkin wrote:
>>
>>                           On Mon, Jan 10, 2022 at 01:18:51PM +0800, Zhu Lingshan wrote:
>>
>>                               This commit expends irq requester abilities to handle per vq irq,
>>                               shared irq and config irq.
>>
>>                               On some platforms, the device can not get enough vectors for every
>>                               virtqueue and config interrupt, the device needs to work under such
>>                               circumstances.
>>
>>                               Normally a device can get enough vectors, so every virtqueue and
>>                               config interrupt can have its own vector/irq. If the total vector
>>                               number is less than all virtqueues + 1(config interrupt), all
>>                               virtqueues need to share a vector/irq and config interrupt is
>>                               enabled. If the total vector number < 2, all vitequeues share
>>                               a vector/irq, and config interrupt is disabled. Otherwise it will
>>                               fail if allocation for vectors fails.
>>
>>                               This commit also made necessary chages to the irq cleaner to
>>                               free per vq irq/shared irq and config irq.
>>
>>                               Signed-off-by: Zhu Lingshan <lingshan.zhu@...el.com>
>>
>>                           In this case, shouldn't you also check VIRTIO_PCI_ISR_CONFIG?
>>                           doing that will skip the need
>>
>>                       Hello Michael,
>>
>>                       When insufficient MSIX vectors granted:
>>                       If num_vectors >=2, there will be a vector for the config interrupt, and all vqs share one vector.
>>                       If num_vectors =1, all vqs share the only one vector, and config interrupt is disabled.
>>
>>              ATM linux falls back to INTX in that case, shared by config and vqs.
>>
>>          Yes, the same result. However this driver needs to drive VFs too, and VFs do
>>          not support INTX,
>>          so we need it to send msix dma.
>>
>>                       currently vqs and config interrupt don't share vectors, so IMHO, no need to check .
>>
>>              IMHO it does not matter much that current Linux drivers do not use it,
>>              the spec explicitly allows this option. If such hardware
>>              becomes more common (and you seem to want to improve support
>>              for managing interrupts so maybe yes) we'll add it in Linux.
>>
>>          (just see your another email coming), so I think I should implement a irq
>>          handler
>>          for num_vectors=1 case, a handler checks VIRTIO_PCI_ISR_CONFIG to tell
>>          whether
>>          it is a vq interrupt or config interrupt, then handle it(not disabling
>>          config interrupt).
>>
>>          Thanks,
>>          Zhu Lingshan
>>
>>      Right. To be more exact, if status is bit 2 is not set you call
>>      vq interrupt, if set you call both vq and config interrupt,
>>      since with MSI only config interrupts have a status bit.
>>
>> Thanks! I guess I should call both vq and config interrupt handlers when they
>> share only one vector, because the spec says VIRTIO_PCI_CAP_ISR_CFG
>> is for INTx, but VFs don't support INTx as SRIOV spec required, so
>> isr may always be zero.
>>
>> Thanks,
>> Zhu Lingshan
>>
> Yes. But on the other hand, the spec says:
>
> The device MUST present at least one VIRTIO_PCI_CAP_ISR_CFG capability.
> The device MUST set the Device Configuration Interrupt bit in ISR status before sending a device configu
> ration change notification to the driver.
> If MSIX capability is disabled, the device MUST set the Queue Interrupt bit in ISR status before sending a
> virtqueue notification to the driver.
>
> which to me implies that the Device Configuration Interrupt bit
> is set unconditionally.
>
> And yes it says:
> ...to be used for INT#x interrupt handling
> but it does not say "exclusively".
>
> It is unfortunate that it does not copy this requirement in more places, but
> I think that device does have to set Device Configuration Interrupt bit
> unconditionally.
sorry for the late reply, I totally agree on expanding ISR cap to 
MSI(and MSIX) usage.
>
>
> What exactly does ifcvf do? Does it ever trigger config change
> interrupts? If it does, does it set the Device Configuration Interrupt
> when MSI is used?
It triggers config interrupt upon config changes. However, the spec says:
"If MSI-X capability is enabled, the driver SHOULD NOT access ISR status 
upon detecting a Queue Interrupt.",
so for a VF(only MSIX, no INTx), when a vq interrupt is triggered, we 
see isr == 0;
So I think it would be nice to make slight changes to the spec: 
consistently describes ISR cap usage, and
remove this (ambiguous) limitation of ISR usage(which implies only for 
INTx).

For the driver which drivers both VF and PF, I think currently we should 
ignore ISR cap, means if the all device vqs and config interrupt
share the only one vector/IRQ, just kick them all. I agree we should 
work out a way to tell the device type in the future.
Does this sounds reasonable?

Thanks,
Zhu Lingshan
>
> I will report a spec defect for the apparent inconsistency.
>
>>                       I will send a V2 patch address your comments.
>>
>>                       Thanks,
>>                       Zhu Lingshan
>>
>>                               ---
>>                                drivers/vdpa/ifcvf/ifcvf_base.h |  6 +--
>>                                drivers/vdpa/ifcvf/ifcvf_main.c | 78 +++++++++++++++------------------
>>                                2 files changed, 38 insertions(+), 46 deletions(-)
>>
>>                               diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
>>                               index 1d5431040d7d..1d0afb63f06c 100644
>>                               --- a/drivers/vdpa/ifcvf/ifcvf_base.h
>>                               +++ b/drivers/vdpa/ifcvf/ifcvf_base.h
>>                               @@ -27,8 +27,6 @@
>>
>>                                #define IFCVF_QUEUE_ALIGNMENT  PAGE_SIZE
>>                                #define IFCVF_QUEUE_MAX                32768
>>                               -#define IFCVF_MSI_CONFIG_OFF   0
>>                               -#define IFCVF_MSI_QUEUE_OFF    1
>>                                #define IFCVF_PCI_MAX_RESOURCE 6
>>
>>                                #define IFCVF_LM_CFG_SIZE              0x40
>>                               @@ -102,11 +100,13 @@ struct ifcvf_hw {
>>                                       u8 notify_bar;
>>                                       /* Notificaiton bar address */
>>                                       void __iomem *notify_base;
>>                               +       u8 vector_per_vq;
>>                               +       u16 padding;
>>
>>                           What is this padding doing?
>>
>>                  for cacheline alignment
>>
>>
>>
>>                                       phys_addr_t notify_base_pa;
>>                                       u32 notify_off_multiplier;
>>                               +       u32 dev_type;
>>                                       u64 req_features;
>>                                       u64 hw_features;
>>                               -       u32 dev_type;
>>
>>                           moving things around ... optimization? split out.
>>
>>                  sure
>>
>>
>>
>>                                       struct virtio_pci_common_cfg __iomem *common_cfg;
>>                                       void __iomem *net_cfg;
>>                                       struct vring_info vring[IFCVF_MAX_QUEUES];
>>                               diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
>>                               index 414b5dfd04ca..ec76e342bd7e 100644
>>                               --- a/drivers/vdpa/ifcvf/ifcvf_main.c
>>                               +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
>>                               @@ -17,6 +17,8 @@
>>                                #define DRIVER_AUTHOR   "Intel Corporation"
>>                                #define IFCVF_DRIVER_NAME       "ifcvf"
>>
>>                               +static struct vdpa_config_ops ifc_vdpa_ops;
>>                               +
>>
>>                           there can be multiple devices thinkably.
>>                           reusing a global ops does not sound reasonable.
>>
>>                  OK, I will set vq irq number to -EINVAL when vqs share irq,
>>                  then we can disable irq_bypass when see irq = -EINVAL,
>>                  no need to set get_vq_irq = NULL.
>>
>>
>>
>>
>>                                static irqreturn_t ifcvf_config_changed(int irq, void *arg)
>>                                {
>>                                       struct ifcvf_hw *vf = arg;
>>                               @@ -63,13 +65,20 @@ static void ifcvf_free_irq(struct ifcvf_adapter *adapter, int queues)
>>                                       struct ifcvf_hw *vf = &adapter->vf;
>>                                       int i;
>>
>>                               +       if (vf->vector_per_vq)
>>                               +               for (i = 0; i < queues; i++) {
>>                               +                       devm_free_irq(&pdev->dev, vf->vring[i].irq, &vf->vring[i]);
>>                               +                       vf->vring[i].irq = -EINVAL;
>>                               +               }
>>                               +       else
>>                               +               devm_free_irq(&pdev->dev, vf->vring[0].irq, vf);
>>
>>                               -       for (i = 0; i < queues; i++) {
>>                               -               devm_free_irq(&pdev->dev, vf->vring[i].irq, &vf->vring[i]);
>>                               -               vf->vring[i].irq = -EINVAL;
>>                               +
>>                               +       if (vf->config_irq != -EINVAL) {
>>                               +               devm_free_irq(&pdev->dev, vf->config_irq, vf);
>>                               +               vf->config_irq = -EINVAL;
>>                                       }
>>
>>                           what about other error types?
>>
>>                  vf->config_irq is set to -EINVAL in ifcvf_request_config_irq(),
>>                  if no config irq(vector) is granted, or it should be a valid irq number,
>>                  so there can be no other error numbers. But I can change it
>>                  to  if (vf->config_irq < 0) for sure
>>
>>
>>
>>
>>                               -       devm_free_irq(&pdev->dev, vf->config_irq, vf);
>>                                       ifcvf_free_irq_vectors(pdev);
>>                                }
>>
>>                               @@ -191,52 +200,35 @@ static int ifcvf_request_config_irq(struct ifcvf_adapter *adapter, int config_ve
>>
>>                                static int ifcvf_request_irq(struct ifcvf_adapter *adapter)
>>                                {
>>                               -       struct pci_dev *pdev = adapter->pdev;
>>                                       struct ifcvf_hw *vf = &adapter->vf;
>>                               -       int vector, i, ret, irq;
>>                               -       u16 max_intr;
>>                               +       u16 nvectors, max_vectors;
>>                               +       int config_vector, ret;
>>
>>                               -       /* all queues and config interrupt  */
>>                               -       max_intr = vf->nr_vring + 1;
>>                               +       nvectors = ifcvf_alloc_vectors(adapter);
>>                               +       if (nvectors < 0)
>>                               +               return nvectors;
>>
>>                               -       ret = pci_alloc_irq_vectors(pdev, max_intr,
>>                               -                                   max_intr, PCI_IRQ_MSIX);
>>                               -       if (ret < 0) {
>>                               -               IFCVF_ERR(pdev, "Failed to alloc IRQ vectors\n");
>>                               -               return ret;
>>                               -       }
>>                               +       vf->vector_per_vq = true;
>>                               +       max_vectors = vf->nr_vring + 1;
>>                               +       config_vector = vf->nr_vring;
>>
>>                               -       snprintf(vf->config_msix_name, 256, "ifcvf[%s]-config\n",
>>                               -                pci_name(pdev));
>>                               -       vector = 0;
>>                               -       vf->config_irq = pci_irq_vector(pdev, vector);
>>                               -       ret = devm_request_irq(&pdev->dev, vf->config_irq,
>>                               -                              ifcvf_config_changed, 0,
>>                               -                              vf->config_msix_name, vf);
>>                               -       if (ret) {
>>                               -               IFCVF_ERR(pdev, "Failed to request config irq\n");
>>                               -               return ret;
>>                               +       if (nvectors < max_vectors) {
>>                               +               vf->vector_per_vq = false;
>>                               +               config_vector = 1;
>>                               +               ifc_vdpa_ops.get_vq_irq = NULL;
>>                                       }
>>
>>                               -       for (i = 0; i < vf->nr_vring; i++) {
>>                               -               snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n",
>>                               -                        pci_name(pdev), i);
>>                               -               vector = i + IFCVF_MSI_QUEUE_OFF;
>>                               -               irq = pci_irq_vector(pdev, vector);
>>                               -               ret = devm_request_irq(&pdev->dev, irq,
>>                               -                                      ifcvf_intr_handler, 0,
>>                               -                                      vf->vring[i].msix_name,
>>                               -                                      &vf->vring[i]);
>>                               -               if (ret) {
>>                               -                       IFCVF_ERR(pdev,
>>                               -                                 "Failed to request irq for vq %d\n", i);
>>                               -                       ifcvf_free_irq(adapter, i);
>>                               +       if (nvectors < 2)
>>                               +               config_vector = 0;
>>
>>                               -                       return ret;
>>                               -               }
>>                               +       ret = ifcvf_request_vq_irq(adapter, vf->vector_per_vq);
>>                               +       if (ret)
>>                               +               return ret;
>>
>>                               -               vf->vring[i].irq = irq;
>>                               -       }
>>                               +       ret = ifcvf_request_config_irq(adapter, config_vector);
>>                               +
>>                               +       if (ret)
>>                               +               return ret;
>>
>>                           here on error we need to cleanup vq irq we requested, need we not?
>>
>>                  I think it may not be needed, it can work without config interrupt, though lame
>>
>>                  Thanks for your comments!
>>                  Zhu Lingshan
>>
>>
>>
>>
>>                                       return 0;
>>                                }
>>                               @@ -573,7 +565,7 @@ static struct vdpa_notification_area ifcvf_get_vq_notification(struct vdpa_devic
>>                                 * IFCVF currently does't have on-chip IOMMU, so not
>>                                 * implemented set_map()/dma_map()/dma_unmap()
>>                                 */
>>                               -static const struct vdpa_config_ops ifc_vdpa_ops = {
>>                               +static struct vdpa_config_ops ifc_vdpa_ops = {
>>                                       .get_features   = ifcvf_vdpa_get_features,
>>                                       .set_features   = ifcvf_vdpa_set_features,
>>                                       .get_status     = ifcvf_vdpa_get_status,
>>                               --
>>                               2.27.0
>>
>>
>>
>>
>>
>>