lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240903082950.00006d1e@linux.intel.com>
Date: Tue, 3 Sep 2024 08:29:50 -0700
From: Nirmal Patel <nirmal.patel@...ux.intel.com>
To: Kai-Heng Feng <kai.heng.feng@...onical.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
 jonathan.derrick@...ux.dev, acelan.kao@...onical.com,
 lpieralisi@...nel.org, kw@...ux.com, robh@...nel.org, bhelgaas@...gle.com,
 linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: vmd: Delay interrupt handling on MTL VMD
 controller

On Tue, 3 Sep 2024 15:07:45 +0800
Kai-Heng Feng <kai.heng.feng@...onical.com> wrote:

> On Tue, Sep 3, 2024 at 12:29 PM Manivannan Sadhasivam
> <manivannan.sadhasivam@...aro.org> wrote:
> >
> > On Tue, Sep 03, 2024 at 10:55:44AM +0800, Kai-Heng Feng wrote:  
> > > Meteor Lake VMD has a bug that the IRQ raises before the DMA
> > > region is ready, so the requested IO is considered never
> > > completed: [   97.343423] nvme nvme0: I/O 259 QID 2 timeout,
> > > completion polled [   97.343446] nvme nvme0: I/O 384 QID 3
> > > timeout, completion polled [   97.343459] nvme nvme0: I/O 320 QID
> > > 4 timeout, completion polled [   97.343470] nvme nvme0: I/O 707
> > > QID 5 timeout, completion polled
> > >
> > > The is documented as erratum MTL016 [0]. The suggested workaround
> > > is to "The VMD MSI interrupt-handler should initially perform a
> > > dummy register read to the MSI initiator device prior to any
> > > writes to ensure proper PCIe ordering." which essentially is
> > > adding a delay before the interrupt handling.
> > >  
> >
> > Why can't you add a dummy register read instead? Adding a delay for
> > PCIe ordering is not going to work always.  
> 
> This can be done too. But it can take longer than 4us delay, so I'd
> like to keep it a bit faster here.

Since the issue is due to the bug in silicon and we have limited
options. If community is okay to take some performance hit, then please
add more details about the patch above VMD_FEAT_INTERRUPT_QUIRK.

-nirmal
> 
> >  
> > > Hence add a delay before handle interrupt to workaround the
> > > erratum.
> > >
> > > [0]
> > > https://edc.intel.com/content/www/us/en/design/products/platforms/details/meteor-lake-u-p/core-ultra-processor-specification-update/errata-details/#MTL016
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217871
> > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
> > > ---
> > >  drivers/pci/controller/vmd.c | 18 ++++++++++++++++--
> > >  1 file changed, 16 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/vmd.c
> > > b/drivers/pci/controller/vmd.c index a726de0af011..3433b3730f9c
> > > 100644 --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -16,6 +16,7 @@
> > >  #include <linux/srcu.h>
> > >  #include <linux/rculist.h>
> > >  #include <linux/rcupdate.h>
> > > +#include <linux/delay.h>
> > >
> > >  #include <asm/irqdomain.h>
> > >
> > > @@ -74,6 +75,9 @@ enum vmd_features {
> > >        * proper power management of the SoC.
> > >        */
> > >       VMD_FEAT_BIOS_PM_QUIRK          = (1 << 5),
> > > +
> > > +     /* Erratum MTL016 */
> > > +     VMD_FEAT_INTERRUPT_QUIRK        = (1 << 6),
> > >  };
> > >
> > >  #define VMD_BIOS_PM_QUIRK_LTR        0x1003  /* 3145728 ns */
> > > @@ -90,6 +94,8 @@ static DEFINE_IDA(vmd_instance_ida);
> > >   */
> > >  static DEFINE_RAW_SPINLOCK(list_lock);
> > >
> > > +static bool interrupt_delay;
> > > +
> > >  /**
> > >   * struct vmd_irq - private data to map driver IRQ to the VMD
> > > shared vector
> > >   * @node:    list item for parent traversal.
> > > @@ -105,6 +111,7 @@ struct vmd_irq {
> > >       struct vmd_irq_list     *irq;
> > >       bool                    enabled;
> > >       unsigned int            virq;
> > > +     bool                    delay_irq;  
> >
> > This is unused. Perhaps you wanted to use this instead of
> > interrupt_delay?  
> 
> This is leftover, will scratch this.
> 
> Kai-Heng
> 
> >
> > - Mani
> >  
> > >  };
> > >
> > >  /**
> > > @@ -680,8 +687,11 @@ static irqreturn_t vmd_irq(int irq, void
> > > *data) int idx;
> > >
> > >       idx = srcu_read_lock(&irqs->srcu);
> > > -     list_for_each_entry_rcu(vmdirq, &irqs->irq_list, node)
> > > +     list_for_each_entry_rcu(vmdirq, &irqs->irq_list, node) {
> > > +             if (interrupt_delay)
> > > +                     udelay(4);
> > >               generic_handle_irq(vmdirq->virq);
> > > +     }
> > >       srcu_read_unlock(&irqs->srcu, idx);
> > >
> > >       return IRQ_HANDLED;
> > > @@ -1015,6 +1025,9 @@ static int vmd_probe(struct pci_dev *dev,
> > > const struct pci_device_id *id) if (features &
> > > VMD_FEAT_OFFSET_FIRST_VECTOR) vmd->first_vec = 1;
> > >
> > > +     if (features & VMD_FEAT_INTERRUPT_QUIRK)
> > > +             interrupt_delay = true;
> > > +
> > >       spin_lock_init(&vmd->cfg_lock);
> > >       pci_set_drvdata(dev, vmd);
> > >       err = vmd_enable_domain(vmd, features);
> > > @@ -1106,7 +1119,8 @@ static const struct pci_device_id vmd_ids[]
> > > = { {PCI_VDEVICE(INTEL, 0xa77f),
> > >               .driver_data = VMD_FEATS_CLIENT,},
> > >       {PCI_VDEVICE(INTEL, 0x7d0b),
> > > -             .driver_data = VMD_FEATS_CLIENT,},
> > > +             .driver_data = VMD_FEATS_CLIENT |
> > > +                            VMD_FEAT_INTERRUPT_QUIRK,},
> > >       {PCI_VDEVICE(INTEL, 0xad0b),
> > >               .driver_data = VMD_FEATS_CLIENT,},
> > >       {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B),
> > > --
> > > 2.43.0
> > >  
> >
> > --
> > மணிவண்ணன் சதாசிவம்  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ