lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 10 Jan 2024 09:13:59 -0800
From: Ira Weiny <ira.weiny@...el.com>
To: Dan Williams <dan.j.williams@...el.com>, Ira Weiny <ira.weiny@...el.com>,
	Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron
	<jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>, "Alison
 Schofield" <alison.schofield@...el.com>, Vishal Verma
	<vishal.l.verma@...el.com>
CC: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Ira Weiny
	<ira.weiny@...el.com>
Subject: RE: [PATCH RFC] cxl/pci: Skip irq features if irq's are not supported

Dan Williams wrote:
> Ira Weiny wrote:

[snip]

> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index a2fcbca253f3..422bc9657e5c 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -410,6 +410,7 @@ enum cxl_devtype {
> >   * @ram_res: Active Volatile memory capacity configuration
> >   * @serial: PCIe Device Serial Number
> >   * @type: Generic Memory Class device or Vendor Specific Memory device
> > + * @irq_supported: Flag if irqs are supported by the device
> >   */
> >  struct cxl_dev_state {
> >  	struct device *dev;
> > @@ -424,6 +425,7 @@ struct cxl_dev_state {
> >  	struct resource ram_res;
> >  	u64 serial;
> >  	enum cxl_devtype type;
> > +	bool irq_supported;
> 
> I would rather not carry this init-time-only relevant flag in perpetuity
> in the state structure.

Fair enough.

> Let cxl_pci_probe() see the result from
> cxl_alloc_irq_vectors() and then optionally skip calling setup for
> features the demand interrupt support.

yea better the bool is a local variable to cxl_pci_probe().

> 
> >  };
> >  
> >  /**
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 0155fb66b580..bb90ac011290 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -443,6 +443,12 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
> >  	if (!(cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ))
> >  		return 0;
> >  
> > +	if (!cxlds->irq_supported) {
> > +		dev_err(cxlds->dev, "Mailbox interrupts enabled but device indicates no interrupt vectors supported.\n");
> > +		dev_err(cxlds->dev, "Skip mailbox iterrupt configuration.\n");
> > +		return 0;
> > +	}
> 
> I see no need to do a emit a log message here as the code is happy to
> support a mailbox in polled mode.

True.  However this indicates an error with the device IMO.  The device
did not support MSI/MSI-X but yet indicates irq support for mailboxes.
That is not a well behaved device even it it will work.  We are not
failing the probe here but I think the error gives users good insight.

We could just make it dev_dbg() though.

> I.e. this is not an error that the
> user should call their device-vendor about because end user will see no
> loss of functionality.

But it is not exactly a nice device IMO.

> 
> The code right after this is already fully tolerant of IRQ setup errors:

Agreed which is why only the error was printed and the irq setup calls
skipped for good measure.

If you feel strongly about it I can just drop the hunk but I still think
it is worth some message for those devices behaving this way.

> 
>         irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum);
>         if (irq < 0)
>                 return 0;
> 
>         if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq))
>                 return 0;
> 
> 

[snip]

> >  
> >  static irqreturn_t cxl_event_thread(int irq, void *id)
> > @@ -754,6 +762,13 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
> >  	if (!host_bridge->native_cxl_error)
> >  		return 0;
> >  
> > +	/* Polling not supported */
> > +	if (!mds->cxlds.irq_supported) {
> > +		dev_err(mds->cxlds.dev, "Host events enabled but device indicates no interrupt vectors supported.\n");
> > +		dev_err(mds->cxlds.dev, "Event polling is not supported, skip event processing.\n");
> > +		return 0;
> > +	}
> 
> This one can be a dev_info(), since there is no polling fallback and it
> is unlikely that a device supports events without supporting interrupts.

Sounds good.

> 
> ...or maybe unify all these notifications in the result from
> cxl_alloc_irq_vectors():
> 
>     rc = cxl_alloc_irq_vectors();
>     if (rc) {
>         dev_dbg(dev, "No interrupt support, interrupt-dependent features disabled.\n");
>         interrupts_supported = false;
>     }
> 
> Where dev_dbg() instead of dev_info() because the people that are
> missing features will report this debug log and upstream can say...
> "yup, there's your problem". Where users with cards that are known to
> not support interrupts do not otherwise spam the logs with info they
> know already.
> 
> I also note that cxl_request_irq() will do the right thing, so likely
> don't even need that interrupts_supported flag.

Perhaps, but devices which don't support interrupts by design (and don't
attempt to have any irq features) should be silent IMO.  Why spam the log
with that information even if only during a debug session.

For example if a user has 2 devices, 1 broken from vendor X and 1 which
just does not do irqs from vendor Y, the above would be printed for both
devices when they are trying to debug the broken device.  Then they have
to rely on both vendors to report back.

In the case of reporting an actual error they can call vendor X and leave
vendor Y alone.

I know it is more code and you wanted the smallest possible change but I
think this is worth some code.

I'll rework this a bit and send a V1 for real review.

Ira

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ