linux-kernel - Re: [PATCH 1/2] cxl/pci: Add generic MSI-X/MSI irq support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y1IcpaodrrVrkEcL@iweiny-desk3>
Date:   Thu, 20 Oct 2022 21:14:29 -0700
From:   Ira Weiny <ira.weiny@...el.com>
To:     Jonathan Cameron <Jonathan.Cameron@...wei.com>
CC:     Davidlohr Bueso <dave@...olabs.net>, <dan.j.williams@...el.com>,
        <dave.jiang@...el.com>, <alison.schofield@...el.com>,
        <bwidawsk@...nel.org>, <vishal.l.verma@...el.com>,
        <a.manzanares@...sung.com>, <linux-cxl@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] cxl/pci: Add generic MSI-X/MSI irq support

On Tue, Oct 18, 2022 at 11:52:27AM +0100, Jonathan Cameron wrote:
> On Tue, 18 Oct 2022 10:36:19 +0100
> Jonathan Cameron <Jonathan.Cameron@...wei.com> wrote:
> 
> > On Mon, 17 Oct 2022 20:00:09 -0700
> > Davidlohr Bueso <dave@...olabs.net> wrote:
> > 
> > > Introduce a generic irq table for CXL components/features that can have
> > > standard irq support - DOE requires dynamic vector sizing and is not
> > > considered here. For now the table is empty.
> > > 
> > > Create an infrastructure to query the max vectors required for the CXL
> > > device. Upon successful allocation, users can plug in their respective isr
> > > at any point thereafter, which is supported by a new cxlds->has_irq flag,
> > > for example, if the irq setup is not done in the PCI driver, such as
> > > the case of the CXL-PMU.
> > > 
> > > Reviewed-by: Dave Jiang <dave.jiang@...el.com>
> > > Signed-off-by: Davidlohr Bueso <dave@...olabs.net>  
> > 
> > A few nitpicks inline.
> > 
> > With the comment one tidied up (other one optional)
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> > 
> > I'll rebase my cpmu code on top of this shortly.
> Hi Davidlohr,
> 
> Doing the CPMU rebase has shown up that using this generic infrastructure
> ends up rather ugly.
> 
> Previously I had a local array to manage the required register maps
> that was then freed.  Now I have to move that into the cxl device state
> just so I can get at it from the irq finding callback.
> 
> So I have an extra step to be able to use this generic framework.
> 
> 1. Query how many CPMU devices there are.  Stash that and register map
>    info in cxlds.  I could do this in the callback but that's really really
>    horrible layering issue as most of what is done has nothing to do
>    with finding the vector numbers.

FWIW I did this for the event stuff and did not find it so distasteful...  :-/

However the information I am stashing in the cxlds is all interrupt
information.  So I think it is different from what I see in the CPMU stuff.

> 2. The callback below to find those numbers 
> 3. Registration of the cpmu devices.
> 
> Reality is that it is cleaner to more or less ignore the infrastructure
> proposed in this patch.
> 
> 1. Query how many CPMU devices there are. Whilst there stash the maximim
>    cpmu vector number in the cxlds.
> 2. Run a stub in this infrastructure that does max(irq, cxlds->irq_num);
> 3. Carry on as before.
> 
> Thus destroying the point of this infrastructure for that usecase at least
> and leaving an extra bit of state in the cxl_dev_state that is just
> to squirt a value into the callback...

I'm not sure I follow?  Do you mean this?

static int cxl_cpmu_get_max_msgnum(struct cxl_dev_state *cxlds)
{
	return cxlds->cpmu_max_vector;
}  

> 
> So with that in mind I'm withdrawing the RB above.  This looks to be
> an idea that with hindsight doesn't necessarily pan out.
> Long hand equivalent with the specific handling needed for each case
> is probably going to be neater than walking a table of much more
> restricted callbacks.

I'm not married to the idea of the array of callbacks but I'm not sure how this
solves having to iterate on the CPMU devices twice?

Ira

> Maybe there is a nice way to fit the CPMU
> registration into this infrastructure, but I'm not immediately seeing it.
> 
> One other note inline via a compiler warning.
> 
> Jonathan
> 
> > 
> > Jonathan
> > 
> > 
> > > ---
> > >  drivers/cxl/cxlmem.h |  3 ++
> > >  drivers/cxl/pci.c    | 72 ++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 75 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index 88e3a8e54b6a..72b69b003302 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
> > >   * @info: Cached DVSEC information about the device.
> > >   * @serial: PCIe Device Serial Number
> > >   * @doe_mbs: PCI DOE mailbox array
> > > + * @has_irq: PCIe MSI-X/MSI support
> > >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> > >   *
> > >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > > @@ -247,6 +248,8 @@ struct cxl_dev_state {
> > >  
> > >  	struct xarray doe_mbs;
> > >  
> > > +	bool has_irq;
> > > +
> > >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > >  };
> > >  
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index faeb5d9d7a7a..9c3e95ebaa26 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -428,6 +428,73 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> > >  	}
> > >  }
> > >  
> > > +/**
> > > + * struct cxl_irq_cap - CXL feature that is capable of receiving MSI-X/MSI irqs.
> > > + *
> > > + * @name: Name of the device/component generating this interrupt.
> > > + * @get_max_msgnum: Get the feature's largest interrupt message number.  If the
> > > + *		    feature does not have the Interrupt Supported bit set, then
> > > + *		    return -1.
> > > + */
> > > +struct cxl_irq_cap {
> > > +	const char *name;
> > > +	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);
> > > +};
> > > +
> > > +static const struct cxl_irq_cap cxl_irq_cap_table[] = {
> > > +	NULL
> 
> That's not valid, just make it empty instead.
> 
> 
> > > +};
> > > +
> > > +static void cxl_pci_free_irq_vectors(void *data)
> > > +{
> > > +	pci_free_irq_vectors(data);
> > > +}
> > > +
> > > +/*
> > > + * Attempt to allocate the largest amount of necessary vectors.
> > > + *
> > > + * Returns 0 upon a successful allocation of *all* vectors, or a  
> > 
> > Technically not all vectors.  If we wanted to do that we could
> > just directly query that via pci_msix_vec_count() etc that gets
> > it from the MSIX capability. That's frowned upon because it's common
> > to stick lots of extra vectors on the end for stuff that linux never
> > cares about (debug etc, or optional features).
> > 
> > All vectors up to the maximum one the code uses would be more accurate.
> > 
> > > + * negative value otherwise.
> > > + */
> > > +static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct device *dev = cxlds->dev;
> > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > +	int rc, i, vectors = -1;
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
> > > +		int irq;
> > > +
> > > +		if (!cxl_irq_cap_table[i].get_max_msgnum)
> > > +			continue;
> > > +
> > > +		irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
> > > +		vectors = max_t(int, irq, vectors);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Semantically lack of irq support is not an error, but we
> > > +	 * still fail to allocate, so return negative.
> > > +	 */
> > > +	if (vectors == -1)
> > > +		return -1;
> > > +
> > > +	vectors++;
> > > +	rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
> > > +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > > +	if (rc < 0)
> > > +		return rc;
> > > +
> > > +	if (rc != vectors) {
> > > +		dev_dbg(dev, "Not enough interrupts; use polling instead.\n");
> > > +		/* some got allocated, clean them up */
> > > +		cxl_pci_free_irq_vectors(pdev);
> > > +		return -ENOSPC;
> > > +	}
> > > +
> > > +	return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
> > > +}
> > > +
> > >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  {
> > >  	struct cxl_register_map map;
> > > @@ -494,6 +561,11 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (rc)
> > >  		return rc;
> > >  
> > > +	if (!cxl_pci_alloc_irq_vectors(cxlds)) {
> > > +		cxlds->has_irq = true;
> > > +	} else
> > > +		cxlds->has_irq = false;
> > > +  
> > 	cxlds->has_irq = !(cxl_pci_aloc_irq_vectors(cxlds) < 0);
> > 
> > maybe...
> > 
> > >  	cxlmd = devm_cxl_add_memdev(cxlds);
> > >  	if (IS_ERR(cxlmd))
> > >  		return PTR_ERR(cxlmd);  
> > 
>