linux-kernel - Re: [PATCH 08/11] cxl/mem: Wire up event interrupts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y4ceXGYg8MXzZCwP@iweiny-desk3>
Date:   Wed, 30 Nov 2022 01:11:56 -0800
From:   Ira Weiny <ira.weiny@...el.com>
To:     Jonathan Cameron <Jonathan.Cameron@...wei.com>
CC:     Dan Williams <dan.j.williams@...el.com>,
        Alison Schofield <alison.schofield@...el.com>,
        Vishal Verma <vishal.l.verma@...el.com>,
        "Ben Widawsky" <bwidawsk@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Davidlohr Bueso <dave@...olabs.net>,
        <linux-kernel@...r.kernel.org>, <linux-cxl@...r.kernel.org>
Subject: Re: [PATCH 08/11] cxl/mem: Wire up event interrupts

On Wed, Nov 16, 2022 at 02:40:21PM +0000, Jonathan Cameron wrote:
> On Thu, 10 Nov 2022 10:57:55 -0800
> ira.weiny@...el.com wrote:
> 
> > From: Ira Weiny <ira.weiny@...el.com>
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > The driver must deal with the possibility that dynamic capacity is not
> > yet supported by a device it sees.  Fallback and retry without dynamic
> > capacity if the first attempt fails.
> > 
> > Device capacity event logs interrupt as part of the informational event
> > log.  Check the event status to see which log has data.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@...el.com>
> > 
> Hi Ira,
> 
> A few comments inline.

Thanks for the review!

> 
> Thanks,
> 
> Jonathan
> 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 879b228a98a0..1e6762af2a00 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> 
> >  /**
> >   * cxl_mem_get_event_records - Get Event Records from the device
> > @@ -867,6 +870,52 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> >  
> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy *policy = &cxlds->evt_int_policy;
> > +	size_t policy_size = sizeof(*policy);
> > +	bool retry = true;
> > +	int rc;
> > +
> > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> > +	policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
> > +
> > +again:
> > +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > +			       policy, policy_size, NULL, 0);
> > +	if (rc < 0) {
> > +		/*
> > +		 * If the device does not support dynamic capacity it may fail
> > +		 * the command due to an invalid payload.  Retry without
> > +		 * dynamic capacity.
> > +		 */
> 
> There are a number of ways to discover if DCD is supported that aren't based
> on try and retry like this. 9.13.3 has "basic sequence to utilize Dynamic Capacity"
> That calls out:
> Verify the necessary Dynamic Capacity commands are returned in the CEL.
> 
> First I'm not sure we should set the interrupt on for DCD until we have a lot
> more of the flow handled, secondly even then we should figure out if it is supported
> at a higher level than this command and pass that info down here.

I'm not sure I really agree.  The events are just traced.  I think this
functionality is really orthogonal to if any other support for DCD is there.

Regardless like I said in the call I think deferring this is the right way to
go for now.

> 
> 
> > +		if (retry) {
> > +			retry = false;
> > +			policy->dyn_cap_settings = 0;
> > +			policy_size = sizeof(*policy) - sizeof(policy->dyn_cap_settings);
> > +			goto again;
> > +		}
> > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > +			rc);
> > +		memset(policy, CXL_INT_NONE, sizeof(*policy));
> 
> Relying on all the fields being 1 byte is a bit error prone. I'd just set them all
> individually in the interests of more readable code.

Done.

> 
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
> > +			       policy, policy_size);
> 
> Add a comment on why you are reading this back (to get the msgnums in the upper
> bits) as it's not obvious to a casual reader.

Done.

> 
> > +	if (rc < 0) {
> > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > +			rc);
> > +		return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
> > +
> 
> ...
> 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index e0d511575b45..64b2e2671043 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -458,6 +458,138 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> >  	cxlds->nr_irq_vecs = nvecs;
> >  }
> >  
> > +struct cxl_event_irq_id {
> > +	struct cxl_dev_state *cxlds;
> > +	u32 status;
> > +	unsigned int msgnum;
> msgnum is only here for freeing the interrupt - I'd rather we fixed
> that by using standard infrastructure (or adding some - see below).
> 
> status is an indirect way of allowing us to share an interrupt handler.
> You could do that by registering a trivial wrapper for each instead.
> Then all you have left is the cxl_dev_state which could be passed
> in directly as the callback parameter removing need to have this
> structure at all.  I think that might be neater.

It does prevent the alloc of this structure which I like.

I've made the change.

> 
> > +};
> > +
> > +static irqreturn_t cxl_event_int_thread(int irq, void *id)
> > +{
> > +	struct cxl_event_irq_id *cxlid = id;
> > +	struct cxl_dev_state *cxlds = cxlid->cxlds;
> > +
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_INFO)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_WARN)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_FAIL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_FATAL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > +	if (cxlid->status & CXLDEV_EVENT_STATUS_DYNAMIC_CAP)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_DYNAMIC_CAP);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t cxl_event_int_handler(int irq, void *id)
> > +{
> > +	struct cxl_event_irq_id *cxlid = id;
> > +	struct cxl_dev_state *cxlds = cxlid->cxlds;
> > +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +
> > +	if (cxlid->status & status)
> > +		return IRQ_WAKE_THREAD;
> > +	return IRQ_HANDLED;
> 
> If status not set IRQ_NONE.
> Ah. I see Dave raised this as well.

Yep done.

> 
> > +}
> 
> ...
> 
> > +static int cxl_request_event_irq(struct cxl_dev_state *cxlds,
> > +				 enum cxl_event_log_type log_type,
> > +				 u8 setting)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct cxl_event_irq_id *id;
> > +	unsigned int msgnum = CXL_EVENT_INT_MSGNUM(setting);
> > +	int irq;
> > +
> > +	/* Disabled irq is not an error */
> > +	if (!cxl_evt_int_is_msi(setting) || msgnum > cxlds->nr_irq_vecs) {
> 
> I don't think that second condition can occur.  The language under table 8-52
> (I think) means that it will move around if there aren't enough vectors
> (for MSI - MSI-X is more complex, but result the same).

Based on the other review this is just a bool msi_enabled which is used to
determine if this should be set up at all.

> 
> > +		dev_dbg(dev, "Event interrupt not enabled; %s %u %d\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_INFO),
> > +			msgnum, cxlds->nr_irq_vecs);
> > +		return 0;
> > +	}
> > +
> > +	id = devm_kzalloc(dev, sizeof(*id), GFP_KERNEL);
> > +	if (!id)
> > +		return -ENOMEM;
> > +
> > +	id->cxlds = cxlds;
> > +	id->msgnum = msgnum;
> > +	id->status = log_type_to_status(log_type);
> > +
> > +	irq = pci_request_irq(pdev, id->msgnum, cxl_event_int_handler,
> > +			      cxl_event_int_thread, id,
> > +			      "%s:event-log-%s", dev_name(dev),
> > +			      cxl_event_log_type_str(log_type));
> > +	if (irq)
> > +		return irq;
> > +
> > +	devm_add_action_or_reset(dev, cxl_free_event_irq, id);
> 
> Hmm. no pcim_request_irq()  maybe this is the time to propose one
> (separate from this patch so we don't get delayed by that!)

Perhaps.  But not tonight...  ;-)

> 
> We discussed this way back in DOE series (I'd forgotten but lore found
> it for me).  There I suggested just calling
> devm_request_threaded_irq() directly as a work around.

Yea that works fine.  One issue is we lose the format printing of the irq name:

...
 29:  ...  PCI-MSI 100663300-edge      0000:c0:00.0:event-log-Fatal
 30:  ...  PCI-MSI 100663301-edge      0000:c0:00.0:event-log-Failure
 31:  ...  PCI-MSI 100663302-edge      0000:c0:00.0:event-log-Warning
 32:  ...  PCI-MSI 100663303-edge      0000:c0:00.0:event-log-Informational
...

Thanks,
Ira

> 
> > +	return 0;
> > +}
> > +
> > +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	u8 setting;
> > +
> > +	if (cxl_event_config_msgnums(cxlds))
> > +		return;
> > +
> > +	/*
> > +	 * Dynamic Capacity shares the info message number
> > +	 * Nothing to be done except check the status bit in the
> > +	 * irq thread.
> > +	 */
> > +	setting = cxlds->evt_int_policy.info_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_INFO, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));
> > +
> > +	setting = cxlds->evt_int_policy.warn_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_WARN, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_WARN));
> > +
> > +	setting = cxlds->evt_int_policy.failure_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FAIL, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_FAIL));
> > +
> > +	setting = cxlds->evt_int_policy.fatal_settings;
> > +	if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FATAL, setting))
> > +		dev_err(dev, "Failed to get interrupt for %s event log\n",
> > +			cxl_event_log_type_str(CXL_EVENT_TYPE_FATAL));
> > +}
>