lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 21 Oct 2022 09:49:31 +0100
From:   Jonathan Cameron <Jonathan.Cameron@...wei.com>
To:     Ira Weiny <ira.weiny@...el.com>
CC:     Davidlohr Bueso <dave@...olabs.net>, <dan.j.williams@...el.com>,
        <dave.jiang@...el.com>, <alison.schofield@...el.com>,
        <bwidawsk@...nel.org>, <vishal.l.verma@...el.com>,
        <a.manzanares@...sung.com>, <linux-cxl@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] cxl/pci: Add generic MSI-X/MSI irq support

On Thu, 20 Oct 2022 21:18:58 -0700
Ira Weiny <ira.weiny@...el.com> wrote:

> On Thu, Oct 20, 2022 at 03:31:25PM -0700, Davidlohr Bueso wrote:
> > On Tue, 18 Oct 2022, Jonathan Cameron wrote:
> >   
> > > Reality is that it is cleaner to more or less ignore the infrastructure
> > > proposed in this patch.
> > > 
> > > 1. Query how many CPMU devices there are. Whilst there stash the maximim
> > >   cpmu vector number in the cxlds.
> > > 2. Run a stub in this infrastructure that does max(irq, cxlds->irq_num);
> > > 3. Carry on as before.
> > > 
> > > Thus destroying the point of this infrastructure for that usecase at least
> > > and leaving an extra bit of state in the cxl_dev_state that is just
> > > to squirt a value into the callback...  
> > 
> > If it doesn't fit, then it doesn't fit.
> > 
> > However, while I was expecting pass one to be in the callback, I wasn't
> > expecting that both pass 1 and 2 shared the cpmu_regs_array. If the array
> > could be reconstructed during pass 2, then it would fit a bit better;
> > albeit the extra allocation, cycles etc., but this is probing phase, so
> > overhead isn't that important (and cpmu_count isn't big enough to matter).

I thought about that approach, but it's really ugly to have to do

1) For the IRQ number gathering.
  a) Parse 1 to count CPMUs
  b) Parse 2 to get the register maps - grab the irq numbers and unmap them again
2) For the CPMU registration
  a) Parse 3 to count CPMUs (we could stash the number of CPMUS form 1a) but
     that's no advantage over stashing the max irq in current proposal.
     Both are putting state where it's not relevant or wanted just to make it
     available in a callback.  This way is even worse because it's getting
     stashed as a side effect of a parse in a function doing something different.
  b) Parse 4 to get the register maps and actually create the devices. Could have
     stashed this earlier as well, but same 'side effects' argument applies.

Sure, can move to this however with appropriate comments on why we are playing
these games because otherwise I suspect a future 'cleanup' would remove double, double
pass.

To allow for an irq registration wrapper that turns a series of straight
line calls into callbacks in an array.  The straight line calls aren't exactly
complex in the first place.
//find cpmu filling in cxl_cpmu_reg_maps.

max_irq = -1
rc = cxl_mailbox_get_irq()
if (rc < 0)
	return rc;
max_irq = max(max_irq, rc);

rc = cxl_events_get_irq()
if (rc < 0)
	return rc;
max_irq = max(max_irq, rc);

rc = cxl_cpmus_get_irq(cxl_cpmu_reg_maps);
if (rc < 0)
	return rc;
max_irq = max(max_irq, rC);

...

if (irq > 0) {

	pci_get...
}

//create all the devices...


> > 
> > But if we're going to go with a free-for-all approach, can we establish
> > who goes for the initial pci_alloc_irq_vectors()? I think perhaps mbox
> > since it's the most straightforward and with least requirements, I'm
> > also unsure of the status yet to merge events and pmu, but regardless
> > they are still larger patchsets. If folks agree I can send a new mbox-only
> > patch.  
> 
> I think there needs to be some mechanism for all of the sub-device-functions to
> report their max required vectors.
> 
> I don't think that the mbox code is necessarily the code which should need to
> know about all those other sub-device-thingys.  But it could certainly take
> some 'max vectors' value that probe passed to it.
> 
> I'm still not sure how dropping this infrastructure makes Jonathan's code
> cleaner.  I still think there will need to be 2 passes over the number of
> CPMU's.
> 

Primarily that there is no need to stash anything about the CPMUs in the
cxl_device_state (option 1) or repeat all the counting and discovery logic twice
(option 2).

I can live with it (it's what we have to do in pcie port for the equivalent)
but the wrapped up version feels like a false optimization.

Saves a few lines of code and adds a bunch of complexity elsewhere that looks to
me to outweigh that saving.

If people are convinced this is the way to go then fair enough, but be prepared
for the ugly corners!

Jonathan

> Ira
> 
> > 
> > Thanks,
> > Davidlohr  
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ