lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221021162055.nuxvfdrfhv42nlim@offworld>
Date:   Fri, 21 Oct 2022 09:20:55 -0700
From:   Davidlohr Bueso <dave@...olabs.net>
To:     Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc:     Ira Weiny <ira.weiny@...el.com>, dan.j.williams@...el.com,
        dave.jiang@...el.com, alison.schofield@...el.com,
        bwidawsk@...nel.org, vishal.l.verma@...el.com,
        a.manzanares@...sung.com, linux-cxl@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] cxl/pci: Add generic MSI-X/MSI irq support

On Fri, 21 Oct 2022, Jonathan Cameron wrote:

>On Thu, 20 Oct 2022 21:18:58 -0700
>Ira Weiny <ira.weiny@...el.com> wrote:
>
>> On Thu, Oct 20, 2022 at 03:31:25PM -0700, Davidlohr Bueso wrote:
>> > On Tue, 18 Oct 2022, Jonathan Cameron wrote:
>> >
>> > > Reality is that it is cleaner to more or less ignore the infrastructure
>> > > proposed in this patch.
>> > >
>> > > 1. Query how many CPMU devices there are. Whilst there stash the maximim
>> > >   cpmu vector number in the cxlds.
>> > > 2. Run a stub in this infrastructure that does max(irq, cxlds->irq_num);
>> > > 3. Carry on as before.
>> > >
>> > > Thus destroying the point of this infrastructure for that usecase at least
>> > > and leaving an extra bit of state in the cxl_dev_state that is just
>> > > to squirt a value into the callback...
>> >
>> > If it doesn't fit, then it doesn't fit.
>> >
>> > However, while I was expecting pass one to be in the callback, I wasn't
>> > expecting that both pass 1 and 2 shared the cpmu_regs_array. If the array
>> > could be reconstructed during pass 2, then it would fit a bit better;
>> > albeit the extra allocation, cycles etc., but this is probing phase, so
>> > overhead isn't that important (and cpmu_count isn't big enough to matter).
>
>I thought about that approach, but it's really ugly to have to do
>
>1) For the IRQ number gathering.
>  a) Parse 1 to count CPMUs
>  b) Parse 2 to get the register maps - grab the irq numbers and unmap them again
>2) For the CPMU registration
>  a) Parse 3 to count CPMUs (we could stash the number of CPMUS form 1a) but
>     that's no advantage over stashing the max irq in current proposal.
>     Both are putting state where it's not relevant or wanted just to make it
>     available in a callback.  This way is even worse because it's getting
>     stashed as a side effect of a parse in a function doing something different.
>  b) Parse 4 to get the register maps and actually create the devices. Could have
>     stashed this earlier as well, but same 'side effects' argument applies.
>
>Sure, can move to this however with appropriate comments on why we are playing
>these games because otherwise I suspect a future 'cleanup' would remove double, double
>pass.
>
>To allow for an irq registration wrapper that turns a series of straight
>line calls into callbacks in an array.  The straight line calls aren't exactly
>complex in the first place.
>//find cpmu filling in cxl_cpmu_reg_maps.
>
>max_irq = -1
>rc = cxl_mailbox_get_irq()
>if (rc < 0)
>	return rc;
>max_irq = max(max_irq, rc);
>
>rc = cxl_events_get_irq()
>if (rc < 0)
>	return rc;
>max_irq = max(max_irq, rc);
>
>rc = cxl_cpmus_get_irq(cxl_cpmu_reg_maps);
>if (rc < 0)
>	return rc;
>max_irq = max(max_irq, rC);
>
>...
>
>if (irq > 0) {
>
>	pci_get...
>}
>
>//create all the devices...

Yes, this was sort of what I pictured if we go this way. It doesn't make
my eyes sore.

>
>> >
>> > But if we're going to go with a free-for-all approach, can we establish
>> > who goes for the initial pci_alloc_irq_vectors()? I think perhaps mbox
>> > since it's the most straightforward and with least requirements, I'm
>> > also unsure of the status yet to merge events and pmu, but regardless
>> > they are still larger patchsets. If folks agree I can send a new mbox-only
>> > patch.
>>
>> I think there needs to be some mechanism for all of the sub-device-functions to
>> report their max required vectors.
>>
>> I don't think that the mbox code is necessarily the code which should need to
>> know about all those other sub-device-thingys.  But it could certainly take
>> some 'max vectors' value that probe passed to it.
>>
>> I'm still not sure how dropping this infrastructure makes Jonathan's code
>> cleaner.  I still think there will need to be 2 passes over the number of
>> CPMU's.
>>
>
>Primarily that there is no need to stash anything about the CPMUs in the
>cxl_device_state (option 1) or repeat all the counting and discovery logic twice
>(option 2).
>
>I can live with it (it's what we have to do in pcie port for the equivalent)
>but the wrapped up version feels like a false optimization.
>
>Saves a few lines of code and adds a bunch of complexity elsewhere that looks to
>me to outweigh that saving.

Yeah it's hard to justify the extra complexity here when the alternative isn't
even that bad.

Thanks,
Davidlohr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ