[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190218024922.GA27779@ming.t460p>
Date: Mon, 18 Feb 2019 10:49:23 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Christoph Hellwig <hch@....de>,
Bjorn Helgaas <helgaas@...nel.org>,
Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
Sagi Grimberg <sagi@...mberg.me>,
linux-nvme@...ts.infradead.org, linux-pci@...r.kernel.org,
Keith Busch <keith.busch@...el.com>,
Marc Zyngier <marc.zyngier@....com>,
Sumit Saxena <sumit.saxena@...adcom.com>,
Kashyap Desai <kashyap.desai@...adcom.com>,
Shivasharan Srikanteshwara
<shivasharan.srikanteshwara@...adcom.com>
Subject: Re: [patch v6 7/7] genirq/affinity: Add support for non-managed
affinity sets
Hi Thomas,
On Sun, Feb 17, 2019 at 08:17:05PM +0100, Thomas Gleixner wrote:
> On Sun, 17 Feb 2019, Ming Lei wrote:
> > On Sat, Feb 16, 2019 at 06:13:13PM +0100, Thomas Gleixner wrote:
> > > Some drivers need an extra set of interrupts which should not be marked
> > > managed, but should get initial interrupt spreading.
> >
> > Could you share the drivers and their use case?
>
> You were Cc'ed on that old discussion:
>
> https://lkml.kernel.org/r/300d6fef733ca76ced581f8c6304bac6@mail.gmail.com
Thanks for providing the link.
>
> > > For both interrupt sets the interrupts are properly spread out, but the
> > > second set is not marked managed.
> >
> > Given drivers only care the managed vs non-managed interrupt numbers,
> > just wondering why this case can't be covered by .pre_vectors &
> > .post_vectors?
>
> Well, yes, but post/pre are not subject to spreading and I really don't
> want to go there.
>
> > Also this kind of usage may break blk-mq easily, in which the following
> > rule needs to be respected:
> >
> > 1) all CPUs are required to spread among each interrupt set
> >
> > 2) no any CPU is shared between two IRQs in same set.
>
> I don't see how that would break blk-mq. The unmanaged set is not used by
> the blk-mq stuff, that's some driver internal voodoo. So blk-mq still gets
> a perfectly spread and managed interrupt set for the queues.
>From the discussion above, the use case is for megaraid_sas. And one of the
two interrupt sets(managed and non-managed) will be chosen according to
workloads runtime.
Each interrupt set actually defines one blk-mq queue mapping, and the
queue mapping needs to respect the rule I mentioned now. However,
non-managed affinity can be changed to any way anytime by user-space.
Recently HPSA tried to add one module parameter to use non-managed
IRQ[1].
Also NVMe RDMA uses non-managed interrupts, and at least one CPU hotplug
issue is never fixed yet[2].
[1] https://marc.info/?t=154387665200001&r=1&w=2
[2] https://www.spinics.net/lists/linux-block/msg24140.html
thanks,
Ming
Powered by blists - more mailing lists