[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1902101731370.8784@nanos.tec.linutronix.de>
Date: Sun, 10 Feb 2019 17:39:20 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Ming Lei <ming.lei@...hat.com>
cc: Christoph Hellwig <hch@....de>, Bjorn Helgaas <helgaas@...nel.org>,
Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
Sagi Grimberg <sagi@...mberg.me>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org
Subject: Re: [PATCH 4/5] nvme-pci: simplify nvme_setup_irqs() via .setup_affinity
callback
On Fri, 25 Jan 2019, Ming Lei wrote:
> Use the callback of .setup_affinity() to re-caculate number
> of queues, and build irqs affinity with help of irq_build_affinity().
>
> Then nvme_setup_irqs() gets simplified a lot.
I'm pretty sure you can achieve the same by reworking the core code without
that callback.
> + /* Fill out vectors at the beginning that don't need affinity */
> + for (curvec = 0; curvec < affd->pre_vectors; curvec++)
> + cpumask_copy(&masks[curvec].mask, cpu_possible_mask);
cpu_possible_mask is wrong. Why are you deliberately trying to make this
special? There is absolutely no reason to do so.
These interrupts are not managed and therefore the initial affinity has to
be irq_default_affinity. Setting them to cpu_possible_mask can and probably
will evade a well thought out default affinity mask, which was set to
isolate a set of cores from general purpose interrupts.
This is exactly the thing which happens with driver special stuff and which
needs to be avoided. There is nothing special about this NVME setup and
yes, I can see why the current core code is a bit tedious to work with, but
that does not justify that extra driver magic by any means.
Thanks,
tglx
Powered by blists - more mailing lists