[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zj7C5fPCAdGwGsrI@fedora>
Date: Sat, 11 May 2024 08:59:17 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Keith Busch <kbusch@...nel.org>
Cc: Christoph Hellwig <hch@....de>, Keith Busch <kbusch@...a.com>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
tglx@...utronix.de
Subject: Re: [PATCH 2/2] nvme-pci: allow unmanaged interrupts
On Fri, May 10, 2024 at 06:41:58PM -0600, Keith Busch wrote:
> On Sat, May 11, 2024 at 07:50:21AM +0800, Ming Lei wrote:
> > On Fri, May 10, 2024 at 10:20:02AM -0600, Keith Busch wrote:
> > > On Fri, May 10, 2024 at 05:10:47PM +0200, Christoph Hellwig wrote:
> > > > On Fri, May 10, 2024 at 07:14:59AM -0700, Keith Busch wrote:
> > > > > From: Keith Busch <kbusch@...nel.org>
> > > > >
> > > > > Some people _really_ want to control their interrupt affinity.
> > > >
> > > > So let them argue why. I'd rather have a really, really, really
> > > > good argument for this crap, and I'd like to hear it from the horses
> > > > mouth.
> > >
> > > It's just prioritizing predictable user task scheduling for a subset of
> > > CPUs instead of having consistently better storage performance.
> > >
> > > We already have "isolcpus=managed_irq," parameter to prevent managed
> > > interrupts from running on a subset of CPUs, so the use case is already
> > > kind of supported. The problem with that parameter is it is a no-op if
> > > the starting affinity spread contains only isolated CPUs.
> >
> > Can you explain a bit why it is a no-op? If only isolated CPUs are
> > spread on one queue, there will be no IO originated from these isolated
> > CPUs, that is exactly what the isolation needs.
>
> The "isolcpus=managed_irq," option doesn't limit the dispatching CPUs.
Please see commit a46c27026da1 ("blk-mq: don't schedule block kworker on isolated CPUs")
in for-6.10/block.
> It only limits where the managed irq will assign the effective_cpus as a
> best effort.
Most of times it does work.
>
> Example, I boot with a system with 4 threads, one nvme device, and
> kernel parameter:
>
> isolcpus=managed_irq,2-3
>
> Run this:
>
> for i in $(seq 0 3); do taskset -c $i dd if=/dev/nvme0n1 of=/dev/null bs=4k count=1000 iflag=direct; done
It is one test problem, when you try to isolate '2-3', it isn't expected
to submit IO or run application on these isolated CPUs.
Thanks,
Ming
Powered by blists - more mailing lists