linux-kernel - Re: [PATCH 0/3] fix interrupt swamp in NVMe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.1908221143060.1983@nanos.tec.linutronix.de>
Date:   Thu, 22 Aug 2019 11:48:29 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Keith Busch <keith.busch@...il.com>
cc:     Ming Lei <ming.lei@...hat.com>, Long Li <longli@...rosoft.com>,
        Jens Axboe <axboe@...com>, Sagi Grimberg <sagi@...mberg.me>,
        chenxiang <chenxiang66@...ilicon.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ming Lei <tom.leiming@...il.com>,
        John Garry <john.garry@...wei.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-nvme <linux-nvme@...ts.infradead.org>,
        Keith Busch <keith.busch@...el.com>,
        Ingo Molnar <mingo@...hat.com>, Christoph Hellwig <hch@....de>,
        "longli@...uxonhyperv.com" <longli@...uxonhyperv.com>
Subject: Re: [PATCH 0/3] fix interrupt swamp in NVMe

On Wed, 21 Aug 2019, Keith Busch wrote:
> On Wed, Aug 21, 2019 at 7:34 PM Ming Lei <ming.lei@...hat.com> wrote:
> > On Wed, Aug 21, 2019 at 04:27:00PM +0000, Long Li wrote:
> > > Here is the command to benchmark it:
> > >
> > > fio --bs=4k --ioengine=libaio --iodepth=128 --filename=/dev/nvme0n1:/dev/nvme1n1:/dev/nvme2n1:/dev/nvme3n1:/dev/nvme4n1:/dev/nvme5n1:/dev/nvme6n1:/dev/nvme7n1:/dev/nvme8n1:/dev/nvme9n1 --direct=1 --runtime=120 --numjobs=80 --rw=randread --name=test --group_reporting --gtod_reduce=1
> > >
> >
> > I can reproduce the issue on one machine(96 cores) with 4 NVMes(32 queues), so
> > each queue is served on 3 CPUs.
> >
> > IOPS drops > 20% when 'use_threaded_interrupts' is enabled. From fio log, CPU
> > context switch is increased a lot.
> 
> Interestingly use_threaded_interrupts shows a marginal improvement on
> my machine with the same fio profile. It was only 5 NVMes, but they've
> one queue per-cpu on 112 cores.

Which is not surprising because the thread and the hard interrupt are on
the same CPU and there is just that little overhead of the context switch.

The thing is that this really depends on how the scheduler decides to place
the interrupt thread.

If you have a queue for several CPUs, then depending on the load situation
allowing a multi-cpu affinity for the thread can cause lots of task
migration.

But restricting the irq thread to the CPU on which the interrupt is affine
can also starve that CPU. There is no universal rule for that.

Tracing should tell.

Thanks,

	tglx