lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <539B14A9.8010204@fb.com>
Date:	Fri, 13 Jun 2014 09:11:37 -0600
From:	Jens Axboe <axboe@...com>
To:	Keith Busch <keith.busch@...el.com>
CC:	Matias Bjørling <m@...rling.me>,
	Matthew Wilcox <willy@...ux.intel.com>,
	"sbradshaw@...ron.com" <sbradshaw@...ron.com>,
	"tom.leiming@...il.com" <tom.leiming@...il.com>,
	"hch@...radead.org" <hch@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>
Subject: Re: [PATCH v7] NVMe: conversion to blk-mq

On 06/13/2014 09:05 AM, Keith Busch wrote:
> On Fri, 13 Jun 2014, Jens Axboe wrote:
>> On 06/12/2014 06:06 PM, Keith Busch wrote:
>>> When cancelling IOs, we have to check if the hwctx has a valid tags
>>> for some reason. I have 32 cores in my system and as many queues, but
>>
>> It's because unused queues are torn down, to save memory.
>>
>>> blk-mq is only using half of those queues and freed the "tags" for the
>>> rest after they'd been initialized without telling the driver. Why is
>>> blk-mq not making utilizing all my queues?
>>
>> You have 31 + 1 queues, so only 31 mappable queues. blk-mq symmetrically
>> distributes these, so you should have a core + thread sibling on 16
>> queues. And yes, that leaves 15 idle hardware queues for this specific
>> case. I like the symmetry, it makes it more predictable if things are
>> spread out evenly.
> 
> You'll see performance differences on some workloads that depend on which
> cores your process runs and which one services an interrupt. We can play
> games with with cores and see what happens on my 32 cpu system. I usually
> run 'irqbalance --hint=exact' for best performance, but that doesn't do
> anything with blk-mq since the affinity hint is gone.

Huh wtf, that hint is not supposed to be gone. I'm guessing it went away
with the removal of the manual queue assignments.

> I ran the following script several times on each version of the
> driver. This will pin a sequential read test to cores 0, 8, and 16. The
> device is local to NUMA node on cores 0-7 and 16-23; the second test
> runs on the remote node and the third on the thread sibling of 0. Results
> were averaged, but very consistent anyway. The system was otherwise idle.
> 
>  # for i in $(seq 0 8 16); do
>   > let "cpu=1<<$i"
>   > cpu=`echo $cpu | awk '{printf "%#x\n", $1}'`
>   > taskset ${cpu} dd if=/dev/nvme0n1 of=/dev/null bs=4k count=1000000
> iflag=direct
>   > done
> 
> Here are the performance drops observed with blk-mq with the existing
> driver as baseline:
> 
>  CPU : Drop
>  ....:.....
>    0 : -6%
>    8 : -36%
>   16 : -12%

We need the hints back for sure, I'll run some of the same tests and
verify to be sure. Out of curiousity, what is the topology like on your
box? Are 0/1 siblings, and 0..7 one node?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ