[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56A263A7.6070406@fb.com>
Date: Fri, 22 Jan 2016 10:15:19 -0700
From: Jens Axboe <axboe@...com>
To: Keith Busch <keith.busch@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
CC: Stefan Haberland <sth@...ux.vnet.ibm.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-s390 <linux-s390@...r.kernel.org>,
Sebastian Ott <sebott@...ux.vnet.ibm.com>
Subject: Re: [BUG] Regression introduced with "block: split bios to max
possible length"
On 01/22/2016 07:56 AM, Keith Busch wrote:
> On Thu, Jan 21, 2016 at 08:15:37PM -0800, Linus Torvalds wrote:
>> For the case of nvme, for example, I think the max sector number is so
>> high that you'll never hit that anyway, and you'll only ever hit the
>> chunk limit. No?
>
> The device's max transfer and chunk size are not very large, both fixed
> at 128KB. We can lose ~70% of potential throughput when IO isn't aligned,
> and end users reported this when the block layer stopped splitting on
> alignment for the NVMe drive.
>
> So it's a big deal for this h/w, but now I feel awkward defending a
> device specific feature for the generic block layer.
Honestly, the splitting code is what is a piece of crap, we never should
have gone down that route. Hopefully we can get rid of it soon. In the
mean time, this does need to work. It's an odd hw construct (basically
two devices bolted together), but it's not really an esoteric thing to
support.
> Anyway, the patch was developed with incorrect assumptions. I'd still
> like to try again after reconciling the queue limit constraints, but
> I defer to Jens for the near term.
Instead of scrambling for -rc1, I'd suggest we just revert again and
ensure what we merge for -rc2 is clean and passes the test cases.
--
Jens Axboe
Powered by blists - more mailing lists