lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180925210418.GA9854@bombadil.infradead.org>
Date:   Tue, 25 Sep 2018 14:04:18 -0700
From:   Matthew Wilcox <willy@...radead.org>
To:     Dave Chinner <david@...morbit.com>
Cc:     Jens Axboe <axboe@...nel.dk>, Christopher Lameter <cl@...ux.com>,
        Christoph Hellwig <hch@....de>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Ming Lei <tom.leiming@...il.com>,
        linux-block <linux-block@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        Linux FS Devel <linux-fsdevel@...r.kernel.org>,
        "open list:XFS FILESYSTEM" <linux-xfs@...r.kernel.org>,
        Dave Chinner <dchinner@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Ming Lei <ming.lei@...hat.com>
Subject: Re: block: DMA alignment of IO buffer allocated from slab

On Tue, Sep 25, 2018 at 05:49:10PM +1000, Dave Chinner wrote:
> On Mon, Sep 24, 2018 at 12:09:37PM -0600, Jens Axboe wrote:
> > On 9/24/18 12:00 PM, Christopher Lameter wrote:
> > > On Mon, 24 Sep 2018, Jens Axboe wrote:
> > > 
> > >> The situation is making me a little uncomfortable, though. If we export
> > >> such a setting, we really should be honoring it...
> 
> That's what I said up front, but you replied to this with:
> 
> | I think this is all crazy talk. We've never done this, [...]
> 
> Now I'm not sure what you are saying we should do....
> 
> > > Various subsystems create custom slab arrays with their particular
> > > alignment requirement for these allocations.
> > 
> > Oh yeah, I think the solution is basic enough for XFS, for instance.
> > They just have to error on the side of being cautious, by going full
> > sector alignment for memory...
> 
> How does the filesystem find out about hardware alignment
> requirements? Isn't probing through the block device to find out
> about the request queue configurations considered a layering
> violation?
> 
> What if sector alignment is not sufficient?  And how would this work
> if we start supporting sector sizes larger than page size? (which the
> XFS buffer cache supports just fine, even if nothing else in
> Linux does).

I've never quite understood the O_DIRECT sector size alignment
restriction.  The sector size has literally nothing to do with the
limitations of the controller that's doing the DMA.  OK, NVMe smooshes the
two components into one, but back in the SCSI era, the DMA abilities were
the HBA's responsibility and the sector size was a property of the LUN!

Heck, with a sufficiently advanced HBA (eg supporting scatterlists with
bitbuckets), you could even ask for sub-sector-*sized* IOs.  Not terribly
useful since the bytes still had to be transferred over the SCSI cable,
but you'd save transferring them across the PCI bus.

Anyway, why would we require *larger* than 512 byte alignment for
in-kernel users?  I doubt there are any remaining HBAs that can't do
8-byte aligned I/Os (for the record, NVMe requires controllers to be
able to do 4-byte aligned I/Os).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ