linux-kernel - Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACVXFVMOuCk0bHZfrV=VZWLtgsa4oWxrpnu6aoB1LKZ50UMhZA@mail.gmail.com>
Date:	Wed, 29 Jul 2015 07:21:47 -0400
From:	Ming Lei <ming.lei@...onical.com>
To:	Dave Chinner <david@...morbit.com>
Cc:	Christoph Hellwig <hch@...radead.org>,
	Jens Axboe <axboe@...nel.dk>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"Justin M. Forbes" <jforbes@...oraproject.org>,
	Jeff Moyer <jmoyer@...hat.com>, Tejun Heo <tj@...nel.org>,
	linux-api@...r.kernel.org
Subject: Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO

On Wed, Jul 29, 2015 at 4:41 AM, Dave Chinner <david@...morbit.com> wrote:
> On Wed, Jul 29, 2015 at 03:33:52AM -0400, Ming Lei wrote:
>> On Mon, Jul 27, 2015 at 1:33 PM, Christoph Hellwig <hch@...radead.org> wrote:
>> > On Mon, Jul 27, 2015 at 05:53:33AM -0400, Ming Lei wrote:
>> >> Because size has to be 4k aligned too.
>> >
>> > Yes.  But again I don't see any reason to limit us to a hardcoded 512
>> > byte block size here, especially considering the patches to finally
>>
>> From loop block's view, the request size can be any count of 512-byte
>> sectors, then the transfer size to backing device can't  guarantee to be
>> 4k aligned always.
>
> In theory, yes. In practise, doesn't happen very often.
>
>> > allow enabling other block sizes from userspace.
>>
>> I have some questions about the patchset, and looks the author doesn't
>> reply it yet.
>>
>> On Mon, Jul 27, 2015 at 6:06 PM, Dave Chinner <david@...morbit.com> wrote:
>> >> Because size has to be 4k aligned too.
>> >
>> > So check that, too. Any >= 4k block size filesystem should be doing
>> > mostly 4k aligned and sized IO...
>>
>> I guess you mean we only use direct IO for the 4k aligned and sized IO?
>> If so, that won't be efficient because the page cache has to be flushed
>> during the switch.
>
> It will be extremely rare for a 4k block size filesystem to do
> anything other than 4k aligned and sized IO. Think about it for a
> minute: what does the page cache do to unaligned IO patterns (i.e.
> buffered IO)?  It does IO in page sizes, and so if the application
> if doing badly aligned or sized IO with buffered IO, then the
> underlying device will only ever size page sized and aligned IO.
>
> Hence sector aligned IO will only come from applications doing
> direct IO.  If the application is doing direct IO and it's not
> properly aligned, then it already is going to get sucky performance
> because most filesystem serialise sub-block size direct IO because
> concurrent sub-block IOs to the same block usually leads to data
> corruption.

The blocksize of filesysten over loop can be 512, 1024, 2048, and
suppose sector size of backing device is 4096, then filesystem
can see aligned direct IO when IO size/offset from application is aligned
with fs block size, but loop still can't do direct IO for all this
kind of requests
against backing file.

Another case is that application may access loop block directly, such
as 'dd if=/dev/loopN', but it may not be common, and maybe it needn't
to consider.

Thanks,

>
> So, really, sector aligned/sized direct IO is a sucky performance
> path before we even get to the loop device, so we don't really need
> to care how fast the loop device handles this case. The loop device
> just needs to ensure that it doesn't corrupt data when badly aligned
> IOs come in... ;)
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/