lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iRz=+11PHxaNMuWGiC67C5C1GKaq=0sUfqZ9xO5eHSUA@mail.gmail.com>
Date:	Fri, 23 Oct 2015 16:32:57 -0700
From:	Dan Williams <dan.j.williams@...el.com>
To:	Jan Kara <jack@...e.cz>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"jmoyer@...hat.com" <jmoyer@...hat.com>, "hch@....de" <hch@....de>,
	"axboe@...com" <axboe@...com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	"willy@...ux.intel.com" <willy@...ux.intel.com>,
	"ross.zwisler@...ux.intel.com" <ross.zwisler@...ux.intel.com>,
	"david@...morbit.com" <david@...morbit.com>
Subject: Re: [PATCH 5/5] block: enable dax for raw block devices

On Thu, Oct 22, 2015 at 2:08 PM, Jan Kara <jack@...e.cz> wrote:
> On Thu 22-10-15 16:05:46, Williams, Dan J wrote:
[..]
>> This text was aimed at the request from Ross to document the differences
>> vs the generic_file_mmap() path.  Is the following incremental change
>> more clear?
>
> Well, not really. I thought you'd just delete that paragraph :) The thing
> is: When doing IO directly to the block device, it makes no sense to look
> at a filesystem on top of it - hopefully there is none since you'd be
> corrupting it. So the paragraph that would make sense to me would be:
>
>  * Finally, in contrast to filemap_page_mkwrite(), we don't bother calling
>  * sb_start_pagefault(). There is no filesystem which could be frozen here
>  * and when bdev gets frozen, IO gets blocked in the request queue.

I'm not following this assertion that "IO gets blocked in the request
queue" when the device is frozen in the code.  As far as I can see
outside of tracking the freeze depth count the request_queue does not
check if the device is frozen.   freeze_bdev() is moot when no
filesystem is a present.

> But when spelled out like this, I've realized that with DAX, this blocking
> of requests in the request queue doesn't really block the IO to the device.
> So block device freezing (aka blk_queue_stop()) doesn't work reliably with
> DAX. That should be fixed but it's not easy as the only way to do that
> would be to hook into blk_stop_queue() and unmap (or at least
> write-protect) all the mappings of the device. Ugh...

Again I'm missing how this is guaranteed in the non-DAX case.
freeze_bdev() will sync_blockdev(), but it does nothing to prevent
re-dirtying through raw device mmaps while the fs in frozen.  Should
it?  That's at least a separate patch.

> Ugh2: Now I realized that DAX mmap isn't safe wrt fs freezing even for
> filesystems since there's nothing which writeprotects pages that are
> writeably mapped. In normal path, page writeback does this but that doesn't
> happen for DAX. I remember we once talked about this but it got lost.
> We need something like walk all filesystem inodes during fs freeze and
> writeprotect all pages that are mapped. But that's going to be slow...

This is what I'm attempting to tackle with the next revision of this series...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ