[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1455680059-20126-2-git-send-email-ross.zwisler@linux.intel.com>
Date: Tue, 16 Feb 2016 20:34:14 -0700
From: Ross Zwisler <ross.zwisler@...ux.intel.com>
To: linux-kernel@...r.kernel.org
Cc: Dan Williams <dan.j.williams@...el.com>,
"J. Bruce Fields" <bfields@...ldses.org>,
"Theodore Ts'o" <tytso@....edu>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.com>,
Jeff Layton <jlayton@...chiereds.net>,
Jens Axboe <axboe@...nel.dk>,
Matthew Wilcox <willy@...ux.intel.com>,
linux-block@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-nvdimm@...ts.01.org, xfs@....sgi.com,
Jan Kara <jack@...e.cz>, Jens Axboe <axboe@...com>,
Matthew Wilcox <matthew.r.wilcox@...el.com>,
Al Viro <viro@....linux.org.uk>,
Ross Zwisler <ross.zwisler@...ux.intel.com>
Subject: [PATCH v3 1/6] block: disable block device DAX by default
From: Dan Williams <dan.j.williams@...el.com>
The recent *sync enabling discovered that we are inserting into the
block_device pagecache counter to the expectations of the dirty data
tracking for dax mappings. This can lead to data corruption.
We want to support DAX for block devices eventually, but it requires
wider changes to properly manage the pagecache.
[<ffffffff81576d93>] dump_stack+0x85/0xc2
[<ffffffff812b9ee0>] dax_writeback_mapping_range+0x60/0xe0
[<ffffffff812a1d4f>] blkdev_writepages+0x3f/0x50
[<ffffffff811db011>] do_writepages+0x21/0x30
[<ffffffff811cb6a6>] __filemap_fdatawrite_range+0xc6/0x100
[<ffffffff811cb75a>] filemap_write_and_wait+0x4a/0xa0
[<ffffffff812a15e0>] set_blocksize+0x70/0xd0
[<ffffffff812a273d>] sb_set_blocksize+0x1d/0x50
[<ffffffff8132ac9b>] ext4_fill_super+0x75b/0x3360
[<ffffffff81583381>] ? vsnprintf+0x201/0x4c0
[<ffffffff815836d9>] ? snprintf+0x49/0x60
[<ffffffff81263010>] mount_bdev+0x180/0x1b0
[<ffffffff8132a540>] ? ext4_calculate_overhead+0x370/0x370
[<ffffffff8131ad95>] ext4_mount+0x15/0x20
[<ffffffff81263908>] mount_fs+0x38/0x170
Mark the support broken so its disabled by default, but otherwise still
available for testing.
Cc: Jan Kara <jack@...e.cz>
Cc: Jens Axboe <axboe@...com>
Cc: Matthew Wilcox <matthew.r.wilcox@...el.com>
Cc: Al Viro <viro@....linux.org.uk>
Reported-by: Ross Zwisler <ross.zwisler@...ux.intel.com>
Suggested-by: Dave Chinner <david@...morbit.com>
Signed-off-by: Dan Williams <dan.j.williams@...el.com>
Signed-off-by: Ross Zwisler <ross.zwisler@...ux.intel.com>
---
block/Kconfig | 13 +++++++++++++
fs/block_dev.c | 6 +++++-
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/block/Kconfig b/block/Kconfig
index 161491d..0363cd7 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -88,6 +88,19 @@ config BLK_DEV_INTEGRITY
T10/SCSI Data Integrity Field or the T13/ATA External Path
Protection. If in doubt, say N.
+config BLK_DEV_DAX
+ bool "Block device DAX support"
+ depends on FS_DAX
+ depends on BROKEN
+ help
+ When DAX support is available (CONFIG_FS_DAX) raw block
+ devices can also support direct userspace access to the
+ storage capacity via MMAP(2) similar to a file on a
+ DAX-enabled filesystem. However, the DAX I/O-path disables
+ some standard I/O-statistics, and the MMAP(2) path has some
+ operational differences due to bypassing the page
+ cache. If in doubt, say N.
+
config BLK_DEV_THROTTLING
bool "Block layer bio throttling support"
depends on BLK_CGROUP=y
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 39b3a17..31c6d10 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1201,7 +1201,11 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, int for_part)
bdev->bd_disk = disk;
bdev->bd_queue = disk->queue;
bdev->bd_contains = bdev;
- bdev->bd_inode->i_flags = disk->fops->direct_access ? S_DAX : 0;
+ if (IS_ENABLED(CONFIG_BLK_DEV_DAX) && disk->fops->direct_access)
+ bdev->bd_inode->i_flags = S_DAX;
+ else
+ bdev->bd_inode->i_flags = 0;
+
if (!partno) {
ret = -ENXIO;
bdev->bd_part = disk_get_part(disk, partno);
--
2.5.0
Powered by blists - more mailing lists