[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55C3124F.3020602@plexistor.com>
Date: Thu, 06 Aug 2015 10:52:47 +0300
From: Boaz Harrosh <boaz@...xistor.com>
To: Dave Chinner <david@...morbit.com>,
Linda Knippers <linda.knippers@...com>
CC: Jeff Moyer <jmoyer@...hat.com>,
"matthew r. wilcox" <matthew.r.wilcox@...el.com>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: regression introduced by "block: Add support for DAX reads/writes
to block devices"
On 08/06/2015 06:24 AM, Dave Chinner wrote:
> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote:
>> On 08/05/2015 06:01 PM, Dave Chinner wrote:
>>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote:
<>
>>>>
>>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs
>>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads
>>>> from the last sector of the device. This results in dax_io trying to do
>>>> a page-sized I/O at 512 bytes from the end of the device.
>>>
This part I do not understand. how is mkfs.xfs reading the sector?
Is it through open(/dev/pmem0,...) ? O_DIRECT?
If so then yes the inode of /dev/pmem0 is IS_DAX() and will try
to use the dax.c stuff. (I think, which Kernel?)
Which means this is a bug.
>>> Right - we have to be able to do IO to that last sector, so this is
>>> a sanity check to tell if the block dev is large enough. The XFS
>>> kernel code does the same end-of-device sector read when the
>>> filesystem is mounted, too.
>>>
>>>> bdev_direct_access, receiving this bogus pos/size combo, returns
>>>> -ERANGE:
>>>>
>>>> if ((sector + DIV_ROUND_UP(size, 512)) >
>>>> part_nr_sects_read(bdev->bd_part))
>>>> return -ERANGE;
>>>>
>>>> Given that file systems supporting dax refuse to mount with a blocksize
>>>> != page size, I'm guessing this is sort of expected behavior. However,
>>>> we really shouldn't be breaking direct I/O on pmem devices.
>>>
No this is a BUG. read/write buffered/direct to an IS_DAX() inode should
be able to be of any alignment size. Since with DAX buffered/direct is
exact same code path and buffered IO expects any size IO.
This is probably a bug in the DAX handling of the bdev-inode. Let me
test this. I will send a fix ASAP.
<>
>>> the output of:
>>>
>>> /sys/block/pmem0/queue/logical_block_size
>> 512
>>
>>> /sys/block/pmem0/queue/physical_block_size
>> 512
>>
There is a pending fix for this.
Do you need it sent to stable ?
>>> /sys/block/pmem0/queue/hw_sector_size
>> 512
>>
>>> /sys/block/pmem0/queue/minimum_io_size
>> 512
>>
>>> /sys/block/pmem0/queue/optimal_io_size
>> 0
Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists