[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iwZjmLxtEZWcExB5hCH765NVeNgiiYzKbeefGYDMBHWQ@mail.gmail.com>
Date: Fri, 29 Jan 2016 16:18:22 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Ross Zwisler <ross.zwisler@...ux.intel.com>,
Christoph Hellwig <hch@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Dan Williams <dan.j.williams@...el.com>,
Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.com>,
Matthew Wilcox <willy@...ux.intel.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...1.01.org>
Subject: Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences
On Fri, Jan 29, 2016 at 3:34 PM, Ross Zwisler
<ross.zwisler@...ux.intel.com> wrote:
> On Fri, Jan 29, 2016 at 11:28:15AM -0700, Ross Zwisler wrote:
>> On Thu, Jan 28, 2016 at 01:38:58PM -0800, Christoph Hellwig wrote:
>> > On Thu, Jan 28, 2016 at 12:35:04PM -0700, Ross Zwisler wrote:
>> > > There are a number of places in dax.c that look up the struct block_device
>> > > associated with an inode. Previously this was done by just using
>> > > inode->i_sb->s_bdev. This is correct for inodes that exist within the
>> > > filesystems supported by DAX (ext2, ext4 & XFS), but when running DAX
>> > > against raw block devices this value is NULL. This causes NULL pointer
>> > > dereferences when these block_device pointers are used.
>> >
>> > It's also wrong for an XFS file system with a RT device..
>> >
>> > > +#define DAX_BDEV(inode) (S_ISBLK(inode->i_mode) ? I_BDEV(inode) \
>> > > + : inode->i_sb->s_bdev)
>> >
>> > .. but this isn't going to fix it. You must use a bdev returned by
>> > get_blocks or a similar file system method.
>>
>> I guess I need to go off and understand if we can have DAX mappings on such a
>> device. If we can, we may have a problem - we can get the block_device from
>> get_block() in I/O path and the various fault paths, but we don't have access
>> to get_block() when flushing via dax_writeback_mapping_range(). We avoid
>> needing it the normal case by storing the sector results from get_block() in
>> the radix tree.
>>
>> /me is off to play with RT devices...
>
> Well, RT devices are completely broken as far as I can see. I've reported the
> breakage to the XFS list. Anything I do that triggers a RT block allocation
> in XFS causes a lockdep splat + a kernel BUG - I've tried regular pwrite(),
> xfs_rtcp and mmap() + write to address. Not a new bug either - happens just
> the same with v4.4. Happens with both PMEM and BRD, and has no relationship
> to whether I'm using DAX or not.
>
> Does it work for this patch to go in as-is since it fixes an immediate OOPS
> with raw block devices + DAX, and when RT devices are alive again I'll figure
> out how to make them work too?
Can we step back and be clear about which lookups should be coming
from get_blocks(). Which ones are critical vs ones we just
opportunistically lookup for a debug print.
Right now xfs and ext4 are basically disagreeing on whether
get_blocks() reliably sets ->bh_bdev, and checking for a raw
block-device inode in dax_clear_blocks() does not make sense. So this
all seems a bit confused.
Powered by blists - more mailing lists