lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 2 Feb 2016 16:33:16 -0800
From:	Jared Hulbert <jaredeh@...il.com>
To:	Dan Williams <dan.j.williams@...el.com>
Cc:	Al Viro <viro@...iv.linux.org.uk>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	Jeff Layton <jlayton@...chiereds.net>,
	linux-nvdimm <linux-nvdimm@...1.01.org>,
	Dave Chinner <david@...morbit.com>,
	LKML <linux-kernel@...r.kernel.org>,
	XFS Developers <xfs@....sgi.com>,
	"J. Bruce Fields" <bfields@...ldses.org>, Jan Kara <jack@...e.com>,
	Linux FS Devel <linux-fsdevel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] dax: allow DAX to look up an inode's block device

On Tue, Feb 2, 2016 at 3:41 PM, Dan Williams <dan.j.williams@...el.com> wrote:
> On Tue, Feb 2, 2016 at 3:36 PM, Jared Hulbert <jaredeh@...il.com> wrote:
>> On Tue, Feb 2, 2016 at 3:19 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
>>>
>>> On Tue, Feb 02, 2016 at 04:11:42PM -0700, Ross Zwisler wrote:
>>>
>>> > However, for raw block devices and for XFS with a real-time device, the
>>> > value in inode->i_sb->s_bdev is not correct.  With the code as it is
>>> > currently written, an fsync or msync to a DAX enabled raw block device will
>>> > cause a NULL pointer dereference kernel BUG.  For this to work correctly we
>>> > need to ask the block device or filesystem what struct block_device is
>>> > appropriate for our inode.
>>> >
>>> > To that end, add a get_bdev(struct inode *) entry point to struct
>>> > super_operations.  If this function pointer is non-NULL, this notifies DAX
>>> > that it needs to use it to look up the correct block_device.  If
>>> > i_sb->get_bdev() is NULL DAX will default to inode->i_sb->s_bdev.
>>>
>>> Umm...  It assumes that bdev will stay pinned for as long as inode is
>>> referenced, presumably?  If so, that needs to be documented (and verified
>>> for existing fs instances).  In principle, multi-disk fs might want to
>>> support things like "silently move the inodes backed by that disk to other
>>> ones"...
>>
>> Dan, This is exactly the kind of thing I'm taking about WRT the
>> weirder device models and directly calling bdev_direct_access().
>> Filesystems don't have the monogamous relationship with a device that
>> is implicitly assumed in DAX, you have to ask the filesystem what the
>> relationship is and is migrating to, and allow the filesystem to
>> update DAX when the relationship is changing.
>
> That's precisely what ->get_bdev() does.  When the answer
> inode->i_sb->s_bdev lookup is invalid, use ->get_bdev().
>
>> As we start to see many
>> DIMM's and 10s TiB pmem systems this is going be an even bigger deal
>> as load balancing, wear leveling, and fault tolerance concerned are
>> inevitably driven by the filesystem.
>
> No, there are no plans on the horizon for an fs to manage these media
> specific concerns for persistent memory.

So the filesystem is now directly in charge of mapping user pages to
physical memory.  The filesystem is effectively bypassing NUMA and
zones and all that stuff that tries to balance memory bus and QPI
traffic etc.  You don't think the filesystem will therefore be in
charge of memory bus hotspots?

Alright.  We can just agree to disagree on that point.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ