lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150507173641.GA21781@gmail.com>
Date:	Thu, 7 May 2015 19:36:41 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Dan Williams <dan.j.williams@...el.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Boaz Harrosh <boaz@...xistor.com>, Jan Kara <jack@...e.cz>,
	Mike Snitzer <snitzer@...hat.com>, Neil Brown <neilb@...e.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	Chris Mason <clm@...com>, Paul Mackerras <paulus@...ba.org>,
	"H. Peter Anvin" <hpa@...or.com>, Christoph Hellwig <hch@....de>,
	Alasdair Kergon <agk@...hat.com>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	Mel Gorman <mgorman@...e.de>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	Rik van Riel <riel@...hat.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Jens Axboe <axboe@...nel.dk>, Theodore Ts'o <tytso@....edu>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Julia Lawall <Julia.Lawall@...6.fr>, Tejun Heo <tj@...nel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer,
 introduce __pfn_t


* Dan Williams <dan.j.williams@...el.com> wrote:

> > Anyway, I did want to say that while I may not be convinced about 
> > the approach, I think the patches themselves don't look horrible. 
> > I actually like your "__pfn_t". So while I (very obviously) have 
> > some doubts about this approach, it may be that the most 
> > convincing argument is just in the code.
> 
> Ok, I'll keep thinking about this and come back when we have a 
> better story about passing mmap'd persistent memory around in 
> userspace.

So is there anything fundamentally wrong about creating struct page 
backing at mmap() time (and making sure aliased mmaps share struct 
page arrays)?

Because if that is done, then the DMA agent won't even know about the 
memory being persistent RAM. It's just a regular struct page, that 
happens to point to persistent RAM. Same goes for all the high level 
VM APIs, futexes, etc. Everything will Just Work.

It will also be relatively fast: mmap() is a relative slowpath, 
comparatively.

As far as RAID is concerned: that's a relatively easy situation, as 
there's only a single user of the devices, the RAID context that 
manages all component devices exclusively. Device to device DMA can 
use the block layer directly, i.e. most of the patches you've got here 
in this series, except:

74287   C May 06 Dan Williams    ( 232) ├─>[PATCH v2 09/10] dax: convert to __pfn_t

I think DAX mmap()s need struct page backing.

I think there's a simple rule: if a page is visible to user-space via 
the MMU then it needs struct page backing. If it's "hidden", like 
behind a RAID abstraction, it probably doesn't.

With the remaining patches a high level RAID driver ought to be able 
to send pfn-to-sector and sector-to-pfn requests to other block 
drivers, without any unnecessary struct page allocation overhead, 
right?

As long as the pfn concept remains a clever way to reuse our 
ram<->sector interfaces to implement sector<->sector IO, in the cases 
where the IO has no serialization or MMU concerns, not using struct 
page and using pfn_t looks natural.

The moment it starts reaching user space APIs, like in the DAX case, 
and especially if it becomes user-MMU visible, it's a mistake to not 
have struct page backing, I think.

(In that sense the current DAX mmap() code is already a partial 
mistake.)

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ