lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150507090217.GA4467@gmail.com>
Date:	Thu, 7 May 2015 11:02:17 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Dan Williams <dan.j.williams@...el.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Boaz Harrosh <boaz@...xistor.com>, Jan Kara <jack@...e.cz>,
	Mike Snitzer <snitzer@...hat.com>, Neil Brown <neilb@...e.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	Chris Mason <clm@...com>, Paul Mackerras <paulus@...ba.org>,
	"H. Peter Anvin" <hpa@...or.com>, Christoph Hellwig <hch@....de>,
	Alasdair Kergon <agk@...hat.com>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	Mel Gorman <mgorman@...e.de>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	Rik van Riel <riel@...hat.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Jens Axboe <axboe@...nel.dk>, Theodore Ts'o <tytso@....edu>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Julia Lawall <Julia.Lawall@...6.fr>, Tejun Heo <tj@...nel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer,
 introduce __pfn_t


* Dan Williams <dan.j.williams@...el.com> wrote:

> > What is the primary thing that is driving this need? Do we have a 
> > very concrete example?
> 
> My pet concrete example is covered by __pfn_t.  Referencing 
> persistent memory in an md/dm hierarchical storage configuration.  
> Setting aside the thrash to get existing block users to do 
> "bvec_set_page(page)" instead of "bvec->page = page" the onus is on 
> that md/dm implementation and backing storage device driver to 
> operate on __pfn_t.  That use case is simple because there is no use 
> of page locking or refcounting in that path, just dma_map_page() and 
> kmap_atomic().  The more difficult use case is precisely what Al 
> picked up on, O_DIRECT and RDMA.  This patchset does nothing to 
> address those use cases outside of not needing a struct page when 
> they eventually craft a bio.

So why not do a dual approach?

There are code paths where the 'pfn' of a persistent device is mostly 
used as a sector_t equivalent of terabytes of storage, not as an index 
of a memory object.

It's not an address to a cache, it's an index into a huge storage 
space - which happens to be (flash) RAM. For them using pfn_t seems 
natural and using struct page * is a strained (not to mention 
expensive) model.

For more complex facilities, where persistent memory is used as a 
memory object, especially where the underlying device is true, 
unfinitely writable RAM (not flash), treating it as a memory zone, or 
setting up dynamic struct page would be the natural approach. (with 
the inevitable cost of setup/teardown in the latter case)

I'd say that for anything where the dynamic struct page is torn down 
unconditionally after completion of only a single use, the natural API 
is probably pfn_t, not struct page. Any synchronization is already 
handled at the block request layer already, and it's storage op 
synchronization, not memory access synchronization really.

For anything more complex, that maps any of this storage to 
user-space, or exposes it to higher level struct page based APIs, 
etc., where references matter and it's more of a cache with 
potentially multiple users, not an IO space, the natural API is struct 
page.

I'd say that this particular series mostly addresses the 'pfn as 
sector_t' side of the equation, where persistent memory is IO space, 
not memory space, and as such it is the more natural and thus also the 
cheaper/faster approach.

Linus probably disagrees? :-)

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ