lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <554CBE17.4070904@redhat.com>
Date:	Fri, 08 May 2015 09:45:59 -0400
From:	Rik van Riel <riel@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>,
	Dave Hansen <dave.hansen@...ux.intel.com>
CC:	Dan Williams <dan.j.williams@...el.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Boaz Harrosh <boaz@...xistor.com>, Jan Kara <jack@...e.cz>,
	Mike Snitzer <snitzer@...hat.com>, Neil Brown <neilb@...e.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	Chris Mason <clm@...com>, Paul Mackerras <paulus@...ba.org>,
	"H. Peter Anvin" <hpa@...or.com>, Christoph Hellwig <hch@....de>,
	Alasdair Kergon <agk@...hat.com>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...1.01.org>,
	Mel Gorman <mgorman@...e.de>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Jens Axboe <axboe@...nel.dk>, "Theodore Ts'o" <tytso@....edu>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Julia Lawall <Julia.Lawall@...6.fr>, Tejun Heo <tj@...nel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce
 __pfn_t

On 05/07/2015 03:11 PM, Ingo Molnar wrote:

> Stable, global page-struct descriptors are a given for real RAM, where 
> we allocate a struct page for every page in nice, large, mostly linear 
> arrays.
> 
> We'd really need that for pmem too, to get the full power of struct 
> page: and that means allocating them in nice, large, predictable 
> places - such as on the device itself ...
> 
> It might even be 'scattered' across the device, with 64 byte struct 
> page size we can pack 64 descriptors into a single page, so every 65 
> pages we could have a page-struct page.
> 
> Finding a pmem page's struct page would thus involve rounding it 
> modulo 65 and reading that page.
> 
> The problem with that is fourfold:
> 
>  - that we now turn a very kernel internal API and data structure into 
>    an ABI. If struct page grows beyond 64 bytes it's a problem.
> 
>  - on bootup (or device discovery time) we'd have to initialize all 
>    the page structs. We could probably do this in a hierarchical way, 
>    by dividing continuous pmem ranges into power-of-two groups of 
>    blocks, and organizing them like the buddy allocator does.
> 
>  - 1.5% of storage space lost.
> 
>  - will wear-leveling properly migrate these 'hot' pages around?

MST and I have been doing some thinking about how to address some of
the issues above.

One way could be to invert the PG_compound logic we have today, by
allocating one struct page for every PMD / THP sized area (2MB on
x86), and dynamically allocating struct pages for the 4kB pages
inside only if the area gets split. They can be freed again when
the area is not being accessed in 4kB chunks.

That way we would always look at the struct page for the 2MB area
first, and if the PG_split bit is set, we look at the array of
dynamically allocated struct pages for this area.

The advantages are obvious: boot time memory overhead and
initialization time are reduced by a factor 512. CPUs could also
take a whole 2MB area in order to do CPU-local 4kB allocations,
defragmentation policies may become a little clearer, etc...

The disadvantage is pretty obvious too: 4kB pages would no longer
be the fast case, with an indirection. I do not know how much of
an issue that would be, or whether it even makes sense for 4kB
pages to continue being the fast case going forward.

Memory trends point in one direction, file size trends in another.

For persistent memory, we would not need 4kB page struct pages unless
memory from a particular area was in small files AND those files were
being actively accessed. Large files (mapped in 2MB chunks) or inactive
small files would not need the 4kB page structs around.

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ