lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5184D93C.7000806@parallels.com>
Date:	Sat, 04 May 2013 13:47:40 +0400
From:	Pavel Emelyanov <xemul@...allels.com>
To:	Matt Helsley <matthltc@...ux.vnet.ibm.com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Linux MM <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/5] pagemap: Introduce the /proc/PID/pagemap2 file

On 05/02/2013 09:08 PM, Matt Helsley wrote:
> On Thu, Apr 11, 2013 at 03:29:41PM +0400, Pavel Emelyanov wrote:
>> This file is the same as the pagemap one, but shows entries with bits
>> 55-60 being zero (reserved for future use). Next patch will occupy one
>> of them.
> 
> This approach doesn't scale as well as it could. As best I can see
> CRIU would do:
> 
> for each vma in /proc/<pid>/smaps
> 	for each page in /proc/<pid>/pagemap2
> 		if soft dirty bit
> 			copy page
> 
> (possibly with pfn checks to avoid copying the same page mapped in
> multiple locations..)

Comparing pfns got from two subsequent pagemap reads doesn't help at all.
If they are equal, this can mean that either page is shared or (less likely,
but still) that the page, that used to be at the 1st pagemap was reclaimed
and mapped to the 2nd between two reads. If they differ, it can again mean
either not-shared (most likely) or shared (pfns were equal, but got reclaimed
and swapped in back).

Some better API for pages sharing would be nice, probably such API could be
also re-used for the user-space KSM :)

> However, if soft dirty bit changes could be queued up (from say the
> fault handler and page table ops that map/unmap pages) and accumulated
> in something like an interval tree it could be something like:
> 
> for each range of changed pages
> 	for each page in range
> 		copy page
> 
> IOW something that scales with the number of changed pages rather
> than the number of mapped pages.
> 
> So I wonder if CRIU would abandon pagemap2 in the future for something
> like this.

We'd surely adopt such APIs is one exists. One thing to note about one is that
we'd also appreciate if this API would be able to batch "present" bits as well
as "swapped" and "page-file" ones. We use these three in CRIU as well, and
these bits scanning can also be optimized.

> Cheers,
> 	-Matt Helsley
> 

Thanks,
Pavel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ