lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101105171703.GA1760@sundance.ccs.neu.edu>
Date:	Fri, 5 Nov 2010 13:17:03 -0400
From:	Gene Cooperman <gene@....neu.edu>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	Kapil Arya <kapil@....neu.edu>,
	Oren Laadan <orenl@...columbia.edu>,
	"ksummit-2010-discuss@...ts.linux-foundation.org" 
	<ksummit-2010-discuss@...ts.linux-foundation.org>,
	Gene Cooperman <gene@....neu.edu>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch

On Fri, Nov 05, 2010 at 04:57:33AM -0700, Luck, Tony wrote:
> > Oren noted that sometimes it's important to stop the process only
> > for a few milliseconds while one checkpoints. In DMTCP, we do that
> > by configuring with --enable-forked-checkpointing. This causes us
> > to fork a child process taking advantage of copy-on-write and then
> > checkpoint the memory pages of the child while the parent continues
> > to execute.
> 
> Interesting ... but while the process is only stopped for the duration
> of the fork, it may be taking COW faults on almost every page it
> touches.  I think this will not work well for large HPC applications
> that allocate most of physical memory as anonymous pages for the
> application. It may even result in an OOM kill if you don't complete
> the checkpoint of the child and have it exit in a timely manner.
> 
> -Tony
> 

I agree with you that forked checkpointing is probably not what you
want in the middle of an HPC computation.  But isn't that part of
the nature of COW?  Whether the COW is invoked within the kernel,
or from outside the kernel via fork --- in either case, when you have
mostly dirty pages, you will have to copy most of the pages.
Do I understand your point correctly?			Thanks,
							- Gene
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ