[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101104160428.GA10656@sundance.ccs.neu.edu>
Date: Thu, 4 Nov 2010 12:04:28 -0400
From: Gene Cooperman <gene@....neu.edu>
To: Tejun Heo <tj@...nel.org>
Cc: Nathan Lynch <ntl@...ox.com>, Christoph Hellwig <hch@....de>,
Oren Laadan <orenl@...columbia.edu>,
ksummit-2010-discuss@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, kapil@....neu.edu, gene@....neu.edu
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
Yes, we are working with Condor to have them validate DMTCP. Time will tell.
- Gene
On Thu, Nov 04, 2010 at 08:36:16AM +0100, Tejun Heo wrote:
> Hello,
>
> On 11/04/2010 02:47 AM, Nathan Lynch wrote:
> >> In this case whitelisting the allowed
> >> state by requiring special APIs for all I/O (or even just standard
> >> APIs as long as they are supposed by the C/R lib you're linked against)
> >> is the more pragmatic, and I think faithful aproach.
> >
> > I don't think users will go for it. They'll continue to use dodgy
> > out-of-tree kernel modules and/or LD_PRELOAD hacks instead of porting
> > their applications to a new library. I think a C/R library is an
> > "ideal" solution, but it's one that nobody would use - especially in
> > HPC, unless the library somehow provides better performance.
>
> I hear that there are plans to integrate one of the userland
> snapshotting implementations with HPC workload manager. ISTR the
> combination to be condor + dmtcp but not sure. I think things like
> that make a lot of sense. Scientists writing programs for HPC
> clusters already work in given frameworks and what those applications
> do and how to recover are pretty well confined/defined. If you
> integrate snapshotting with such frameworks, it becomes pretty easy
> for both the admins and users.
>
> I'll talk about other issues in the reply to Oren's email.
>
> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists