lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 Jul 2007 09:40:22 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	nigel@...pend2.net
Cc:	Pavel Machek <pavel@....cz>, Oliver Neukum <oliver@...kum.org>,
	Miklos Szeredi <miklos@...redi.hu>, benh@...nel.crashing.org,
	mjg59@...f.ucam.org, linux-kernel@...r.kernel.org,
	linux-pm@...ts.linux-foundation.org
Subject: Re: [PATCH] Remove process freezer from suspend to RAM pathway

Hi,

On Thursday, 5 July 2007 23:49, Nigel Cunningham wrote:
> Good morning!
> 
> On Thursday 05 July 2007 23:59:57 Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 15:36, Nigel Cunningham wrote:
> > > On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> > > > On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > > > > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > > > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > > > > And a further question. The freezer is not atomic. What do 
> you 
> > > do
> > > > > > > > > > if a task not yet frozen calls sys_sync(), but fuse is 
> already 
> > > > > frozen?
> > > > > > > > > 
> > > > > > > > > What do you do if a task not yet frozen writes to a pipe, on 
> the 
> > > other
> > > > > > > > > end of which is a task already frozen?
> > > > > > > 
> > > > > > > There's some difference between uninterruptible and interruptible
> > > > > > > sleep I'd say.
> > > > > > > 
> > > > > > > > > It doesn't matter.  The only thing that should matter during 
> > > suspend
> > > > > > > > > (not hibernate) is saving the state of devices to ram, and 
> putting 
> > > the
> > > > > > > > > devices to sleep.
> > > > > > > > 
> > > > > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > > > > and must be called in the hibernate path.
> > > > > > > 
> > > > > > > Not "must". In fact, hibernation should be safe without 
> sys_sync(). It
> > > > > > > is just user un-friendly.
> > > > > > 
> > > > > > In fact, I'd like to remove the sys_sync() from the freezer 
> entirely, 
> > > > > because
> > > > > > it just doesn't belong in there.
> > > > > > 
> > > > > > The only advantege of having sys_sync() in freeze_processes() is 
> that we
> > > > > > have a chance to write out everything when applications cannot 
> produce 
> > > more
> > > > > > data to write, but there are filesystems which don't do that anyway 
> (eg. 
> > > > > XFS),
> > > > > > so generally there's no reason to bother.
> > > > > 
> > > > > Shouldn't XFS - and fuse - be considered to be broken? Sync should 
> sync 
> > > data 
> > > > > and if XFS isn't doing that, it's wrong.
> > > > > 
> > > > > In the case of fuse, we should have a mechanism by which fuse 
> processes 
> > > can be 
> > > > > made to sync if they do have any pending I/O, and by which they can be 
> > > frozen 
> > > > > later than other userspace processes.
> > > > > 
> > > > > I'd like to see the sync stay, because it improves reliability and 
> data 
> > > > > integrity in the fail-to-resume case. Calling scripts would probably 
> > > invoke 
> > > > > sync themselves if they don't already, but that's racy. As it is at 
> the 
> > > > > moment, we know userspace is stopped, so syncing isn't racy.
> > > > 
> > > > I'd like to move the sync out of the freezer, but to call it from the
> > > > suspend/hibernation code, so that we do
> > > > 
> > > > sys_sync();
> > > > error = freeze_processes();
> > > 
> > > Yeah, I understand that. The problem then is that you're racing against 
> > > userspace. That's not usually a problem, but that doesn't mean it's never 
> a 
> > > problem. Try running the stress suite while testing hibernating and you'll 
> > > see what I mean. If something is submitting lots of I/O when you try to 
> > > suspend, your sync call will race against that process if it's not yet 
> > > frozen, and its continued activity will make your sync pointless (there'll 
> be 
> > > more unsynced data when you sys_sync call finishes). Stopping userspace 
> > > before syncing removes that race.
> > 
> > Yes, that will make the suspend/hibernation less reliable in case the resume
> > fails (some data, written after the sync, may be lost).  However, the sync 
> done
> > from within the freezer doesn't guarantee that there are no data lost 
> anyway,
> > so we don't lose much by not doing it.
> > 
> > Now, there's a question how much data may be lost, potentially, if we do the
> > sync before the freezer and I don't think that's a lot.
> 
> You're missing the point. I'm arguing that a sync from within the freezer 
> should guarantee that there is no data loss.

Well, it should, but it doesn't ...

Moreover, if FUSE implements syncing, then the sync from within the freezer
will almost certainly deadlock.

> As I said about, XFS should be fixed to properly sync its data, and
> something should be done about fuse filesystems too.

I think that we can't do anything about it, so we should just live with it.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ