lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200707060749.08665.nigel@nigel.suspend2.net>
Date:	Fri, 6 Jul 2007 07:49:07 +1000
From:	Nigel Cunningham <nigel@...el.suspend2.net>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
Cc:	nigel@...pend2.net, Pavel Machek <pavel@....cz>,
	Oliver Neukum <oliver@...kum.org>,
	Miklos Szeredi <miklos@...redi.hu>, benh@...nel.crashing.org,
	mjg59@...f.ucam.org, linux-kernel@...r.kernel.org,
	linux-pm@...ts.linux-foundation.org
Subject: Re: [PATCH] Remove process freezer from suspend to RAM pathway

Good morning!

On Thursday 05 July 2007 23:59:57 Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 15:36, Nigel Cunningham wrote:
> > On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> > > On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > > > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > > > And a further question. The freezer is not atomic. What do 
you 
> > do
> > > > > > > > > if a task not yet frozen calls sys_sync(), but fuse is 
already 
> > > > frozen?
> > > > > > > > 
> > > > > > > > What do you do if a task not yet frozen writes to a pipe, on 
the 
> > other
> > > > > > > > end of which is a task already frozen?
> > > > > > 
> > > > > > There's some difference between uninterruptible and interruptible
> > > > > > sleep I'd say.
> > > > > > 
> > > > > > > > It doesn't matter.  The only thing that should matter during 
> > suspend
> > > > > > > > (not hibernate) is saving the state of devices to ram, and 
putting 
> > the
> > > > > > > > devices to sleep.
> > > > > > > 
> > > > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > > > and must be called in the hibernate path.
> > > > > > 
> > > > > > Not "must". In fact, hibernation should be safe without 
sys_sync(). It
> > > > > > is just user un-friendly.
> > > > > 
> > > > > In fact, I'd like to remove the sys_sync() from the freezer 
entirely, 
> > > > because
> > > > > it just doesn't belong in there.
> > > > > 
> > > > > The only advantege of having sys_sync() in freeze_processes() is 
that we
> > > > > have a chance to write out everything when applications cannot 
produce 
> > more
> > > > > data to write, but there are filesystems which don't do that anyway 
(eg. 
> > > > XFS),
> > > > > so generally there's no reason to bother.
> > > > 
> > > > Shouldn't XFS - and fuse - be considered to be broken? Sync should 
sync 
> > data 
> > > > and if XFS isn't doing that, it's wrong.
> > > > 
> > > > In the case of fuse, we should have a mechanism by which fuse 
processes 
> > can be 
> > > > made to sync if they do have any pending I/O, and by which they can be 
> > frozen 
> > > > later than other userspace processes.
> > > > 
> > > > I'd like to see the sync stay, because it improves reliability and 
data 
> > > > integrity in the fail-to-resume case. Calling scripts would probably 
> > invoke 
> > > > sync themselves if they don't already, but that's racy. As it is at 
the 
> > > > moment, we know userspace is stopped, so syncing isn't racy.
> > > 
> > > I'd like to move the sync out of the freezer, but to call it from the
> > > suspend/hibernation code, so that we do
> > > 
> > > sys_sync();
> > > error = freeze_processes();
> > 
> > Yeah, I understand that. The problem then is that you're racing against 
> > userspace. That's not usually a problem, but that doesn't mean it's never 
a 
> > problem. Try running the stress suite while testing hibernating and you'll 
> > see what I mean. If something is submitting lots of I/O when you try to 
> > suspend, your sync call will race against that process if it's not yet 
> > frozen, and its continued activity will make your sync pointless (there'll 
be 
> > more unsynced data when you sys_sync call finishes). Stopping userspace 
> > before syncing removes that race.
> 
> Yes, that will make the suspend/hibernation less reliable in case the resume
> fails (some data, written after the sync, may be lost).  However, the sync 
done
> from within the freezer doesn't guarantee that there are no data lost 
anyway,
> so we don't lose much by not doing it.
> 
> Now, there's a question how much data may be lost, potentially, if we do the
> sync before the freezer and I don't think that's a lot.

You're missing the point. I'm arguing that a sync from within the freezer 
should guarantee that there is no data loss. As I said about, XFS should be 
fixed to properly sync its data, and something should be done about fuse 
filesystems too.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ