lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111024100408.0627ac02@notabene.brown>
Date:	Mon, 24 Oct 2011 10:04:08 +1100
From:	NeilBrown <neilb@...e.de>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
Cc:	Alan Stern <stern@...land.harvard.edu>,
	John Stultz <john.stultz@...aro.org>,
	mark gross <markgross@...gnar.org>,
	Linux PM list <linux-pm@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: lsusd - The Linux SUSpend Daemon

On Sun, 23 Oct 2011 14:48:22 +0200 "Rafael J. Wysocki" <rjw@...k.pl> wrote:

> On Sunday, October 23, 2011, NeilBrown wrote:
> > On Fri, 21 Oct 2011 22:00:13 -0400 (EDT) Alan Stern
> > <stern@...land.harvard.edu> wrote:
> > 
> > > On Sat, 22 Oct 2011, NeilBrown wrote:
> > > 
> > > > > >     It uses files in /var/run/suspend for all communication.
> > > > > 
> > > > > I'm not so keen on using files for communication.  At best, they are
> > > > > rather awkward for two-way messaging.  If you really want to use them,
> > > > > then at least put them on a non-backed filesystem, like something under
> > > > > /dev.
> > > > 
> > > > Isn't /var/run a tmpfs filesystem?  It should be.
> > > > Surely /run is, so in the new world order the files should probably go
> > > > there.   But that is just a detail.
> > > 
> > > On my Fedora-14 systems there is no /run, and /var/run is a regular 
> > > directory in a regular filesystem.
> > > 
> > > > I like files...  I particularly like 'flock' to block suspend.   The
> > > > rest.... whatever..
> > > > With files, you only need a context switch when there is real communication.
> > > > With sockets, every message sent must be read so there will be a context
> > > > switch.
> > > > 
> > > > Maybe we could do something with futexes...
> > > 
> > > Not easily -- as far as I can tell, futexes enjoy relatively little 
> > > support.  In any case, they provide the same service as a mutex, which 
> > > means you'd have to build a shared lock on top of them.
> > > 
> > > > > >     lsusd does not try to be event-loop based because:
> > > > > >       - /sys/power/wakeup_count is not pollable.  This could probably be
> > > > > >         'fixed' but I want code to work with today's kernel.  It will probably
> > > > > 
> > > > > Why does this matter?
> > > > 
> > > > In my mind an event based program should never block.  Every action should be
> > > > non-blocking and only taken when 'poll' says it can.
> > > > Reading /sys/power/wakeup_count can be read non-blocking, but you cannot find
> > > > out when it is sensible to try to read it again.  So it doesn't fit.
> > > 
> > > There shouldn't be any trouble about making wakeup_count pollable.  It
> > > also would need to respect nonblocking reads, which it currently does 
> > > not do.
> > 
> > Hmm.. you are correct.  I wonder why I thought it did support non-blocking
> > reads...
> > I guess it was the code for handling an interrupted system call.
> > 
> > I feel a bit uncomfortable with the idea of sysfs files that block but I
> > don't think I can convincingly argue against it.
> > A non-blocking flag could be passed in, but it would be a very messy change -
> > lots of function call signatures changing needlessly:  we would need a flag
> > to the 'show' method ... or add a 'show_nonblock' method which would also be
> > ugly.
> > 
> > 
> > But I think there is a need to block - if there is an in-progress event then
> > it must be possible to wait for it to complete as it may not be visible to
> > userspace until then.
> > We could easily enable 'poll' for wakeup_count and then make it always
> > non-blocking, but I'm not really sure I want to require programs to use poll,
> > only to allow them.  And without using poll there is no way to wait.
> > 
> > As wakeup_count really has to be single-access we could possibly fudge
> > something by remembering the last value read (like we remember the last value
> > written).
> > 
> > - if the current count is different from the last value read, then return
> >   it even if there are in-progress events.
> > - if the current count is the same as the last value read, then block until
> >   there are no in-progress events and return the new value.
> > - enable sysfs_poll on wakeup_count by calling sysfs_notify_dirent at the
> >   end of wakeup_source_deactivated .... or calling something in
> >   kernel/power/main.c which calls that.  However we would need to make
> >   sysfs_notify_dirent a lot lighter weight first.  Maybe I should do that.
> > 
> > Then a process that uses 'poll' could avoid reading wakeup_count except when
> > it has changed, and then it won't block.  And a process that doesn't use poll
> > can block by simply reading twice - either explicitly or by going around a 
> >    read then write and it fails
> > loop a second time.
> > 
> > I'm not sure I'm completely comfortable with that, but it is the best I could
> > come up with.
> 
> Well, you're now considering doing more and more changes to the kernel
> just to be able to implement something in user space to avoid making
> some _other_ changes to the kernel.  That doesn't sound right to me.

:-)   I thought I might get challenged on something like that.

I think the cases are different though.

I'm not presenting this code as a new feature.  I don't need new features -
I have user-space code which works correctly with the current kernel features.

However the precise usage of wakeup_count is a little unusual in that it
blocks when you read.  That doesn't mean that it cannot be used correctly,
but it might limit the options available to a user-space program which wants
to use it.   I was just looking at ways to generalise the existing interface
so that it matches the rest of the kernel better.  I see it much more as a
bug fix than as a new feature.

I'm not saying we need this patch, and I'm not even sure I like it.  I just
presented it as part of exploring exactly how the wakeup_count interface
really works.  It is an interface that I like and that does allow the
original suspend-race problem to be solved, but that does not mean it is
necessarily perfect.

Thanks,
NeilBrown


Download attachment "signature.asc" of type "application/pgp-signature" (829 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ