lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 23 Aug 2011 16:43:29 +0100
From:	Russell King - ARM Linux <linux@....linux.org.uk>
To:	Mark Brown <broonie@...nsource.wolfsonmicro.com>
Cc:	Tejun Heo <tj@...nel.org>, "Rafael J. Wysocki" <rjw@...k.pl>,
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: try_to_freeze() called with IRQs disabled on ARM

On Tue, Aug 23, 2011 at 04:19:36PM +0100, Mark Brown wrote:
> The recent series of commits reworking the freezer appear to have
> caused serious issues on ARM.  The kernel constantly complains that
> try_to_freeze() is bring called with interrupts disabled:
> 
> [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> [   75.380000] no locks held by Xorg/1517.
> [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> [<c0464400>] (dump_stack+0x20/0x24)
> [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> (__might_sleep+0xfc/0x11c)
> [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> (do_signal+0x94/0x230)
> [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> (do_notify_resume+0x28/0x6c)
> [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> [<c000eaf8>] (work_pending+0x24/0x28)
> 
> and the boot runs very slowly.  Reverting the series merged in 56f0be
> appears to resolve the issue, though looking at the changes I'd expect
> there's some underlying bug here that just doesn't trigger very often.
> I don't really have the bandwidth to understand what's gone wrong right
> now but should be able to run tests if you've got anything you'd like
> looking atl.

The signal handling is rather yucky, and it seems it needs to run with
IRQs enabled - but we run it with IRQs disabled.

There's horrible issues in there.  One such issue is the syscall restart
handling.  Here's an example:

	sys_ppoll
	...
	receives signal
	returns ERESTARTNOHAND (restart syscall if no handler)

On the exit path, we notice that we have TIF_SIGPENDING set.  So we call
do_notify_resume() and eventually do_signal().  At this point, 'syscall'
is set.

The first thing do_signal() does is fiddle with the stack to fake a
restart.  If we don't gain a signal here, everything is fine and we
return.  So far so good.

Lets say for the sake of argument that we've run do_signal() with IRQs
enabled.  Now, a scheduling event has come along and set TIF_NEED_RESCHED.

So, having returned from do_notify_resume(), we disable interrupts, reload
the work mask, and notice that TIF_NEED_RESCHED is set.  We now switch
away from this thread and do something else.

While something else is running, lets say a SIGHUP is sent to our process,
and there's a handler installed.

When we switch back, we again disable interrupts, reload the work mask,
and notice this time that TIF_SIGPENDING is again set.  So, just like
last time, we call do_notify_resume() and eventually do_signal().  This
time, syscall is false.

So we obtain the signal, save the current state, and set userspace up to
call the handler.

The state which we've saved though is the state required for the syscall
restart - but that's wrong, because the return code said 'restart if
no handler' and we're now invoking a handler.

What should happen is that the syscall is not restarted, and -EINTR is
returned instead.

What has this got to do with the issue you raise?  Having interrupts
off during signal handling helps to avoid this problem by ensuring that
we can't get resched events if there's a signal with no handler present.
Enabling interrupts for do_signal opens that race up.

Now, having interrupts disabled for writing to the processes stack is
technically bad news (although in the last 18 or so years its never
caused a problem.)  That said, we really should enable interrupts
before calling handle_signal() - which should be safe from the above
race.

But... without serious amounts of rework of the signal handling code
to avoid the above race I don't see how we can allow try_to_freeze()
to run with IRQs enabled and still have race-free syscall restarting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ