linux-kernel - Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140801132754.GI19379@twins.programming.kicks-ass.net>
Date:	Fri, 1 Aug 2014 15:27:54 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Ilya Dryomov <ilya.dryomov@...tank.com>
Cc:	Mike Galbraith <umgwanakikbuti@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>,
	Ceph Development <ceph-devel@...r.kernel.org>,
	davidlohr@...com, jason.low2@...com, axboe@...nel.dk
Subject: Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra
 reschedule point"

On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote:
> I'm going to fix up rbd_request_fn(), but I want to make sure
> I understand this in full.
> 
> - Previously the danger of calling blocking primitives on the way to
>   schedule(), i.e. with task->state != TASK_RUNNING, was that if the
>   blocking primitive was indeed to block the task state would be set
>   back to TASK_RUNNING and the schedule() that that task was on the way
>   to wouldn't have any effect.  Your "Add extra reschedule point" patch
>   essentially made calling mutex_lock() and probably others much more
>   wrong that it used to be, because mutex_lock() may now reschedule
>   when the task is not on the mutex wait queue.

Right and in general we cannot allow spurious wakeups (although we try
very hard to deal with them in generic code, which is why things more or
less worked).

But if you do a patch that 'randomly' ignores ->state on schedule (I did
one) stuff comes apart _real_ quick.

Therefore you should very much not destroy ->state on the way to
schedule.

> - There is nothing wrong with releasing queue_lock and reenabling IRQs
>   in rbd_request_fn() as long as it doesn't block and I remember to
>   disable IRQs and take queue_lock back on return.

Releasing queue_lock might be ok, dunno about the blk locking, however
reenabling IRQs it is actually wrong as per blk_flush_plug_list() since
that uses local_irq_save()/restore() which means it can be called from
contexts which cannot deal with enabling IRQs, and then your
->request_fn() goes and does that.

Now maybe blk_flush_plug_list() is overly paranoid and it could use
local_irq_disable()/enable() instead, I don't know. But until it does, a
request_fn() should never reenable IRQs.

> I'm asking because rbd_request_fn() is probably not the only broken in
> this way code path.  I poked around and found read_events() in aio.c,
> it seems to have been written with the "danger" assumption that
> I outlined above and there is even a comment to it.

I'm fairly sure there's more broken stuff, I didn't dare looking.

> Does that above make sense or am I missing something?

I think that's about it.

Content of type "application/pgp-signature" skipped