linux-kernel - Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALFYKtDDnebBbQSXBOGwabJ5cNga8b_MFq8bREJ+Qnw2f4eYmw@mail.gmail.com>
Date:	Fri, 1 Aug 2014 17:50:05 +0400
From:	Ilya Dryomov <ilya.dryomov@...tank.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Mike Galbraith <umgwanakikbuti@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>,
	Ceph Development <ceph-devel@...r.kernel.org>,
	davidlohr@...com, jason.low2@...com, axboe@...nel.dk,
	Kent Overstreet <kmo@...erainc.com>
Subject: Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra
 reschedule point"

On Fri, Aug 1, 2014 at 5:27 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote:
>> I'm going to fix up rbd_request_fn(), but I want to make sure
>> I understand this in full.
>>
>> - Previously the danger of calling blocking primitives on the way to
>>   schedule(), i.e. with task->state != TASK_RUNNING, was that if the
>>   blocking primitive was indeed to block the task state would be set
>>   back to TASK_RUNNING and the schedule() that that task was on the way
>>   to wouldn't have any effect.  Your "Add extra reschedule point" patch
>>   essentially made calling mutex_lock() and probably others much more
>>   wrong that it used to be, because mutex_lock() may now reschedule
>>   when the task is not on the mutex wait queue.
>
> Right and in general we cannot allow spurious wakeups (although we try
> very hard to deal with them in generic code, which is why things more or
> less worked).
>
> But if you do a patch that 'randomly' ignores ->state on schedule (I did
> one) stuff comes apart _real_ quick.
>
> Therefore you should very much not destroy ->state on the way to
> schedule.
>
>> - There is nothing wrong with releasing queue_lock and reenabling IRQs
>>   in rbd_request_fn() as long as it doesn't block and I remember to
>>   disable IRQs and take queue_lock back on return.
>
> Releasing queue_lock might be ok, dunno about the blk locking, however
> reenabling IRQs it is actually wrong as per blk_flush_plug_list() since
> that uses local_irq_save()/restore() which means it can be called from
> contexts which cannot deal with enabling IRQs, and then your
> ->request_fn() goes and does that.
>
> Now maybe blk_flush_plug_list() is overly paranoid and it could use
> local_irq_disable()/enable() instead, I don't know. But until it does, a
> request_fn() should never reenable IRQs.
>
>> I'm asking because rbd_request_fn() is probably not the only broken in
>> this way code path.  I poked around and found read_events() in aio.c,
>> it seems to have been written with the "danger" assumption that
>> I outlined above and there is even a comment to it.
>
> I'm fairly sure there's more broken stuff, I didn't dare looking.
>
>> Does that above make sense or am I missing something?
>
> I think that's about it.

Thanks for clarifying things.  CC'ing Kent to draw attention to
read_events().

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/