linux-kernel - Re: [RFC PATCH] cifs: Fix possible deadlock with cifs and work queues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKywueT4xRHZGomoPvnA9rob3FpzkPYuuUPe+vEDiOFvTbUiGA@mail.gmail.com>
Date:	Fri, 21 Mar 2014 12:32:12 +0400
From:	Pavel Shilovsky <piastry@...rsoft.ru>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Jeff Layton <jlayton@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-cifs <linux-cifs@...r.kernel.org>,
	Steve French <sfrench@...ba.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Clark Williams <williams@...hat.com>,
	"Luis Claudio R. Goncalves" <lclaudio@...g.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tejun Heo <tj@...nel.org>, uobergfe@...hat.com
Subject: Re: [RFC PATCH] cifs: Fix possible deadlock with cifs and work queues

2014-03-21 6:23 GMT+04:00 Steven Rostedt <rostedt@...dmis.org>:
> On Thu, 20 Mar 2014 17:02:39 -0400
> Jeff Layton <jlayton@...hat.com> wrote:
>
>> Eventually the server should just allow the read to complete even if
>> the client doesn't respond to the oplock break. It has to since clients
>> can suddenly drop off the net while holding an oplock. That should
>> allow everything to unwedge eventually (though it may take a while).
>>
>> If that's not happening then I'd be curious as to why...
>
> The problem is that the data is being filled in the page and the reader
> is waiting for the page lock to be released. The kworker for the reader
> will issue the complete() and unlock the page to wake up the reader.
>
> But because the other workqueue callback calls down_read(), and there
> can be a down_write() waiting for the reader to finish, this
> down_read() will block on the lock as well (rwsems are fair locks).
> This blocks the other workqueue callback from issuing the complete and
> page_unlock() that will wake up the reader that is holding the rwsem
> with down_read().
>
> DEADLOCK.

Thank you for reporting and clarifying the issue!

Read and write codepaths both obtain lock_sem for read and then wait
for cifsiod_wq to complete and release lock_sem. They don't do any
lock_sem operations inside their work task queued to cifsiod_wq. But
oplock code can obtain/release lock_sem in its work task. So, that's
why I agree with Jeff and suggest to move the oplock code to a
different work queue (cifsioopd_wq?) but leave read and write
codepaths use cifsiod_wq.

-- 
Best regards,
Pavel Shilovsky.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/