lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140321075956.500fe42d@tlielax.poochiereds.net>
Date:	Fri, 21 Mar 2014 07:59:56 -0400
From:	Jeff Layton <jlayton@...hat.com>
To:	Pavel Shilovsky <piastry@...rsoft.ru>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-cifs <linux-cifs@...r.kernel.org>,
	Steve French <sfrench@...ba.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Clark Williams <williams@...hat.com>,
	"Luis Claudio R. Goncalves" <lclaudio@...g.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tejun Heo <tj@...nel.org>, uobergfe@...hat.com
Subject: Re: [RFC PATCH] cifs: Fix possible deadlock with cifs and work
 queues

On Fri, 21 Mar 2014 12:32:12 +0400
Pavel Shilovsky <piastry@...rsoft.ru> wrote:

> 2014-03-21 6:23 GMT+04:00 Steven Rostedt <rostedt@...dmis.org>:
> > On Thu, 20 Mar 2014 17:02:39 -0400
> > Jeff Layton <jlayton@...hat.com> wrote:
> >
> >> Eventually the server should just allow the read to complete even if
> >> the client doesn't respond to the oplock break. It has to since clients
> >> can suddenly drop off the net while holding an oplock. That should
> >> allow everything to unwedge eventually (though it may take a while).
> >>
> >> If that's not happening then I'd be curious as to why...
> >
> > The problem is that the data is being filled in the page and the reader
> > is waiting for the page lock to be released. The kworker for the reader
> > will issue the complete() and unlock the page to wake up the reader.
> >
> > But because the other workqueue callback calls down_read(), and there
> > can be a down_write() waiting for the reader to finish, this
> > down_read() will block on the lock as well (rwsems are fair locks).
> > This blocks the other workqueue callback from issuing the complete and
> > page_unlock() that will wake up the reader that is holding the rwsem
> > with down_read().
> >
> > DEADLOCK.
> 
> Thank you for reporting and clarifying the issue!
> 
> Read and write codepaths both obtain lock_sem for read and then wait
> for cifsiod_wq to complete and release lock_sem. They don't do any
> lock_sem operations inside their work task queued to cifsiod_wq. But
> oplock code can obtain/release lock_sem in its work task. So, that's
> why I agree with Jeff and suggest to move the oplock code to a
> different work queue (cifsioopd_wq?) but leave read and write
> codepaths use cifsiod_wq.
> 

Yes, thanks for the clarification. I missed the fact that a 3rd task
was blocked on a down_write. I think that fix will help prevent the
full-blown deadlock, but it'll still be susceptible to long stalls in
some situations.

In Steven's example the work on CPU0 is blocked on a SMB_READ. That
read may not be completing because the server has issued an oplock
break to the client and is waiting on the response. That oplock break
response is blocked because it's blocked on the down_write.

In that situation, what breaks the deadlock is that the server
eventually gives up waiting on the oplock break response, but that can
take on the order of a minute.

So yeah, I think we should add a dedicated workqueue for oplock breaks
as an interim fix, but I also think we need to consider revamping the
lock pushing code such that oplock breaks are not subject to being
blocked like that.

-- 
Jeff Layton <jlayton@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ