linux-kernel - Re: block: fix blk_queue_split() resource exhaustion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 28 Jun 2016 10:24:49 +0200
From:	Lars Ellenberg <lars.ellenberg@...bit.com>
To:	Mike Snitzer <snitzer@...hat.com>
Cc:	Ming Lei <ming.lei@...onical.com>, linux-block@...r.kernel.org,
	Roland Kammerer <roland.kammerer@...bit.com>,
	Jens Axboe <axboe@...nel.dk>, NeilBrown <neilb@...e.com>,
	Kent Overstreet <kent.overstreet@...il.com>,
	Shaohua Li <shli@...nel.org>, Alasdair Kergon <agk@...hat.com>,
	"open list:DEVICE-MAPPER (LVM)" <dm-devel@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Takashi Iwai <tiwai@...e.de>, Jiri Kosina <jkosina@...e.cz>,
	Zheng Liu <gnehzuil.liu@...il.com>,
	Keith Busch <keith.busch@...el.com>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"open list:BCACHE (BLOCK LAYER CACHE)" <linux-bcache@...r.kernel.org>,
	"open list:SOFTWARE RAID (Multiple Disks) SUPPORT" 
	<linux-raid@...r.kernel.org>
Subject: Re: block: fix blk_queue_split() resource exhaustion

On Fri, Jun 24, 2016 at 11:15:47AM -0400, Mike Snitzer wrote:
> On Fri, Jun 24 2016 at 10:27am -0400,
> Lars Ellenberg <lars.ellenberg@...bit.com> wrote:
> 
> > On Fri, Jun 24, 2016 at 07:36:57PM +0800, Ming Lei wrote:
> > > >
> > > > This is not a theoretical problem.
> > > > At least int DRBD, and an unfortunately high IO concurrency wrt. the
> > > > "max-buffers" setting, without this patch we have a reproducible deadlock.
> > > 
> > > Is there any log about the deadlock? And is there any lockdep warning
> > > if it is enabled?
> > 
> > In DRBD, to avoid potentially very long internal queues as we wait for
> > our replication peer device and local backend, we limit the number of
> > in-flight bios we accept, and block in our ->make_request_fn() if that
> > number exceeds a configured watermark ("max-buffers").
> > 
> > Works fine, as long as we could assume that once our make_request_fn()
> > returns, any bios we "recursively" submitted against the local backend
> > would be dispatched. Which used to be the case.
> 
> It'd be useful to know whether this patch fixes your issue:
> https://patchwork.kernel.org/patch/7398411/

I would assume so.
because if current is blocked for any reason,
it will dispatch all bios that are still on current->bio_list
to be processed from other contexts.

Which means we will not deadlock, but make progress,
if the unblock of current depends on processing of those bios.

Also see my other mail on the issue,
where I try to better explain the mechanics of "my" deadlock.

    Lars