linux-kernel - Re: block: fix blk_queue_split() resource exhaustion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160624151547.GA13898@redhat.com>
Date:	Fri, 24 Jun 2016 11:15:47 -0400
From:	Mike Snitzer <snitzer@...hat.com>
To:	Lars Ellenberg <lars.ellenberg@...bit.com>
Cc:	Ming Lei <ming.lei@...onical.com>, linux-block@...r.kernel.org,
	Roland Kammerer <roland.kammerer@...bit.com>,
	Jens Axboe <axboe@...nel.dk>, NeilBrown <neilb@...e.com>,
	Kent Overstreet <kent.overstreet@...il.com>,
	Shaohua Li <shli@...nel.org>, Alasdair Kergon <agk@...hat.com>,
	"open list:DEVICE-MAPPER (LVM)" <dm-devel@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Takashi Iwai <tiwai@...e.de>, Jiri Kosina <jkosina@...e.cz>,
	Zheng Liu <gnehzuil.liu@...il.com>,
	Keith Busch <keith.busch@...el.com>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"open list:BCACHE (BLOCK LAYER CACHE)" <linux-bcache@...r.kernel.org>,
	"open list:SOFTWARE RAID (Multiple Disks) SUPPORT" 
	<linux-raid@...r.kernel.org>
Subject: Re: block: fix blk_queue_split() resource exhaustion

On Fri, Jun 24 2016 at 10:27am -0400,
Lars Ellenberg <lars.ellenberg@...bit.com> wrote:

> On Fri, Jun 24, 2016 at 07:36:57PM +0800, Ming Lei wrote:
> > >
> > > This is not a theoretical problem.
> > > At least int DRBD, and an unfortunately high IO concurrency wrt. the
> > > "max-buffers" setting, without this patch we have a reproducible deadlock.
> > 
> > Is there any log about the deadlock? And is there any lockdep warning
> > if it is enabled?
> 
> In DRBD, to avoid potentially very long internal queues as we wait for
> our replication peer device and local backend, we limit the number of
> in-flight bios we accept, and block in our ->make_request_fn() if that
> number exceeds a configured watermark ("max-buffers").
> 
> Works fine, as long as we could assume that once our make_request_fn()
> returns, any bios we "recursively" submitted against the local backend
> would be dispatched. Which used to be the case.

It'd be useful to know whether this patch fixes your issue:
https://patchwork.kernel.org/patch/7398411/

Ming Lei didn't like it due to concerns about I contexts changing
(whereby breaking merging that occurs via plugging).

But if it _does_ fix your issue then the case for the change is
increased; and we just need to focus on addressing Ming's concerns
(Mikulas has some ideas).

Conversely, and in parallel, Mikulas can look to see if your approach
fixes the observed dm-snapshot deadlock that he set out to fix.