lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 28 Aug 2012 20:25:58 -0700
From:	Kent Overstreet <koverstreet@...gle.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Tejun Heo <tj@...nel.org>, linux-bcache@...r.kernel.org,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com,
	mpatocka@...hat.com, bharrosh@...asas.com,
	Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by
 stacking drivers

On Tue, Aug 28, 2012 at 09:31:50PM -0400, Vivek Goyal wrote:
> On Tue, Aug 28, 2012 at 04:01:08PM -0700, Kent Overstreet wrote:
> > On Tue, Aug 28, 2012 at 03:28:00PM -0700, Kent Overstreet wrote:
> > > On Tue, Aug 28, 2012 at 01:49:10PM -0700, Tejun Heo wrote:
> > > > Overall, I *think* this is correct but need to think more about it to
> > > > be sure.
> > > 
> > > Please do. As much time as I've spent staring at this kind of stuff,
> > > I'm pretty sure I've got it correct but it still makes my head hurt to
> > > work out all the various possible deadlocks.
> > 
> > Hilarious thought: We're punting bios to a rescuer thread that's
> > specific to a certain bio_set, right? What if we happen to punt bios
> > from a different bio_set? And then the rescuer goes to resubmit those
> > bios, and in the process they happen to have dependencies on the
> > original bio_set...
> 
> Are they not fully allocated bios and when you submit these to underlying
> device, ideally we should not be sharing memory pool at different layers
> of stack otherwise we will deadlock any way as stack depth increases. So
> there should not be a dependency on original bio_set?
> 
> Or, am I missing something. May be an example will help.

Uh, it's more complicated than that. My brain is too fried right now to
walk through it in detail, but the problem (if it is a problem; I can't
convince myself one way or the other) is roughly:

one virt block device stacked on top of another - they both do arbitrary
splitting:

So once they've submitted a bio, that bio needs to make forward progress
even if the thread goes to allocate another bio and blocks before it
returns from its make_request fn.

That much my patch solves, with the rescuer thread; if the thread goes
to block, it punts those blocked bios off to a rescuer thread - and we
create one rescuer per bio set.

So going back to the stacked block devices, if you've got say dm on top
of md (or something else since md doesn't really do much splitting) -
each block device will have its own rescuer and everything should be
hunky dory.

Except that when thread a goes to punt those blocked bios to its
rescuer, it punts _all_ the bios on current->bio_list. Even those
generated by/belonging to other bio_sets.

So thread 1 in device b punts bios to its rescuer, thread 2

But thread 2 ends up with bios for both device a and b - because they're
stacked.

Thread 2 starts on bios for device a before it gets to those for device
b. But a is stacked on top of b, so in the process it generates more
bios for b.

So now it's uhm...

yeah, I'm gonna sleep on this. I'm pretty sure to be rigorously correct
filtering the right bios when we punt them to the rescuer is needed,
though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ