lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120518081444.GA27205@dhcp-172-18-216-138.mtv.corp.google.com>
Date:	Fri, 18 May 2012 04:14:44 -0400
From:	Kent Overstreet <koverstreet@...gle.com>
To:	NeilBrown <neilb@...e.de>
Cc:	linux-bcache@...r.kernel.org, linux-kernel@...r.kernel.org,
	dm-devel@...hat.com, linux-fsdevel@...r.kernel.org, tj@...nel.org,
	axboe@...nel.dk, agk@...hat.com
Subject: Re: [PATCH 12/13] Make generic_make_request handle arbitrarily large
 bios

On Fri, May 18, 2012 at 06:05:50PM +1000, NeilBrown wrote:
> 
> Hi Kent,
>  there is lots of good stuff in this series and I certainly hope a lot of it
>  can eventually go upstream.
> 
>  However there is a part of this patch that I don't think is safe:
> 
> 
> > +static void __bio_submit_split(struct closure *cl)
> > +{
> > +	struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl);
> > +	struct bio *bio = s->bio, *n;
> > +
> > +	do {
> > +		n = bio_split(bio, bio_max_sectors(bio),
> > +			      GFP_NOIO, s->q->bio_split);
> > +		if (!n)
> > +			continue_at(cl, __bio_submit_split, bio_split_wq);
> > +
> > +		closure_get(cl);
> > +		generic_make_request(n);
> > +	} while (n != bio);
> > +
> > +	continue_at(cl, bio_submit_split_done, NULL);
> > +}
> 
> Firstly a small point:  Can bio_split ever return NULL here?  I don't
> think it can, so there is no need to test.
> But if it can, then calling generic_make_request(NULL) doesn't seem like a
> good idea.
> 
> More significantly though::
>   This is called from generic_make_request which can be called recursively and
> enforces a tail-recursion semantic.
> If that generic_make_request was a recursive call, then the
> generic_make_request in __bio_submit_split will not start the request, but
> will queue the bio for later handling.  If we then call bio_split again, we
> could try to allocation from a mempool while we are holding one entry
> allocated from that pool captive.  This can deadlock.
> 
> i.e. if the original bio is so large that it needs to be split into 3 pieces,
> then we will try to allocate the second piece before the first piece has a
> chance to be released.  If this happens in enough threads to exhaust the pool
> (4 I think), it will deadlock.
> 
> I realise this sounds like a very unlikely case, but of course they happen.
> 
> One possible approach might be to count how many splits will be required,
> then have an interface to mempools so you can allocate them all at once,
> or block and wait.

Yeah, I actually thought of that (I probably should've documented it,
though).

That's why the code is checking if bio_split() returned NULL; my verison
of bio_split() checks if current->bio_list is non NULL, and if it is it
doesn't pass __GFP_WAIT to bio_alloc_bioset().

(I was debating whether bio_split() should do that itself or leave it up
to the caller. I tend to think it's too easy to accidentally screw up -
and not notice it - so it should be enforced by generic code. Possibly
the check should be moved to bio_alloc_bioset(), though.)

If we do get a memory allocation failure, then we just punt to workqueue
- continue_at() runs __bio_submit_split from the bio_split workqueue -
where we can safely use the mempool.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ