lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 4 Feb 2014 09:16:07 +0100
From:	Roger Pau Monné <roger.pau@...rix.com>
To:	Jan Beulich <JBeulich@...e.com>
CC:	Matt Rushton <mrushton@...zon.com>, Matt Wilson <msw@...zon.com>,
	DavidVrabel <david.vrabel@...rix.com>,
	Ian Campbell <Ian.Campbell@...rix.com>,
	<xen-devel@...ts.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [Xen-devel] [PATCH 3/3] xen-blkback: fix shutdown race

On 04/02/14 09:02, Jan Beulich wrote:
>>>> On 03.02.14 at 17:58, Roger Pau Monné<roger.pau@...rix.com> wrote:
>> On 29/01/14 09:52, Jan Beulich wrote:
>>>>>> On 28.01.14 at 18:43, Roger Pau Monne <roger.pau@...rix.com> wrote:
>>>> +		free_req(blkif, pending_req);
>>>> +		/*
>>>> +		 * Make sure the request is freed before releasing blkif,
>>>> +		 * or there could be a race between free_req and the
>>>> +		 * cleanup done in xen_blkif_free during shutdown.
>>>> +		 *
>>>> +		 * NB: The fact that we might try to wake up pending_free_wq
>>>> +		 * before drain_complete (in case there's a drain going on)
>>>> +		 * it's not a problem with our current implementation
>>>> +		 * because we can assure there's no thread waiting on
>>>> +		 * pending_free_wq if there's a drain going on, but it has
>>>> +		 * to be taken into account if the current model is changed.
>>>> +		 */
>>>> +		xen_blkif_put(blkif);
>>>> +		if (atomic_read(&blkif->refcnt) <= 2) {
>>>> +			if (atomic_read(&blkif->drain))
>>>> +				complete(&blkif->drain_complete);
>>>>  		}
>>>> -		free_req(pending_req->blkif, pending_req);
>>>>  	}
>>>>  }
>>>
>>> The put is still too early imo - you're explicitly accessing field in the
>>> structure immediately afterwards. This may not be an issue at
>>> present, but I think it's at least a latent one.
>>>
>>> Apart from that, the two if()s would - at least to me - be more
>>> clear if combined into one.
>>
>> In order to get rid of the race I had to introduce yet another atomic_t 
>> in xen_blkif struct, which is something I don't really like, but I 
>> could not see any other way to solve this. If that's fine I will resend 
>> the series, here is the reworked patch:
> 
> Mind explaining why you can't simply move the xen_blkif_put()
> down between the if() and the free_ref().

You mean doing something like:

if (atomic_read(&blkif->refcnt) <= 3) {
	if (atomic_read(&blkif->drain))
		complete(&blkif->drain_complete);
}
xen_blkif_put(blkif);
free_req(blkif, pending_req);

This would not be safe, because we are comparing refcnt before
decrementing it, and also we are not doing it atomically (the dec and
compare). If for example we have two inflight requests, both could
perform the atomic_read of refcnt, see there's still another inflight
request, and then both decrement, without anyone calling complete.

There's also the issue that with this approach as we are freeing the
request after we have put blkif, which is a race with the cleanup being
done in xen_blkif_free.

Roger.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ