[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <52E7D472.1070007@citrix.com>
Date: Tue, 28 Jan 2014 17:01:54 +0100
From: Roger Pau Monné <roger.pau@...rix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
CC: <xen-devel@...ts.xenproject.org>, <linux-kernel@...r.kernel.org>,
David Vrabel <david.vrabel@...rix.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Matt Rushton <mrushton@...zon.com>,
Matt Wilson <msw@...zon.com>,
Ian Campbell <Ian.Campbell@...rix.com>
Subject: Re: [PATCH] xen-blkback: fix memory leaks
On 28/01/14 16:37, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 28, 2014 at 01:44:37PM +0100, Roger Pau Monné wrote:
>> On 27/01/14 22:21, Konrad Rzeszutek Wilk wrote:
>>> On Mon, Jan 27, 2014 at 11:13:41AM +0100, Roger Pau Monne wrote:
>>>> @@ -976,17 +983,19 @@ static void __end_block_io_op(struct pending_req *pending_req, int error)
>>>> * the proper response on the ring.
>>>> */
>>>> if (atomic_dec_and_test(&pending_req->pendcnt)) {
>>>> - xen_blkbk_unmap(pending_req->blkif,
>>>> + struct xen_blkif *blkif = pending_req->blkif;
>>>> +
>>>> + xen_blkbk_unmap(blkif,
>>>> pending_req->segments,
>>>> pending_req->nr_pages);
>>>> - make_response(pending_req->blkif, pending_req->id,
>>>> + make_response(blkif, pending_req->id,
>>>> pending_req->operation, pending_req->status);
>>>> - xen_blkif_put(pending_req->blkif);
>>>> - if (atomic_read(&pending_req->blkif->refcnt) <= 2) {
>>>> - if (atomic_read(&pending_req->blkif->drain))
>>>> - complete(&pending_req->blkif->drain_complete);
>>>> + free_req(blkif, pending_req);
>>>> + xen_blkif_put(blkif);
>>>> + if (atomic_read(&blkif->refcnt) <= 2) {
>>>> + if (atomic_read(&blkif->drain))
>>>> + complete(&blkif->drain_complete);
>>>> }
>>>> - free_req(pending_req->blkif, pending_req);
>>>
>>> I keep coming back to this and I am not sure what to think - especially
>>> in the context of WRITE_BARRIER and disconnecting the vbd.
>>>
>>> You moved the 'free_req' to be done before you do atomic_read/dec.
>>>
>>> Which means that we do:
>>>
>>> list_add(&req->free_list, &blkif->pending_free);
>>> wake_up(&blkif->pending_free_wq);
>>>
>>> atomic_dec
>>> if atomic_read <= 2 poke thread that is waiting for drain.
>>>
>>>
>>> while in the past we did:
>>>
>>> atomic_dec
>>> if atomic_read <= 2 poke thread that is waiting for drain.
>>>
>>> list_add(&req->free_list, &blkif->pending_free);
>>> wake_up(&blkif->pending_free_wq);
>>>
>>> which means that we are giving the 'req' _before_ we decrement
>>> the refcnts.
>>>
>>> Could that mean that __do_block_io_op takes it for a spin - oh
>>> wait it won't as it is sitting on a WRITE_BARRIER and waiting:
>>>
>>> 1226 if (drain)
>>> 1227 xen_blk_drain_io(pending_req->blkif);
>>>
>>> But still that feels 'wrong'?
>>
>> Mmmm, the wake_up call in free_req in the context of WRITE_BARRIER is
>> harmless since the thread is waiting on drain_complete as you say, but I
>> take your point that it's all confusing. Do you think it will feel
>> better if we gate the call to wake_up in free_req with this condition:
>>
>> if (was_empty && !atomic_read(&blkif->drain))
>>
>> Or is this just going to make it even messier?
>
> My head spins around when thinking about the refcnt, drain, the two or
> three workqueues.
>
>>
>> Maybe just adding a comment in free_req saying that the wake_up call is
>> going to be ignored in the context of a WRITE_BARRIER, since the thread
>> is already waiting on drain_complete is enough.
>
> Perhaps. You do pass in the 'force' bool flag and we could piggyback
> on that. Meaning you could do
In the new version I'm preparing I'm no longer calling drain_io on
xen_blkif_schedule (so there's no "force" flag), instead I've moved the
cleanup code to xen_blkif_free where I think it makes more sense.
Also the force flag was just a local variable to drain_io, I think it
would get even messier if we add yet another variable (force) to the
xen_blkif struct.
>
> /* a comment about what we just mentioned */
>
> if (!force) {
> // do it the old way
> } else {
>
> /* A comment mentioning _why_ we need the code reshuffled */
>
> // do it the new way
> }
>
> It would be a bit messy - but:
> - We won't have to worry about breaking WRITE_BARRIER as the old
> logic would be preserved. So less worry about regressions.
> - The bug-fix would be easy to backport as it would inject code for
> just the usage you want - that is to drain all I/Os.
> - It would make a nice distinction and allows us to refactor
> this in future patches.
> The cons are that:
> - It would add extra path for just the use-case of shutting down
> without using the existing one.
> - It would be messy
>
>
> But I think when it comes to fixes like these that are
> candidates for backports - messy is OK and if they don't have any
> posibility of introducing regressions on existing other behaviors -
> then we should stick with that.
>
>
> Then in the future we can refactor this to use less of these
> workqueues, refcnt and atomics we have. It is getting confusing.
>
> Thoughts?
Let me post the whole series as I have them now, and we can pick it up
again from that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists