[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <40ab27a0-0f16-b010-1834-6d08ab0838c3@suse.com>
Date: Wed, 27 Jan 2021 12:37:12 +0100
From: Jan Beulich <jbeulich@...e.com>
To: paul@....org
Cc: 'Paul Durrant' <pdurrant@...zon.com>,
'Konrad Rzeszutek Wilk' <konrad.wilk@...cle.com>,
'Roger Pau Monné' <roger.pau@...rix.com>,
'Jens Axboe' <axboe@...nel.dk>,
'Dongli Zhang' <dongli.zhang@...cle.com>,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
On 27.01.2021 12:33, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@...e.com>
>> Sent: 27 January 2021 11:21
>> To: paul@....org
>> Cc: 'Paul Durrant' <pdurrant@...zon.com>; 'Konrad Rzeszutek Wilk' <konrad.wilk@...cle.com>; 'Roger Pau
>> Monné' <roger.pau@...rix.com>; 'Jens Axboe' <axboe@...nel.dk>; 'Dongli Zhang'
>> <dongli.zhang@...cle.com>; linux-kernel@...r.kernel.org; linux-block@...r.kernel.org; xen-
>> devel@...ts.xenproject.org
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
>>
>> On 27.01.2021 12:09, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@...e.com>
>>>> Sent: 27 January 2021 10:57
>>>> To: Paul Durrant <paul@....org>
>>>> Cc: Paul Durrant <pdurrant@...zon.com>; Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>; Roger Pau
>>>> Monné <roger.pau@...rix.com>; Jens Axboe <axboe@...nel.dk>; Dongli Zhang <dongli.zhang@...cle.com>;
>>>> linux-kernel@...r.kernel.org; linux-block@...r.kernel.org; xen-devel@...ts.xenproject.org
>>>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
>>>>
>>>> On 27.01.2021 11:30, Paul Durrant wrote:
>>>>> From: Paul Durrant <pdurrant@...zon.com>
>>>>>
>>>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>>>> behaviour of xen-blkback when connecting to a frontend was:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring specified by 'ring-ref'
>>>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>>> 1 << ring-page-order
>>>>>
>>>>> This was correct behaviour, but was broken by the afforementioned commit to
>>>>> become:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring
>>>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>>> 1 << ring-page-order
>>>>> - if that didn't work then see if there's a single page ring specified by
>>>>> 'ring-ref'
>>>>>
>>>>> This incorrect behaviour works most of the time but fails when a frontend
>>>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>>>> grant reference.
>>>>>
>>>>> This patch restores the original behaviour.
>>>>
>>>> Isn't this only the 2nd of a pair of fixes that's needed, the
>>>> first being the drivers, upon being unloaded, to fully clean up
>>>> after itself? Any stale key left may lead to confusion upon
>>>> re-use of the containing directory.
>>>
>>> In a backend we shouldn't be relying on, nor really expect IMO, a frontend to clean up after itself.
>> Any backend should know *exactly* what xenstore nodes it’s looking for from a frontend.
>>
>> But the backend can't know whether a node exists because the present
>> frontend has written it, or because an earlier instance forgot to
>> delete it. It can only honor what's there. (In fact the other day I
>> was wondering whether some of the writes of boolean "false" nodes
>> wouldn't better be xenbus_rm() instead.)
>
> In the particular case this patch is fixing for me, the frontends are the Windows XENVBD driver and the Windows crash version of the same driver (actually built from different code). The 'normal' instance is multi-page aware and the crash instance is not quite, i.e. it uses the old ring-ref but knows to clean up 'ring-page-order'.
> Clearly, in a crash situation, we cannot rely on frontend to clean up
Ah, I see (and agree).
> so what you say does highlight that there indeed needs to be a second patch to xen-blkback to make sure it removes 'ring-page-order' itself as 'state' cycles through Closed and back to InitWait.
And not just this one node then, I suppose?
> I think this patch does still stand on its own though.
Perhaps, yes.
Jan
Powered by blists - more mailing lists