netdev - Re: [Xen-devel][PATCH] xen/netfront: Remove unneeded .resume callback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <46fe25f2-2db7-496a-cd2c-071cd211ea50@gmail.com>
Date:   Thu, 14 Mar 2019 18:33:29 +0200
From:   Oleksandr Andrushchenko <andr2000@...il.com>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        netdev@...r.kernel.org, xen-devel@...ts.xenproject.org,
        linux-kernel@...r.kernel.org, jgross@...e.com,
        sstabellini@...nel.org, davem@...emloft.net
Cc:     Oleksandr Andrushchenko <oleksandr_andrushchenko@...m.com>,
        Volodymyr Babchuk <Volodymyr_Babchuk@...m.com>
Subject: Re: [Xen-devel][PATCH] xen/netfront: Remove unneeded .resume callback

On 3/14/19 17:40, Boris Ostrovsky wrote:
> On 3/14/19 11:10 AM, Oleksandr Andrushchenko wrote:
>> On 3/14/19 5:02 PM, Boris Ostrovsky wrote:
>>> On 3/14/19 10:52 AM, Oleksandr Andrushchenko wrote:
>>>> On 3/14/19 4:47 PM, Boris Ostrovsky wrote:
>>>>> On 3/14/19 9:17 AM, Oleksandr Andrushchenko wrote:
>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@...m.com>
>>>>>>
>>>>>> Currently on driver resume we remove all the network queues and
>>>>>> destroy shared Tx/Rx rings leaving the driver in its current state
>>>>>> and never signaling the backend of this frontend's state change.
>>>>>> This leads to the number of consequences:
>>>>>> - when frontend withdraws granted references to the rings etc. it
>>>>>> cannot
>>>>>>      be cleanly done as the backend still holds those (it was not
>>>>>> told to
>>>>>>      free the resources)
>>>>>> - it is not possible to resume driver operation as all the
>>>>>> communication
>>>>>>      means with the backned were destroyed by the frontend, thus
>>>>>>      making the frontend appear to the guest OS as functional, but
>>>>>>      not really.
>>>>> What do you mean? Are you saying that after resume you lose
>>>>> connectivity?
>>>> Exactly, if you take a look at the .resume callback as it is now
>>>> what it does it destroys the rings etc. and never notifies the backend
>>>> of that, e.g. it stays in, say, connected state with communication
>>>> channels destroyed. It never goes into any other Xen bus state, so
>>>> there is
>>>> no way its state machine can help recovering.
>>> My tree is about a month old so perhaps there is some sort of regression
>>> but this certainly works for me. After resume netfront gets
>>> XenbusStateInitWait from backend which causes xennet_connect().
>> Ah, the difference can be of the way we get the guest enter
>> the suspend state. I am making my guest to suspend with:
>> echo mem > /sys/power/state
>> And then I use an interrupt to the guest (this is a test code)
>> to wake it up.
>> Could you please share your exact use-case when the guest enters suspend
>> and what you do to resume it?
>
> xl save / xl restore
>
>> I can see no way backend may want enter XenbusStateInitWait in my
>> use-case
>> as it simply doesn't know we want him to.
>
> Yours looks like ACPI path, I don't know how well it was tested TBH.

Hm, so it does work for your use-case, but doesn't for mine.

What would be the best way forward?

1. Implement .resume properly as, for example, block front does [1]

2. Remove .resume completely: this does work as long as backend doesn't 
change anything

I am still a bit unsure if we really need to re-initialize rings, 
re-read front's config from

Xenstore etc - what changes on backend side are expected when we resume 
the front driver?

>
>
> -boris

Thank you,

Oleksandr


[1] 
https://elixir.bootlin.com/linux/v5.0.2/source/drivers/block/xen-blkfront.c#L2072