[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4918ed7c-4c63-6f19-530b-8e16b0c496d4@redhat.com>
Date: Fri, 12 Oct 2018 16:23:48 +0800
From: Jason Wang <jasowang@...hat.com>
To: ake <ake@...l.co.jp>
Cc: "Michael S. Tsirkin" <mst@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] virtio_net: enable tx after resuming from suspend
On 2018年10月12日 12:30, ake wrote:
>
> On 2018年10月11日 22:06, Jason Wang wrote:
>>
>> On 2018年10月11日 18:22, ake wrote:
>>> On 2018年10月11日 18:44, Jason Wang wrote:
>>>> On 2018年10月11日 15:51, Ake Koomsin wrote:
>>>>> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>>>>> disabled the virtio tx before going to suspend to avoid a use after
>>>>> free.
>>>>> However, after resuming, it causes the virtio_net device to lose its
>>>>> network connectivity.
>>>>>
>>>>> To solve the issue, we need to enable tx after resuming.
>>>>>
>>>>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during
>>>>> reset")
>>>>> Signed-off-by: Ake Koomsin <ake@...l.co.jp>
>>>>> ---
>>>>> drivers/net/virtio_net.c | 1 +
>>>>> 1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>> index dab504ec5e50..3453d80f5f81 100644
>>>>> --- a/drivers/net/virtio_net.c
>>>>> +++ b/drivers/net/virtio_net.c
>>>>> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct
>>>>> virtio_device *vdev)
>>>>> }
>>>>> netif_device_attach(vi->dev);
>>>>> + netif_start_queue(vi->dev);
>>>> I believe this is duplicated with netif_tx_wake_all_queues() in
>>>> netif_device_attach() above?
>>> Thank you for your review.
>>>
>>> If both netif_tx_wake_all_queues() and netif_start_queue() result in
>>> clearing __QUEUE_STATE_DRV_XOFF, then is it possible that some
>>> conditions in netif_device_attach() is not satisfied?
>> Yes, maybe. One case I can see now is when the device is down, in this
>> case netif_device_attach() won't try to wakeup the queue.
>>
>>> Without
>>> netif_start_queue(), the virtio_net device does not resume properly
>>> after waking up.
>> How do you trigger the issue? Just do suspend/resume?
> Yes, simply suspend and resume.
>
> Here is how I trigger the issue:
>
> 1) Start the Virtual Machine Manager GUI program.
> 2) Create a guest Linux OS. Make sure that the guest OS kernel is
> >= 4.12. Make sure that it uses virtio_net as its network device.
> In addition, make sure that the video adapter is VGA. Otherwise,
> waking up with the virtual power button does not work.
> 3) After installing the guest OS, log in, and test the network
> connectivity by ping the host machine.
> 4) Suspend. After this, the screen is blank.
> 5) Resume by hitting the virtual power button. The login screen
> appears again.
> 6) Log in again. The guest loses its network connection.
>
> In my test:
> Guest: Ubuntu 16.04/18.04 with kernel 4.15.0-36-generic
> Host: Ubuntu 16.04 with kernel 4.15.0-36-generic/4.4.0-137-generic
I can not reproduce this issue if virtio-net interface is up in guest
before the suspend. I'm using net-next.git and qemu master. But I do
reproduce when virtio-net interface is down in guest before suspend,
after resume, even if I make it up, the network is still lost.
I think the interface is up in your case, but please confirm this.
>
>>> Is it better to report this as a bug first?
>> Nope, you're very welcome to post patch directly.
>>
>>> If I am to do more
>>> investigation, what areas should I look into?
>> As you've figured out, you can start with why netif_tx_wake_all_queues()
>> were not executed?
>>
>> (Btw, does the issue disappear if you move netif_tx_disable() under the
>> check of netif_running() in virtnet_freeze_down()?)
> The issue disappears if I move netif_tx_disable() under the check of
> netif_running() in virtnet_freeze_down(). Moving netif_tx_disable()
> is probably better as its logic is consistent with
> netif_device_attach() implementation. If you are OK with this idea,
> I will submit another patch.
I think the it helps for the case when interface is down before suspend.
But it's still unclear why it help even if the interface is up
(netif_running() is true).
Please submit a patch but we should figure out why it help for a up
interface as well.
Thanks
>
>> Thanks
>>
>>> Best Regards
>>> Ake Koomsin
>>>
> Best Regards
Powered by blists - more mailing lists