[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1cc933e6-cde4-ba20-3c54-7391db93a9a1@redhat.com>
Date: Fri, 4 Jun 2021 10:37:30 +0800
From: Jason Wang <jasowang@...hat.com>
To: wangyunjian <wangyunjian@...wei.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc: "kuba@...nel.org" <kuba@...nel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"mst@...hat.com" <mst@...hat.com>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
dingxiaoxiong <dingxiaoxiong@...wei.com>
Subject: Re: [PATCH net-next] virtio_net: set link state down when virtqueue
is broken
在 2021/6/3 下午7:34, wangyunjian 写道:
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@...hat.com]
>> Sent: Monday, May 31, 2021 11:29 AM
>> To: wangyunjian <wangyunjian@...wei.com>; netdev@...r.kernel.org
>> Cc: kuba@...nel.org; davem@...emloft.net; mst@...hat.com;
>> virtualization@...ts.linux-foundation.org; dingxiaoxiong
>> <dingxiaoxiong@...wei.com>
>> Subject: Re: [PATCH net-next] virtio_net: set link state down when virtqueue is
>> broken
>>
>>
>> 在 2021/5/28 下午6:58, wangyunjian 写道:
>>>> -----Original Message-----
>>>>> From: Yunjian Wang <wangyunjian@...wei.com>
>>>>>
>>>>> The NIC can't receive/send packets if a rx/tx virtqueue is broken.
>>>>> However, the link state of the NIC is still normal. As a result, the
>>>>> user cannot detect the NIC exception.
>>>> Doesn't we have:
>>>>
>>>> /* This should not happen! */
>>>> if (unlikely(err)) {
>>>> dev->stats.tx_fifo_errors++;
>>>> if (net_ratelimit())
>>>> dev_warn(&dev->dev,
>>>> "Unexpected TXQ (%d)
>> queue
>>>> failure: %d\n",
>>>> qnum, err);
>>>> dev->stats.tx_dropped++;
>>>> dev_kfree_skb_any(skb);
>>>> return NETDEV_TX_OK;
>>>> }
>>>>
>>>> Which should be sufficient?
>>> There may be other reasons for this error, e.g -ENOSPC(no free desc).
>>
>> This should not happen unless the device or driver is buggy. We always reserved
>> sufficient slots:
>>
>> if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>> netif_stop_subqueue(dev, qnum); ...
>>
>>
>>> And if rx virtqueue is broken, there is no error statistics.
>>
>> Feel free to add one if it's necessary.
> Currently receiving scenario, it is impossible to distinguish whether the reason for
> not receiving packet is virtqueue's broken or no packet.
Can we introduce rx_fifo_errors for that?
>
>> Let's leave the policy decision (link down) to userspace.
>>
>>
>>>>> The driver can set the link state down when the virtqueue is broken.
>>>>> If the state is down, the user can switch over to another NIC.
>>>> Note that, we probably need the watchdog for virtio-net in order to
>>>> be a complete solution.
>>> Yes, I can think of is that the virtqueue's broken exception is detected on
>> watchdog.
>>> Is there anything else that needs to be done?
>>
>> Basically, it's all about TX stall which watchdog tries to catch. Broken vq is only
>> one of the possible reason.
> Are there any plans for the watchdog?
Somebody posted a prototype 3 or 4 years ago, you can search it and
maybe we can start from there.
Thanks
>
> Thanks
>
>> Thanks
>>
>>
>>> Thanks
>>>
>>>> Thanks
>>>>
>>>>
Powered by blists - more mailing lists