[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0fcc1413-cb20-7a17-bdcd-6f9994990432@redhat.com>
Date: Mon, 31 May 2021 11:28:57 +0800
From: Jason Wang <jasowang@...hat.com>
To: wangyunjian <wangyunjian@...wei.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc: "kuba@...nel.org" <kuba@...nel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"mst@...hat.com" <mst@...hat.com>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
dingxiaoxiong <dingxiaoxiong@...wei.com>
Subject: Re: [PATCH net-next] virtio_net: set link state down when virtqueue
is broken
在 2021/5/28 下午6:58, wangyunjian 写道:
>> -----Original Message-----
>>> From: Yunjian Wang <wangyunjian@...wei.com>
>>>
>>> The NIC can't receive/send packets if a rx/tx virtqueue is broken.
>>> However, the link state of the NIC is still normal. As a result, the
>>> user cannot detect the NIC exception.
>>
>> Doesn't we have:
>>
>> /* This should not happen! */
>> if (unlikely(err)) {
>> dev->stats.tx_fifo_errors++;
>> if (net_ratelimit())
>> dev_warn(&dev->dev,
>> "Unexpected TXQ (%d) queue
>> failure: %d\n",
>> qnum, err);
>> dev->stats.tx_dropped++;
>> dev_kfree_skb_any(skb);
>> return NETDEV_TX_OK;
>> }
>>
>> Which should be sufficient?
> There may be other reasons for this error, e.g -ENOSPC(no free desc).
This should not happen unless the device or driver is buggy. We always
reserved sufficient slots:
if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
netif_stop_subqueue(dev, qnum);
...
> And if rx virtqueue is broken, there is no error statistics.
Feel free to add one if it's necessary.
Let's leave the policy decision (link down) to userspace.
>
>>
>>> The driver can set the link state down when the virtqueue is broken.
>>> If the state is down, the user can switch over to another NIC.
>>
>> Note that, we probably need the watchdog for virtio-net in order to be a
>> complete solution.
> Yes, I can think of is that the virtqueue's broken exception is detected on watchdog.
> Is there anything else that needs to be done?
Basically, it's all about TX stall which watchdog tries to catch. Broken
vq is only one of the possible reason.
Thanks
>
> Thanks
>
>> Thanks
>>
>>
Powered by blists - more mailing lists