netdev - Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <68ebd92e-cf7a-43af-af71-386495606d90@linux.dev>
Date: Fri, 26 Jan 2024 11:13:48 +0800
From: Zhu Yanjun <yanjun.zhu@...ux.dev>
To: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Jason Wang <jasowang@...hat.com>
Cc: Andrew Lunn <andrew@...n.ch>, Heng Qi <hengqi@...ux.alibaba.com>,
 Paolo Abeni <pabeni@...hat.com>, Zhu Yanjun <yanjun.zhu@...el.com>,
 mst@...hat.com, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 virtualization@...ts.linux.dev, netdev@...r.kernel.org
Subject: Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang


在 2024/1/26 11:11, Zhu Yanjun 写道:
>
>
> 在 2024/1/22 15:02, Xuan Zhuo 写道:
>> On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang<jasowang@...hat.com>  wrote:
>>> On Mon, Jan 22, 2024 at 2:55 PM Jason Wang<jasowang@...hat.com>  wrote:
>>>> On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo<xuanzhuo@...ux.alibaba.com>  wrote:
>>>>> On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang<jasowang@...hat.com>  wrote:
>>>>>> On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo<xuanzhuo@...ux.alibaba.com>  wrote:
>>>>>>> On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang<jasowang@...hat.com>  wrote:
>>>>>>>> On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun<yanjun.zhu@...ux.dev>  wrote:
>>>>>>>>> 在 2024/1/20 1:29, Andrew Lunn 写道:
>>>>>>>>>>>>>>         while (!virtqueue_get_buf(vi->cvq, &tmp) &&
>>>>>>>>>>>>>> -           !virtqueue_is_broken(vi->cvq))
>>>>>>>>>>>>>> +           !virtqueue_is_broken(vi->cvq)) {
>>>>>>>>>>>>>> +        if (timeout)
>>>>>>>>>>>>>> +            timeout--;
>>>>>>>>>>>>> This is not really a timeout, just a loop counter. 200 iterations could
>>>>>>>>>>>>> be a very short time on reasonable H/W. I guess this avoid the soft
>>>>>>>>>>>>> lockup, but possibly (likely?) breaks the functionality when we need to
>>>>>>>>>>>>> loop for some non negligible time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I fear we need a more complex solution, as mentioned by Micheal in the
>>>>>>>>>>>>> thread you quoted.
>>>>>>>>>>>> Got it. I also look forward to the more complex solution to this problem.
>>>>>>>>>>> Can we add a device capability (new feature bit) such as ctrq_wait_timeout
>>>>>>>>>>> to get a reasonable timeout？
>>>>>>>>>> The usual solution to this is include/linux/iopoll.h. If you can sleep
>>>>>>>>>> read_poll_timeout() otherwise read_poll_timeout_atomic().
>>>>>>>>> I read carefully the functions read_poll_timeout() and
>>>>>>>>> read_poll_timeout_atomic(). The timeout is set by the caller of the 2
>>>>>>>>> functions.
>>>>>>>> FYI, in order to avoid a swtich of atomic or not, we need convert rx
>>>>>>>> mode setting to workqueue first:
>>>>>>>>
>>>>>>>> https://www.mail-archive.com/virtualization@lists.linux-foundation.org/msg60298.html
>>>>>>>>
>>>>>>>>> As such, can we add a module parameter to customize this timeout value
>>>>>>>>> by the user?
>>>>>>>> Who is the "user" here, or how can the "user" know the value?
>>>>>>>>
>>>>>>>>> Or this timeout value is stored in device register, virtio_net driver
>>>>>>>>> will read this timeout value at initialization?
>>>>>>>> See another thread. The design needs to be general, or you can post a RFC.
>>>>>>>>
>>>>>>>> In another thought, we've already had a tx watchdog, maybe we can have
>>>>>>>> something similar to cvq and use timeout + reset in that case.
>>>>>>> But we may block by the reset ^_^ if the device is broken?
>>>>>> I mean vq reset here.
>>>>> I see.
>>>>>
>>>>> I mean when the deivce is broken, the vq reset also many be blocked.
>>>>>
>>>>>          void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, u16 index)
>>>>>          {
>>>>>                  struct virtio_pci_modern_common_cfg __iomem *cfg;
>>>>>
>>>>>                  cfg = (struct virtio_pci_modern_common_cfg __iomem *)mdev->common;
>>>>>
>>>>>                  vp_iowrite16(index, &cfg->cfg.queue_select);
>>>>>                  vp_iowrite16(1, &cfg->queue_reset);
>>>>>
>>>>>                  while (vp_ioread16(&cfg->queue_reset))
>>>>>                          msleep(1);
>>>>>
>>>>>                  while (vp_ioread16(&cfg->cfg.queue_enable))
>>>>>                          msleep(1);
>>>>>          }
>>>>>          EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
>>>>>
>>>>> In this function, for the broken device, we can not expect something.
>>>> Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
>>>>
>>>> Thanks
>>>>
>>>>>> It looks like we have multiple goals here
>>>>>>
>>>>>> 1) avoid lockups, using workqueue + cond_resched() seems to be
>>>>>> sufficient, it has issue but nothing new
>>>>>> 2) recover from the unresponsive device, the issue for timeout is that
>>>>>> it needs to deal with false positives
>>>>> I agree.
>>>>>
>>>>> But I want to add a new goal, cvq async. In the netdim, we will
>>>>> send many requests via the cvq, so the cvq async will be nice.
>>> Then you need an interrupt for cvq.
>>>
>>> FYI, I've posted a series that use interrupt for cvq in the past:
>>>
>>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>> I know this. But the interrupt maybe not a good solution without new space.
>>
>>> Haven't found time in working on this anymore, maybe we can start from
>>> this or not.
>> I said async, but my aim is to put many requests to the cvq before getting the
>> response.
>>
>> Heng Qi posted thishttps://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/

Sorry. This mail is rejected by netdev maillist. So I have to resend it.


Thanks a lot. I read Heng Qi's commits carefully. This patch series are 
similiar with the NIC feature xmit_more.

But if cvq command is urgent, can we let this urgent cvq command be 
passed ASAP?

I mean, can we set a flag similar to xmit_more? if cvq command is not 
urgent, it can be queued. If it is urgent,

this cvq command is passed ASAP.

Zhu Yanjun

> Zhu Yanjun
>
>> Thanks.
>>
>>
>>> Thanks
>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>
>>>>>>>> Thans
>>>>>>>>
>>>>>>>>> Zhu Yanjun
>>>>>>>>>
>>>>>>>>>>        Andrew