lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 26 Mar 2024 10:46:41 +0800
From: Heng Qi <hengqi@...ux.alibaba.com>
To: Jason Wang <jasowang@...hat.com>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux.dev,
 "Michael S. Tsirkin" <mst@...hat.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>,
 "David S. Miller" <davem@...emloft.net>,
 Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker



在 2024/3/25 下午4:42, Jason Wang 写道:
> On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/25 下午3:56, Jason Wang 写道:
>>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>
>>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
>>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
>>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
>>>>>>>> which increases the delay of dim commands when configuring
>>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
>>>>>>>> increase and interferes with the performance of dim.
>>>>>>>>
>>>>>>>> Therefore we asynchronously process ctlq's dim commands.
>>>>>>>>
>>>>>>>> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
>>>>>>> I may miss some previous discussions.
>>>>>>>
>>>>>>> But at least the changelog needs to explain why you don't use interrupt.
>>>>>> Will add, but reply here first.
>>>>>>
>>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
>>>>>> with some existing devices.
>>>>>> For example, when existing devices are replaced with new drivers, they
>>>>>> may not work.
>>>>>> Or, if the guest OS supported by the new device is replaced by an old
>>>>>> downstream OS product, it will not be usable.
>>>>>>
>>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
>>>>>> this does have historical baggage.
>>>>> I don't think the upstream Linux drivers need to workaround buggy
>>>>> devices. Or it is a good excuse to block configure interrupts.
>>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
>>>> as the guest os opens irq to ctrlq.
>>>>
>>>> If other products have no problem with this, I would prefer to use irq
>>>> to solve this problem, which is the most essential solution.
>>> Let's do that.
>> Ok, will do.
>>
>> Do you have the link to the patch where you previously modified the
>> control queue for interrupt notifications.
>> I think a new patch could be made on top of it, but I can't seem to find it.
> Something like this?

YES. Thanks Jason.

>
> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>
> Note that
>
> 1) some patch has been merged
> 2) we probably need to drop the timeout logic as it's another topic
> 3) need to address other comments

I did a quick read of your patch sets from the previous 5 version:
[1] 
https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
[2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
[3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
[4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
[5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/

Regarding adding the interrupt to ctrlq, there are a few points where 
there is no agreement,
which I summarize below.

1. Require additional interrupt vector resource
https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/
2. Adding the interrupt for ctrlq may break some devices
https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/
3. RTNL breaks surprise removal
https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/

Regarding the above, there seems to be no conclusion yet.
If these problems still exist, I think this patch is good enough and we 
can merge it first.

For the third point, it seems to be being solved by Daniel now [6], but 
spink lock is used,
which I think conflicts with the way of adding interrupts to ctrlq.

[6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/


Thanks,
Heng

>
> THanks
>
>
>> Thanks,
>> Heng
>>
>>> Thanks
>>>
>>>>> And I remember you told us your device doesn't have such an issue.
>>>> YES.
>>>>
>>>> Thanks,
>>>> Heng
>>>>
>>>>> Thanks
>>>>>
>>>>>> Thanks,
>>>>>> Heng
>>>>>>
>>>>>>> Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ