lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75f1ae35-aeee-404a-be1c-2ffa05126cdb@linux.alibaba.com>
Date: Tue, 26 Mar 2024 13:57:21 +0800
From: Heng Qi <hengqi@...ux.alibaba.com>
To: Jason Wang <jasowang@...hat.com>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux.dev,
 "Michael S. Tsirkin" <mst@...hat.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>,
 "David S. Miller" <davem@...emloft.net>,
 Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker



在 2024/3/26 下午12:08, Jason Wang 写道:
> On Tue, Mar 26, 2024 at 10:46 AM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>
>>
>> 在 2024/3/25 下午4:42, Jason Wang 写道:
>>> On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>
>>>> 在 2024/3/25 下午3:56, Jason Wang 写道:
>>>>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
>>>>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
>>>>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>>>>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
>>>>>>>>>> which increases the delay of dim commands when configuring
>>>>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
>>>>>>>>>> increase and interferes with the performance of dim.
>>>>>>>>>>
>>>>>>>>>> Therefore we asynchronously process ctlq's dim commands.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
>>>>>>>>> I may miss some previous discussions.
>>>>>>>>>
>>>>>>>>> But at least the changelog needs to explain why you don't use interrupt.
>>>>>>>> Will add, but reply here first.
>>>>>>>>
>>>>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
>>>>>>>> with some existing devices.
>>>>>>>> For example, when existing devices are replaced with new drivers, they
>>>>>>>> may not work.
>>>>>>>> Or, if the guest OS supported by the new device is replaced by an old
>>>>>>>> downstream OS product, it will not be usable.
>>>>>>>>
>>>>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
>>>>>>>> this does have historical baggage.
>>>>>>> I don't think the upstream Linux drivers need to workaround buggy
>>>>>>> devices. Or it is a good excuse to block configure interrupts.
>>>>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
>>>>>> as the guest os opens irq to ctrlq.
>>>>>>
>>>>>> If other products have no problem with this, I would prefer to use irq
>>>>>> to solve this problem, which is the most essential solution.
>>>>> Let's do that.
>>>> Ok, will do.
>>>>
>>>> Do you have the link to the patch where you previously modified the
>>>> control queue for interrupt notifications.
>>>> I think a new patch could be made on top of it, but I can't seem to find it.
>>> Something like this?
>> YES. Thanks Jason.
>>
>>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>>>
>>> Note that
>>>
>>> 1) some patch has been merged
>>> 2) we probably need to drop the timeout logic as it's another topic
>>> 3) need to address other comments
>> I did a quick read of your patch sets from the previous 5 version:
>> [1]
>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>> [2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
>> [3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
>> [4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
>> [5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/
>>
>> Regarding adding the interrupt to ctrlq, there are a few points where
>> there is no agreement,
>> which I summarize below.
>>
>> 1. Require additional interrupt vector resource
>> https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/
> I don't think one more vector is a big problem. Multiqueue will
> require much more than this.
>
> Even if it is, we can try to share an interrupt as Michael suggests.
>
> Let's start from something that is simple, just one more vector.

OK, that puts my concerns to rest.

>
>> 2. Adding the interrupt for ctrlq may break some devices
>> https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/
> These devices need to be fixed. It's hard to imagine the evolution of
> virtio-net is blocked by buggy devices.

Agree.

>
>> 3. RTNL breaks surprise removal
>> https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/
> The comment is for indefinite waiting for ctrl vq which turns out to
> be another issue.
>
> For the removal, we just need to do the wakeup then everything is fine.

Then I will make a patch set based on irq and without timeout.

>
>> Regarding the above, there seems to be no conclusion yet.
>> If these problems still exist, I think this patch is good enough and we
>> can merge it first.
> I don't think so, poll turns out to be problematic for a lot of cases.
>
>> For the third point, it seems to be being solved by Daniel now [6], but
>> spink lock is used,
>> which I think conflicts with the way of adding interrupts to ctrlq.
>>
>> [6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/
> I don't see how it conflicts with this.

I'll just make changes on top of it. Can I?

Thanks,
Heng

>
> Thanks
>
>>
>> Thanks,
>> Heng
>>
>>> THanks
>>>
>>>
>>>> Thanks,
>>>> Heng
>>>>
>>>>> Thanks
>>>>>
>>>>>>> And I remember you told us your device doesn't have such an issue.
>>>>>> YES.
>>>>>>
>>>>>> Thanks,
>>>>>> Heng
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Heng
>>>>>>>>
>>>>>>>>> Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ