[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEsvsmyUaTD35kp=4qJhMdDYG=hGVbT0JGGTwGTb3XRuLg@mail.gmail.com>
Date: Tue, 26 Mar 2024 12:08:16 +0800
From: Jason Wang <jasowang@...hat.com>
To: Heng Qi <hengqi@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux.dev,
"Michael S. Tsirkin" <mst@...hat.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>, "David S. Miller" <davem@...emloft.net>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH 2/2] virtio-net: reduce the CPU consumption of dim worker
On Tue, Mar 26, 2024 at 10:46 AM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>
>
>
> 在 2024/3/25 下午4:42, Jason Wang 写道:
> > On Mon, Mar 25, 2024 at 4:22 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
> >>
> >>
> >> 在 2024/3/25 下午3:56, Jason Wang 写道:
> >>> On Mon, Mar 25, 2024 at 3:18 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
> >>>>
> >>>> 在 2024/3/25 下午1:57, Jason Wang 写道:
> >>>>> On Mon, Mar 25, 2024 at 10:21 AM Heng Qi <hengqi@...ux.alibaba.com> wrote:
> >>>>>> 在 2024/3/22 下午1:19, Jason Wang 写道:
> >>>>>>> On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@...ux.alibaba.com> wrote:
> >>>>>>>> Currently, ctrlq processes commands in a synchronous manner,
> >>>>>>>> which increases the delay of dim commands when configuring
> >>>>>>>> multi-queue VMs, which in turn causes the CPU utilization to
> >>>>>>>> increase and interferes with the performance of dim.
> >>>>>>>>
> >>>>>>>> Therefore we asynchronously process ctlq's dim commands.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
> >>>>>>> I may miss some previous discussions.
> >>>>>>>
> >>>>>>> But at least the changelog needs to explain why you don't use interrupt.
> >>>>>> Will add, but reply here first.
> >>>>>>
> >>>>>> When upgrading the driver's ctrlq to use interrupt, problems may occur
> >>>>>> with some existing devices.
> >>>>>> For example, when existing devices are replaced with new drivers, they
> >>>>>> may not work.
> >>>>>> Or, if the guest OS supported by the new device is replaced by an old
> >>>>>> downstream OS product, it will not be usable.
> >>>>>>
> >>>>>> Although, ctrlq has the same capabilities as IOq in the virtio spec,
> >>>>>> this does have historical baggage.
> >>>>> I don't think the upstream Linux drivers need to workaround buggy
> >>>>> devices. Or it is a good excuse to block configure interrupts.
> >>>> Of course I agree. Our DPU devices support ctrlq irq natively, as long
> >>>> as the guest os opens irq to ctrlq.
> >>>>
> >>>> If other products have no problem with this, I would prefer to use irq
> >>>> to solve this problem, which is the most essential solution.
> >>> Let's do that.
> >> Ok, will do.
> >>
> >> Do you have the link to the patch where you previously modified the
> >> control queue for interrupt notifications.
> >> I think a new patch could be made on top of it, but I can't seem to find it.
> > Something like this?
>
> YES. Thanks Jason.
>
> >
> > https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
> >
> > Note that
> >
> > 1) some patch has been merged
> > 2) we probably need to drop the timeout logic as it's another topic
> > 3) need to address other comments
>
> I did a quick read of your patch sets from the previous 5 version:
> [1]
> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
> [2] https://lore.kernel.org/all/20221226074908.8154-1-jasowang@redhat.com/
> [3] https://lore.kernel.org/all/20230413064027.13267-1-jasowang@redhat.com/
> [4] https://lore.kernel.org/all/20230524081842.3060-1-jasowang@redhat.com/
> [5] https://lore.kernel.org/all/20230720083839.481487-1-jasowang@redhat.com/
>
> Regarding adding the interrupt to ctrlq, there are a few points where
> there is no agreement,
> which I summarize below.
>
> 1. Require additional interrupt vector resource
> https://lore.kernel.org/all/20230516165043-mutt-send-email-mst@kernel.org/
I don't think one more vector is a big problem. Multiqueue will
require much more than this.
Even if it is, we can try to share an interrupt as Michael suggests.
Let's start from something that is simple, just one more vector.
> 2. Adding the interrupt for ctrlq may break some devices
> https://lore.kernel.org/all/f9e75ce5-e6df-d1be-201b-7d0f18c1b6e7@redhat.com/
These devices need to be fixed. It's hard to imagine the evolution of
virtio-net is blocked by buggy devices.
> 3. RTNL breaks surprise removal
> https://lore.kernel.org/all/20230720170001-mutt-send-email-mst@kernel.org/
The comment is for indefinite waiting for ctrl vq which turns out to
be another issue.
For the removal, we just need to do the wakeup then everything is fine.
>
> Regarding the above, there seems to be no conclusion yet.
> If these problems still exist, I think this patch is good enough and we
> can merge it first.
I don't think so, poll turns out to be problematic for a lot of cases.
>
> For the third point, it seems to be being solved by Daniel now [6], but
> spink lock is used,
> which I think conflicts with the way of adding interrupts to ctrlq.
>
> [6] https://lore.kernel.org/all/20240325214912.323749-1-danielj@nvidia.com/
I don't see how it conflicts with this.
Thanks
>
>
> Thanks,
> Heng
>
> >
> > THanks
> >
> >
> >> Thanks,
> >> Heng
> >>
> >>> Thanks
> >>>
> >>>>> And I remember you told us your device doesn't have such an issue.
> >>>> YES.
> >>>>
> >>>> Thanks,
> >>>> Heng
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>>> Thanks,
> >>>>>> Heng
> >>>>>>
> >>>>>>> Thanks
>
Powered by blists - more mailing lists