[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEA6p_Bi1OMTas0W4VuxAMz8Frf+vBNc8c7xCDUxb_uwUy8Zgw@mail.gmail.com>
Date: Tue, 9 Feb 2021 10:00:22 -0800
From: Wei Wang <weiwan@...gle.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Jason Wang <jasowang@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>
Cc: David Miller <davem@...emloft.net>,
Network Development <netdev@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>,
virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH net] virtio-net: suppress bad irq warning for tx napi
On Tue, Feb 9, 2021 at 6:58 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> > >>> I have no preference. Just curious, especially if it complicates the patch.
> > >>>
> > >> My understanding is that. It's probably ok for net. But we probably need
> > >> to document the assumptions to make sure it was not abused in other drivers.
> > >>
> > >> Introduce new parameters for find_vqs() can help to eliminate the subtle
> > >> stuffs but I agree it looks like a overkill.
> > >>
> > >> (Btw, I forget the numbers but wonder how much difference if we simple
> > >> remove the free_old_xmits() from the rx NAPI path?)
> > > The committed patchset did not record those numbers, but I found them
> > > in an earlier iteration:
> > >
> > > [PATCH net-next 0/3] virtio-net tx napi
> > > https://lists.openwall.net/netdev/2017/04/02/55
> > >
> > > It did seem to significantly reduce compute cycles ("Gcyc") at the
> > > time. For instance:
> > >
> > > TCP_RR Latency (us):
> > > 1x:
> > > p50 24 24 21
> > > p99 27 27 27
> > > Gcycles 299 432 308
> > >
> > > I'm concerned that removing it now may cause a regression report in a
> > > few months. That is higher risk than the spurious interrupt warning
> > > that was only reported after years of use.
> >
> >
> > Right.
> >
> > So if Michael is fine with this approach, I'm ok with it. But we
> > probably need to a TODO to invent the interrupt handlers that can be
> > used for more than one virtqueues. When MSI-X is enabled, the interrupt
> > handler (vring_interrup()) assumes the interrupt is used by a single
> > virtqueue.
>
> Thanks.
>
> The approach to schedule tx-napi from virtnet_poll_cleantx instead of
> cleaning directly in this rx-napi function was not effective at
> suppressing the warning, I understand.
Correct. I tried the approach to schedule tx napi instead of directly
do free_old_xmit_skbs() in virtnet_poll_cleantx(). But the warning
still happens.
>
> It should be easy to drop the spurious rate to a little under 99%
> percent, if only to suppress the warning. By probabilistically
> cleaning in virtnet_poll_cleantx only every 98/100, say. But that
> really is a hack.
>
> There does seem to be a huge potential for cycle savings if we can
> really avoid the many spurious interrupts.
>
> As scheduling napi_tx from virtnet_poll_cleantx is not effective,
> agreed that we should probably look at the more complete solution to
> handle both tx and rx virtqueues from the same IRQ.
>
> And revert this explicit warning suppression patch once we have that.
If everyone agrees, I will submit a new version of this patch to
address previous comments, before a more complete solution is
proposed.
Powered by blists - more mailing lists