lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-K9xTBStGR5BEiS6WZd=znqM_ENcj9_nb=rrpcMORqE8g@mail.gmail.com>
Date:   Tue, 9 Feb 2021 09:58:04 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     Wei Wang <weiwan@...gle.com>, David Miller <davem@...emloft.net>,
        Network Development <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH net] virtio-net: suppress bad irq warning for tx napi

> >>> I have no preference. Just curious, especially if it complicates the patch.
> >>>
> >> My understanding is that. It's probably ok for net. But we probably need
> >> to document the assumptions to make sure it was not abused in other drivers.
> >>
> >> Introduce new parameters for find_vqs() can help to eliminate the subtle
> >> stuffs but I agree it looks like a overkill.
> >>
> >> (Btw, I forget the numbers but wonder how much difference if we simple
> >> remove the free_old_xmits() from the rx NAPI path?)
> > The committed patchset did not record those numbers, but I found them
> > in an earlier iteration:
> >
> >    [PATCH net-next 0/3] virtio-net tx napi
> >    https://lists.openwall.net/netdev/2017/04/02/55
> >
> > It did seem to significantly reduce compute cycles ("Gcyc") at the
> > time. For instance:
> >
> >      TCP_RR Latency (us):
> >      1x:
> >        p50              24       24       21
> >        p99              27       27       27
> >        Gcycles         299      432      308
> >
> > I'm concerned that removing it now may cause a regression report in a
> > few months. That is higher risk than the spurious interrupt warning
> > that was only reported after years of use.
>
>
> Right.
>
> So if Michael is fine with this approach, I'm ok with it. But we
> probably need to a TODO to invent the interrupt handlers that can be
> used for more than one virtqueues. When MSI-X is enabled, the interrupt
> handler (vring_interrup()) assumes the interrupt is used by a single
> virtqueue.

Thanks.

The approach to schedule tx-napi from virtnet_poll_cleantx instead of
cleaning directly in this rx-napi function was not effective at
suppressing the warning, I understand.

It should be easy to drop the spurious rate to a little under 99%
percent, if only to suppress the warning. By probabilistically
cleaning in virtnet_poll_cleantx only every 98/100, say. But that
really is a hack.

There does seem to be a huge potential for cycle savings if we can
really avoid the many spurious interrupts.

As scheduling napi_tx from virtnet_poll_cleantx is not effective,
agreed that we should probably look at the more complete solution to
handle both tx and rx virtqueues from the same IRQ.

And revert this explicit warning suppression patch once we have that.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ