lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4ouf-0AVHvwyPMGN9q-C70Sjm-PFqBnAz7L4rJGKcsVeYXA@mail.gmail.com>
Date:   Mon, 12 Jul 2021 15:40:16 +0200
From:   Íñigo Huguet <ihuguet@...hat.com>
To:     Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc:     Edward Cree <ecree.xilinx@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, ivan@...udflare.com,
        ast@...nel.org, daniel@...earbox.net, hawk@...nel.org,
        john.fastabend@...il.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, brouer@...hat.com
Subject: Re: [PATCH 1/3] sfc: revert "reduce the number of requested xdp ev queues"

On Fri, Jul 9, 2021 at 5:07 PM Jesper Dangaard Brouer
<jbrouer@...hat.com> wrote:
> > I think it's less about that and more about avoiding lock contention.
> > If two sources (XDP and the regular stack) are both trying to use a TXQ,
> >   and contending for a lock, it's possible that the resulting total
> >   throughput could be far less than either source alone would get if it
> >   had exclusive use of a queue.
> > There don't really seem to be any good answers to this; any CPU in the
> >   system can initiate an XDP_REDIRECT at any time and if they can't each
> >   get a queue to themselves then I don't see how the arbitration can be
> >   performant.  (There is the middle-ground possibility of TXQs shared by
> >   multiple XDP CPUs but not shared with the regular stack, in which case
> >   if only a subset of CPUs are actually handling RX on the device(s) with
> >   an XDP_REDIRECTing program it may be possible to avoid contention if
> >   the core-to-XDP-TXQ mapping can be carefully configured.)
>
> Yes, I prefer the 'middle-ground' fallback you describe.  XDP gets it's
> own set of TXQ-queues, and when driver detect TXQ's are less than CPUs
> that can redirect packets it uses an ndo_xdp_xmit function that takes a
> (hashed) lock (happens per packet burst (max 16)).

That's a good idea, which in fact I had already considered, but I had
(almost) discarded because I still see there 2 problems:
1. If there are no free MSI-X vectors remaining at all,
XDP_TX/REDIRECT will still be disabled.
2. If the amount of free MSI-X vectors is little. Then, many CPUs will
be contending for very few queues/locks, not for normal traffic but
yes for XDP traffic. If someone wants to intensively use
XDP_TX/REDIRECT will get a very poor performance, with no option to
get a better tradeoff between normal and XDP traffic.

We have to consider that both scenarios are very feasible because this
problem appears on machines with a high number of CPUs. Even if
support for more channels and queues per channel is added, who knows
what crazy numbers for CPU cores we will be using in a few years? And
we also have to consider VFs, which usually have much less MSI-X
vectors available, and can be assigned to many different
configurations of virtual machines.

So I think that we still need a last resort fallback of sharing TXQs
with network stack:
1. If there are enough resources: 1 queue per CPU for XDP
2. If there are not enough resources, but still a fair amount: many
queues dedicated only to XDP, with (hashed) locking contention
3. If there are not free resources, or there are very few: TXQs shared
for network core and XDP

Of course, there is always the option of tweaking driver and hardware
parameters to, for example, increase the amount of resources
available. But if the user doesn't use it I think we should give them
a good enough tradeoff. If the user doesn't use XDP, it won't be
noticeable at all. If he/she intensively uses it and doesn't get the
desired performance, he/she will have to tweak parameters.

Jesper has tomorrow a workshop in netdevconf where they will speak
about this topic. Please let us know if you come up with any good new
idea.

Regards
-- 
Íñigo Huguet

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ