lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UcXLi5fK3UiOpfPKu6FxJh1tH4r+_ZjCNsH=cEqHztOOg@mail.gmail.com>
Date:   Fri, 11 Sep 2020 11:41:58 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Cc:     Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
        Magnus Karlsson <magnus.karlsson@...il.com>,
        Maciej Fijalkowski <maciejromanfijalkowski@...il.com>,
        Network Development <netdev@...r.kernel.org>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH net-next] i40e: allow VMDQs to be used
 with AF_XDP zero-copy

On Fri, Sep 11, 2020 at 11:05 AM Samudrala, Sridhar
<sridhar.samudrala@...el.com> wrote:
>
>
>
> On 9/11/2020 6:10 AM, Maciej Fijalkowski wrote:
> > On Fri, Sep 11, 2020 at 02:29:50PM +0200, Magnus Karlsson wrote:
> >> On Fri, Sep 11, 2020 at 2:11 PM Maciej Fijalkowski
> >> <maciej.fijalkowski@...el.com> wrote:
> >>>
> >>> On Fri, Sep 11, 2020 at 02:08:26PM +0200, Magnus Karlsson wrote:
> >>>> From: Magnus Karlsson <magnus.karlsson@...el.com>
> >>>>
> >>>> Allow VMDQs to be used with AF_XDP sockets in zero-copy mode. For some
> >>>> reason, we only allowed main VSIs to be used with zero-copy, but
> >>>> there is now reason to not allow VMDQs also.
> >>>
> >>> You meant 'to allow' I suppose. And what reason? :)
> >>
> >> Yes, sorry. Should be "not to allow". I was too trigger happy ;-).
> >>
> >> I have gotten requests from users that they want to use VMDQs in
> >> conjunction with containers. Basically small slices of the i40e
> >> portioned out as netdevs. Do you see any problems with using a VMDQ
> >> iwth zero-copy?
>
> Today VMDQ VSIs are used when a macvlan interface is created on top of a
> i40e PF with l2-fwd-offload on. But i don't think we can create an
> AF_XDP zerocopy socket on top of a macvlan netdev as it doesn't support
> ndo_bpf or ndo_xdp_xxx apis or expose hw queues directly.
>
> We need to expose VMDQ VSI as a native netdev that can expose its own
> queues and support ndo_ ops in order to enable AF_XDP zerocopy on a
> VMDQ. We talked about this approach at the recent netdev conference to
> expose VMDQ VSI as a subdevice with its own netdev.
>
> https://netdevconf.info/0x14/session.html?talk-hardware-acceleration-of-container-networking-interfaces

I still hold the opinion that macvlan is still the best way to go
about addressing most of these needs. The problem with doing isolation
as separate netdevs is the fact that east/west traffic starts to
essentially swamp the PCIe bus on the device as you have to deal with
broadcast/multicast replication and east/west traffic. Leaving that
replication and east/west traffic up to software to handle while
allowing the unicast traffic to be directed is the best way to go in
my opinion.

The problem with just spawning netdevs is that each vendor can do it
differently and what you get varies in functionality. If anything we
would need to come up with a standardized interface to define what
features can be used and exposed. That was one of the motivations
behind using macvlan. So if anything it seems like it might make more
sense to look at extending the macvlan interface to enable offloading
additional features to the lower level device.

With that said I am not certain VMDq is even the right kind of
interface to use for containers. I would be more interested in
something like what we did in fm10k for macvlan offload where we used
resource tags to identify traffic that belonged to a given interface
and just dedicated that to it rather than queues and interrupts. The
problem with dedicating queues and interrupts is that those are a
limited resource so scaling will become an issue when you get to any
decent count of containers.

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ