lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87czx7r0w8.fsf@toke.dk>
Date:   Wed, 10 Feb 2021 23:52:39 +0100
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Marek Majtyka <alardam@...il.com>,
        Saeed Mahameed <saeed@...nel.org>,
        David Ahern <dsahern@...il.com>,
        Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
        John Fastabend <john.fastabend@...il.com>,
        Jesper Dangaard Brouer <jbrouer@...hat.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Maciej Fijalkowski <maciejromanfijalkowski@...il.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>, hawk@...nel.org,
        bpf <bpf@...r.kernel.org>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        jeffrey.t.kirsher@...el.com
Subject: Re: [PATCH v2 bpf 1/5] net: ethtool: add xdp properties flag set

Jakub Kicinski <kuba@...nel.org> writes:

> On Wed, 10 Feb 2021 11:53:53 +0100 Toke Høiland-Jørgensen wrote:
>> >> I am a bit confused now. Did you mean validation tests of those XDP
>> >> flags, which I am working on or some other validation tests?
>> >> What should these tests verify? Can you please elaborate more on the
>> >> topic, please - just a few sentences how are you see it?  
>> >
>> > Conformance tests can be written for all features, whether they have 
>> > an explicit capability in the uAPI or not. But for those that do IMO
>> > the tests should be required.
>> >
>> > Let me give you an example. This set adds a bit that says Intel NICs 
>> > can do XDP_TX and XDP_REDIRECT, yet we both know of the Tx queue
>> > shenanigans. So can i40e do XDP_REDIRECT or can it not?
>> >
>> > If we have exhaustive conformance tests we can confidently answer that
>> > question. And the answer may not be "yes" or "no", it may actually be
>> > "we need more options because many implementations fall in between".
>> >
>> > I think readable (IOW not written in some insane DSL) tests can also 
>> > be useful for users who want to check which features their program /
>> > deployment will require.  
>> 
>> While I do agree that that kind of conformance test would be great, I
>> don't think it has to hold up this series (the perfect being the enemy
>> of the good, and all that). We have a real problem today that userspace
>> can't tell if a given driver implements, say, XDP_REDIRECT, and so
>> people try to use it and spend days wondering which black hole their
>> packets disappear into. And for things like container migration we need
>> to be able to predict whether a given host supports a feature *before*
>> we start the migration and try to use it.
>
> Unless you have a strong definition of what XDP_REDIRECT means the flag
> itself is not worth much. We're not talking about normal ethtool feature
> flags which are primarily stack-driven, XDP is implemented mostly by
> the driver, each vendor can do their own thing. Maybe I've seen one
> vendor incompatibility too many at my day job to hope for the best...

I'm totally on board with documenting what a feature means. E.g., for
XDP_REDIRECT, whether it's acceptable to fail the redirect in some
situations even when it's active, or if there should always be a
slow-path fallback.

But I disagree that the flag is worthless without it. People are running
into real issues with trying to run XDP_REDIRECT programs on a driver
that doesn't support it at all, and it's incredibly confusing. The
latest example popped up literally yesterday:

https://lore.kernel.org/xdp-newbies/CAM-scZPPeu44FeCPGO=Qz=03CrhhfB1GdJ8FNEpPqP_G27c6mQ@mail.gmail.com/

>> I view the feature flags as a list of features *implemented* by the
>> driver. Which should be pretty static in a given kernel, but may be
>> different than the features currently *enabled* on a given system (due
>> to, e.g., the TX queue stuff).
>
> Hm, maybe I'm not being clear enough. The way XDP_REDIRECT (your
> example) is implemented across drivers differs in a meaningful ways. 
> Hence the need for conformance testing. We don't have a golden SW
> standard to fall back on, like we do with HW offloads.

I'm not disagreeing that we need to harmonise what "implementing a
feature" means. Maybe I'm just not sure what you mean by "conformance
testing"? What would that look like, specifically? A script in selftest
that sets up a redirect between two interfaces that we tell people to
run? Or what? How would you catch, say, that issue where if a machine
has more CPUs than the NIC has TXQs things start falling apart?

> Also IDK why those tests are considered such a huge ask. As I said most
> vendors probably already have them, and so I'd guess do good distros.
> So let's work together.

I guess what I'm afraid of is that this will end up delaying or stalling
a fix for a long-standing issue (which is what I consider this series as
shown by the example above). Maybe you can alleviate that by expanding a
bit on what you mean?

-Toke

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ