netdev - Re: [xdp-hints] Re: [PATCH RFCv2 bpf-next 04/18] net: create xdp_hints

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74a68399-35b2-c0f2-92cb-236a0773837e@redhat.com>
Date:   Fri, 9 Sep 2022 16:13:53 +0200
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     "Burakov, Anatoly" <anatoly.burakov@...el.com>, bpf@...r.kernel.org
Cc:     brouer@...hat.com, netdev@...r.kernel.org,
        xdp-hints@...-project.net, larysa.zaremba@...el.com,
        memxor@...il.com, Lorenzo Bianconi <lorenzo@...nel.org>,
        mtahhan@...hat.com,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Daniel Borkmann <borkmann@...earbox.net>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>,
        dave@...cker.co.uk, Magnus Karlsson <magnus.karlsson@...el.com>,
        bjorn@...nel.org, Alexander Lobakin <alexandr.lobakin@...el.com>
Subject: Re: [xdp-hints] Re: [PATCH RFCv2 bpf-next 04/18] net: create
 xdp_hints_common and set functions



On 09/09/2022 12.49, Burakov, Anatoly wrote:
> On 07-Sep-22 4:45 PM, Jesper Dangaard Brouer wrote:
>> XDP-hints via BTF are about giving drivers the ability to extend the
>> common set of hardware offload hints in a flexible way.
>>
>> This patch start out with defining the common set, based on what is
>> used available in the SKB. Having this as a common struct in core
>> vmlinux makes it easier to implement xdp_frame to SKB conversion
>> routines as normal C-code, see later patches.
>>
>> Drivers can redefine the layout of the entire metadata area, but are
>> encouraged to use this common struct as the base, on which they can
>> extend on top for their extra hardware offload hints. When doing so,
>> drivers can mark the xdp_buff (and xdp_frame) with flags indicating
>> this it compatible with the common struct.
>>
>> Patch also provides XDP-hints driver helper functions for updating the
>> common struct. Helpers gets inlined and are defined for maximum
>> performance, which does require some extra care in drivers, e.g. to
>> keep track of flags to reduce data dependencies, see code DOC.
>>
>> Userspace and BPF-prog's MUST not consider the common struct UAPI.
>> The common struct (and enum flags) are only exposed via BTF, which
>> implies consumers must read and decode this BTF before using/consuming
>> data layout.
>>
>> Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
>> ---
>>   include/net/xdp.h |  147 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/core/xdp.c    |    5 ++
>>   2 files changed, 152 insertions(+)
>>
>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>> index 04c852c7a77f..ea5836ccee82 100644
>> --- a/include/net/xdp.h
>> +++ b/include/net/xdp.h
>> @@ -8,6 +8,151 @@
>>   #include <linux/skbuff.h> /* skb_shared_info */
>> +/**
>> + * struct xdp_hints_common - Common XDP-hints offloads shared with 
>> netstack
>> + * @btf_full_id: The modules BTF object + type ID for specific struct
>> + * @vlan_tci: Hardware provided VLAN tag + proto type in 
>> @xdp_hints_flags
>> + * @rx_hash32: Hardware provided RSS hash value
>> + * @xdp_hints_flags: see &enum xdp_hints_flags
>> + *
>> + * This structure contains the most commonly used hardware offloads 
>> hints
>> + * provided by NIC drivers and supported by the SKB.
>> + *
>> + * Driver are expected to extend this structure by include &struct
>> + * xdp_hints_common as part of the drivers own specific xdp_hints 
>> struct's, but
>> + * at the end-of their struct given XDP metadata area grows backwards.
>> + *
>> + * The member @btf_full_id is populated by driver modules to uniquely 
>> identify
>> + * the BTF struct.  The high 32-bits store the modules BTF object ID 
>> and the
>> + * lower 32-bit the BTF type ID within that BTF object.
>> + */
>> +struct xdp_hints_common {
>> +    union {
>> +        __wsum        csum;
>> +        struct {
>> +            __u16    csum_start;
>> +            __u16    csum_offset;
>> +        };
>> +    };
>> +    u16 rx_queue;
>> +    u16 vlan_tci;
>> +    u32 rx_hash32;
>> +    u32 xdp_hints_flags;
>> +    u64 btf_full_id; /* BTF object + type ID */
>> +} __attribute__((aligned(4))) __attribute__((packed));
> 
> I'm assuming any Tx metadata will have to go before the Rx checksum union?
> 

Nope.  The plan is that the TX metadata can reuse the same metadata area
with its own layout.  I imagine a new xdp_buff->flags bit that tell us
the layout is now TX-layout with xdp_hints_common_tx.

We could rename xdp_hints_common to xdp_hints_common_rx to anticipate
and prepare for this. But that would be getting a head of ourselves,
because someone in the community might have a smarter solution, e.g.
that could combine common RX and TX in a single struct. e.g. overlapping
csum and vlan_tci might make sense.

>> +
>> +
>> +/**
>> + * enum xdp_hints_flags - flags used by &struct xdp_hints_common
>> + *
>> + * The &enum xdp_hints_flags have reserved the first 16 bits for 
>> common flags
>> + * and drivers can introduce use their own flags bits from BIT(16). For
>> + * BPF-progs to find these flags (via BTF) drivers should define an enum
>> + * xdp_hints_flags_driver.
>> + */
>> +enum xdp_hints_flags {
>> +    HINT_FLAG_CSUM_TYPE_BIT0  = BIT(0),
>> +    HINT_FLAG_CSUM_TYPE_BIT1  = BIT(1),
>> +    HINT_FLAG_CSUM_TYPE_MASK  = 0x3,
>> +
>> +    HINT_FLAG_CSUM_LEVEL_BIT0 = BIT(2),
>> +    HINT_FLAG_CSUM_LEVEL_BIT1 = BIT(3),
>> +    HINT_FLAG_CSUM_LEVEL_MASK = 0xC,
>> +    HINT_FLAG_CSUM_LEVEL_SHIFT = 2,
>> +
>> +    HINT_FLAG_RX_HASH_TYPE_BIT0 = BIT(4),
>> +    HINT_FLAG_RX_HASH_TYPE_BIT1 = BIT(5),
>> +    HINT_FLAG_RX_HASH_TYPE_MASK = 0x30,
>> +    HINT_FLAG_RX_HASH_TYPE_SHIFT = 0x4,
>> +
>> +    HINT_FLAG_RX_QUEUE = BIT(7),
>> +
>> +    HINT_FLAG_VLAN_PRESENT            = BIT(8),
>> +    HINT_FLAG_VLAN_PROTO_ETH_P_8021Q  = BIT(9),
>> +    HINT_FLAG_VLAN_PROTO_ETH_P_8021AD = BIT(10),
>> +    /* Flags from BIT(16) can be used by drivers */
> 
> If we assumed we also have Tx section, would 16 bits be enough? For a 
> basic implementation of UDP checksumming, AF_XDP would need 3x16 more 
> bits (to store L2/L3/L4 offsets) plus probably a flag field indicating 
> presence of each. Is there any way to expand common fields in the future 
> (or is it at all intended to be expandable)?
> 

As above we could have separate flags for TX side, e.g.
xdp_hints_flags_tx.  But some of the flags might still be valid for
TX-side, so they could potentially share some.

BUT it is also important to realize that I'm saying this is not UAPI
flags being exposed (like in include/uapi/bpf.h).  The runtime value of
these enum defined flags MUST be obtained via BTF (through help of
libbpf CO-RE or in userspace by parsing BTF).

Thus, in principle the kernel is free to change these structs and enums.
In practice it will be very annoying for BPF-progs and AF_XDP userspace
code if we change the names of the struct's and somewhat annoying if
members change name.  CO-RE can deal with kernel changes and feature
detection[1] down to the avail enums e.g. via using
bpf_core_enum_value_exists().  But we should avoid too many changes as
the code becomes harder to read.

--Jesper

[1] 
https://nakryiko.com/posts/bpf-core-reference-guide/#bpf-core-enum-value-exists