[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74a68399-35b2-c0f2-92cb-236a0773837e@redhat.com>
Date: Fri, 9 Sep 2022 16:13:53 +0200
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: "Burakov, Anatoly" <anatoly.burakov@...el.com>, bpf@...r.kernel.org
Cc: brouer@...hat.com, netdev@...r.kernel.org,
xdp-hints@...-project.net, larysa.zaremba@...el.com,
memxor@...il.com, Lorenzo Bianconi <lorenzo@...nel.org>,
mtahhan@...hat.com,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Borkmann <borkmann@...earbox.net>,
Andrii Nakryiko <andrii.nakryiko@...il.com>,
dave@...cker.co.uk, Magnus Karlsson <magnus.karlsson@...el.com>,
bjorn@...nel.org, Alexander Lobakin <alexandr.lobakin@...el.com>
Subject: Re: [xdp-hints] Re: [PATCH RFCv2 bpf-next 04/18] net: create
xdp_hints_common and set functions
On 09/09/2022 12.49, Burakov, Anatoly wrote:
> On 07-Sep-22 4:45 PM, Jesper Dangaard Brouer wrote:
>> XDP-hints via BTF are about giving drivers the ability to extend the
>> common set of hardware offload hints in a flexible way.
>>
>> This patch start out with defining the common set, based on what is
>> used available in the SKB. Having this as a common struct in core
>> vmlinux makes it easier to implement xdp_frame to SKB conversion
>> routines as normal C-code, see later patches.
>>
>> Drivers can redefine the layout of the entire metadata area, but are
>> encouraged to use this common struct as the base, on which they can
>> extend on top for their extra hardware offload hints. When doing so,
>> drivers can mark the xdp_buff (and xdp_frame) with flags indicating
>> this it compatible with the common struct.
>>
>> Patch also provides XDP-hints driver helper functions for updating the
>> common struct. Helpers gets inlined and are defined for maximum
>> performance, which does require some extra care in drivers, e.g. to
>> keep track of flags to reduce data dependencies, see code DOC.
>>
>> Userspace and BPF-prog's MUST not consider the common struct UAPI.
>> The common struct (and enum flags) are only exposed via BTF, which
>> implies consumers must read and decode this BTF before using/consuming
>> data layout.
>>
>> Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
>> ---
>> include/net/xdp.h | 147
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> net/core/xdp.c | 5 ++
>> 2 files changed, 152 insertions(+)
>>
>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>> index 04c852c7a77f..ea5836ccee82 100644
>> --- a/include/net/xdp.h
>> +++ b/include/net/xdp.h
>> @@ -8,6 +8,151 @@
>> #include <linux/skbuff.h> /* skb_shared_info */
>> +/**
>> + * struct xdp_hints_common - Common XDP-hints offloads shared with
>> netstack
>> + * @btf_full_id: The modules BTF object + type ID for specific struct
>> + * @vlan_tci: Hardware provided VLAN tag + proto type in
>> @xdp_hints_flags
>> + * @rx_hash32: Hardware provided RSS hash value
>> + * @xdp_hints_flags: see &enum xdp_hints_flags
>> + *
>> + * This structure contains the most commonly used hardware offloads
>> hints
>> + * provided by NIC drivers and supported by the SKB.
>> + *
>> + * Driver are expected to extend this structure by include &struct
>> + * xdp_hints_common as part of the drivers own specific xdp_hints
>> struct's, but
>> + * at the end-of their struct given XDP metadata area grows backwards.
>> + *
>> + * The member @btf_full_id is populated by driver modules to uniquely
>> identify
>> + * the BTF struct. The high 32-bits store the modules BTF object ID
>> and the
>> + * lower 32-bit the BTF type ID within that BTF object.
>> + */
>> +struct xdp_hints_common {
>> + union {
>> + __wsum csum;
>> + struct {
>> + __u16 csum_start;
>> + __u16 csum_offset;
>> + };
>> + };
>> + u16 rx_queue;
>> + u16 vlan_tci;
>> + u32 rx_hash32;
>> + u32 xdp_hints_flags;
>> + u64 btf_full_id; /* BTF object + type ID */
>> +} __attribute__((aligned(4))) __attribute__((packed));
>
> I'm assuming any Tx metadata will have to go before the Rx checksum union?
>
Nope. The plan is that the TX metadata can reuse the same metadata area
with its own layout. I imagine a new xdp_buff->flags bit that tell us
the layout is now TX-layout with xdp_hints_common_tx.
We could rename xdp_hints_common to xdp_hints_common_rx to anticipate
and prepare for this. But that would be getting a head of ourselves,
because someone in the community might have a smarter solution, e.g.
that could combine common RX and TX in a single struct. e.g. overlapping
csum and vlan_tci might make sense.
>> +
>> +
>> +/**
>> + * enum xdp_hints_flags - flags used by &struct xdp_hints_common
>> + *
>> + * The &enum xdp_hints_flags have reserved the first 16 bits for
>> common flags
>> + * and drivers can introduce use their own flags bits from BIT(16). For
>> + * BPF-progs to find these flags (via BTF) drivers should define an enum
>> + * xdp_hints_flags_driver.
>> + */
>> +enum xdp_hints_flags {
>> + HINT_FLAG_CSUM_TYPE_BIT0 = BIT(0),
>> + HINT_FLAG_CSUM_TYPE_BIT1 = BIT(1),
>> + HINT_FLAG_CSUM_TYPE_MASK = 0x3,
>> +
>> + HINT_FLAG_CSUM_LEVEL_BIT0 = BIT(2),
>> + HINT_FLAG_CSUM_LEVEL_BIT1 = BIT(3),
>> + HINT_FLAG_CSUM_LEVEL_MASK = 0xC,
>> + HINT_FLAG_CSUM_LEVEL_SHIFT = 2,
>> +
>> + HINT_FLAG_RX_HASH_TYPE_BIT0 = BIT(4),
>> + HINT_FLAG_RX_HASH_TYPE_BIT1 = BIT(5),
>> + HINT_FLAG_RX_HASH_TYPE_MASK = 0x30,
>> + HINT_FLAG_RX_HASH_TYPE_SHIFT = 0x4,
>> +
>> + HINT_FLAG_RX_QUEUE = BIT(7),
>> +
>> + HINT_FLAG_VLAN_PRESENT = BIT(8),
>> + HINT_FLAG_VLAN_PROTO_ETH_P_8021Q = BIT(9),
>> + HINT_FLAG_VLAN_PROTO_ETH_P_8021AD = BIT(10),
>> + /* Flags from BIT(16) can be used by drivers */
>
> If we assumed we also have Tx section, would 16 bits be enough? For a
> basic implementation of UDP checksumming, AF_XDP would need 3x16 more
> bits (to store L2/L3/L4 offsets) plus probably a flag field indicating
> presence of each. Is there any way to expand common fields in the future
> (or is it at all intended to be expandable)?
>
As above we could have separate flags for TX side, e.g.
xdp_hints_flags_tx. But some of the flags might still be valid for
TX-side, so they could potentially share some.
BUT it is also important to realize that I'm saying this is not UAPI
flags being exposed (like in include/uapi/bpf.h). The runtime value of
these enum defined flags MUST be obtained via BTF (through help of
libbpf CO-RE or in userspace by parsing BTF).
Thus, in principle the kernel is free to change these structs and enums.
In practice it will be very annoying for BPF-progs and AF_XDP userspace
code if we change the names of the struct's and somewhat annoying if
members change name. CO-RE can deal with kernel changes and feature
detection[1] down to the avail enums e.g. via using
bpf_core_enum_value_exists(). But we should avoid too many changes as
the code becomes harder to read.
--Jesper
[1]
https://nakryiko.com/posts/bpf-core-reference-guide/#bpf-core-enum-value-exists
Powered by blists - more mailing lists