lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20230914210452.2588884-10-sdf@google.com> Date: Thu, 14 Sep 2023 14:04:52 -0700 From: Stanislav Fomichev <sdf@...gle.com> To: bpf@...r.kernel.org Cc: ast@...nel.org, daniel@...earbox.net, andrii@...nel.org, martin.lau@...ux.dev, song@...nel.org, yhs@...com, john.fastabend@...il.com, kpsingh@...nel.org, sdf@...gle.com, haoluo@...gle.com, jolsa@...nel.org, kuba@...nel.org, toke@...nel.org, willemb@...gle.com, dsahern@...nel.org, magnus.karlsson@...el.com, bjorn@...nel.org, maciej.fijalkowski@...el.com, hawk@...nel.org, yoong.siang.song@...el.com, netdev@...r.kernel.org, xdp-hints@...-project.net Subject: [PATCH bpf-next v2 9/9] xsk: document tx_metadata_len layout - how to use - how to query features - pointers to the examples Signed-off-by: Stanislav Fomichev <sdf@...gle.com> --- Documentation/networking/index.rst | 1 + Documentation/networking/xsk-tx-metadata.rst | 77 ++++++++++++++++++++ 2 files changed, 78 insertions(+) create mode 100644 Documentation/networking/xsk-tx-metadata.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 5b75c3f7a137..9b2accb48df7 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -123,6 +123,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics. xfrm_sync xfrm_sysctl xdp-rx-metadata + xsk-tx-metadata .. only:: subproject and html diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst new file mode 100644 index 000000000000..b7289f06745c --- /dev/null +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -0,0 +1,77 @@ +================== +AF_XDP TX Metadata +================== + +This document describes how to enable offloads when transmitting packets +via :doc:`af_xdp`. Refer to :doc:`xdp-rx-metadata` on how to access similar +metadata on the receive side. + +General Design +============== + +The headroom for the metadata is reserved via ``tx_metadata_len`` in +``struct xdp_umem_reg``. The metadata length is therefore the same for +every socket that shares the same umem. The metadata layout is a fixed UAPI, +refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``. +Thus, generally, the ``tx_metadata_len`` field above should contain +``sizeof(union xsk_tx_metadata)``. + +The headroom and the metadata itself should be located right before +``xdp_desc->addr`` in the umem frame. Within a frame, the metadata +layout is as follows:: + + tx_metadata_len + / \ + +-----------------+---------+----------------------------+ + | xsk_tx_metadata | padding | payload | + +-----------------+---------+----------------------------+ + ^ + | + xdp_desc->addr + +An AF_XDP application can request headrooms larger than ``sizeof(struct +xsk_tx_metadata)``. The kernel will ignore the padding (and will still +use ``xdp_desc->addr - tx_metadata_len`` to locate +the ``xsk_tx_metadata``). For the frames that shouldn't carry +any metadata (i.e., the ones that don't have ``XDP_TX_METADATA`` option), +the metadata area is ignored by the kernel as well. + +The flags field enables the particular offload: + +- ``XDP_TX_METADATA_TIMESTAMP``: requests the device to put transmission + timestamp into ``tx_timestamp`` field of ``union xsk_tx_metadata``. +- ``XDP_TX_METADATA_CHECKSUM``: requests the device to calculate L4 + checksum. ``csum_start`` specifies byte offset of there the checksumming + should start and ``csum_offset`` specifies byte offset where the + device should store the computed checksum. +- ``XDP_TX_METADATA_CHECKSUM_SW``: requests checksum calculation to + be done in software; this mode works only in ``XSK_COPY`` mode and + is mostly intended for testing. Do not enable this option, it + will negatively affect performance. + +Besides the flags above, in order to trigger the offloads, the first +packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` +bit in the ``options`` field. Also not that in a multi-buffer packet +only the first chunk should carry the metadata. + +Querying Device Capabilities +============================ + +Every devices exports its offloads capabilities via netlink netdev family. +Refer to ``xsk-flags`` features bitmask in +``Documentation/netlink/specs/netdev.yaml``. + +- ``tx-timestamp``: device supports ``XDP_TX_METADATA_TIMESTAMP`` +- ``tx-checksum``: device supports ``XDP_TX_METADATA_CHECKSUM`` + +Note that every devices supports ``XDP_TX_METADATA_CHECKSUM_SW`` when +running in ``XSK_COPY`` mode. + +See ``tools/net/ynl/samples/netdev.c`` on how to query this information. + +Example +======= + +See ``tools/testing/selftests/bpf/xdp_hw_metadata.c`` for an example +program that handles TX metadata. Also see https://github.com/fomichev/xskgen +for a more bare-bones example. -- 2.42.0.459.ge4e396fd5e-goog
Powered by blists - more mailing lists