[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <IA3PR11MB9254057C82D0CEAFDF58BA73D856A@IA3PR11MB9254.namprd11.prod.outlook.com>
Date: Wed, 16 Jul 2025 02:04:23 +0000
From: "Song, Yoong Siang" <yoong.siang.song@...el.com>
To: Stanislav Fomichev <stfomichev@...il.com>
CC: "David S . Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Simon Horman <horms@...nel.org>, Jonathan Corbet
<corbet@....net>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
<daniel@...earbox.net>, Jesper Dangaard Brouer <hawk@...nel.org>, "John
Fastabend" <john.fastabend@...il.com>, Stanislav Fomichev <sdf@...ichev.me>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: RE: [PATCH bpf-next,v4 1/1] doc: clarify XDP Rx metadata handling and
driver requirements
On Wednesday, July 16, 2025 4:49 AM, Stanislav Fomichev <stfomichev@...il.com> wrote:
>On 07/15, Song Yoong Siang wrote:
>> Improves the documentation for XDP Rx metadata handling, especially for
>> AF_XDP use cases. It clarifies that drivers must remove any device-reserved
>> metadata from the data_meta area before passing the frame to the XDP
>> program.
>>
>> Besides, expand the explanation of how userspace and BPF programs should
>> coordinate the use of METADATA_SIZE, and adds a detailed diagram to
>> illustrate pointer adjustments and metadata layout.
>>
>> Additional, describe the requirements and constraints enforced by
>> bpf_xdp_adjust_meta().
>>
>> Signed-off-by: Song Yoong Siang <yoong.siang.song@...el.com>
>> ---
>>
>> V4:
>> - update the documentation to indicate that drivers are expected to copy
>> any device-reserved metadata from the metadata area (Jakub)
>> - remove selftest tool changes.
>>
>> V3: https://lore.kernel.org/netdev/20250702165757.3278625-1-
>yoong.siang.song@...el.com/
>> - update doc and commit msg accordingly.
>>
>> V2: https://lore.kernel.org/netdev/20250702030349.3275368-1-
>yoong.siang.song@...el.com/
>> - unconditionally do bpf_xdp_adjust_meta with -XDP_METADATA_SIZE (Stanislav)
>>
>> V1: https://lore.kernel.org/netdev/20250701042940.3272325-1-
>yoong.siang.song@...el.com/
>> ---
>> Documentation/networking/xdp-rx-metadata.rst | 47 ++++++++++++++------
>> 1 file changed, 34 insertions(+), 13 deletions(-)
>>
>> diff --git a/Documentation/networking/xdp-rx-metadata.rst
>b/Documentation/networking/xdp-rx-metadata.rst
>> index a6e0ece18be5..2e067eb6c5d6 100644
>> --- a/Documentation/networking/xdp-rx-metadata.rst
>> +++ b/Documentation/networking/xdp-rx-metadata.rst
>> @@ -49,7 +49,10 @@ as follows::
>> | |
>> xdp_buff->data_meta xdp_buff->data
>>
>> -An XDP program can store individual metadata items into this ``data_meta``
>> +Certain devices may utilize the ``data_meta`` area for specific purposes.
>> +Drivers for these devices must move any hardware-related metadata out from
>the
>> +``data_meta`` area before presenting the frame to the XDP program. This ensures
>> +that the XDP program can store individual metadata items into this ``data_meta``
>> area in whichever format it chooses. Later consumers of the metadata
>> will have to agree on the format by some out of band contract (like for
>> the AF_XDP use case, see below).
>> @@ -63,18 +66,36 @@ the final consumer. Thus the BPF program manually
>allocates a fixed number of
>> bytes out of metadata via ``bpf_xdp_adjust_meta`` and calls a subset
>> of kfuncs to populate it. The userspace ``XSK`` consumer computes
>> ``xsk_umem__get_data() - METADATA_SIZE`` to locate that metadata.
>> -Note, ``xsk_umem__get_data`` is defined in ``libxdp`` and
>> -``METADATA_SIZE`` is an application-specific constant (``AF_XDP`` receive
>> -descriptor does _not_ explicitly carry the size of the metadata).
>> -
>> -Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer)::
>> -
>> - +----------+-----------------+------+
>> - | headroom | custom metadata | data |
>> - +----------+-----------------+------+
>> - ^
>> - |
>> - rx_desc->address
>> +Note, ``xsk_umem__get_data`` is defined in ``libxdp`` and ``METADATA_SIZE`` is
>> +an application-specific constant. Since the ``AF_XDP`` receive descriptor does
>> +_not_ explicitly carry the size of the metadata, it is the responsibility of the
>> +driver to copy any device-reserved metadata out from the metadata area and
>> +ensure that ``xdp_buff->data_meta`` is set equal to ``xdp_buff->data`` before a
>> +BPF program is executed. This is necessary so that, after the BPF program
>> +adjusts the metadata area, the consumer can reliably retrieve the metadata
>> +address using ``METADATA_SIZE`` offset.
>> +
>> +The following diagram shows how custom metadata is positioned relative to the
>> +packet data and how pointers are adjusted for metadata access (note the
>absence
>> +of the ``data_meta`` pointer in ``xdp_desc``)::
>> +
>> + |<-- bpf_xdp_adjust_meta(xdp_buff, -METADATA_SIZE) --|
>> + new xdp_buff->data_meta old xdp_buff->data_meta
>> + | |
>> + | xdp_buff->data
>> + | |
>> + +----------+----------------------------------------------------+------+
>> + | headroom | custom metadata | data |
>> + +----------+----------------------------------------------------+------+
>> + | |
>> + | xdp_desc->addr
>> + |<------ xsk_umem__get_data() - METADATA_SIZE -------|
>> +
>> +``bpf_xdp_adjust_meta`` ensures that ``METADATA_SIZE`` is aligned to 4 bytes,
>> +does not exceed 252 bytes, and leaves sufficient space for building the
>> +xdp_frame. If these conditions are not met, it returns a negative error. In this
>> +case, the BPF program should not proceed to populate data into the
>``data_meta``
>> +area.
>>
>> XDP_PASS
>> ========
>
>Can we move these details into a new section? Call it 'Driver implementation'
>or something similar and explain all the above. Because the original
>purpose of the doc was to explain the API to the user applications.
>Since we are hiding these details from the users, explaining them
>separately seems more clear.
Sure. I will move all to new section and name the section 'Driver implementation'
Powered by blists - more mailing lists