lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <PH0PR11MB5830D33B679A0ACD3FD6E23CD8132@PH0PR11MB5830.namprd11.prod.outlook.com>
Date: Thu, 9 Jan 2025 07:19:26 +0000
From: "Song, Yoong Siang" <yoong.siang.song@...el.com>
To: Stanislav Fomichev <stfomichev@...il.com>
CC: "David S . Miller" <davem@...emloft.net>, Eric Dumazet
	<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>, Simon Horman <horms@...nel.org>, Willem de Bruijn
	<willemb@...gle.com>, "Bezdeka, Florian" <florian.bezdeka@...mens.com>,
	Donald Hunter <donald.hunter@...il.com>, Jonathan Corbet <corbet@....net>,
	Bjorn Topel <bjorn@...nel.org>, "Karlsson, Magnus"
	<magnus.karlsson@...el.com>, "Fijalkowski, Maciej"
	<maciej.fijalkowski@...el.com>, Jonathan Lemon <jonathan.lemon@...il.com>,
	Andrew Lunn <andrew+netdev@...n.ch>, Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>, Jesper Dangaard Brouer
	<hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, "Damato, Joe"
	<jdamato@...tly.com>, Stanislav Fomichev <sdf@...ichev.me>, Xuan Zhuo
	<xuanzhuo@...ux.alibaba.com>, Mina Almasry <almasrymina@...gle.com>, "Daniel
 Jurgens" <danielj@...dia.com>, Amritha Nambiar <amritha.nambiar@...el.com>,
	Andrii Nakryiko <andrii@...nel.org>, Eduard Zingerman <eddyz87@...il.com>,
	Mykola Lysenko <mykolal@...com>, Martin KaFai Lau <martin.lau@...ux.dev>,
	Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, KP Singh
	<kpsingh@...nel.org>, Hao Luo <haoluo@...gle.com>, Jiri Olsa
	<jolsa@...nel.org>, Shuah Khan <shuah@...nel.org>, Alexandre Torgue
	<alexandre.torgue@...s.st.com>, Jose Abreu <joabreu@...opsys.com>, "Maxime
 Coquelin" <mcoquelin.stm32@...il.com>, "Nguyen, Anthony L"
	<anthony.l.nguyen@...el.com>, "Kitszel, Przemyslaw"
	<przemyslaw.kitszel@...el.com>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-doc@...r.kernel.org"
	<linux-doc@...r.kernel.org>, "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
	"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
	"linux-stm32@...md-mailman.stormreply.com"
	<linux-stm32@...md-mailman.stormreply.com>,
	"linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "intel-wired-lan@...ts.osuosl.org"
	<intel-wired-lan@...ts.osuosl.org>, "xdp-hints@...-project.net"
	<xdp-hints@...-project.net>
Subject: RE: [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload
 support to XDP Tx metadata

On Wednesday, January 8, 2025 12:50 AM, Stanislav Fomichev <stfomichev@...il.com> wrote:
>On 01/06, Song Yoong Siang wrote:
>> Extend the XDP Tx metadata framework so that user can requests launch time
>> hardware offload, where the Ethernet device will schedule the packet for
>> transmission at a pre-determined time called launch time. The value of
>> launch time is communicated from user space to Ethernet driver via
>> launch_time field of struct xsk_tx_metadata.
>>
>> Suggested-by: Stanislav Fomichev <sdf@...gle.com>

Hi Stanislav Fomichev,

Thanks for your review comments.
I notice that you have two emails:
sdf@...gle.com & stfomichev@...il.com

Which one I should use in the suggested-by tag?

>> Signed-off-by: Song Yoong Siang <yoong.siang.song@...el.com>
>> ---
>>  Documentation/netlink/specs/netdev.yaml      |  4 ++
>>  Documentation/networking/xsk-tx-metadata.rst | 64 ++++++++++++++++++++
>>  include/net/xdp_sock.h                       | 10 +++
>>  include/net/xdp_sock_drv.h                   |  1 +
>>  include/uapi/linux/if_xdp.h                  | 10 +++
>>  include/uapi/linux/netdev.h                  |  3 +
>>  net/core/netdev-genl.c                       |  2 +
>>  net/xdp/xsk.c                                |  3 +
>>  tools/include/uapi/linux/if_xdp.h            | 10 +++
>>  tools/include/uapi/linux/netdev.h            |  3 +
>>  10 files changed, 110 insertions(+)
>>
>> diff --git a/Documentation/netlink/specs/netdev.yaml
>b/Documentation/netlink/specs/netdev.yaml
>> index cbb544bd6c84..e59c8a14f7d1 100644
>> --- a/Documentation/netlink/specs/netdev.yaml
>> +++ b/Documentation/netlink/specs/netdev.yaml
>> @@ -70,6 +70,10 @@ definitions:
>>          name: tx-checksum
>>          doc:
>>            L3 checksum HW offload is supported by the driver.
>> +      -
>> +        name: tx-launch-time
>> +        doc:
>> +          Launch time HW offload is supported by the driver.
>>    -
>>      name: queue-type
>>      type: enum
>> diff --git a/Documentation/networking/xsk-tx-metadata.rst
>b/Documentation/networking/xsk-tx-metadata.rst
>> index e76b0cfc32f7..3cec089747ce 100644
>> --- a/Documentation/networking/xsk-tx-metadata.rst
>> +++ b/Documentation/networking/xsk-tx-metadata.rst
>> @@ -50,6 +50,10 @@ The flags field enables the particular offload:
>>    checksum. ``csum_start`` specifies byte offset of where the checksumming
>>    should start and ``csum_offset`` specifies byte offset where the
>>    device should store the computed checksum.
>> +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the
>> +  packet for transmission at a pre-determined time called launch time. The
>> +  value of launch time is indicated by ``launch_time`` field of
>> +  ``union xsk_tx_metadata``.
>>
>>  Besides the flags above, in order to trigger the offloads, the first
>>  packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
>> @@ -65,6 +69,65 @@ In this case, when running in ``XDK_COPY`` mode, the TX
>checksum
>>  is calculated on the CPU. Do not enable this option in production because
>>  it will negatively affect performance.
>>
>> +Launch Time
>> +===========
>> +
>> +The value of the requested launch time should be based on the device's PTP
>> +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path
>> +compared to the ETF queuing discipline, which organizes packets and delays
>> +their transmission. Instead, AF_XDP immediately hands off the packets to
>> +the device driver without rearranging their order or holding them prior to
>> +transmission. In scenarios where the launch time offload feature is
>> +disabled, the device driver is expected to disregard the launch time
>> +request. For correct interpretation and meaningful operation, the launch
>> +time should never be set to a value larger than the farthest programmable
>> +time in the future (the horizon). Different devices have different hardware
>> +limitations on the launch time offload feature.
>> +
>> +stmmac driver
>> +-------------
>> +
>> +For stmmac, TSO and launch time (TBS) features are mutually exclusive for
>> +each individual Tx Queue. By default, the driver configures Tx Queue 0 to
>> +support TSO and the rest of the Tx Queues to support TBS. The launch time
>> +hardware offload feature can be enabled or disabled by using the tc-etf
>> +command to call the driver's ndo_setup_tc() callback.
>> +
>> +The value of the launch time that is programmed in the Enhanced Normal
>> +Transmit Descriptors is a 32-bit value, where the most significant 8 bits
>> +represent the time in seconds and the remaining 24 bits represent the time
>> +in 256 ns increments. The programmed launch time is compared against the
>> +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the
>> +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the
>> +future.
>> +
>> +The stmmac driver maintains FIFO behavior and does not perform packet
>> +reordering. This means that a packet with a launch time request will block
>> +other packets in the same Tx Queue until it is transmitted.
>> +
>> +igc driver
>> +----------
>> +
>> +For igc, all four Tx Queues support the launch time feature. The launch
>> +time hardware offload feature can be enabled or disabled by using the
>> +tc-etf command to call the driver's ndo_setup_tc() callback. When entering
>> +TSN mode, the igc driver will reset the device and create a default Qbv
>> +schedule with a 1-second cycle time, with all Tx Queues open at all times.
>> +
>> +The value of the launch time that is programmed in the Advanced Transmit
>> +Context Descriptor is a relative offset to the starting time of the Qbv
>> +transmission window of the queue. The Frst flag of the descriptor can be
>> +set to schedule the packet for the next Qbv cycle. Therefore, the horizon
>> +of the launch time for i225 and i226 is the ending time of the next cycle
>> +of the Qbv transmission window of the queue. For example, when the Qbv
>> +cycle time is set to 1 second, the horizon of the launch time ranges
>> +from 1 second to 2 seconds, depending on where the Qbv cycle is currently
>> +running.
>> +
>> +The igc driver maintains FIFO behavior and does not perform packet
>> +reordering. This means that a packet with a launch time request will block
>> +other packets in the same Tx Queue until it is transmitted.
>
>Since two devices we initially support are using FIFO mode, should we more
>explicitly target this case? Maybe even call netdev features
>tx-launch-time-fifo? In the future, if/when we get support timing-wheel-like
>queues, we can export another tx-launch-time-wheel?
>
>It seems important for the userspace to know which mode it's running.
>In a fifo mode, it might make sense to allocate separate queues
>for scheduling things far into the future/etc.

You are right, user should isolate one queue for scheduling things
far into future and use other queue for normal traffic.

>
>Thoughts? No code changes required, just more explicitly state the
>expectations.

Agree with you, let me change the name from tx-launch-time to
tx-launch-time-fifo to explicitly state the fifo behavior.  

Thanks & Regards
Siang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ