lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250822113519.y6maeu4ifoqx4mxe@skbuf>
Date: Fri, 22 Aug 2025 14:35:19 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: Oleksij Rempel <o.rempel@...gutronix.de>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Rob Herring <robh@...nel.org>,
	Krzysztof Kozlowski <krzk+dt@...nel.org>,
	Florian Fainelli <f.fainelli@...il.com>,
	Maxime Chevallier <maxime.chevallier@...tlin.com>,
	Kory Maincent <kory.maincent@...tlin.com>,
	Lukasz Majewski <lukma@...x.de>, Jonathan Corbet <corbet@....net>,
	Donald Hunter <donald.hunter@...il.com>,
	Vadim Fedorenko <vadim.fedorenko@...ux.dev>,
	Jiri Pirko <jiri@...nulli.us>, Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	John Fastabend <john.fastabend@...il.com>, kernel@...gutronix.de,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Russell King <linux@...linux.org.uk>, Divya.Koppera@...rochip.com,
	Sabrina Dubroca <sd@...asysnail.net>,
	Stanislav Fomichev <sdf@...ichev.me>
Subject: Re: [PATCH net-next v3 3/3] Documentation: net: add flow control
 guide and document ethtool API

On Wed, Aug 20, 2025 at 03:10:23PM +0200, Oleksij Rempel wrote:
>          name: stats-src
> +        doc: |
> +          Selects the source of the MAC statistics, values from
> +          enum ethtool_mac_stats_src. This allows requesting statistics
> +          from an aggregated MAC or a specific PHY, for example.

"This allows requesting statistics from the individual components of the
MAC Merge layer" would be better - nothing to do with PHYs.

>          type: u32
>    -
>      name: eee
> diff --git a/Documentation/networking/flow_control.rst b/Documentation/networking/flow_control.rst
> new file mode 100644
> index 000000000000..ba315a5bcb87
> --- /dev/null
> +++ b/Documentation/networking/flow_control.rst
> @@ -0,0 +1,379 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. _ethernet-flow-control:
> +
> +=====================
> +Ethernet Flow Control
> +=====================
> +
> +This document is a practical guide to Ethernet Flow Control in Linux, covering
> +what it is, how it works, and how to configure it.
> +
> +What is Flow Control?
> +=====================
> +
> +Flow control is a mechanism to prevent a fast sender from overwhelming a
> +slow receiver with data, which would cause buffer overruns and dropped packets.
> +The receiver can signal the sender to temporarily stop transmitting, giving it
> +time to process its backlog.
> +
> +Standards references
> +====================
> +
> +Ethernet flow control mechanisms are specified across consolidated IEEE base
> +standards; some originated as amendments:
> +
> +- Collision-based flow control is part of CSMA/CD in **IEEE 802.3**
> +  (half-duplex).
> +- Link‑wide PAUSE is defined in **IEEE 802.3 Annex 31B**

There are some odd characters here.

> +  (originally **802.3x**).
> +- Priority-based Flow Control (PFC) is defined in **IEEE 802.1Q Clause 36**
> +  (originally **802.1Qbb**).
> +
> +In the remainder of this document, the consolidated clause numbers are used.
> +
> +How It Works: The Mechanisms
> +============================
> +
> +The method used for flow control depends on the link's duplex mode.
> +
> +.. note::
> +   The user-visible ``ethtool`` pause API described in this document controls
> +   **link-wide PAUSE** (IEEE 802.3 Annex 31B) only. It does not control the
> +   collision-based behavior that exists on half-duplex links.
> +
> +2. Full-Duplex: Link-wide PAUSE (IEEE 802.3 Annex 31B)
> +------------------------------------------------------
> +On full-duplex links, devices can send and receive at the same time. Flow
> +control is achieved by sending a special **PAUSE frame**, defined by IEEE
> +802.3 Annex 31B. This mechanism pauses all traffic on the link and is therefore
> +called *link-wide PAUSE*.
> +
> +* **What it is**: A standard Ethernet frame with a globally reserved
> +    destination MAC address (``01-80-C2-00-00-01``). This address is in a range
> +    that standard IEEE 802.1D-compliant bridges do not forward. However, some
> +    unmanaged or misconfigured bridges have been reported to forward these
> +    frames, which can disrupt flow control across a network.
> +
> +* **How it works**: The frame contains a MAC Control opcode for PAUSE
> +    (``0x0001``) and a ``pause_time`` value, telling the sender how long to
> +    wait before sending more data frames. This time is specified in units of
> +    "pause quanta," where one quantum is the time it takes to transmit 512 bits.
> +    For example, one pause quantum is 51.2 microseconds on a 10 Mbit/s link,
> +    and 512 nanoseconds on a 1 Gbit/s link.

I might also mention that the quantum value of 0 is special and it means
that the transmitter can resume, even if past quanta have not elapsed.

> +
> +* **Who uses it**: Any full-duplex link, from 10 Mbit/s to multi-gigabit speeds.
> +
> +The MAC (Media Access Controller)
> +---------------------------------
> +The MAC is the hardware component that actually sends and receives PAUSE
> +frames. Its capabilities define the upper limit of what the driver can support.
> +For link-wide PAUSE, MACs can vary in their support for symmetric (both
> +directions) or asymmetric (independent TX/RX) flow control.
> +
> +For PFC, the MAC must be capable of generating and interpreting the
> +priority-based PAUSE frames and managing separate pause states for each
> +traffic class.
> +
> +Many MACs also implement automatic PAUSE frame transmission based on the fill
> +level of their internal RX FIFO. This is typically configured with two
> +thresholds:
> +
> +* **FLOW_ON (High Water Mark)**: When the RX FIFO usage reaches this
> +  threshold, the MAC automatically transmits a PAUSE frame to stop the sender.
> +
> +* **FLOW_OFF (Low Water Mark)**: When the RX FIFO usage drops below this
> +  threshold, the MAC transmits a PAUSE frame with a quanta of zero to tell

I think quanta is plural.

> +  the sender it can resume transmission.
> +
> +The optimal values for these thresholds depend on the link's round-trip-time
> +(RTT) and the peer's internal processing latency. The high water mark must be
> +set low enough so that the MAC's RX FIFO does not overflow while waiting for
> +the peer to react to the PAUSE frame. The driver is responsible for configuring
> +sensible defaults according to the IEEE specification. User tuning should only
> +be necessary in special cases, such as on links with unusually long cable
> +lengths (e.g., long-haul fiber).

How would user tuning be achieved?

> diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h
> index 46de09954042..0af7b90101c1 100644
> --- a/include/uapi/linux/ethtool_netlink_generated.h
> +++ b/include/uapi/linux/ethtool_netlink_generated.h
> @@ -394,7 +400,25 @@ enum {
>  	ETHTOOL_A_PAUSE_STAT_MAX = (__ETHTOOL_A_PAUSE_STAT_CNT - 1)
>  };
>  
> -enum {
> +/**
> + * enum ethtool_pause - Parameters for link-wide PAUSE (IEEE 802.3 Annex 31B).
> + * @ETHTOOL_A_PAUSE_AUTONEG: Acts as a mode selector for the driver. On GET:
> + *   indicates the driver's behavior. If true, the driver will respect the
> + *   negotiated outcome; if false, the driver will use a forced configuration.
> + *   On SET: if true, the driver configures the PHY's advertisement based on
> + *   the rx and tx attributes. If false, the driver forces the MAC into the
> + *   state defined by the rx and tx attributes.
> + * @ETHTOOL_A_PAUSE_RX: Enable receiving PAUSE frames (pausing local TX). On
> + *   GET: reflects the currently preferred configuration state.
> + * @ETHTOOL_A_PAUSE_TX: Enable transmitting PAUSE frames (pausing peer TX). On
> + *   GET: reflects the currently preferred configuration state.
> + * @ETHTOOL_A_PAUSE_STATS: Contains the pause statistics counters. The source
> + *   of these statistics is determined by stats-src.
> + * @ETHTOOL_A_PAUSE_STATS_SRC: Selects the source of the MAC statistics, values
> + *   from enum ethtool_mac_stats_src. This allows requesting statistics from an
> + *   aggregated MAC or a specific PHY, for example.

Same here.

> + */
> +enum ethtool_a_pause {
>  	ETHTOOL_A_PAUSE_UNSPEC,
>  	ETHTOOL_A_PAUSE_HEADER,
>  	ETHTOOL_A_PAUSE_AUTONEG,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ