[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251025160905.3857885-387-sashal@kernel.org>
Date: Sat, 25 Oct 2025 12:00:18 -0400
From: Sasha Levin <sashal@...nel.org>
To: patches@...ts.linux.dev,
stable@...r.kernel.org
Cc: Ido Schimmel <idosch@...dia.com>,
Petr Machata <petrm@...dia.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Jakub Kicinski <kuba@...nel.org>,
Sasha Levin <sashal@...nel.org>,
bridge@...ts.linux.dev,
netdev@...r.kernel.org
Subject: [PATCH AUTOSEL 6.17-5.4] bridge: Redirect to backup port when port is administratively down
From: Ido Schimmel <idosch@...dia.com>
[ Upstream commit 3d05b24429e1de7a17c8fdccb04a04dbc8ad297b ]
If a backup port is configured for a bridge port, the bridge will
redirect known unicast traffic towards the backup port when the primary
port is administratively up but without a carrier. This is useful, for
example, in MLAG configurations where a system is connected to two
switches and there is a peer link between both switches. The peer link
serves as the backup port in case one of the switches loses its
connection to the multi-homed system.
In order to avoid flooding when the primary port loses its carrier, the
bridge does not flush dynamic FDB entries pointing to the port upon STP
disablement, if the port has a backup port.
The above means that known unicast traffic destined to the primary port
will be blackholed when the port is put administratively down, until the
FDB entries pointing to it are aged-out.
Given that the current behavior is quite weird and unlikely to be
depended on by anyone, amend the bridge to redirect to the backup port
also when the primary port is administratively down and not only when it
does not have a carrier.
The change is motivated by a report from a user who expected traffic to
be redirected to the backup port when the primary port was put
administratively down while debugging a network issue.
Reviewed-by: Petr Machata <petrm@...dia.com>
Signed-off-by: Ido Schimmel <idosch@...dia.com>
Acked-by: Nikolay Aleksandrov <razor@...ckwall.org>
Link: https://patch.msgid.link/20250812080213.325298-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@...nel.org>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes
- Prevents known-unicast blackholing when a bridge port with a
configured backup is put administratively down. Today, with a backup
port configured, FDB entries are intentionally not flushed on STP
disable (net/bridge/br_stp_if.c:116), so known unicast continues to
target the primary port. However, br_forward() only redirects to the
backup when the primary has no carrier, not when it’s
administratively down, so traffic can be dropped until FDB aging.
- The patch extends the existing redirection criterion to cover both
“no carrier” and “admin down,” aligning behavior with user
expectations in MLAG-like deployments and eliminating a surprising
failure mode.
- Why it’s a stable-worthy bugfix
- User-visible impact: Traffic blackhole in a common operational
scenario (admin down during maintenance/debug), even though a backup
port is configured and FDB entries are retained specifically to
allow continued forwarding.
- Small, contained change: One condition widened in a single function;
no API/ABI or architectural changes.
- Consistent with existing semantics: It broadens an already-
established fast-failover behavior (originally for link/carrier
loss) to the equivalent “port down” state, which is operationally
the same intent.
- Maintainer acks: Reviewed-by and Acked-by from bridge maintainers;
Signed-off by net maintainer.
- Code reference and rationale
- Current redirection only when carrier is down:
- net/bridge/br_forward.c:151
if (rcu_access_pointer(to->backup_port) &&
!netif_carrier_ok(to->dev)) { ... }
- Patch adds admin-down to the same decision, effectively:
- net/bridge/br_forward.c:151
if (rcu_access_pointer(to->backup_port) &&
(!netif_carrier_ok(to->dev) || !netif_running(to->dev))) { ... }
- This ensures redirection also when `!netif_running()`
(administratively down).
- The reason blackholing occurs without this patch:
- On STP port disable, FDB entries are not flushed if a backup port
is configured:
- net/bridge/br_stp_if.c:116
if (!rcu_access_pointer(p->backup_port))
br_fdb_delete_by_port(br, p, 0, 0);
- This optimization (commit 8dc350202d32, “optimize backup_port fdb
convergence”) intentionally keeps FDB entries to enable seamless
redirection, but br_forward() fails to redirect when the port is
admin down, causing drops.
- Risk assessment
- Minimal regression risk: Checks only `netif_running(to->dev)` in a
path that already conditionally redirects; `should_deliver()` still
gates actual forwarding on the backup port’s state and policy.
- No new features, no data structure changes, no timing-sensitive
logic added.
- Behavior remains unchanged unless a backup port is configured, and
then only in the admin-down case, which is the intended failover
scenario.
- Backport considerations
- Applicable to stable series that include backup port support and the
FDB-retention optimization (e.g., post-2018/2019 kernels). It will
not apply to trees that predate `backup_port`.
- The change is a clean one-liner in `br_forward()`; no dependencies
beyond existing `netif_running()` and `netif_carrier_ok()`.
Conclusion: This is a clear bugfix to prevent data-plane blackholes in a
supported configuration with minimal risk. It should be backported to
stable kernels that have bridge backup-port support.
net/bridge/br_forward.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 29097e984b4f7..870bdf2e082c4 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -148,7 +148,8 @@ void br_forward(const struct net_bridge_port *to,
goto out;
/* redirect to backup link if the destination port is down */
- if (rcu_access_pointer(to->backup_port) && !netif_carrier_ok(to->dev)) {
+ if (rcu_access_pointer(to->backup_port) &&
+ (!netif_carrier_ok(to->dev) || !netif_running(to->dev))) {
struct net_bridge_port *backup_port;
backup_port = rcu_dereference(to->backup_port);
--
2.51.0
Powered by blists - more mailing lists