[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0be119c2-4b5f-4aa6-bb20-b3e8a8b4cf82@kernel.org>
Date: Wed, 26 Nov 2025 12:15:05 +0100
From: Matthieu Baerts <matttbe@...nel.org>
To: Maciej Żenczykowski <zenczykowski@...il.com>
Cc: Linux Network Development Mailing List <netdev@...r.kernel.org>,
"David S . Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Lorenzo Colitti <lorenzo@...gle.com>, Neal Cardwell <ncardwell@...gle.com>,
bpf@...r.kernel.org, MPTCP Linux <mptcp@...ts.linux.dev>,
Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net] net: fix propagation of EPERM from tcp_connect()
Hi Maciej,
(+cc MPTCP list)
On 26/11/2025 02:08, Maciej Żenczykowski wrote:
>> FWIW this breaks the mptcp_join.sh test, too:
>
> What do you mean by 'too', does it break something else as well, or
> just the quoted mptcp_join?
>
>> https://netdev-3.bots.linux.dev/vmksft-mptcp/results/394900/1-mptcp-join-sh/stdout
>
> My still very preliminary investigation is that this is actually
> correct (though obviously the tests need to be adjusted).
>
> See tools/testing/selftests/net/mptcp/mptcp_join.sh:89
>
> # generated using "nfbpf_compile '(ip && (ip[54] & 0xf0) == 0x30) ||
> # (ip6 && (ip6[74] & 0xf0) == 0x30)'"
> CBPF_MPTCP_SUBOPTION_ADD_ADDR=...
>
> mptcp_join.sh:365
> if ! ip netns exec $ns2 $tables -A OUTPUT -p tcp \
> -m tcp --tcp-option 30 \
> -m bpf --bytecode \
> "$CBPF_MPTCP_SUBOPTION_ADD_ADDR" \
> -j DROP
>
> So basically this is using iptables -j DROP which presumably
> propagates to EPERM and thus results in a faster local failure...
I don't think that's what caused the issue: according to the logs, two
tests have failed: "delete and re-add" and "flush re-add" and they don't
use the mentioned snippet. Still it might be caused by a Netfilter rule,
because they both call:
reset_with_tcp_filter "..." ns2 10.0.3.2 REJECT OUTPUT
This helper will call:
iptables -A OUTPUT -s 10.0.3.2 -j REJECT
Note that you can easily reproduce the issue by only launching the
problematic tests with './mptcp_join.sh "<test name or id>"', e.g.
cd tools/testing/selftests/net/mptcp
./mptcp_join.sh "delete and re-add"
> Although this is probably trying to replicate packet loss rather than
> a local error...
>
> So I'm not sure if I should:
> (a) fix the asserts with new values (presumably easiest by far),
> or
> (b) change how it does DROP to make it more like network packet loss
> (maybe an extra namespace, so the drop is in a diff netns, during
> forwarding??? not even sure if that would help though, or maybe add
> drop on other netns INPUT instead of OUTPUT).
> or
> (c) introduce some iptables -j DROP_CN type return... (seems like that
> might be worthwhile anyway)
>From what I understand, with your RFC patch, a "connect()" (or
"kernel_connect()") will get an error because of the Netfilter rule. If
that's normal, then probably the expected results can be adapted, e.g.
from ...
join_syn_tx=3 join_connect_err=1 \
chk_join_nr 2 2 2
... to ...
join_connect_err=2 \
chk_join_nr 2 2 2
At least that would show the effect of your patch.
An important note: the selftests can be executed on older kernels: if
your patch is changing the behaviour, and it is a fix that is going to
be backported to stable, that's fine. If that's not a fix, the selftests
should continue to work with and without the kernel patch. Then, the
Netfilter rule should probably be adapted instead, maybe by moving it to
the other side (ns1) in INPUT if that still makes the tests valid.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Powered by blists - more mailing lists