lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0be119c2-4b5f-4aa6-bb20-b3e8a8b4cf82@kernel.org>
Date: Wed, 26 Nov 2025 12:15:05 +0100
From: Matthieu Baerts <matttbe@...nel.org>
To: Maciej Żenczykowski <zenczykowski@...il.com>
Cc: Linux Network Development Mailing List <netdev@...r.kernel.org>,
 "David S . Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
 Lorenzo Colitti <lorenzo@...gle.com>, Neal Cardwell <ncardwell@...gle.com>,
 bpf@...r.kernel.org, MPTCP Linux <mptcp@...ts.linux.dev>,
 Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net] net: fix propagation of EPERM from tcp_connect()

Hi Maciej,

(+cc MPTCP list)

On 26/11/2025 02:08, Maciej Żenczykowski wrote:
>> FWIW this breaks the mptcp_join.sh test, too:
> 
> What do you mean by 'too', does it break something else as well, or
> just the quoted mptcp_join?
> 
>> https://netdev-3.bots.linux.dev/vmksft-mptcp/results/394900/1-mptcp-join-sh/stdout
> 
> My still very preliminary investigation is that this is actually
> correct (though obviously the tests need to be adjusted).
> 
> See tools/testing/selftests/net/mptcp/mptcp_join.sh:89
> 
> # generated using "nfbpf_compile '(ip && (ip[54] & 0xf0) == 0x30) ||
> #                                (ip6 && (ip6[74] & 0xf0) == 0x30)'"
> CBPF_MPTCP_SUBOPTION_ADD_ADDR=...
> 
> mptcp_join.sh:365
>       if ! ip netns exec $ns2 $tables -A OUTPUT -p tcp \
>                       -m tcp --tcp-option 30 \
>                       -m bpf --bytecode \
>                       "$CBPF_MPTCP_SUBOPTION_ADD_ADDR" \
>                       -j DROP
> 
> So basically this is using iptables -j DROP which presumably
> propagates to EPERM and thus results in a faster local failure...

I don't think that's what caused the issue: according to the logs, two
tests have failed: "delete and re-add" and "flush re-add" and they don't
use the mentioned snippet. Still it might be caused by a Netfilter rule,
because they both call:

  reset_with_tcp_filter "..." ns2 10.0.3.2 REJECT OUTPUT

This helper will call:

  iptables -A OUTPUT -s 10.0.3.2 -j REJECT

Note that you can easily reproduce the issue by only launching the
problematic tests with './mptcp_join.sh "<test name or id>"', e.g.

  cd tools/testing/selftests/net/mptcp
  ./mptcp_join.sh "delete and re-add"

> Although this is probably trying to replicate packet loss rather than
> a local error...
> 
> So I'm not sure if I should:
> (a) fix the asserts with new values (presumably easiest by far),
> or
> (b) change how it does DROP to make it more like network packet loss
> (maybe an extra namespace, so the drop is in a diff netns, during
> forwarding??? not even sure if that would help though, or maybe add
> drop on other netns INPUT instead of OUTPUT).
> or
> (c) introduce some iptables -j DROP_CN type return... (seems like that
> might be worthwhile anyway)

>From what I understand, with your RFC patch, a "connect()" (or
"kernel_connect()") will get an error because of the Netfilter rule. If
that's normal, then probably the expected results can be adapted, e.g.
from ...

  join_syn_tx=3 join_connect_err=1 \
          chk_join_nr 2 2 2

... to ...

  join_connect_err=2 \
          chk_join_nr 2 2 2

At least that would show the effect of your patch.

An important note: the selftests can be executed on older kernels: if
your patch is changing the behaviour, and it is a fix that is going to
be backported to stable, that's fine. If that's not a fix, the selftests
should continue to work with and without the kernel patch. Then, the
Netfilter rule should probably be adapted instead, maybe by moving it to
the other side (ns1) in INPUT if that still makes the tests valid.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ