[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a74fbb54-2594-fd37-c5fe-3a027d9a5ea3@alu.unizg.hr>
Date: Sat, 10 Jun 2023 20:04:02 +0200
From: Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
To: Guillaume Nault <gnault@...hat.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Shuah Khan <shuah@...nel.org>,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: POSSIBLE BUG: selftests/net/fcnal-test.sh: [FAIL][FIX TESTED] in
vrf "bind - ns-B IPv6 LLA" test
On 6/9/23 18:13, Guillaume Nault wrote:
> On Thu, Jun 08, 2023 at 07:37:15AM +0200, Mirsad Goran Todorovac wrote:
>> On 6/7/23 18:51, Guillaume Nault wrote:
>>> On Wed, Jun 07, 2023 at 12:04:52AM +0200, Mirsad Goran Todorovac wrote:
>>>> [...]
>>>> TEST: ping local, VRF bind - ns-A IP [ OK ]
>>>> TEST: ping local, VRF bind - VRF IP [FAIL]
>>>> TEST: ping local, VRF bind - loopback [ OK ]
>>>> TEST: ping local, device bind - ns-A IP [FAIL]
>>>> TEST: ping local, device bind - VRF IP [ OK ]
>>>> [...]
>>>> TEST: ping local, VRF bind - ns-A IP [ OK ]
>>>> TEST: ping local, VRF bind - VRF IP [FAIL]
>>>> TEST: ping local, VRF bind - loopback [ OK ]
>>>> TEST: ping local, device bind - ns-A IP [FAIL]
>>>> TEST: ping local, device bind - VRF IP [ OK ]
>>>> [...]
>>>
>>> I have the same failures here. They don't seem to be recent.
>>> I'll take a look.
>>
>> Certainly. I thought it might be something architecture-specific?
>>
>> I have reproduced it also on a Lenovo IdeaPad 3 with Ubuntu 22.10,
>> but on Lenovo desktop with AlmaLinux 8.8 (CentOS fork), the result
>> was "888/888 passed".
>
> I've taken a deeper look at these failures. That's actually a problem in
> ping. That's probably why you have different results depending on the
> distribution.
Thank you for your work. I feel encouraged by your aim to get to the bottom
of the problem ...
> The problem is that, for some versions, 'ping -I netdev ...' doesn't
> bind the socket to 'netdev' if the IPv4 address to ping is set on that
> same device. The VRF tests depend on this socket binding, so they fail
> when ping refuses to bind. That was fixed upstream with commit
> 92ce8ef21393 ("Revert "ping: do not bind to device when destination IP
> is on device"") (https://github.com/iputils/iputils/commit/92ce8ef2139353da3bf55fe2280bd4abd2155c9f).
>
> Long story short, the tests should pass with the latest upstream ping
> version.
>
> Alternatively, you can modify the commands run by fcnal-test.sh and
> provide the -I option twice: one for setting the device binding and one
> for setting the source IPv4 address. This way ping should accept to
> bind its socket.
>
> Something like (not tested):
>
> - run_cmd ping -c1 -w1 -I ${VRF} ${a}
> + run_cmd ping -c1 -w1 -I ${VRF} -I ${a} ${a}
> [...]
> - run_cmd ping -c1 -w1 -I ${NSA_DEV} ${a}
> + run_cmd ping -c1 -w1 -I ${NSA_DEV} -I ${a} ${a}
I have tested this and the fix appears to work:
#################################################################
With VRF
SYSCTL: net.ipv4.raw_l3mdev_accept=1
TEST: ping out, VRF bind - ns-B IP [ OK ]
TEST: ping out, device bind - ns-B IP [ OK ]
TEST: ping out, vrf device + dev address bind - ns-B IP [ OK ]
TEST: ping out, vrf device + vrf address bind - ns-B IP [ OK ]
TEST: ping out, VRF bind - ns-B loopback IP [ OK ]
TEST: ping out, device bind - ns-B loopback IP [ OK ]
TEST: ping out, vrf device + dev address bind - ns-B loopback IP [ OK ]
TEST: ping out, vrf device + vrf address bind - ns-B loopback IP [ OK ]
TEST: ping in - ns-A IP [ OK ]
TEST: ping in - VRF IP [ OK ]
TEST: ping local, VRF bind - ns-A IP [ OK ]
TEST: ping local, VRF bind - VRF IP [ OK ]
TEST: ping local, VRF bind - loopback [ OK ]
TEST: ping local, device bind - ns-A IP [ OK ]
TEST: ping local, device bind - VRF IP [ OK ]
TEST: ping local, device bind - loopback [ OK ]
TEST: ping out, vrf bind, blocked by rule - ns-B loopback IP [ OK ]
TEST: ping out, device bind, blocked by rule - ns-B loopback IP [ OK ]
TEST: ping in, blocked by rule - ns-A loopback IP [ OK ]
TEST: ping out, vrf bind, unreachable route - ns-B loopback IP [ OK ]
TEST: ping out, device bind, unreachable route - ns-B loopback IP [ OK ]
TEST: ping in, unreachable route - ns-A loopback IP [ OK ]
SYSCTL: net.ipv4.ping_group_range=0 2147483647
SYSCTL: net.ipv4.raw_l3mdev_accept=1
TEST: ping out, VRF bind - ns-B IP [ OK ]
TEST: ping out, device bind - ns-B IP [ OK ]
TEST: ping out, vrf device + dev address bind - ns-B IP [ OK ]
TEST: ping out, vrf device + vrf address bind - ns-B IP [ OK ]
TEST: ping out, VRF bind - ns-B loopback IP [ OK ]
TEST: ping out, device bind - ns-B loopback IP [ OK ]
TEST: ping out, vrf device + dev address bind - ns-B loopback IP [ OK ]
TEST: ping out, vrf device + vrf address bind - ns-B loopback IP [ OK ]
TEST: ping in - ns-A IP [ OK ]
TEST: ping in - VRF IP [ OK ]
TEST: ping local, VRF bind - ns-A IP [ OK ]
TEST: ping local, VRF bind - VRF IP [ OK ]
TEST: ping local, VRF bind - loopback [ OK ]
TEST: ping local, device bind - ns-A IP [ OK ]
TEST: ping local, device bind - VRF IP [ OK ]
TEST: ping local, device bind - loopback [ OK ]
TEST: ping out, vrf bind, blocked by rule - ns-B loopback IP [ OK ]
TEST: ping out, device bind, blocked by rule - ns-B loopback IP [ OK ]
TEST: ping in, blocked by rule - ns-A loopback IP [ OK ]
TEST: ping out, vrf bind, unreachable route - ns-B loopback IP [ OK ]
TEST: ping out, device bind, unreachable route - ns-B loopback IP [ OK ]
TEST: ping in, unreachable route - ns-A loopback IP [ OK ]
###########################################################################
This also works on the Lenovo IdeaPad 3 Ubuntu 22.10 laptop, but on the AlmaLinux 8.8
Lenovo desktop I have a problem:
[root@...mtodorov net]# grep FAIL ../fcnal-test-4.log
TEST: ping local, VRF bind - ns-A IP [FAIL]
TEST: ping local, VRF bind - VRF IP [FAIL]
TEST: ping local, device bind - ns-A IP [FAIL]
TEST: ping local, VRF bind - ns-A IP [FAIL]
TEST: ping local, VRF bind - VRF IP [FAIL]
TEST: ping local, device bind - ns-A IP [FAIL]
[root@...mtodorov net]#
Kernel is the recent one:
[root@...mtodorov net]# uname -rms
Linux 6.4.0-rc5-testnet-00003-g5b23878f7ed9 x86_64
[root@...mtodorov net]#
>> However, I have a question:
>>
>> In the ping + "With VRF" section, the tests with net.ipv4.raw_l3mdev_accept=1
>> are repeated twice, while "No VRF" section has the versions:
>>
>> SYSCTL: net.ipv4.raw_l3mdev_accept=0
>>
>> and
>>
>> SYSCTL: net.ipv4.raw_l3mdev_accept=1
>>
>> The same happens with the IPv6 ping tests.
>>
>> In that case, it could be that we have only 2 actual FAIL cases,
>> because the error is reported twice.
>>
>> Is this intentional?
>
> I don't know why the non-VRF tests are run once with raw_l3mdev_accept=0
> and once with raw_l3mdev_accept=1. Unless I'm missing something, this
> option shouldn't affect non-VRF users. Maybe the objective is to make
> sure that it really doesn't affect them. David certainly knows better.
The problem appears to be that non-VRF tests are being ran with
raw_l3mdev_accept={0|1}, while VRF tests w raw_l3mdev_accept={1|1} ...
I will try to fix that, but I am not sure of the semantics either.
Regards,
Mirsad
Powered by blists - more mailing lists