netdev - Re: [PATCH net] selftests/net: big_tcp: longer netperf session on slow machines

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tt8sfmlk.fsf@nvidia.com>
Date: Mon, 17 Feb 2025 18:50:08 +0100
From: Petr Machata <petrm@...dia.com>
To: Pablo Martin Medrano <pablmart@...hat.com>
CC: <netdev@...r.kernel.org>, "David S . Miller" <davem@...emloft.net>, "Eric
 Dumazet" <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, "Paolo
 Abeni" <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, Shuah Khan
	<shuah@...nel.org>
Subject: Re: [PATCH net] selftests/net: big_tcp: longer netperf session on
 slow machines


Pablo Martin Medrano <pablmart@...hat.com> writes:

> After debugging the following output for big_tcp.sh on a board:
>
> CLI GSO | GW GRO | GW GSO | SER GRO
> on        on       on       on      : [PASS]
> on        off      on       off     : [PASS]
> off       on       on       on      : [FAIL_on_link1]
> on        on       off      on      : [FAIL_on_link1]
>
> Davide Caratti found that by default the test duration 1s is too short
> in slow systems to reach the correct cwd size necessary for tcp/ip to
> generate at least one packet bigger than 65536 to hit the iptables match
> on length rule the test uses.
>
> This skips (with xfail) the aforementioned failing combinations when
> KSFT_MACHINE_SLOW is set.
> ---
>  tools/testing/selftests/net/big_tcp.sh | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/tools/testing/selftests/net/big_tcp.sh b/tools/testing/selftests/net/big_tcp.sh
> index 2db9d15cd45f..e613dc3d84ad 100755
> --- a/tools/testing/selftests/net/big_tcp.sh
> +++ b/tools/testing/selftests/net/big_tcp.sh
> @@ -21,8 +21,7 @@ CLIENT_GW6="2001:db8:1::2"
>  MAX_SIZE=128000
>  CHK_SIZE=65535
>  
> -# Kselftest framework requirement - SKIP code is 4.
> -ksft_skip=4
> +source lib.sh
>  
>  setup() {
>  	ip netns add $CLIENT_NS
> @@ -157,12 +156,20 @@ do_test() {
>  }
>  
>  testup() {
> -	echo "CLI GSO | GW GRO | GW GSO | SER GRO" && \
> -	do_test "on"  "on"  "on"  "on"  && \
> -	do_test "on"  "off" "on"  "off" && \
> -	do_test "off" "on"  "on"  "on"  && \
> -	do_test "on"  "on"  "off" "on"  && \
> -	do_test "off" "on"  "off" "on"
> +	echo "CLI GSO | GW GRO | GW GSO | SER GRO"
> +	input_by_test=(
> +	" on  on  on  on"
> +	" on off  on off"
> +	"off  on  on  on"
> +	" on  on off  on"
> +	"off  on off  on"
> +	)
> +	for test_values in "${input_by_test[@]}"; do
> +		do_test ${test_values[0]}
> +		xfail_on_slow check_err $? "test failed"
> +		# check_err sets $RET with $ksft_xfail or $ksft_fail (or 0)
> +		test $RET = 0 || return $RET

This bails out on first failure though, whereas previously it would run
all the tests. Is that intentional?

Looking at the test, it looks like do_test itself could be converted to
lib.sh as follows (sorry, this is a cut-n-paste from the terminal, so
tabs are gone):

@@ -134,3 +133,4 @@ do_test() {
         local ser_gro=$4
-        local ret="PASS"
+
+        RET=0

@@ -145,7 +145,8 @@ do_test() {

-        if check_counter link1 $ROUTER_NS; then
-                check_counter link3 $SERVER_NS || ret="FAIL_on_link3"
-        else
-                ret="FAIL_on_link1"
-        fi
+        check_counter link1 $ROUTER_NS
+        false
+        check_err $? "fail on link1"
+
+        check_counter link3 $SERVER_NS
+        check_err $? "fail on link3"

@@ -153,5 +154,6 @@ do_test() {
         stop_counter link3 $SERVER_NS
-        printf "%-9s %-8s %-8s %-8s: [%s]\n" \
-                $cli_tso $gw_gro $gw_tso $ser_gro $ret
-        test $ret = "PASS"
+
+        log_test "$(printf "%-9s %-8s %-8s %-8s" \
+                            $cli_tso $gw_gro $gw_tso $ser_gro)"
+        :
 }
@@ -159,3 +161,3 @@ do_test() {
 testup() {
-        echo "CLI GSO | GW GRO | GW GSO | SER GRO" && \
+        echo "      CLI GSO | GW GRO | GW GSO | SER GRO" && \
         do_test "on"  "on"  "on"  "on"  && \
@@ -178,2 +177,3 @@ fi
 trap cleanup EXIT
+xfail_on_slow
 setup && echo "Testing for BIG TCP:" && \
@@ -181,2 +181,2 @@ NF=4 testup && echo "***v4 Tests Done***" && \
 NF=6 testup && echo "***v6 Tests Done***"
-exit $?
+exit $EXIT_STATUS

That way you only really touch the bits that do the actual checks to
port them over to the log_test framework. xfail_on_slow() is usually
called on a per-check basis, but if anything in the test can fail, I
think it's fair to just call it like I show so that it toggles the
condition globally.

Then I'm getting this for slow machine with an injected failure:

bash-5.2# KSFT_MACHINE_SLOW=yes ./big_tcp.sh                                                        │
Error: Failed to load TC action module.                                                             │
We have an error talking to the kernel                                                              │
Error: Failed to load TC action module.                                                             │
We have an error talking to the kernel                                                              │
Testing for BIG TCP:                                                                                │
      CLI GSO | GW GRO | GW GSO | SER GRO                                                           │
TEST: on        on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        off      on       off                               [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
***v4 Tests Done***                                                                                 │
      CLI GSO | GW GRO | GW GSO | SER GRO                                                           │
TEST: on        on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        off      on       off                               [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
***v6 Tests Done***                                                                                 │
bash-5.2# echo $?                                                                                   │
0                                                                                                   │

... and for non-KSFT_MACHINE_SLOW, I get FAILs with $? of 1, i.e. what
we are after.

> +	done
>  }
>  
>  if ! netperf -V &> /dev/null; then