lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Mar 2023 14:52:07 +0100
From:   Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To:     netdev@...r.kernel.org, linux-kselftest@...r.kernel.org
Cc:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, Shuah Khan <shuah@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: BUG: selftest/net/tun: Hang in unregister_netdevice

On 3/14/23 12:45, Mirsad Todorovac wrote:
> Hi, all!
> 
> After running tools/testing/selftests/net/tun, there seems to be some kind of hang
> in test "FAIL  tun.reattach_delete_close" or "FAIL  tun.reattach_close_delete".
> 
> Two tests exit by timeout, but the processes left are unkillable, even with kill -9 PID:
> 
> [root@...mtodorov linux_torvalds]# ps -ef | grep tun
> root        1140       1  0 12:16 ?        00:00:00 /bin/bash /usr/sbin/ksmtuned
> root        1333       1  0 12:16 ?        00:00:01 /usr/libexec/platform-python -Es /usr/sbin/tuned -l -P
> root        3930    2309  0 12:20 pts/1    00:00:00 tools/testing/selftests/net/tun
> root        3952    2309  0 12:21 pts/1    00:00:00 tools/testing/selftests/net/tun
> root        4056    3765  0 12:25 pts/1    00:00:00 grep --color=auto tun
> [root@...mtodorov linux_torvalds]# kill -9 3930 3952
> [root@...mtodorov linux_torvalds]# ps -ef | grep tun
> root        1140       1  0 12:16 ?        00:00:00 /bin/bash /usr/sbin/ksmtuned
> root        1333       1  0 12:16 ?        00:00:01 /usr/libexec/platform-python -Es /usr/sbin/tuned -l -P
> root        3930    2309  0 12:20 pts/1    00:00:00 tools/testing/selftests/net/tun
> root        3952    2309  0 12:21 pts/1    00:00:00 tools/testing/selftests/net/tun
> root        4060    3765  0 12:25 pts/1    00:00:00 grep --color=auto tun
> [root@...mtodorov linux_torvalds]#
> 
> The kernel seems to be stuck in some loop, and filling the log with the
> following messages until reboot, where it is also waiting very long on the
> situation to timeout, which apparently never happens.
> 
> Mar 14 11:54:09 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
> Mar 14 11:54:19 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
> Mar 14 11:54:29 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
> Mar 14 11:54:40 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
> Mar 14 11:54:50 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
> 
> The platform is kernel 6.3.0-rc2 on AlmaLinux 8.7 and a LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB
> (lshw output attached).
> 
> The .config is here:
> 
> https://domac.alu.hr/~mtodorov/linux/selftests/net-tun/config-6.3.0-rc2-mg-andy-devres-00006-gfc89d7fb499b
> 
> Basically, it is a vanilla Torvalds tree kernel with MGLRU, KMEMLEAK, and CONFIG_DEBUG_KOBJECT enabled.
> And devres patch.
> 
> Please find the strace of the net/tun run attached.
> 
> I am available for additional diagnostics.

Hi, again!

I've been busy while waiting for reply, so I wondered how would a vanilla kernel
go through the test, considering that I've been testing a number of patches
lately.

I did a fresh git clone from repo and woa.

Surprisingly, the test with CONFIG_DEBUG_KOBJECT turned off passes:

[root@...mtodorov linux_torvalds]# tools/testing/selftests/net/tun
TAP version 13
1..5
# Starting 5 tests from 1 test cases.
#  RUN           tun.delete_detach_close ...
#            OK  tun.delete_detach_close
ok 1 tun.delete_detach_close
#  RUN           tun.detach_delete_close ...
#            OK  tun.detach_delete_close
ok 2 tun.detach_delete_close
#  RUN           tun.detach_close_delete ...
#            OK  tun.detach_close_delete
ok 3 tun.detach_close_delete
#  RUN           tun.reattach_delete_close ...
#            OK  tun.reattach_delete_close
ok 4 tun.reattach_delete_close
#  RUN           tun.reattach_close_delete ...
#            OK  tun.reattach_close_delete
ok 5 tun.reattach_close_delete
# PASSED: 5 / 5 tests passed.
# Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0
[root@...mtodorov linux_torvalds]#

So, no hanging processes that cannot be killed now.

If you think it is worthy to explore the lockup that occurs when turning
CONFIG_DEBUG_KOBJECT=y, I will rebuild once again with these turned on,
to clear any doubts.

Until later.

Best regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ