lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220430144902.GB27004@xsang-OptiPlex-9020>
Date:   Sat, 30 Apr 2022 22:49:02 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     Ammar Faizi <ammarfaizi2@...weeb.org>, lkp@...ts.01.org,
        lkp@...el.com, LKML <linux-kernel@...r.kernel.org>
Subject: [net]  02dcea6024: INFO:task_blocked_for_more_than#seconds


(please be noted we reported
"[net]  02dcea6024:  netperf.Throughput_tps -9.1% regression"
on https://lore.kernel.org/all/20220429062049.GA8992@xsang-OptiPlex-9020/
yesterday, but now we found a func issue on fsmark tests.
FYI)

Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 02dcea6024d411cda23bd77e5e604f94a32c332d ("net: allow sk_prot->release_cb() without sock lock held")
https://github.com/ammarfaizi2/linux-block axboe/linux-block/sock-nolock.2

in testcase: fsmark
version: fsmark-x86_64-698ee57-1_20220410
with following parameters:

	iterations: 1x
	nr_threads: 1t
	disk: 1HDD
	fs: xfs
	fs2: nfsv4
	filesize: 4K
	test_size: 50M
	sync_method: fsyncBeforeClose
	nr_files_per_directory: 1fpd
	cpufreq_governor: performance
	ucode: 0xd000331

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/


on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  991.054690][  T722] INFO: task nfsd:4136 blocked for more than 491 seconds.
[  991.061875][  T722]       Not tainted 5.18.0-rc4-00072-g02dcea6024d4 #2
[  991.068688][  T722] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  991.077381][  T722] task:nfsd            state:D stack:    0 pid: 4136 ppid:     2 flags:0x00004000
[  991.086602][  T722] Call Trace:
[  991.089925][  T722]  <TASK>
[ 991.092899][ T722] __schedule (kbuild/src/x86_64/kernel/sched/core.c:5073 kbuild/src/x86_64/kernel/sched/core.c:6388) 
[ 991.097265][ T722] schedule (kbuild/src/x86_64/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/x86_64/kernel/sched/core.c:6461 (discriminator 1)) 
[ 991.101278][ T722] __lock_sock (kbuild/src/x86_64/include/linux/spinlock.h:354 kbuild/src/x86_64/net/core/sock.c:2771) 
[ 991.105553][ T722] ? prepare_to_wait_exclusive (kbuild/src/x86_64/kernel/sched/wait.c:414) 
[ 991.111222][ T722] lock_sock_nested (kbuild/src/x86_64/net/core/sock.c:3317) 
[ 991.115944][ T722] tcp_recvmsg (kbuild/src/x86_64/net/ipv4/tcp.c:2574) 
[ 991.120310][ T722] inet6_recvmsg (kbuild/src/x86_64/net/ipv6/af_inet6.c:674 (discriminator 8)) 
[ 991.124854][ T722] svc_tcp_read_marker+0xa6/0x180 
[ 991.130949][ T722] svc_tcp_recvfrom (kbuild/src/x86_64/net/sunrpc/svcsock.c:994) 
[ 991.135746][ T722] ? del_timer_sync (kbuild/src/x86_64/kernel/time/timer.c:1384) 
[ 991.140456][ T722] ? schedule_timeout (kbuild/src/x86_64/kernel/time/timer.c:1890) 
[ 991.145422][ T722] svc_handle_xprt (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:846) 
[ 991.150212][ T722] svc_recv (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:884) 
[ 991.154311][ T722] ? nfsd_shutdown_threads (kbuild/src/x86_64/fs/nfsd/nfssvc.c:932) nfsd
[ 991.160251][ T722] nfsd (kbuild/src/x86_64/fs/nfsd/nfssvc.c:973 (discriminator 1)) nfsd
[ 991.164629][ T722] kthread (kbuild/src/x86_64/kernel/kthread.c:376) 
[ 991.168651][ T722] ? kthread_complete_and_exit (kbuild/src/x86_64/kernel/kthread.c:331) 
[ 991.174316][ T722] ret_from_fork (kbuild/src/x86_64/arch/x86/entry/entry_64.S:298) 
[  991.178782][  T722]  </TASK>
[ 1482.574672][  T722] INFO: task nfsd:4136 blocked for more than 983 seconds.
[ 1482.581894][  T722]       Not tainted 5.18.0-rc4-00072-g02dcea6024d4 #2
[ 1482.588737][  T722] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1482.597456][  T722] task:nfsd            state:D stack:    0 pid: 4136 ppid:     2 flags:0x00004000
[ 1482.606714][  T722] Call Trace:
[ 1482.610068][  T722]  <TASK>
[ 1482.613066][ T722] __schedule (kbuild/src/x86_64/kernel/sched/core.c:5073 kbuild/src/x86_64/kernel/sched/core.c:6388) 
[ 1482.617455][ T722] schedule (kbuild/src/x86_64/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/x86_64/kernel/sched/core.c:6461 (discriminator 1)) 
[ 1482.621488][ T722] __lock_sock (kbuild/src/x86_64/include/linux/spinlock.h:354 kbuild/src/x86_64/net/core/sock.c:2771) 
[ 1482.625796][ T722] ? prepare_to_wait_exclusive (kbuild/src/x86_64/kernel/sched/wait.c:414) 
[ 1482.631482][ T722] lock_sock_nested (kbuild/src/x86_64/net/core/sock.c:3317) 
[ 1482.636203][ T722] tcp_recvmsg (kbuild/src/x86_64/net/ipv4/tcp.c:2574) 
[ 1482.640590][ T722] inet6_recvmsg (kbuild/src/x86_64/net/ipv6/af_inet6.c:674 (discriminator 8)) 
[ 1482.645145][ T722] svc_tcp_read_marker+0xa6/0x180 
[ 1482.651256][ T722] svc_tcp_recvfrom (kbuild/src/x86_64/net/sunrpc/svcsock.c:994) 
[ 1482.656065][ T722] ? del_timer_sync (kbuild/src/x86_64/kernel/time/timer.c:1384) 
[ 1482.660796][ T722] ? schedule_timeout (kbuild/src/x86_64/kernel/time/timer.c:1890) 
[ 1482.665790][ T722] svc_handle_xprt (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:846) 
[ 1482.670602][ T722] svc_recv (kbuild/src/x86_64/net/sunrpc/svc_xprt.c:884) 
[ 1482.674717][ T722] ? nfsd_shutdown_threads (kbuild/src/x86_64/fs/nfsd/nfssvc.c:932) nfsd
[ 1482.680676][ T722] nfsd (kbuild/src/x86_64/fs/nfsd/nfssvc.c:973 (discriminator 1)) nfsd
[ 1482.685049][ T722] kthread (kbuild/src/x86_64/kernel/kthread.c:376) 
[ 1482.689067][ T722] ? kthread_complete_and_exit (kbuild/src/x86_64/kernel/kthread.c:331) 
[ 1482.694735][ T722] ret_from_fork (kbuild/src/x86_64/arch/x86/entry/entry_64.S:298) 
[ 1482.699180][  T722]  </TASK>


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.18.0-rc4-00072-g02dcea6024d4" of type "text/plain" (162679 bytes)

View attachment "job-script" of type "text/plain" (8644 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (35312 bytes)

View attachment "job.yaml" of type "text/plain" (5783 bytes)

View attachment "reproduce" of type "text/plain" (885 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ