lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220503102803.GB22788@xsang-OptiPlex-9020>
Date:   Tue, 3 May 2022 18:28:03 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Baokun Li <libaokun1@...wei.com>
Cc:     0day robot <lkp@...el.com>, Hulk Robot <hulkci@...wei.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        linux-ext4@...r.kernel.org, tytso@....edu,
        adilger.kernel@...ger.ca, jack@...e.cz, yi.zhang@...wei.com,
        yebin10@...wei.com, yukuai3@...wei.com, libaokun1@...wei.com,
        stable@...r.kernel.org
Subject: [ext4]  bc88befa0c: INFO:task_blocked_for_more_than#seconds



Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: bc88befa0c58cf6a97b141854fba66eb26c1a8f4 ("[PATCH v2] ext4: fix race condition between ext4_write and ext4_convert_inline_data")
url: https://github.com/intel-lab-lkp/linux/commits/Baokun-Li/ext4-fix-race-condition-between-ext4_write-and-ext4_convert_inline_data/20220426-220149
base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/linux-ext4/20220426140658.1046700-1-libaokun1@huawei.com

in testcase: xfstests
version: xfstests-x86_64-46e1b83-1_20220414
with following parameters:

	nr_pmem: 4
	fs: ext4
	test: generic-dax
	ucode: 0x700001c

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  993.306815][  T112] INFO: task xfs_io:2214 blocked for more than 491 seconds.
[  993.315644][  T112]       Not tainted 5.17.0-rc5-00046-gbc88befa0c58 #1
[  993.323888][  T112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  993.334067][  T112] task:xfs_io          state:D stack:    0 pid: 2214 ppid:  2213 flags:0x00000000
[  993.344788][  T112] Call Trace:
[  993.349603][  T112]  <TASK>
[ 993.354064][ T112] __schedule (kernel/sched/core.c:4995 kernel/sched/core.c:6304) 
[ 993.359989][ T112] ? io_schedule_timeout (kernel/sched/core.c:6183) 
[ 993.366775][ T112] ? _raw_spin_lock_irq (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170) 
[ 993.373317][ T112] schedule (include/linux/instrumented.h:71 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:134 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2127 (discriminator 1) kernel/sched/core.c:6379 (discriminator 1)) 
[ 993.378880][ T112] rwsem_down_write_slowpath (arch/x86/include/asm/current.h:15 kernel/locking/rwsem.c:1144) 
[ 993.386003][ T112] ? check_preempt_curr (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:1991 (discriminator 1) include/linux/sched.h:2006 (discriminator 1) kernel/sched/core.c:2135 (discriminator 1)) 
[ 993.392684][ T112] ? attach_entity_cfs_rq (arch/x86/include/asm/atomic64_64.h:46 include/linux/atomic/atomic-long.h:53 include/linux/atomic/atomic-instrumented.h:1295 kernel/sched/fair.c:3332 kernel/sched/fair.c:11316) 
[ 993.399523][ T112] ? up_read (kernel/locking/rwsem.c:1053) 
[ 993.405085][ T112] ? __cond_resched (kernel/sched/core.c:8200) 
[ 993.411216][ T112] ? down_write (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/locking/rwsem.c:138 kernel/locking/rwsem.c:255 kernel/locking/rwsem.c:1258 kernel/locking/rwsem.c:1268 kernel/locking/rwsem.c:1515) 
[ 993.417072][ T112] ? ext4_convert_inline_data (arch/x86/include/asm/bitops.h:75 include/asm-generic/bitops/instrumented-atomic.h:42 fs/ext4/ext4.h:1933 fs/ext4/inline.c:2007) 
[ 993.424240][ T112] down_write (kernel/locking/rwsem.c:1516) 
[ 993.430013][ T112] ? down_write_killable (kernel/locking/rwsem.c:1512) 
[ 993.436732][ T112] ext4_fallocate (include/linux/instrumented.h:71 (discriminator 12) include/asm-generic/bitops/instrumented-non-atomic.h:134 (discriminator 12) fs/ext4/ext4.h:1918 (discriminator 12) fs/ext4/extents.c:4730 (discriminator 12)) 
[ 993.442821][ T112] vfs_fallocate (fs/open.c:308) 
[ 993.448801][ T112] __x64_sys_fallocate (include/linux/file.h:45 fs/open.c:332 fs/open.c:339 fs/open.c:337 fs/open.c:337) 
[ 993.455223][ T112] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[ 993.461038][ T112] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) 
[  993.468293][  T112] RIP: 0033:0x7f391abb246d
[  993.474055][  T112] RSP: 002b:00007ffe20742f10 EFLAGS: 00000293 ORIG_RAX: 000000000000011d
[  993.483796][  T112] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f391abb246d
[  993.493068][  T112] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
[  993.502334][  T112] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  993.511582][  T112] R10: 0000000000100000 R11: 0000000000000293 R12: 0000000000000000
[  993.520862][  T112] R13: 0000000000100000 R14: 000000000252bef0 R15: 0000000000000001
[  993.530120][  T112]  </TASK>
[ 1484.826867][  T112] INFO: task xfs_io:2214 blocked for more than 983 seconds.
[ 1484.835749][  T112]       Not tainted 5.17.0-rc5-00046-gbc88befa0c58 #1
[ 1484.843764][  T112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1484.853792][  T112] task:xfs_io          state:D stack:    0 pid: 2214 ppid:  2213 flags:0x00000000
[ 1484.864324][  T112] Call Trace:
[ 1484.868910][  T112]  <TASK>
[ 1484.873163][ T112] __schedule (kernel/sched/core.c:4995 kernel/sched/core.c:6304) 
[ 1484.878929][ T112] ? io_schedule_timeout (kernel/sched/core.c:6183) 
[ 1484.885546][ T112] ? _raw_spin_lock_irq (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170) 
[ 1484.892010][ T112] schedule (include/linux/instrumented.h:71 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:134 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2127 (discriminator 1) kernel/sched/core.c:6379 (discriminator 1)) 
[ 1484.897384][ T112] rwsem_down_write_slowpath (arch/x86/include/asm/current.h:15 kernel/locking/rwsem.c:1144) 
[ 1484.904336][ T112] ? check_preempt_curr (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:1991 (discriminator 1) include/linux/sched.h:2006 (discriminator 1) kernel/sched/core.c:2135 (discriminator 1)) 
[ 1484.910819][ T112] ? attach_entity_cfs_rq (arch/x86/include/asm/atomic64_64.h:46 include/linux/atomic/atomic-long.h:53 include/linux/atomic/atomic-instrumented.h:1295 kernel/sched/fair.c:3332 kernel/sched/fair.c:11316) 
[ 1484.917512][ T112] ? up_read (kernel/locking/rwsem.c:1053) 
[ 1484.922830][ T112] ? __cond_resched (kernel/sched/core.c:8200) 
[ 1484.928787][ T112] ? down_write (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/locking/rwsem.c:138 kernel/locking/rwsem.c:255 kernel/locking/rwsem.c:1258 kernel/locking/rwsem.c:1268 kernel/locking/rwsem.c:1515) 
[ 1484.934453][ T112] ? ext4_convert_inline_data (arch/x86/include/asm/bitops.h:75 include/asm-generic/bitops/instrumented-atomic.h:42 fs/ext4/ext4.h:1933 fs/ext4/inline.c:2007) 
[ 1484.941387][ T112] down_write (kernel/locking/rwsem.c:1516) 
[ 1484.946908][ T112] ? down_write_killable (kernel/locking/rwsem.c:1512) 
[ 1484.953416][ T112] ext4_fallocate (include/linux/instrumented.h:71 (discriminator 12) include/asm-generic/bitops/instrumented-non-atomic.h:134 (discriminator 12) fs/ext4/ext4.h:1918 (discriminator 12) fs/ext4/extents.c:4730 (discriminator 12)) 
[ 1484.959250][ T112] vfs_fallocate (fs/open.c:308) 
[ 1484.964958][ T112] __x64_sys_fallocate (include/linux/file.h:45 fs/open.c:332 fs/open.c:339 fs/open.c:337 fs/open.c:337) 
[ 1484.971154][ T112] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[ 1484.976720][ T112] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) 
[ 1484.983783][  T112] RIP: 0033:0x7f391abb246d
[ 1484.989416][  T112] RSP: 002b:00007ffe20742f10 EFLAGS: 00000293 ORIG_RAX: 000000000000011d
[ 1484.999026][  T112] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f391abb246d
[ 1485.008215][  T112] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
[ 1485.017408][  T112] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1485.026622][  T112] R10: 0000000000100000 R11: 0000000000000293 R12: 0000000000000000
[ 1485.035801][  T112] R13: 0000000000100000 R14: 000000000252bef0 R15: 0000000000000001
[ 1485.045007][  T112]  </TASK>


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.17.0-rc5-00046-gbc88befa0c58" of type "text/plain" (164819 bytes)

View attachment "job-script" of type "text/plain" (6114 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (27476 bytes)

View attachment "xfstests" of type "text/plain" (769 bytes)

View attachment "job.yaml" of type "text/plain" (4902 bytes)

View attachment "reproduce" of type "text/plain" (873 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ