[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220119021536.GA27703@xsang-OptiPlex-9020>
Date: Wed, 19 Jan 2022 10:15:36 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Michael Kelley <mikelley@...rosoft.com>,
Nishanth Menon <nm@...com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jason Gunthorpe <jgg@...dia.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ltp@...ts.linux.it
Subject: [genirq/msi] [confidence: ] cd6cf06590: stack_segment:#[##]
(
please be noted we reported:
"[genirq/msi] cf24208bdb: RIP:_raw_spin_lock_irqsave"
on https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/E63WJZCXP327GCKAQGACKDK3ZQ7JVZCX/
when this commit is in:
commit: cf24208bdbd0a9e0238c2514a10c49a610e26ee5 ("genirq/msi: Convert storage to xarray")
https://git.kernel.org/cgit/linux/kernel/git/tglx/devel.git msi
report again here as a reminder the issue still exists on mainline
)
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: cd6cf06590b9792340dceaa285138777f3cc4d90 ("genirq/msi: Convert storage to xarray")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: ltp
version: ltp-x86_64-14c1f76-1_20211218
with following parameters:
test: numa
ucode: 0x42e
test-description: The LTP testsuite contains a collection of tools for testing the Linux kernel and related features.
test-url: http://linux-test-project.github.io/
on test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
(
Note: in parent dmesg there is also similar
[ 15.367201][ T306] ==================================================================
[ 15.368136][ T306] BUG: KASAN: use-after-free in __pci_enable_msi_range+0x618/0x640
[ 15.368136][ T306] Read of size 2 at addr ffff888109acb464 by task kworker/0:2/306
but no following
"stack segment: 0000 [#1] "
or
"RIP: 0010:_raw_spin_lock_irqsave"
and ltp tests can go on
)
[ 20.017068][ T306] BUG: KASAN: use-after-free in __pci_enable_msi_range (drivers/pci/msi/msi.h:36 drivers/pci/msi/msi.c:474 drivers/pci/msi/msi.c:905)
[ 20.017068][ T306] Read of size 2 at addr ffff888f47a57854 by task kworker/0:2/306
[ 20.017068][ T306]
[ 20.017068][ T306] CPU: 0 PID: 306 Comm: kworker/0:2 Not tainted 5.16.0-rc5-00095-gcd6cf06590b9 #1
[ 20.017068][ T306] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 20.017068][ T306] Workqueue: events work_for_cpu_fn
[ 20.017068][ T306] Call Trace:
[ 20.017068][ T306] <TASK>
[ 20.017068][ T306] dump_stack_lvl (lib/dump_stack.c:107)
[ 20.017068][ T306] print_address_description+0x21/0x140
[ 20.017068][ T306] ? __pci_enable_msi_range (drivers/pci/msi/msi.h:36 drivers/pci/msi/msi.c:474 drivers/pci/msi/msi.c:905)
[ 20.017068][ T306] kasan_report.cold (mm/kasan/report.c:434 mm/kasan/report.c:450)
[ 20.017068][ T306] ? msi_domain_alloc_irqs_descs_locked (kernel/irq/msi.c:938)
[ 20.017068][ T306] ? __pci_enable_msi_range (drivers/pci/msi/msi.h:36 drivers/pci/msi/msi.c:474 drivers/pci/msi/msi.c:905)
[ 20.017068][ T306] __pci_enable_msi_range (drivers/pci/msi/msi.h:36 drivers/pci/msi/msi.c:474 drivers/pci/msi/msi.c:905)
[ 20.017068][ T306] pci_alloc_irq_vectors_affinity (drivers/pci/msi/msi.c:1029)
[ 20.017068][ T306] ? pci_enable_msix_range (drivers/pci/msi/msi.c:1008)
[ 20.017068][ T306] ? pci_address_to_pio+0x40/0x40
[ 20.017068][ T306] pcie_port_device_register (include/linux/pci.h:1882 drivers/pci/pcie/portdrv_core.c:107 drivers/pci/pcie/portdrv_core.c:178 drivers/pci/pcie/portdrv_core.c:353)
[ 20.017068][ T306] ? pcie_port_service_unregister (drivers/pci/pcie/portdrv_core.c:316)
[ 20.017068][ T306] ? dequeue_entity (kernel/sched/fair.c:4379)
[ 20.017068][ T306] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[ 20.017068][ T306] ? __switch_to (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/thread_info.h:89 arch/x86/include/asm/fpu/sched.h:65 arch/x86/kernel/process_64.c:622)
[ 20.017068][ T306] ? pcie_portdrv_remove (drivers/pci/pcie/portdrv_pci.c:103)
[ 20.017068][ T306] pcie_portdrv_probe (drivers/pci/pcie/portdrv_pci.c:117)
[ 20.017068][ T306] ? pcie_portdrv_remove (drivers/pci/pcie/portdrv_pci.c:103)
[ 20.017068][ T306] local_pci_probe (drivers/pci/pci-driver.c:323)
[ 20.017068][ T306] ? pci_device_shutdown (drivers/pci/pci-driver.c:305)
[ 20.017068][ T306] work_for_cpu_fn (kernel/workqueue.c:5194)
[ 20.017068][ T306] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
[ 20.017068][ T306] worker_thread (include/linux/list.h:284 kernel/workqueue.c:2358 kernel/workqueue.c:2450)
[ 20.017068][ T306] ? __kthread_parkme (arch/x86/include/asm/bitops.h:207 (discriminator 4) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 4) kernel/kthread.c:249 (discriminator 4))
[ 20.017068][ T306] ? schedule (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2120 (discriminator 1) kernel/sched/core.c:6328 (discriminator 1))
[ 20.017068][ T306] ? process_one_work (kernel/workqueue.c:2388)
[ 20.017068][ T306] ? process_one_work (kernel/workqueue.c:2388)
[ 20.017068][ T306] kthread (kernel/kthread.c:327)
[ 20.017068][ T306] ? set_kthread_struct (kernel/kthread.c:272)
[ 20.017068][ T306] ret_from_fork (arch/x86/entry/entry_64.S:301)
[ 20.017068][ T306] </TASK>
[ 20.017068][ T306]
[ 20.017068][ T306] Allocated by task 306:
[ 20.017068][ T306] kasan_save_stack (mm/kasan/common.c:38)
[ 20.017068][ T306] __kasan_kmalloc (mm/kasan/common.c:46 mm/kasan/common.c:434 mm/kasan/common.c:513 mm/kasan/common.c:522)
[ 20.017068][ T306] msi_add_msi_desc (include/linux/slab.h:590 include/linux/slab.h:724 kernel/irq/msi.c:38 kernel/irq/msi.c:85)
[ 20.017068][ T306] msi_setup_msi_desc (drivers/pci/msi/msi.c:366)
[ 20.017068][ T306] __pci_enable_msi_range (drivers/pci/msi/msi.c:448 drivers/pci/msi/msi.c:905)
[ 20.017068][ T306] pci_alloc_irq_vectors_affinity (drivers/pci/msi/msi.c:1029)
[ 20.017068][ T306] pcie_port_device_register (include/linux/pci.h:1882 drivers/pci/pcie/portdrv_core.c:107 drivers/pci/pcie/portdrv_core.c:178 drivers/pci/pcie/portdrv_core.c:353)
[ 20.017068][ T306] pcie_portdrv_probe (drivers/pci/pcie/portdrv_pci.c:117)
[ 20.017068][ T306] local_pci_probe (drivers/pci/pci-driver.c:323)
[ 20.017068][ T306] work_for_cpu_fn (kernel/workqueue.c:5194)
[ 20.017068][ T306] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
[ 20.017068][ T306] worker_thread (include/linux/list.h:284 kernel/workqueue.c:2358 kernel/workqueue.c:2450)
[ 20.017068][ T306] kthread (kernel/kthread.c:327)
[ 20.017068][ T306] ret_from_fork (arch/x86/entry/entry_64.S:301)
[ 20.017068][ T306]
[ 20.017068][ T306] Freed by task 306:
[ 20.017068][ T306] kasan_save_stack (mm/kasan/common.c:38)
[ 20.017068][ T306] kasan_set_track (mm/kasan/common.c:46)
[ 20.017068][ T306] kasan_set_free_info (mm/kasan/generic.c:372)
[ 20.017068][ T306] __kasan_slab_free (mm/kasan/common.c:368 mm/kasan/common.c:328 mm/kasan/common.c:374)
[ 20.017068][ T306] kfree (mm/slub.c:1749 mm/slub.c:3513 mm/slub.c:4561)
[ 20.017068][ T306] msi_free_msi_descs_range (kernel/irq/msi.c:59 kernel/irq/msi.c:160)
[ 20.017068][ T306] msi_domain_alloc_irqs_descs_locked (kernel/irq/msi.c:940)
[ 20.017068][ T306] __pci_enable_msi_range (drivers/pci/msi/msi.c:458 drivers/pci/msi/msi.c:905)
[ 20.017068][ T306] pci_alloc_irq_vectors_affinity (drivers/pci/msi/msi.c:1029)
[ 20.017068][ T306] pcie_port_device_register (include/linux/pci.h:1882 drivers/pci/pcie/portdrv_core.c:107 drivers/pci/pcie/portdrv_core.c:178 drivers/pci/pcie/portdrv_core.c:353)
[ 20.017068][ T306] pcie_portdrv_probe (drivers/pci/pcie/portdrv_pci.c:117)
[ 20.017068][ T306] local_pci_probe (drivers/pci/pci-driver.c:323)
[ 20.017068][ T306] work_for_cpu_fn (kernel/workqueue.c:5194)
[ 20.017068][ T306] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
[ 20.017068][ T306] worker_thread (include/linux/list.h:284 kernel/workqueue.c:2358 kernel/workqueue.c:2450)
[ 20.017068][ T306] kthread (kernel/kthread.c:327)
[ 20.017068][ T306] ret_from_fork (arch/x86/entry/entry_64.S:301)
[ 20.017068][ T306]
[ 20.017068][ T306] The buggy address belongs to the object at ffff888f47a57800
[ 20.017068][ T306] which belongs to the cache kmalloc-128 of size 128
[ 20.017068][ T306] The buggy address is located 84 bytes inside of
[ 20.017068][ T306] 128-byte region [ffff888f47a57800, ffff888f47a57880)
[ 20.017068][ T306] The buggy address belongs to the page:
[ 20.017068][ T306] page:00000000821cb941 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xf47a56
[ 20.017068][ T306] head:00000000821cb941 order:1 compound_mapcount:0
[ 20.017068][ T306] flags: 0x57ffffc0010200(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
[ 20.017068][ T306] raw: 0057ffffc0010200 0000000000000000 dead000000000122 ffff88810004c8c0
[ 20.017068][ T306] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[ 20.017068][ T306] page dumped because: kasan: bad access detected
[ 20.017068][ T306]
[ 20.017068][ T306] Memory state around the buggy address:
[ 20.017068][ T306] ffff888f47a57700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc
[ 20.017068][ T306] ffff888f47a57780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 20.017068][ T306] >ffff888f47a57800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 20.017068][ T306] ^
[ 20.017068][ T306] ffff888f47a57880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 20.017068][ T306] ffff888f47a57900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 20.017068][ T306] ==================================================================
[ 20.017068][ T306] Disabling lock debugging due to kernel taint
[ 20.616644][ T306] stack segment: 0000 [#1] SMP KASAN PTI
[ 20.617620][ T306] CPU: 0 PID: 306 Comm: kworker/0:2 Tainted: G B 5.16.0-rc5-00095-gcd6cf06590b9 #1
[ 20.617620][ T306] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 20.617620][ T306] Workqueue: events work_for_cpu_fn
[ 20.617620][ T306] RIP: 0010:_raw_spin_lock_irqsave (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:513 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[ 20.617620][ T306] Code: be 04 00 00 00 c7 44 24 20 00 00 00 00 e8 88 c0 2c fe be 04 00 00 00 48 8d 7c 24 20 e8 79 c0 2c fe ba 01 00 00 00 8b 44 24 20 <f0> 0f b1 55 00 75 2e 48 b8 00 00 00 00 00 fc ff df 48 c7 04 03 00
All code
========
0: be 04 00 00 00 mov $0x4,%esi
5: c7 44 24 20 00 00 00 movl $0x0,0x20(%rsp)
c: 00
d: e8 88 c0 2c fe callq 0xfffffffffe2cc09a
12: be 04 00 00 00 mov $0x4,%esi
17: 48 8d 7c 24 20 lea 0x20(%rsp),%rdi
1c: e8 79 c0 2c fe callq 0xfffffffffe2cc09a
21: ba 01 00 00 00 mov $0x1,%edx
26: 8b 44 24 20 mov 0x20(%rsp),%eax
2a:* f0 0f b1 55 00 lock cmpxchg %edx,0x0(%rbp) <-- trapping instruction
2f: 75 2e jne 0x5f
31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
38: fc ff df
3b: 48 rex.W
3c: c7 .byte 0xc7
3d: 04 03 add $0x3,%al
...
Code starting with the faulting instruction
===========================================
0: f0 0f b1 55 00 lock cmpxchg %edx,0x0(%rbp)
5: 75 2e jne 0x35
7: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
e: fc ff df
11: 48 rex.W
12: c7 .byte 0xc7
13: 04 03 add $0x3,%al
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.16.0-rc5-00095-gcd6cf06590b9" of type "text/plain" (177583 bytes)
View attachment "job-script" of type "text/plain" (5412 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (15036 bytes)
View attachment "ltp" of type "text/plain" (60065 bytes)
View attachment "job.yaml" of type "text/plain" (4466 bytes)
Powered by blists - more mailing lists