[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180116061641.GB32639@localhost.localdomain>
Date: Mon, 15 Jan 2018 23:16:41 -0700
From: Keith Busch <keith.busch@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG 4.15-rc7] IRQ matrix management errors
On Mon, Jan 15, 2018 at 01:13:29AM -0800, Thomas Gleixner wrote:
>
> The dmesg is not that interesting. The traces definitely are if you can
> identify the point where it goes into lala land.
The attached is with irq_matrix and irq_vector trace events enabled,
but I've stripped out irq_work_*, local_timer_*, rescheudle_*,
and call_function_* messages removed, as these appear to be traced
continuously unrelated to the allocations. I stopped at the first warning
rather than let it continue.
Here's the warning the trace events in the attachment were captured with:
[ 334.567321] WARNING: CPU: 28 PID: 1421 at kernel/irq/matrix.c:222 irq_matrix_remove_managed+0x10f/0x120
[ 334.567323] Modules linked in: nvme ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_uncore intel_rapl_perf iTCO_wdt joydev iTCO_vendor_support ipmi_ssif ipmi_si mei_me ipmi_devintf mei shpchp i2c_i801 ipmi_msghandler lpc_ich ioatdma wmi dca acpi_power_meter acpi_pad xfs libcrc32c ast i2c_algo_bit drm_kms_helper ttm drm i40e crc32c_intel
[ 334.567391] ptp nvme_core pps_core [last unloaded: nvme]
[ 334.567398] CPU: 28 PID: 1421 Comm: kworker/u674:3 Not tainted 4.15.0-rc8+ #6
[ 334.567401] Hardware name: Intel Corporation S2600STB/S2600STB, BIOS SE5C620.86B.00.01.2001.062220170731 06/22/2017
[ 334.567407] Workqueue: nvme-wq nvme_reset_work [nvme]
[ 334.567412] RIP: 0010:irq_matrix_remove_managed+0x10f/0x120
[ 334.567414] RSP: 0018:ffffbada0af13a88 EFLAGS: 00010046
[ 334.567417] RAX: 00000000000000ee RBX: ffff9c45bd824900 RCX: 0000000000000000
[ 334.567419] RDX: 0000000000000100 RSI: 00000000000000ee RDI: ffff9c45bd410c50
[ 334.567420] RBP: 0000000000000000 R08: 0000000000000100 R09: 0000000000000000
[ 334.567422] R10: 0000000000000018 R11: 0000000000000003 R12: ffff9c45bd410c00
[ 334.567423] R13: ffff9c45bd410c30 R14: 00000000000000ee R15: 00000000000000ee
[ 334.567426] FS: 0000000000000000(0000) GS:ffff9c55bcc00000(0000) knlGS:0000000000000000
[ 334.567427] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 334.567429] CR2: 0000557bd66ab730 CR3: 00000019f9209002 CR4: 00000000007606e0
[ 334.567431] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 334.567432] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 334.567433] PKRU: 55555554
[ 334.567434] Call Trace:
[ 334.567445] x86_vector_free_irqs+0xa1/0x180
[ 334.567451] x86_vector_alloc_irqs+0x1e4/0x3a0
[ 334.567455] msi_domain_alloc+0x62/0x130
[ 334.567463] ? kmem_cache_alloc_node_trace+0x1ac/0x1d0
[ 334.567467] __irq_domain_alloc_irqs+0x121/0x300
[ 334.567471] msi_domain_alloc_irqs+0x99/0x2e0
[ 334.567477] native_setup_msi_irqs+0x54/0x90
[ 334.567484] __pci_enable_msix+0xfb/0x4e0
[ 334.567489] pci_alloc_irq_vectors_affinity+0x8e/0x130
[ 334.567495] nvme_reset_work+0x919/0x153b [nvme]
[ 334.567503] ? update_curr+0xe4/0x1d0
[ 334.567508] ? account_entity_dequeue+0xa4/0xd0
[ 334.567512] ? dequeue_entity+0xd5/0x430
[ 334.567515] ? pick_next_task_fair+0x14f/0x5f0
[ 334.567525] ? __switch_to+0xa2/0x430
[ 334.567532] ? sched_clock+0x5/0x10
[ 334.567536] ? put_prev_entity+0x1e/0xe0
[ 334.567542] process_one_work+0x182/0x370
[ 334.567546] worker_thread+0x2e/0x380
[ 334.567549] ? process_one_work+0x370/0x370
[ 334.567554] kthread+0x111/0x130
[ 334.567560] ? kthread_create_worker_on_cpu+0x70/0x70
[ 334.567566] ? do_syscall_64+0x61/0x170
[ 334.567573] ? SyS_exit_group+0x10/0x10
[ 334.567580] ret_from_fork+0x1f/0x30
[ 334.567584] Code: 89 ea 44 89 f6 41 ff d1 4d 8b 0f 4d 85 c9 75 e2 e9 2a ff ff ff 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f ff e9 14 ff ff ff <0f> ff e9 0d ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
[ 334.567637] ---[ end trace b97ec0c6a11aa61f ]---
View attachment "irq-trace-events" of type "text/plain" (201307 bytes)
Powered by blists - more mailing lists