[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180921164118.3dfb13db@cakuba.netronome.com>
Date: Fri, 21 Sep 2018 16:41:18 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Michael Chan <michael.chan@...adcom.com>,
Ariel Elior <ariel.elior@...ium.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Tariq Toukan <tariqt@...lanox.com>,
Saeed Mahameed <saeedm@...lanox.com>,
Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
Song Liu <songliubraving@...com>,
Jay Vosburgh <j.vosburgh@...il.com>,
Veaceslav Falico <vfalico@...il.com>,
Andy Gospodarek <andy@...yhouse.net>
Subject: Re: [PATCH net 14/15] nfp: remove ndo_poll_controller
On Fri, 21 Sep 2018 15:27:51 -0700, Eric Dumazet wrote:
> As diagnosed by Song Liu, ndo_poll_controller() can
> be very dangerous on loaded hosts, since the cpu
> calling ndo_poll_controller() might steal all NAPI
> contexts (for all RX/TX queues of the NIC). This capture
> can last for unlimited amount of time, since one
> cpu is generally not able to drain all the queues under load.
>
> nfp uses NAPI for TX completions, so we better let core
> networking stack call the napi->poll() to avoid the capture.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Acked-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
My testing only detects the pre-existing lockdep splat :)
The test runs multiple netperf sessions and it has threads spewing the
"noise_xyz" lines into /dev/kmsg, then another thread sends series of
the "this_is_line_X_" lines which are counted on the other end to make
sure we don't drop the messages.
And every now and then the test will do something random like reload
the driver or bring the link down or install an XDP program.
[ 1511.417141] BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0
[ 1511.417173] noise_KVEQAAHZ
[ 1511.417178] stuck for 35s!
[ 1511.425191] noise_KVEQAAHZ
[ 1511.428134] Showing busy workqueues and worker pools:
[ 1511.434080] noise_KVEQAAHZ
[ 1511.434744] workqueue events: flags=0x0
[ 1511.440908] noise_KVEQAAHZ
[ 1511.443919] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=2/256
[ 1511.458404] in-flight: 197:free_obj_work
[ 1511.463337] pending: free_obj_work
[ 1511.467670]
[ 1511.467671] ======================================================
[ 1511.467672] WARNING: possible circular locking dependency detected
[ 1511.467673] 4.19.0-rc4-debug-00636-g4718a743bc48 #23 Not tainted
[ 1511.467674] ------------------------------------------------------
[ 1511.467674] ksoftirqd/0/10 is trying to acquire lock:
[ 1511.467675] 0000000081e5d4d3 (console_owner){-.-.}, at: vprintk_emit+0x2de/0x480
[ 1511.467678]
[ 1511.467679] but task is already holding lock:
[ 1511.467680] 00000000ebab6c31 (&(&pool->lock)->rlock){-.-.}, at: show_workqueue_state.cold.34+0x36d/0x13d7
[ 1511.467683]
[ 1511.467684] which lock already depends on the new lock.
[ 1511.467684]
[ 1511.467685]
[ 1511.467686] the existing dependency chain (in reverse order) is:
[ 1511.467686]
[ 1511.467687] -> #2 (&(&pool->lock)->rlock){-.-.}:
[ 1511.467690] _raw_spin_lock+0x2f/0x40
[ 1511.467691] __queue_work+0x2f4/0x11d0
[ 1511.467691] __queue_delayed_work+0x22d/0x380
[ 1511.467692] queue_delayed_work_on+0x8d/0xa0
[ 1511.467693] netpoll_send_skb_on_dev+0x307/0x7e0
[ 1511.467694] netpoll_send_udp+0x1249/0x1910
[ 1511.467695] write_msg+0x285/0x2f0 [netconsole]
[ 1511.467695] console_unlock+0x6a0/0xc90
[ 1511.467696] vprintk_emit+0x1dc/0x480
[ 1511.467697] printk_emit+0x88/0xae
[ 1511.467697] devkmsg_write+0x2e5/0x360
[ 1511.467698] __vfs_write+0x4e9/0x890
[ 1511.467699] vfs_write+0x156/0x4a0
[ 1511.467699] ksys_write+0xd6/0x1d0
[ 1511.467700] do_syscall_64+0x146/0x5c0
[ 1511.467701] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1511.467701]
[ 1511.467702] -> #1 (target_list_lock){-.-.}:
[ 1511.467705] _raw_spin_lock_irqsave+0x46/0x60
[ 1511.467706] write_msg+0xe1/0x2f0 [netconsole]
[ 1511.467706] console_unlock+0x6a0/0xc90
[ 1511.467707] vprintk_emit+0x1dc/0x480
[ 1511.467708] printk+0x9f/0xc5
[ 1511.467708] netpoll_print_options+0x77/0x2f0
[ 1511.467709] enabled_store+0x174/0x320 [netconsole]
[ 1511.467710] configfs_write_file+0x250/0x400 [configfs]
[ 1511.467711] __vfs_write+0xde/0x890
[ 1511.467711] vfs_write+0x156/0x4a0
[ 1511.467712] ksys_write+0xd6/0x1d0
[ 1511.467713] do_syscall_64+0x146/0x5c0
[ 1511.467713] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1511.467714]
[ 1511.467714] -> #0 (console_owner){-.-.}:
[ 1511.467717] lock_acquire+0x183/0x400
[ 1511.467718] vprintk_emit+0x339/0x480
[ 1511.467719] printk+0x9f/0xc5
[ 1511.467720] show_workqueue_state.cold.34+0x40d/0x13d7
[ 1511.467721] wq_watchdog_timer_fn+0x598/0x850
[ 1511.467721] call_timer_fn+0x21b/0x720
[ 1511.467722] expire_timers+0x34a/0x550
[ 1511.467723] run_timer_softirq+0x45e/0x6a0
[ 1511.467724] __do_softirq+0x2ca/0xb1d
[ 1511.467725] run_ksoftirqd+0x3c/0x60
[ 1511.467725] smpboot_thread_fn+0x58a/0x9c0
[ 1511.467726] kthread+0x315/0x3d0
[ 1511.467727] ret_from_fork+0x24/0x30
[ 1511.467728]
[ 1511.467729] other info that might help us debug this:
[ 1511.467729]
[ 1511.467730] Chain exists of:
[ 1511.467730] console_owner --> target_list_lock --> &(&pool->lock)->rlock
[ 1511.467734]
[ 1511.467735] Possible unsafe locking scenario:
[ 1511.467735]
[ 1511.467736] CPU0 CPU1
[ 1511.467737] ---- ----
[ 1511.467737] lock(&(&pool->lock)->rlock);
[ 1511.467739] lock(target_list_lock);
[ 1511.467741] lock(&(&pool->lock)->rlock);
[ 1511.467743] lock(console_owner);
[ 1511.467744]
[ 1511.467745] *** DEADLOCK ***
[ 1511.467746]
[ 1511.467746] 3 locks held by ksoftirqd/0/10:
[ 1511.467747] #0: 0000000007b706eb ((&wq_watchdog_timer)){+.-.}, at: call_timer_fn+0x1c3/0x720
[ 1511.467750] #1: 00000000470e0b00 (rcu_read_lock_sched){....}, at: show_workqueue_state+0x5/0x1a0
[ 1511.467753] #2: 00000000ebab6c31 (&(&pool->lock)->rlock){-.-.}, at: show_workqueue_state.cold.34+0x36d/0x13d7
[ 1511.467757]
[ 1511.467757] stack backtrace:
[ 1511.467758] CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 4.19.0-rc4-debug-00636-g4718a743bc48 #23
[ 1511.467759] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
[ 1511.467760] Call Trace:
[ 1511.467761] dump_stack+0xea/0x1b0
[ 1511.467761] ? dump_stack_print_info.cold.0+0x1b/0x1b
[ 1511.467762] ? print_stack_trace+0xf3/0x140
[ 1511.467763] print_circular_bug.isra.13.cold.36+0x2e2/0x417
[ 1511.467764] ? print_circular_bug_header+0x80/0x80
[ 1511.467764] check_prev_add.constprop.21+0xccd/0x1600
[ 1511.467765] ? check_usage+0x7b0/0x7b0
[ 1511.467766] ? __print_lock_name+0x110/0x110
[ 1511.467767] ? graph_lock+0x65/0x190
[ 1511.467767] ? __print_lock_name+0x110/0x110
[ 1511.467768] __lock_acquire+0x216f/0x35b0
[ 1511.467769] ? mark_held_locks+0x140/0x140
[ 1511.467770] ? check_usage+0x7b0/0x7b0
[ 1511.467770] ? print_irqtrace_events+0x270/0x270
[ 1511.467771] ? lock_downgrade+0x6b0/0x6b0
[ 1511.467772] ? do_raw_spin_unlock+0xb1/0x350
[ 1511.467772] ? __print_lock_name+0x110/0x110
[ 1511.467773] ? __lock_acquire+0x6a5/0x35b0
[ 1511.467774] ? mark_held_locks+0x140/0x140
[ 1511.467774] ? down_trylock+0x4b/0x70
[ 1511.467775] ? down_trylock+0x13/0x70
[ 1511.467776] lock_acquire+0x183/0x400
[ 1511.467776] ? vprintk_emit+0x2de/0x480
[ 1511.467777] ? lock_release+0x870/0x870
[ 1511.467778] ? do_raw_spin_trylock+0x1b0/0x1b0
[ 1511.467779] ? vprintk_emit+0x215/0x480
[ 1511.467779] ? trace_hardirqs_off+0x95/0x230
[ 1511.467780] vprintk_emit+0x339/0x480
[ 1511.467781] ? vprintk_emit+0x2de/0x480
[ 1511.467781] ? wake_up_klogd+0x1d0/0x1d0
[ 1511.467782] ? do_raw_spin_unlock+0xb1/0x350
[ 1511.467783] ? lock_acquire+0x183/0x400
[ 1511.467784] ? show_workqueue_state.cold.34+0x36d/0x13d7
[ 1511.467784] ? _raw_spin_unlock_irqrestore+0x59/0x70
[ 1511.467785] printk+0x9f/0xc5
[ 1511.467786] ? kmsg_dump_rewind_nolock+0xd9/0xd9
[ 1511.467786] show_workqueue_state.cold.34+0x40d/0x13d7
[ 1511.467787] ? print_worker_info+0x4a0/0x4a0
[ 1511.467788] ? llist_reverse_order+0x60/0x60
[ 1511.467789] ? radix_tree_iter_resume+0x190/0x190
[ 1511.467789] ? devkmsg_read+0x840/0x840
[ 1511.467790] ? radix_tree_next_chunk+0x1b2/0xae0
[ 1511.467791] ? __radix_tree_next_slot+0x810/0x810
[ 1511.467792] ? vprintk_emit+0x15f/0x480
[ 1511.467792] ? wake_up_klogd+0x1d0/0x1d0
[ 1511.467793] ? print_irqtrace_events+0x270/0x270
[ 1511.467794] ? mark_held_locks+0x140/0x140
[ 1511.467795] ? lock_downgrade+0x6b0/0x6b0
[ 1511.467796] ? _raw_spin_unlock_irqrestore+0x59/0x70
[ 1511.467796] ? wq_watchdog_timer_fn+0x4ce/0x850
[ 1511.467797] ? lock_downgrade+0x6b0/0x6b0
[ 1511.467798] ? idr_get_next_ul+0x290/0x290
[ 1511.467799] ? rcu_read_unlock_special.part.22+0x1660/0x1660
[ 1511.467800] ? rcu_dynticks_curr_cpu_in_eqs+0xd9/0x1d0
[ 1511.467801] wq_watchdog_timer_fn+0x598/0x850
[ 1511.467801] ? show_workqueue_state+0x1a0/0x1a0
[ 1511.467802] ? quarantine_put+0xd9/0x300
[ 1511.467803] ? lockdep_hardirqs_on+0x278/0x580
[ 1511.467804] ? trace_hardirqs_on+0x74/0x230
[ 1511.467804] ? lock_acquire+0x183/0x400
[ 1511.467805] ? call_timer_fn+0x1c3/0x720
[ 1511.467806] ? rcu_lockdep_current_cpu_online+0x212/0x310
[ 1511.467807] ? rcu_process_callbacks+0x660/0x660
[ 1511.467807] call_timer_fn+0x21b/0x720
[ 1511.467808] ? show_workqueue_state+0x1a0/0x1a0
[ 1511.467809] ? init_timer_key+0x380/0x380
[ 1511.467810] ? _raw_spin_unlock_irqrestore+0x3c/0x70
[ 1511.467810] ? expire_timers+0x33f/0x550
[ 1511.467811] ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 1511.467812] ? lock_downgrade+0x6b0/0x6b0
[ 1511.467812] ? _raw_spin_unlock_irq+0x29/0x50
[ 1511.467813] ? show_workqueue_state+0x1a0/0x1a0
[ 1511.467814] ? lockdep_hardirqs_on+0x278/0x580
[ 1511.467815] ? _raw_spin_unlock_irq+0x29/0x50
[ 1511.467815] ? trace_hardirqs_on+0x74/0x230
[ 1511.467816] ? show_workqueue_state+0x1a0/0x1a0
[ 1511.467817] expire_timers+0x34a/0x550
[ 1511.467817] ? call_timer_fn+0x720/0x720
[ 1511.467818] ? lock_acquire+0x183/0x400
[ 1511.467819] ? run_timer_softirq+0x39d/0x6a0
[ 1511.467819] ? lock_release+0x870/0x870
[ 1511.467820] ? trace_hardirqs_off+0x95/0x230
[ 1511.467821] ? __next_timer_interrupt+0xf0/0xf0
[ 1511.467822] ? do_raw_spin_trylock+0x1b0/0x1b0
[ 1511.467822] run_timer_softirq+0x45e/0x6a0
[ 1511.467823] ? expire_timers+0x550/0x550
[ 1511.467824] ? tcp_tasklet_func+0x3d6/0x650
[ 1511.467824] ? tcp_tsq_handler+0x100/0x100
[ 1511.467825] ? rcu_process_callbacks+0x660/0x660
[ 1511.467826] ? tasklet_action_common.isra.2+0xcb/0x350
[ 1511.467827] ? lockdep_hardirqs_on+0x278/0x580
[ 1511.467827] ? tasklet_action_common.isra.2+0xcb/0x350
[ 1511.467828] ? trace_hardirqs_on+0x74/0x230
[ 1511.467829] ? __bpf_trace_preemptirq_template+0x10/0x10
[ 1511.467830] ? __rcu_process_callbacks+0x1750/0x1750
[ 1511.467830] ? __do_softirq+0x286/0xb1d
[ 1511.467831] ? rcu_lockdep_current_cpu_online+0x212/0x310
[ 1511.467832] ? rcu_process_callbacks+0x660/0x660
[ 1511.467832] __do_softirq+0x2ca/0xb1d
[ 1511.467833] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1511.467834] ? __irqentry_text_end+0x1fa0ca/0x1fa0ca
[ 1511.467835] ? trace_hardirqs_on+0x230/0x230
[ 1511.467835] ? __schedule+0x6/0x2130
[ 1511.467836] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1511.467837] ? run_ksoftirqd+0x11/0x60
[ 1511.467837] ? trace_hardirqs_off+0x95/0x230
[ 1511.467838] ? trace_hardirqs_on_caller+0x240/0x240
[ 1511.467839] ? finish_task_switch+0xa20/0xa20
[ 1511.467840] ? ___preempt_schedule+0x16/0x18
[ 1511.467840] run_ksoftirqd+0x3c/0x60
[ 1511.467841] smpboot_thread_fn+0x58a/0x9c0
[ 1511.467842] ? __smpboot_create_thread+0x410/0x410
[ 1511.467843] ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 1511.467843] ? __kthread_parkme+0xb6/0x180
[ 1511.467844] ? __smpboot_create_thread+0x410/0x410
[ 1511.467845] kthread+0x315/0x3d0
[ 1511.467845] ? kthread_park+0x120/0x120
[ 1511.467846] ret_from_fork+0x24/0x30
[ 1511.468035] this_is_line_0_
[ 1511.469707] workqueue mm_percpu_wq: flags=0x8
[ 1511.475150] noise_KVEQAAHZ
[ 1511.475474] noise_KVEQAAHZ
[ 1511.475740] noise_KVEQAAHZ
[ 1511.476125] noise_KVEQAAHZ
[ 1511.476409] noise_KVEQAAHZ
[ 1511.476700] noise_KVEQAAHZ
[ 1511.476996] noise_KVEQAAHZ
[ 1511.477149] this_is_line_1_
[ 1511.477558] this_is_line_2_
[ 1511.477814] this_is_line_3_
[ 1511.478132] this_is_line_4_
[ 1511.478463] this_is_line_5_
[ 1511.478762] this_is_line_6_
[ 1511.479077] this_is_line_7_
[ 1511.479359] this_is_line_8_
[ 1511.479647] this_is_line_9_
[ 1511.479943] this_is_line_10_
[ 1511.480192] this_is_line_11_
[ 1511.480495] this_is_line_12_
[ 1511.480801] this_is_line_13_
[ 1511.483873] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[ 1511.491063] noise_KVEQAAHZ
[ 1511.496337] this_is_line_14_
[ 1511.496649] this_is_line_15_
[ 1511.496905] this_is_line_16_
[ 1511.497257] this_is_line_17_
[ 1511.497521] this_is_line_18_
[ 1511.497797] pending: vmstat_update
[ 1511.497827] this_is_line_19_
[ 1511.497881] pool 8: cpus=4 node=0 flags=0x0 nice=0 hung=35s workers=3 idle: 12260 9068
Powered by blists - more mailing lists