lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 30 Jun 2022 11:56:48 +0200 From: Paolo Abeni <pabeni@...hat.com> To: Qiao Ma <mqaio@...ux.alibaba.com>, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, gustavoars@...nel.org, cai.huoqing@...ux.dev, aviad.krawczyk@...wei.com, zhaochen6@...wei.com Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64() On Wed, 2022-06-29 at 15:28 +0800, Qiao Ma wrote: > When using hinic device as a bond slave device, and reading device stats of > master bond device, the kernel may hung. > > The kernel panic calltrace as follows: > Kernel panic - not syncing: softlockup: hung tasks > Call trace: > native_queued_spin_lock_slowpath+0x1ec/0x31c > dev_get_stats+0x60/0xcc > dev_seq_printf_stats+0x40/0x120 > dev_seq_show+0x1c/0x40 > seq_read_iter+0x3c8/0x4dc > seq_read+0xe0/0x130 > proc_reg_read+0xa8/0xe0 > vfs_read+0xb0/0x1d4 > ksys_read+0x70/0xfc > __arm64_sys_read+0x20/0x30 > el0_svc_common+0x88/0x234 > do_el0_svc+0x2c/0x90 > el0_svc+0x1c/0x30 > el0_sync_handler+0xa8/0xb0 > el0_sync+0x148/0x180 > > And the calltrace of task that actually caused kernel hungs as follows: > __switch_to+124 > __schedule+548 > schedule+72 > schedule_timeout+348 > __down_common+188 > __down+24 > down+104 > hinic_get_stats64+44 [hinic] > dev_get_stats+92 > bond_get_stats+172 [bonding] > dev_get_stats+92 > dev_seq_printf_stats+60 > dev_seq_show+24 > seq_read_iter+964 > seq_read+220 > proc_reg_read+164 > vfs_read+172 > ksys_read+108 > __arm64_sys_read+28 > el0_svc_common+132 > do_el0_svc+40 > el0_svc+24 > el0_sync_handler+164 > el0_sync+324 > > When getting device stats from bond, kernel will call bond_get_stats(). > It first holds the spinlock bond->stats_lock, and then call > hinic_get_stats64() to collect hinic device's stats. > However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to > protect its critical section, which may schedule current task out. > And if system is under high pressure, the task cannot be woken up > immediately, which eventually triggers kernel hung panic. > > Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats") > Signed-off-by: Qiao Ma <mqaio@...ux.alibaba.com> Side note: it looks like that after this patch every section protected by the mgmt_lock is already under rtnl lock protection, so you could probably remove the hinic specific lock (in a separate, net-next, patch). Please double check the above as I skimmed upon that quickly. Thanks, Paolo
Powered by blists - more mailing lists