lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CH2PR15MB3575CECCB65404A0292B68759AF20@CH2PR15MB3575.namprd15.prod.outlook.com>
Date:   Fri, 12 Jul 2019 13:35:26 +0000
From:   Jon Maloy <jon.maloy@...csson.com>
To:     Chris Packham <chris.packham@...iedtelesis.co.nz>,
        "eric.dumazet@...il.com" <eric.dumazet@...il.com>,
        "ying.xue@...driver.com" <ying.xue@...driver.com>,
        "davem@...emloft.net" <davem@...emloft.net>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "tipc-discussion@...ts.sourceforge.net" 
        <tipc-discussion@...ts.sourceforge.net>
Subject: RE: [PATCH v2] tipc: ensure head->lock is initialised

Acked-by: Jon Maloy <jon.maloy@...csson.com>

Actually, this was not what I meant, but it solves our problem in a simple and safe way, for now.
Later, when net-next opens, I will revert this and do it the right way, which is to change from skb_queue_purge() to __skb_queue_purge as Eric correctly noted.
That change has to be done at four locations, at least,  and is too intrusive to post to 'net' now.
I'll send a cleanup patch when net-next re-opens.

BR
///jon
 

> -----Original Message-----
> From: Chris Packham <chris.packham@...iedtelesis.co.nz>
> Sent: 11-Jul-19 18:41
> To: Jon Maloy <jon.maloy@...csson.com>; eric.dumazet@...il.com;
> ying.xue@...driver.com; davem@...emloft.net
> Cc: linux-kernel@...r.kernel.org; netdev@...r.kernel.org; tipc-
> discussion@...ts.sourceforge.net; Chris Packham
> <chris.packham@...iedtelesis.co.nz>
> Subject: [PATCH v2] tipc: ensure head->lock is initialised
> 
> tipc_named_node_up() creates a skb list. It passes the list to
> tipc_node_xmit() which has some code paths that can call
> skb_queue_purge() which relies on the list->lock being initialised.
> 
> The spin_lock is only needed if the messages end up on the receive path but
> when the list is created in tipc_named_node_up() we don't necessarily know if
> it is going to end up there.
> 
> Once all the skb list users are updated in tipc it will then be possible to update
> them to use the unlocked variants of the skb list functions and initialise the
> lock when we know the message will follow the receive path.
> 
> Signed-off-by: Chris Packham <chris.packham@...iedtelesis.co.nz>
> ---
> 
> I'm updating our products to use the latest kernel. One change that we have
> that doesn't appear to have been upstreamed is related to the following soft
> lockup.
> 
> NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/3:0]
> Modules linked in: tipc jitterentropy_rng echainiv drbg platform_driver(O)
> ipifwd(PO)
> CPU: 3 PID: 0 Comm: swapper/3 Tainted: P           O    4.4.6-at1 #1
> task: a3054e00 ti: ac6b4000 task.ti: a307a000
> NIP: 806891c4 LR: 804f5060 CTR: 804f50d0
> REGS: ac6b59b0 TRAP: 0901   Tainted: P           O     (4.4.6-at1)
> MSR: 00029002 <CE,EE,ME>  CR: 84002088  XER: 20000000
> 
> GPR00: 804f50fc ac6b5a60 a3054e00 00029002 00000101 01001011
> 00000000 00000001
> GPR08: 00021002 c1502d1c ac6b5ae4 00000000 804f50d0 NIP [806891c4]
> _raw_spin_lock_irqsave+0x44/0x80 LR [804f5060] skb_dequeue+0x20/0x90
> Call Trace:
> [ac6b5a80] [804f50fc] skb_queue_purge+0x2c/0x50 [ac6b5a90] [c1511058]
> tipc_node_xmit+0x138/0x170 [tipc] [ac6b5ad0] [c1509e58]
> tipc_named_node_up+0x88/0xa0 [tipc] [ac6b5b00] [c150fc1c]
> tipc_netlink_compat_stop+0x9bc/0xf50 [tipc] [ac6b5b20] [c1511638]
> tipc_rcv+0x418/0x9b0 [tipc] [ac6b5bc0] [c150218c]
> tipc_bcast_stop+0xfc/0x7b0 [tipc] [ac6b5bd0] [80504e38]
> __netif_receive_skb_core+0x468/0xa10
> [ac6b5c70] [805082fc] netif_receive_skb_internal+0x3c/0xe0
> [ac6b5ca0] [80642a48] br_handle_frame_finish+0x1d8/0x4d0
> [ac6b5d10] [80642f30] br_handle_frame+0x1f0/0x330 [ac6b5d60]
> [80504ec8] __netif_receive_skb_core+0x4f8/0xa10
> [ac6b5e00] [805082fc] netif_receive_skb_internal+0x3c/0xe0
> [ac6b5e30] [8044c868] _dpa_rx+0x148/0x5c0 [ac6b5ea0] [8044b0c8]
> priv_rx_default_dqrr+0x98/0x170 [ac6b5ed0] [804d1338]
> qman_p_poll_dqrr+0x1b8/0x240 [ac6b5f00] [8044b1c0]
> dpaa_eth_poll+0x20/0x60 [ac6b5f20] [805087cc]
> net_rx_action+0x15c/0x320 [ac6b5f80] [8002594c]
> __do_softirq+0x13c/0x250 [ac6b5fe0] [80025c34] irq_exit+0xb4/0xf0
> [ac6b5ff0] [8000d81c] call_do_irq+0x24/0x3c [a307be60] [80004acc]
> do_IRQ+0x8c/0x120 [a307be80] [8000f450] ret_from_except+0x0/0x18
> --- interrupt: 501 at arch_cpu_idle+0x24/0x70
> 
> Eyeballing the code I think it can still happen since tipc_named_node_up
> allocates struct sk_buff_head head on the stack so it could have arbitrary
> content.
> 
> Changes in v2:
> - fixup commit subject
> - add more information to commit message from mailing list discussion
> 
>  net/tipc/name_distr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c index
> 61219f0b9677..44abc8e9c990 100644
> --- a/net/tipc/name_distr.c
> +++ b/net/tipc/name_distr.c
> @@ -190,7 +190,7 @@ void tipc_named_node_up(struct net *net, u32
> dnode)
>  	struct name_table *nt = tipc_name_table(net);
>  	struct sk_buff_head head;
> 
> -	__skb_queue_head_init(&head);
> +	skb_queue_head_init(&head);
> 
>  	read_lock_bh(&nt->cluster_scope_lock);
>  	named_distribute(net, &head, dnode, &nt->cluster_scope);
> --
> 2.22.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ