lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 27 Nov 2017 15:46:22 +0000
From:   Jon Maloy <jon.maloy@...csson.com>
To:     Tommi Rantala <tommi.t.rantala@...ia.com>,
        Ying Xue <ying.xue@...driver.com>,
        "David S. Miller" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "tipc-discussion@...ts.sourceforge.net" 
        <tipc-discussion@...ts.sourceforge.net>,
        open list <linux-kernel@...r.kernel.org>
Subject: RE: tipc_node_calculate_timer div-by-zero

Hi Tommy,
Yes, this clearly doesn't look right. tipc_rcv() should never be called unless the bearer exists and is up.
The simplest and safest way to fix this is by just to remove the second call to tipc_rcv(), - there will be more discovery messages arriving later.
An alternative is to move the (user == LINK_CONFIG) test to before the (b && test_bit() test. This is nicer, but will cost another few cpu cycles on the most critical data path.  I think I favor the first approach.

Maybe something like this:

if (likely (b && test_bit(0, &b->up))) {
                tipc_rcv(sock_net(sk), skb, b);
} else  if (msg_user(hdr) == LINK_CONFIG)
                tipc_udp_rcast_disc(b, skb);
                kfree_skb(skb);
}
rcu_read_unlock();
return 0;

If you want you can send this to 'net' yourself. Otherwise I can do it, but that would take a couple of days.

///jon


> -----Original Message-----
> From: Tommi Rantala [mailto:tommi.t.rantala@...ia.com]
> Sent: Monday, November 27, 2017 04:51
> To: Jon Maloy <jon.maloy@...csson.com>; Ying Xue
> <ying.xue@...driver.com>; David S. Miller <davem@...emloft.net>;
> netdev@...r.kernel.org; tipc-discussion@...ts.sourceforge.net; open list
> <linux-kernel@...r.kernel.org>
> Subject: tipc_node_calculate_timer div-by-zero
> 
> Hi,
> 
> I'm seeing a rare TIPC div-by-zero crash in tipc_node_calculate_timer().
> 
> If I get it right, we're receiving a discovery packet while enabling a bearer.
> The bearer is not yet fully initialized, causing the discovery packet processing
> to use zero tolerance value, which then causes the div-by-zero in
> tipc_node_calculate_timer().
> 
> I can reproduce the issue easily by adding some sleep in
> tipc_enable_bearer(), and the machine always crashes after booting it up:
> 
> diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index
> 47ec121574ce..00bdd87cd614 100644
> --- a/net/tipc/bearer.c
> +++ b/net/tipc/bearer.c
> @@ -327,12 +327,16 @@ static int tipc_enable_bearer(struct net *net, const
> char *name,
>                  return -EINVAL;
>          }
> 
> +       pr_warn("HACK: sleeping 2 seconds!");
> +       usleep_range(2000000, 2100000);
> +
>          b->identity = bearer_id;
>          b->tolerance = m->tolerance;
>          b->window = m->window;
> 
> 
> In 4.14 removing the latter tipc_rcv() call in tipc_udp_recv() [which was
> added in in commit c9b64d492b1f (tipc: add replicast peer discovery)] seems
> to fix the crash, but I do not really understand the code:
> 
> diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index
> ecca64fc6a6f..371653acf1f6 100644
> --- a/net/tipc/udp_media.c
> +++ b/net/tipc/udp_media.c
> @@ -371,9 +371,11 @@ static int tipc_udp_recv(struct sock *sk, struct
> sk_buff *skb)
>                          goto rcu_out;
>          }
> 
> +       /*
>          tipc_rcv(sock_net(sk), skb, b);
>          rcu_read_unlock();
>          return 0;
> +       */
> 
>   rcu_out:
>          rcu_read_unlock();
> 
> 
> 
> I have tested 4.4 and 4.14, and the bug is reproducible in both versions.
> 
> Comments?
> 
> -Tommi
> 
> 
> 
> [   12.576885] tipc: Activated (version 2.0.0)
> [   12.577506] NET: Registered protocol family 30
> [   12.578212] tipc: Started in single node mode
> [   12.589907] Started in network mode
> [   12.590450] Own node address <1.1.1>, network identity 1
> [   12.592022] HACK: sleeping 2 seconds!
> [   12.668088] divide error: 0000 [#1] SMP
> [   12.670993] Modules linked in: tipc ip6_udp_tunnel udp_tunnel nfsd
> auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc isofs aesni_intel
> aes_x86_64 crypto_simd cryptd glue_helper ata_piix sch_fq_codel autofs4
> [   12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted
> 4.14.2-pc64-dirty #1
> [   12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-2.fc27 04/01/2014
> [   12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
> [   12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
> [   12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
> [   12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX:
> 0000000000000000
> [   12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI:
> ffff8c2a5b382600
> [   12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09:
> 0000000000000001
> [   12.696632] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff8c2a5d8949d8
> [   12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15:
> ffff8c2a5d894800
> [   12.702338] FS:  0000000000000000(0000) GS:ffff8c2a7fc80000(0000)
> knlGS:0000000000000000
> [   12.705099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4:
> 00000000003606e0
> [   12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [   12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [   12.712627] Call Trace:
> [   12.713390]  <IRQ>
> [   12.714011]  tipc_node_check_dest+0x2e8/0x350 [tipc]
> [   12.715286]  tipc_disc_rcv+0x14d/0x1d0 [tipc]
> [   12.716370]  tipc_rcv+0x8b0/0xd40 [tipc]
> [   12.717396]  ? minmax_running_min+0x2f/0x60
> [   12.718248]  ? dst_alloc+0x4c/0xa0
> [   12.718964]  ? tcp_ack+0xaf1/0x10b0
> [   12.719658]  ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
> [   12.720634]  tipc_udp_recv+0x71/0x1d0 [tipc]
> [   12.721459]  ? dst_alloc+0x4c/0xa0
> [   12.722130]  udp_queue_rcv_skb+0x264/0x490
> [   12.722924]  __udp4_lib_rcv+0x21e/0x990
> [   12.723670]  ? ip_route_input_rcu+0x2dd/0xbf0
> [   12.724442]  ? tcp_v4_rcv+0x958/0xa40
> [   12.725039]  udp_rcv+0x1a/0x20
> [   12.725587]  ip_local_deliver_finish+0x97/0x1d0
> [   12.726323]  ip_local_deliver+0xaf/0xc0
> [   12.726959]  ? ip_route_input_noref+0x19/0x20
> [   12.727689]  ip_rcv_finish+0xdd/0x3b0
> [   12.728307]  ip_rcv+0x2ac/0x360
> [   12.728839]  __netif_receive_skb_core+0x6fb/0xa90
> [   12.729580]  ? udp4_gro_receive+0x1a7/0x2c0
> [   12.730274]  __netif_receive_skb+0x1d/0x60
> [   12.730953]  ? __netif_receive_skb+0x1d/0x60
> [   12.731637]  netif_receive_skb_internal+0x37/0xd0
> [   12.732371]  napi_gro_receive+0xc7/0xf0
> [   12.732920]  receive_buf+0x3c3/0xd40
> [   12.733441]  virtnet_poll+0xb1/0x250
> [   12.733944]  net_rx_action+0x23e/0x370
> [   12.734476]  __do_softirq+0xc5/0x2f8
> [   12.734922]  irq_exit+0xfa/0x100
> [   12.735315]  do_IRQ+0x4f/0xd0
> [   12.735680]  common_interrupt+0xa2/0xa2
> [   12.736126]  </IRQ>
> [   12.736416] RIP: 0010:native_safe_halt+0x6/0x10
> [   12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffff4d
> [   12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX:
> 0000000000000000
> [   12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [   12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09:
> ffffa41cc12c7e88
> [   12.740118] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000002
> [   12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15:
> 0000000000000000
> [   12.741831]  default_idle+0x2a/0x100
> [   12.742323]  arch_cpu_idle+0xf/0x20
> [   12.742796]  default_idle_call+0x28/0x40
> [   12.743312]  do_idle+0x179/0x1f0
> [   12.743761]  cpu_startup_entry+0x1d/0x20
> [   12.744291]  start_secondary+0x112/0x120
> [   12.744816]  secondary_startup_64+0xa5/0xa5
> [   12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00
> 00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48
> 89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
> [   12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
> RSP: ffff8c2a7fc838a0
> [   12.748555] ---[ end trace 1399ab83390650fd ]---
> [   12.749296] Kernel panic - not syncing: Fatal exception in interrupt
> [   12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [   12.751215] Rebooting in 60 seconds..
> 
> 
> [   12.561265] tipc: Activated (version 2.0.0)
> [   12.561997] NET: Registered protocol family 30
> [   12.562686] tipc: Started in single node mode
> [   12.569696] Started in network mode
> [   12.570401] Own node address <1.1.1>, network identity 1
> [   12.592022] HACK: sleeping 2 seconds!
> [   13.792592] divide error: 0000 [#1] SMP
> [   13.793539] Modules linked in: tipc ...
> [   13.799203] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.101-dirty #9
> [   13.800235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-2.fc27 04/01/2014
> [   13.801531] task: ffff8802355ebc00 ti: ffff8802355f8000 task.ti:
> ffff8802355f8000
> [   13.803134] RIP: 0010:[<ffffffffa03b6252>]  [<ffffffffa03b6252>]
> tipc_node_calculate_timer.isra.6+0x52/0x60 [tipc]
> [   13.804974] RSP: 0018:ffff88023fc838a0  EFLAGS: 00010246
> [   13.805806] RAX: 0000000000000000 RBX: ffff88023fc83996 RCX:
> 0000000000000000
> [   13.806826] RDX: 0000000000000000 RSI: ffff8802275fd454 RDI:
> 0000000000000000
> [   13.807890] RBP: ffff88023fc838b0 R08: 0000000000000001 R09:
> 0000000000000001
> [   13.808915] R10: 0000000000000000 R11: ffff8802275fd470 R12:
> ffff8802275fd460
> [   13.809865] R13: ffff8802275fd1d8 R14: 0000000000000000 R15:
> ffff8802275fd000
> [   13.810932] FS:  0000000000000000(0000) GS:ffff88023fc80000(0000)
> knlGS:0000000000000000
> [   13.812152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   13.813140] CR2: 00007fa9aac6b6e0 CR3: 00000000b2e96000 CR4:
> 00000000003406e0
> [   13.814257] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [   13.815513] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [   13.816727] Stack:
> [   13.817256]  ffff88023258b400 0000000000000000 ffff88023fc83960
> ffffffffa03b7102
> [   13.819080]  ffffffff00000000 0000000000000000 ffffea000000edd8
> 0000000001001001
> [   13.820998]  ffff880001001004 ffff880200000002 ffff8800b928b000
> ffff8802275fcc00
> [   13.822880] Call Trace:
> [   13.823472]  <IRQ>
> [   13.823869]  [<ffffffffa03b7102>] tipc_node_check_dest+0x342/0x3f0 [tipc]
> [   13.825389]  [<ffffffffa03af2db>] tipc_disc_rcv+0x14b/0x1d0 [tipc]
> [   13.826571]  [<ffffffffa03b7ab1>] tipc_rcv+0x6d1/0xc40 [tipc]
> [   13.827619]  [<ffffffff8207dde5>] ? wake_up_process+0x15/0x20
> [   13.828656]  [<ffffffffa0288de0>] ? svc_xprt_do_enqueue+0x100/0x240
> [sunrpc]
> [   13.829865]  [<ffffffffa03be3c0>] ? tipc_udp_send_msg+0x240/0x240 [tipc]
> [   13.831033]  [<ffffffffa03be401>] tipc_udp_recv+0x41/0x80 [tipc]
> [   13.832126]  [<ffffffff82500c77>] udp_queue_rcv_skb+0x327/0x440
> [   13.833234]  [<ffffffff825013c4>] __udp4_lib_rcv+0x154/0x6d0
> [   13.834290]  [<ffffffff824cf321>] ? ip_route_input_noref+0x331/0xe20
> [   13.835410]  [<ffffffff82501e4a>] udp_rcv+0x1a/0x20

Powered by blists - more mailing lists