lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Nov 2017 11:50:48 +0200
From:   Tommi Rantala <tommi.t.rantala@...ia.com>
To:     Jon Maloy <jon.maloy@...csson.com>,
        Ying Xue <ying.xue@...driver.com>,
        "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        tipc-discussion@...ts.sourceforge.net,
        open list <linux-kernel@...r.kernel.org>
Subject: tipc_node_calculate_timer div-by-zero

Hi,

I'm seeing a rare TIPC div-by-zero crash in tipc_node_calculate_timer().

If I get it right, we're receiving a discovery packet while enabling a 
bearer. The bearer is not yet fully initialized, causing the discovery 
packet processing to use zero tolerance value, which then causes the 
div-by-zero in tipc_node_calculate_timer().

I can reproduce the issue easily by adding some sleep in 
tipc_enable_bearer(), and the machine always crashes after booting it up:

diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index 47ec121574ce..00bdd87cd614 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -327,12 +327,16 @@ static int tipc_enable_bearer(struct net *net, 
const char *name,
                 return -EINVAL;
         }

+       pr_warn("HACK: sleeping 2 seconds!");
+       usleep_range(2000000, 2100000);
+
         b->identity = bearer_id;
         b->tolerance = m->tolerance;
         b->window = m->window;


In 4.14 removing the latter tipc_rcv() call in tipc_udp_recv() [which 
was added in in commit c9b64d492b1f (tipc: add replicast peer 
discovery)] seems to fix the crash, but I do not really understand the code:

diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index ecca64fc6a6f..371653acf1f6 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -371,9 +371,11 @@ static int tipc_udp_recv(struct sock *sk, struct 
sk_buff *skb)
                         goto rcu_out;
         }

+       /*
         tipc_rcv(sock_net(sk), skb, b);
         rcu_read_unlock();
         return 0;
+       */

  rcu_out:
         rcu_read_unlock();



I have tested 4.4 and 4.14, and the bug is reproducible in both versions.

Comments?

-Tommi



[   12.576885] tipc: Activated (version 2.0.0)
[   12.577506] NET: Registered protocol family 30
[   12.578212] tipc: Started in single node mode
[   12.589907] Started in network mode
[   12.590450] Own node address <1.1.1>, network identity 1
[   12.592022] HACK: sleeping 2 seconds!
[   12.668088] divide error: 0000 [#1] SMP
[   12.670993] Modules linked in: tipc ip6_udp_tunnel udp_tunnel nfsd 
auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc isofs 
aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ata_piix 
sch_fq_codel autofs4
[   12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 
4.14.2-pc64-dirty #1
[   12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.10.2-2.fc27 04/01/2014
[   12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
[   12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
[   12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
[   12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX: 
0000000000000000
[   12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI: 
ffff8c2a5b382600
[   12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09: 
0000000000000001
[   12.696632] R10: 0000000000000000 R11: 0000000000000000 R12: 
ffff8c2a5d8949d8
[   12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15: 
ffff8c2a5d894800
[   12.702338] FS:  0000000000000000(0000) GS:ffff8c2a7fc80000(0000) 
knlGS:0000000000000000
[   12.705099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4: 
00000000003606e0
[   12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[   12.712627] Call Trace:
[   12.713390]  <IRQ>
[   12.714011]  tipc_node_check_dest+0x2e8/0x350 [tipc]
[   12.715286]  tipc_disc_rcv+0x14d/0x1d0 [tipc]
[   12.716370]  tipc_rcv+0x8b0/0xd40 [tipc]
[   12.717396]  ? minmax_running_min+0x2f/0x60
[   12.718248]  ? dst_alloc+0x4c/0xa0
[   12.718964]  ? tcp_ack+0xaf1/0x10b0
[   12.719658]  ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
[   12.720634]  tipc_udp_recv+0x71/0x1d0 [tipc]
[   12.721459]  ? dst_alloc+0x4c/0xa0
[   12.722130]  udp_queue_rcv_skb+0x264/0x490
[   12.722924]  __udp4_lib_rcv+0x21e/0x990
[   12.723670]  ? ip_route_input_rcu+0x2dd/0xbf0
[   12.724442]  ? tcp_v4_rcv+0x958/0xa40
[   12.725039]  udp_rcv+0x1a/0x20
[   12.725587]  ip_local_deliver_finish+0x97/0x1d0
[   12.726323]  ip_local_deliver+0xaf/0xc0
[   12.726959]  ? ip_route_input_noref+0x19/0x20
[   12.727689]  ip_rcv_finish+0xdd/0x3b0
[   12.728307]  ip_rcv+0x2ac/0x360
[   12.728839]  __netif_receive_skb_core+0x6fb/0xa90
[   12.729580]  ? udp4_gro_receive+0x1a7/0x2c0
[   12.730274]  __netif_receive_skb+0x1d/0x60
[   12.730953]  ? __netif_receive_skb+0x1d/0x60
[   12.731637]  netif_receive_skb_internal+0x37/0xd0
[   12.732371]  napi_gro_receive+0xc7/0xf0
[   12.732920]  receive_buf+0x3c3/0xd40
[   12.733441]  virtnet_poll+0xb1/0x250
[   12.733944]  net_rx_action+0x23e/0x370
[   12.734476]  __do_softirq+0xc5/0x2f8
[   12.734922]  irq_exit+0xfa/0x100
[   12.735315]  do_IRQ+0x4f/0xd0
[   12.735680]  common_interrupt+0xa2/0xa2
[   12.736126]  </IRQ>
[   12.736416] RIP: 0010:native_safe_halt+0x6/0x10
[   12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffff4d
[   12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX: 
0000000000000000
[   12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[   12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09: 
ffffa41cc12c7e88
[   12.740118] R10: 0000000000000000 R11: 0000000000000000 R12: 
0000000000000002
[   12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15: 
0000000000000000
[   12.741831]  default_idle+0x2a/0x100
[   12.742323]  arch_cpu_idle+0xf/0x20
[   12.742796]  default_idle_call+0x28/0x40
[   12.743312]  do_idle+0x179/0x1f0
[   12.743761]  cpu_startup_entry+0x1d/0x20
[   12.744291]  start_secondary+0x112/0x120
[   12.744816]  secondary_startup_64+0xa5/0xa5
[   12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00 
00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48 
89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
[   12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] 
RSP: ffff8c2a7fc838a0
[   12.748555] ---[ end trace 1399ab83390650fd ]---
[   12.749296] Kernel panic - not syncing: Fatal exception in interrupt
[   12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000 
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   12.751215] Rebooting in 60 seconds..


[   12.561265] tipc: Activated (version 2.0.0)
[   12.561997] NET: Registered protocol family 30
[   12.562686] tipc: Started in single node mode
[   12.569696] Started in network mode
[   12.570401] Own node address <1.1.1>, network identity 1
[   12.592022] HACK: sleeping 2 seconds!
[   13.792592] divide error: 0000 [#1] SMP
[   13.793539] Modules linked in: tipc ...
[   13.799203] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.101-dirty #9
[   13.800235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.10.2-2.fc27 04/01/2014
[   13.801531] task: ffff8802355ebc00 ti: ffff8802355f8000 task.ti: 
ffff8802355f8000
[   13.803134] RIP: 0010:[<ffffffffa03b6252>]  [<ffffffffa03b6252>] 
tipc_node_calculate_timer.isra.6+0x52/0x60 [tipc]
[   13.804974] RSP: 0018:ffff88023fc838a0  EFLAGS: 00010246
[   13.805806] RAX: 0000000000000000 RBX: ffff88023fc83996 RCX: 
0000000000000000
[   13.806826] RDX: 0000000000000000 RSI: ffff8802275fd454 RDI: 
0000000000000000
[   13.807890] RBP: ffff88023fc838b0 R08: 0000000000000001 R09: 
0000000000000001
[   13.808915] R10: 0000000000000000 R11: ffff8802275fd470 R12: 
ffff8802275fd460
[   13.809865] R13: ffff8802275fd1d8 R14: 0000000000000000 R15: 
ffff8802275fd000
[   13.810932] FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) 
knlGS:0000000000000000
[   13.812152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   13.813140] CR2: 00007fa9aac6b6e0 CR3: 00000000b2e96000 CR4: 
00000000003406e0
[   13.814257] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   13.815513] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[   13.816727] Stack:
[   13.817256]  ffff88023258b400 0000000000000000 ffff88023fc83960 
ffffffffa03b7102
[   13.819080]  ffffffff00000000 0000000000000000 ffffea000000edd8 
0000000001001001
[   13.820998]  ffff880001001004 ffff880200000002 ffff8800b928b000 
ffff8802275fcc00
[   13.822880] Call Trace:
[   13.823472]  <IRQ>
[   13.823869]  [<ffffffffa03b7102>] tipc_node_check_dest+0x342/0x3f0 [tipc]
[   13.825389]  [<ffffffffa03af2db>] tipc_disc_rcv+0x14b/0x1d0 [tipc]
[   13.826571]  [<ffffffffa03b7ab1>] tipc_rcv+0x6d1/0xc40 [tipc]
[   13.827619]  [<ffffffff8207dde5>] ? wake_up_process+0x15/0x20
[   13.828656]  [<ffffffffa0288de0>] ? svc_xprt_do_enqueue+0x100/0x240 
[sunrpc]
[   13.829865]  [<ffffffffa03be3c0>] ? tipc_udp_send_msg+0x240/0x240 [tipc]
[   13.831033]  [<ffffffffa03be401>] tipc_udp_recv+0x41/0x80 [tipc]
[   13.832126]  [<ffffffff82500c77>] udp_queue_rcv_skb+0x327/0x440
[   13.833234]  [<ffffffff825013c4>] __udp4_lib_rcv+0x154/0x6d0
[   13.834290]  [<ffffffff824cf321>] ? ip_route_input_noref+0x331/0xe20
[   13.835410]  [<ffffffff82501e4a>] udp_rcv+0x1a/0x20

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ