lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <8E92BAA8-0FC6-4D29-BB4D-B6B60047A1D2@gmail.com>
Date: Thu, 7 Dec 2023 00:26:38 +0200
From: Martin Zaharinov <micron10@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: netdev <netdev@...r.kernel.org>,
 Paolo Abeni <pabeni@...hat.com>,
 patchwork-bot+netdevbpf@...nel.org,
 Jakub Kicinski <kuba@...nel.org>,
 Stephen Hemminger <stephen@...workplumber.org>,
 kuba+netdrv@...nel.org,
 dsahern@...il.com
Subject: Re: Urgent Bug Report Kernel crash 6.5.2

Hi all


its strange same problem is go on 6.6.4 same same debug log

diff hardware , users number and ….

in debug log is same : lib/rcuref.c 

in this line is : 


        /*
         * If the reference count was already in the dead zone, then this
         * put() operation is imbalanced. Warn, put the reference count back to
         * DEAD and tell the caller to not deconstruct the object.
         */
        if (WARN_ONCE(cnt >= RCUREF_RELEASED, "rcuref - imbalanced put()")) {
                atomic_set(&ref->refcnt, RCUREF_DEAD);
                return false;
        }


[529520.875413] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G           O       6.6.3 #1
[529520.875533] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020
[529520.875653] RIP: 0010:rcuref_put_slowpath+0x5f/0x70
[529520.875748] Code: 31 c0 eb e2 80 3d 9e d1 e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 d9 96 e3 8f c6 05 84 d1 e6 00 01 e8 41 9d c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2
[529520.875908] RSP: 0018:ffffa823c052cde8 EFLAGS: 00010296
[529520.876003] RAX: 0000000000000019 RBX: ffffa0f049053180 RCX: 00000000fff7ffff
[529520.876122] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea
[529520.876244] RBP: ffffa0f0a8fffec0 R08: 0000000000000000 R09: 00000000fff7ffff
[529520.876364] R10: ffffa0f79ae00000 R11: 0000000000000003 R12: ffffa0f04655f000
[529520.876482] R13: 0000000000000258 R14: ffffa0f16ade1000 R15: ffffa0f79f964bd0
[529520.876601] FS:  0000000000000000(0000) GS:ffffa0f79f940000(0000) knlGS:0000000000000000
[529520.876723] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[529520.876822] CR2: 00007fa9bd56b3c8 CR3: 000000016e43e002 CR4: 00000000003706e0
[529520.877043] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[529520.877164] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[529520.877287] Call Trace:
[529520.877382]  <IRQ>
[529520.877472]  ? __warn+0x6c/0x130
[529520.877566]  ? report_bug+0x1b8/0x200
[529520.877661]  ? handle_bug+0x36/0x70
[529520.877753]  ? exc_invalid_op+0x17/0x1a0
[529520.877849]  ? asm_exc_invalid_op+0x16/0x20
[529520.877947]  ? rcuref_put_slowpath+0x5f/0x70
[529520.878043]  ? rcuref_put_slowpath+0x5f/0x70
[529520.878136]  dst_release+0x1c/0x40
[529520.878229]  __dev_queue_xmit+0x594/0xcd0
[529520.878324]  ? eth_header+0x25/0xc0
[529520.878417]  ip_finish_output2+0x1a0/0x530
[529520.878514]  process_backlog+0x107/0x210
[529520.878610]  __napi_poll+0x20/0x180
[529520.878702]  net_rx_action+0x29f/0x380
[529520.878935]  __do_softirq+0xd0/0x202
[529520.879033]  do_softirq+0x3a/0x50
[529520.879127]  </IRQ>
[529520.879217]  <TASK>
[529520.879306]  flush_smp_call_function_queue+0x3f/0x50
[529520.879407]  do_idle+0x14d/0x210
[529520.879500]  cpu_startup_entry+0x21/0x30
[529520.879597]  start_secondary+0xe1/0xf0
[529520.879693]  secondary_startup_64_no_verify+0x166/0x16b
[529520.879793]  </TASK>
[529520.879884] ---[ end trace 0000000000000000 ]—


m.

> On 16 Nov 2023, at 16:17, Martin Zaharinov <micron10@...il.com> wrote:
> 
> Hi All
> 
> report same problem with kernel 6.6.1 - i think problem is in rcu but … if have options to add people from RCU here.
> 
> See report : 
> 
> 
> 
> [141229.505339] ------------[ cut here ]------------
> [141229.505492] rcuref - imbalanced put()
> [141229.505504] WARNING: CPU: 8 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1))
> [141229.505821] Modules linked in: xsk_diag unix_diag iptable_filter xt_TCPMSS iptable_mangle xt_addrtype xt_nat xt_MASQUERADE iptable_nat ip_tables netconsole coretemp e1000 ixgbe mdio pppoe pppox sha1_ssse3 sha1_generic ppp_mppe libarc4 ppp_generic slhc nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
> [141229.506349] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G           O       6.6.1 #1
> [141229.506527] Hardware name: Persy Super Server/X11DDW-L, BIOS 4.0 07/11/2023
> [141229.506701] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1))
> [141229.506843] Code: 31 c0 eb e2 80 3d ef 4e e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 07 99 e3 97 c6 05 d5 4e e6 00 01 e8 d1 1f c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2
> All code
> ========
>   0: 31 c0                 xor    %eax,%eax
>   2: eb e2                 jmp    0xffffffffffffffe6
>   4: 80 3d ef 4e e6 00 00 cmpb   $0x0,0xe64eef(%rip)        # 0xe64efa
>   b: 74 0a                 je     0x17
>   d: c7 03 00 00 00 e0     movl   $0xe0000000,(%rbx)
>  13: 31 c0                 xor    %eax,%eax
>  15: eb cf                 jmp    0xffffffffffffffe6
>  17: 48 c7 c7 07 99 e3 97 mov    $0xffffffff97e39907,%rdi
>  1e: c6 05 d5 4e e6 00 01 movb   $0x1,0xe64ed5(%rip)        # 0xe64efa
>  25: e8 d1 1f c7 ff        call   0xffffffffffc71ffb
>  2a:* 0f 0b                 ud2     <-- trapping instruction
>  2c: eb df                 jmp    0xd
>  2e: cc                    int3
>  2f: cc                    int3
>  30: cc                    int3
>  31: cc                    int3
>  32: cc                    int3
>  33: cc                    int3
>  34: cc                    int3
>  35: cc                    int3
>  36: cc                    int3
>  37: cc                    int3
>  38: cc                    int3
>  39: cc                    int3
>  3a: cc                    int3
>  3b: 48 89 fa              mov    %rdi,%rdx
>  3e: 83                    .byte 0x83
>  3f: e2                    .byte 0xe2
> 
> Code starting with the faulting instruction
> ===========================================
>   0: 0f 0b                 ud2
>   2: eb df                 jmp    0xffffffffffffffe3
>   4: cc                    int3
>   5: cc                    int3
>   6: cc                    int3
>   7: cc                    int3
>   8: cc                    int3
>   9: cc                    int3
>   a: cc                    int3
>   b: cc                    int3
>   c: cc                    int3
>   d: cc                    int3
>   e: cc                    int3
>   f: cc                    int3
>  10: cc                    int3
>  11: 48 89 fa              mov    %rdi,%rdx
>  14: 83                    .byte 0x83
>  15: e2                    .byte 0xe2
> [141229.507086] RSP: 0018:ffffa444449e0978 EFLAGS: 00010296
> [141229.507229] RAX: 0000000000000019 RBX: ffff9b54866a4100 RCX: 00000000fff7ffff
> [141229.507404] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea
> [141229.507577] RBP: ffff9b53e57b1ec0 R08: 0000000000000000 R09: 00000000fff7ffff
> [141229.507751] R10: ffff9b62db200000 R11: 0000000000000003 R12: ffff9b5b0595c000
> [141229.507929] R13: ffff9b5b09c32200 R14: ffff9b5b09e29a00 R15: ffff9b5b0557e080
> [141229.508101] FS:  0000000000000000(0000) GS:ffff9b62dfa00000(0000) knlGS:0000000000000000
> [141229.508279] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [141229.508425] CR2: 00007fbadced6a80 CR3: 000000096f014002 CR4: 00000000003706e0
> [141229.508599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [141229.508773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [141229.508947] Call Trace:
> [141229.509079]  <IRQ>
> [141229.509206] ? __warn (kernel/panic.c:235 kernel/panic.c:673)
> [141229.509342] ? report_bug (lib/bug.c:180 lib/bug.c:219)
> [141229.509482] ? handle_bug (arch/x86/kernel/traps.c:237)
> [141229.509617] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1))
> [141229.509751] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
> [141229.509892] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1))
> [141229.510028] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1))
> [141229.510164] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166)
> [141229.510302] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4324)
> [141229.510441] vlan_dev_hard_start_xmit (net/8021q/vlan_dev.c:130)
> [141229.510584] dev_hard_start_xmit (./include/linux/netdevice.h:4904 net/core/dev.c:3573 net/core/dev.c:3589)
> [141229.510722] __dev_queue_xmit (./include/linux/netdevice.h:3278 (discriminator 25) net/core/dev.c:4370 (discriminator 25))
> [141229.510862] ? eth_header (net/ethernet/eth.c:85)
> [141229.510998] ip_finish_output2 (./include/net/neighbour.h:542 (discriminator 2) net/ipv4/ip_output.c:233 (discriminator 2))
> [141229.511135] ip_sabotage_in (net/bridge/br_netfilter_hooks.c:881 net/bridge/br_netfilter_hooks.c:866)
> [141229.511269] nf_hook_slow (./include/linux/netfilter.h:144 net/netfilter/core.c:626)
> [141229.511406] ip_rcv (./include/linux/netfilter.h:259 ./include/linux/netfilter.h:302 net/ipv4/ip_input.c:569)
> [141229.511540] ? ip_rcv_core.constprop.0 (net/ipv4/ip_input.c:436)
> [141229.511678] netif_receive_skb (net/core/dev.c:5552 net/core/dev.c:5666 net/core/dev.c:5752 net/core/dev.c:5811)
> [141229.511814] br_handle_frame_finish (net/bridge/br_input.c:216)
> [141229.511954] ? br_pass_frame_up (net/bridge/br_input.c:75)
> [141229.512092] br_nf_hook_thresh (net/bridge/br_netfilter_hooks.c:1051)
> [141229.512227] ? br_pass_frame_up (net/bridge/br_input.c:75)
> [141229.512363] br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:427)
> [141229.512501] ? br_pass_frame_up (net/bridge/br_input.c:75)
> [141229.512644] ? nf_nat_ipv4_pre_routing (net/netfilter/nf_nat_proto.c:656) nf_nat
> [141229.512792] br_nf_pre_routing (net/bridge/br_netfilter_hooks.c:538)
> [141229.512928] ? br_nf_hook_thresh (net/bridge/br_netfilter_hooks.c:354)
> [141229.513061] br_handle_frame (./include/linux/netfilter.h:144 net/bridge/br_input.c:272 net/bridge/br_input.c:417)
> [141229.513196] ? br_pass_frame_up (net/bridge/br_input.c:75)
> [141229.513333] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5446 (discriminator 1))
> [141229.513475] ? ip_finish_output2 (net/ipv4/ip_output.c:243)
> [141229.513613] process_backlog (net/core/dev.c:5551 net/core/dev.c:5666 net/core/dev.c:5994)
> [141229.513749] __napi_poll (net/core/dev.c:6556)
> [141229.513887] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756)
> [141229.514023] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564)
> [141229.514158] do_softirq (kernel/softirq.c:463 (discriminator 32) kernel/softirq.c:450 (discriminator 32))
> [141229.514292]  </IRQ>
> [141229.514420]  <TASK>
> [141229.514548] flush_smp_call_function_queue (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) kernel/smp.c:579 (discriminator 1))
> [141229.514688] do_idle (kernel/sched/idle.c:314)
> [141229.514822] cpu_startup_entry (kernel/sched/idle.c:379)
> [141229.516148] start_secondary (arch/x86/kernel/smpboot.c:326)
> [141229.516291] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
> [141229.516435]  </TASK>
> [141229.516562] ---[ end trace 0000000000000000 ]—
> 
> 
> Best regards,
> Martin
> 
> 
> 
>> On 15 Sep 2023, at 9:45, Eric Dumazet <edumazet@...gle.com> wrote:
>> 
>> scripts/decode_stacktrace.sh
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ