lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b833aea6-b499-4b9c-90fe-aab31510544d@intel.com>
Date: Mon, 16 Sep 2024 12:32:09 +0200
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Ben Greear <greearb@...delatech.com>, netdev <netdev@...r.kernel.org>
CC: Jan Glaza <jan.glaza@...el.com>, Aleksandr Loktionov
	<aleksandr.loktionov@...el.com>, "intel-wired-lan@...ts.osuosl.org"
	<intel-wired-lan@...ts.osuosl.org>
Subject: Re: tcp_ack __list_del crash in 6.10.3+ hacks

On 9/14/24 07:27, Ben Greear wrote:
> Hello,
> 
> We found this during a long duration network test where we are using
> lots of wifi network devices in a single system, talking with

It will be really hard to repro for us. Still would like to help.

> an intel 10g

It's more likely to get Intel's help if you mail (also) to our IWL list
(CCed, +Aleksandr for ixgbe expertise).


> NIC in the same system (using vrfs and such).  The system ran around
> 7 hours before it crashed.  Seems to be a null pointer in a list, but
> I'm not having great luck understanding where exactly in the large tcp_ack
> method this is happening.  Any suggestions for how to get more relevant
> info out of gdb?
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000008^M
> #PF: supervisor write access in kernel mode^M
> #PF: error_code(0x0002) - not-present page^M
> PGD 115855067 P4D 115855067 PUD 283ed3067 PMD 0 ^M
> Oops: Oops: 0002 [#1] PREEMPT SMP^M
> CPU: 6 PID: 115673 Comm: btserver Tainted: G           O       6.10.3+ 
> #57^M
> Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020^M
> RIP: 0010:tcp_ack+0x62e/0x1530^M
> Code: 9c 24 80 05 00 00 0f 84 56 09 00 00 49 39 9c 24 50 06 00 00 0f 84 
> b2 04 00 00 48 8b 53 58 48 8b 43 60 48 89 df 48 8b 74 24 28 <48> 89 42 
> 08 48 89 10 48 c7 43 60 00 00 00 00 48 c7 43 58 00 00 00^M
> RSP: 0018:ffffc9000027c998 EFLAGS: 00010207^M
> RAX: 0000000000000000 RBX: ffff8881226a8800 RCX: ffff8881226abe01^M
> RDX: 0000000000000000 RSI: ffff888126a3d4c8 RDI: ffff8881226a8800^M
> RBP: ffffc9000027ca28 R08: 000000000005edf6 R09: 0000000000000000^M
> R10: 0000000000000008 R11: 0000000084d9074f R12: ffff888126a3d340^M
> R13: 0000000000000004 R14: ffff8881226aac00 R15: 0000000000000000^M
> FS:  00007efc82a2f7c0(0000) GS:ffff88845dd80000(0000) 
> knlGS:0000000000000000^M
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
> CR2: 0000000000000008 CR3: 0000000125477006 CR4: 00000000003706f0^M
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400^M
> Call Trace:^M
>   <IRQ>^M
>   ? __die+0x1a/0x60^M
>   ? page_fault_oops+0x150/0x500^M
>   ? exc_page_fault+0x6f/0x160^M
>   ? asm_exc_page_fault+0x22/0x30^M
>   ? tcp_ack+0x62e/0x1530^M
>   ? tcp_ack+0x5f1/0x1530^M
>   ? tcp_schedule_loss_probe+0x101/0x1d0^M
>   tcp_rcv_established+0x168/0x750^M
>   tcp_v4_do_rcv+0x13f/0x270^M
>   tcp_v4_rcv+0x1236/0x15f0^M
>   ? udp_lib_lport_inuse+0x100/0x100^M
>   ? raw_local_deliver+0xc8/0x250^M
>   ip_protocol_deliver_rcu+0x1b/0x290^M
>   ip_local_deliver_finish+0x6d/0x90^M
>   ip_sublist_rcv_finish+0x2d/0x40^M
>   ip_sublist_rcv+0x160/0x200^M
>   ? __netif_receive_skb_core.constprop.0+0x30d/0xf80^M
>   ip_list_rcv+0xca/0x120^M
>   __netif_receive_skb_list_core+0x17f/0x1e0^M
>   netif_receive_skb_list_internal+0x1c5/0x290^M
>   napi_complete_done+0x69/0x180^M
>   ixgbe_poll+0xd93/0x13d0 [ixgbe]^M
>   __napi_poll+0x20/0x1a0^M
>   net_rx_action+0x2af/0x310^M
>   handle_softirqs+0xc8/0x2b0^M
> __irq_exit_rcu+0x5f/0x80^M
>   common_interrupt+0x81/0xa0^M
>   </IRQ>^M
> 
> (gdb) l *(tcp_ack+0x62e)
> 0xffffffff81c8601e is in tcp_ack (/home/greearb/git/linux-6.10.dev.y/ 
> include/linux/list.h:195).
> 190     * This is only for internal list manipulation where we know
> 191     * the prev/next entries already!
> 192     */
> 193    static inline void __list_del(struct list_head * prev, struct 
> list_head * next)
> 194    {
> 195        next->prev = prev;
> 196        WRITE_ONCE(prev->next, next);
> 197    }
> 198
> 199    /*
> (gdb) l *(tcp_rcv_established+0x168)
> 0xffffffff81c88b88 is in tcp_rcv_established (/home/greearb/git/ 
> linux-6.10.dev.y/net/ipv4/tcp_input.c:6209).
> 6204
> 6205        if (!tcp_validate_incoming(sk, skb, th, 1))
> 6206            return;
> 6207
> 6208    step5:
> 6209        reason = tcp_ack(sk, skb, FLAG_SLOWPATH | 
> FLAG_UPDATE_TS_RECENT);
> 6210        if ((int)reason < 0) {
> 6211            reason = -reason;
> 6212            goto discard;
> 6213        }
> (gdb)
> 
> Thanks,
> Ben
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ