lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA85sZunA=tf0FgLH=MNVYq3Edewb1j58oBAoXE1Tyuy3GJObg@mail.gmail.com>
Date: Mon, 26 Jun 2023 20:01:50 +0200
From: Ian Kumlien <ian.kumlien@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: Alexander Lobakin <aleksander.lobakin@...el.com>, 
	intel-wired-lan <intel-wired-lan@...ts.osuosl.org>, Jakub Kicinski <kuba@...nel.org>, 
	Eric Dumazet <edumazet@...gle.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, Jun 26, 2023 at 7:56 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On Mon, 2023-06-26 at 19:30 +0200, Ian Kumlien wrote:
> > There, that didn't take long, even with wireguard disabled
> >
> > [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > [14079.685456] #PF: supervisor read access in kernel mode
> > [14079.690686] #PF: error_code(0x0000) - not-present page
> > [14079.695915] PGD 0 P4D 0
> > [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> > [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > BIOS 1.7a 10/13/2022
> > [14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > 48 8d
> > [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > [14079.783106] FS:  0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > knlGS:0000000000000000
> > [14079.791305] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > [14079.804408] Call Trace:
> > [14079.806961]  <TASK>
> > [14079.809170]  ? __die+0x1a/0x60
> > [14079.812340]  ? page_fault_oops+0x158/0x440
> > [14079.816551]  ? ip6_route_output_flags+0xe3/0x160
> > [14079.821284]  ? exc_page_fault+0x3f4/0x820
> > [14079.825408]  ? update_load_avg+0x77/0x710
> > [14079.829534]  ? asm_exc_page_fault+0x22/0x30
> > [14079.833836]  ? __udp_gso_segment+0x346/0x4f0
> > [14079.838218]  ? __udp_gso_segment+0x2fa/0x4f0
> > [14079.842600]  ? _raw_spin_unlock_irqrestore+0x16/0x30
> > [14079.847679]  ? try_to_wake_up+0x8e/0x5a0
> > [14079.851713]  inet_gso_segment+0x150/0x3c0
> > [14079.855827]  ? vhost_poll_wakeup+0x31/0x40
> > [14079.860032]  skb_mac_gso_segment+0x9b/0x110
> > [14079.864331]  __skb_gso_segment+0xae/0x160
> > [14079.868455]  ? netif_skb_features+0x144/0x290
> > [14079.872928]  validate_xmit_skb+0x167/0x370
> > [14079.877139]  validate_xmit_skb_list+0x43/0x70
> > [14079.881612]  sch_direct_xmit+0x267/0x380
> > [14079.885641]  __qdisc_run+0x140/0x590
> > [14079.889324]  __dev_queue_xmit+0x44d/0xba0
> > [14079.893450]  ? nf_hook_slow+0x3c/0xb0
> > [14079.897229]  br_dev_queue_push_xmit+0xb2/0x1c0
> > [14079.901788]  maybe_deliver+0xa9/0x100
> > [14079.905564]  br_flood+0x8a/0x180
> > [14079.908903]  br_handle_frame_finish+0x31f/0x5b0
> > [14079.913547]  br_handle_frame+0x28f/0x3a0
> > [14079.917585]  ? ipv6_find_hdr+0x1f0/0x3e0
> > [14079.921622]  ? br_handle_local_finish+0x20/0x20
> > [14079.926267]  __netif_receive_skb_core.constprop.0+0x4c5/0xc90
> > [14079.932125]  ? br_handle_frame_finish+0x5b0/0x5b0
> > [14079.936946]  ? ___slab_alloc+0x4bf/0xaf0
> > [14079.940986]  __netif_receive_skb_list_core+0x107/0x250
> > [14079.946240]  netif_receive_skb_list_internal+0x194/0x2b0
> > [14079.951660]  ? napi_gro_flush+0x97/0xf0
> > [14079.955604]  napi_complete_done+0x69/0x180
> > [14079.959808]  ixgbe_poll+0xe10/0x12e0
> > [14079.963506]  __napi_poll+0x26/0x1b0
> > [14079.967106]  napi_threaded_poll+0x232/0x250
> > [14079.971405]  ? __napi_poll+0x1b0/0x1b0
> > [14079.975260]  kthread+0xee/0x120
> > [14079.978510]  ? kthread_complete_and_exit+0x20/0x20
> > [14079.983415]  ret_from_fork+0x22/0x30
> > [14079.987102]  </TASK>
> > [14079.989395] Modules linked in: chaoskey
> > [14079.993347] CR2: 00000000000000c0
> > [14079.996773] ---[ end trace 0000000000000000 ]---
> > [14080.018013] pstore: backend (erst) writing error (-28)
> > [14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > 48 8d
> > [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > [14080.088746] FS:  0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > knlGS:0000000000000000
> > [14080.096964] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> > [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> > interrupt ]---
>
> Could you please provide a decoded stack trace?
>
> # in your git tree:
> cat <stacktrace file > | ./scripts/decode_stacktrace.sh vmlinux

I'm afraid it doesn't yield more information, really... I can't say why

 cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[14079.685456] #PF: supervisor read access in kernel mode
[14079.690686] #PF: error_code(0x0000) - not-present page
[14079.695915] PGD 0 P4D 0
[14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
[14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
[14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[14079.717796] RIP: 0010:__udp_gso_segment (??:?)
[14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff

Code starting with the faulting instruction
===========================================
   0: c3                    ret
   1: 08 66 89              or     %ah,-0x77(%rsi)
   4: 5c                    pop    %rsp
   5: 02 04 45 84 e4 0f 85 add    -0x7af01b7c(,%rax,2),%al
   c: 27                    (bad)
   d: fd                    std
   e: ff                    (bad)
   f: ff                    .byte 0xff
49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
48 8d
[14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
[14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
[14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
[14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
[14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
[14079.783106] FS:  0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
knlGS:0000000000000000
[14079.791305] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
[14079.804408] Call Trace:
[14079.806961]  <TASK>
[14079.809170] ? __die (??:?)
[14079.812340] ? page_fault_oops (fault.c:?)
[14079.816551] ? ip6_route_output_flags (??:?)
[14079.821284] ? exc_page_fault (??:?)
[14079.825408] ? update_load_avg (fair.c:?)
[14079.829534] ? asm_exc_page_fault (??:?)
[14079.833836] ? __udp_gso_segment (??:?)
[14079.838218] ? __udp_gso_segment (??:?)
[14079.842600] ? _raw_spin_unlock_irqrestore (??:?)
[14079.847679] ? try_to_wake_up (core.c:?)
[14079.851713] inet_gso_segment (??:?)
[14079.855827] ? vhost_poll_wakeup (vhost.c:?)
[14079.860032] skb_mac_gso_segment (??:?)
[14079.864331] __skb_gso_segment (??:?)
[14079.868455] ? netif_skb_features (??:?)
[14079.872928] validate_xmit_skb (dev.c:?)
[14079.877139] validate_xmit_skb_list (??:?)
[14079.881612] sch_direct_xmit (??:?)
[14079.885641] __qdisc_run (??:?)
[14079.889324] __dev_queue_xmit (??:?)
[14079.893450] ? nf_hook_slow (??:?)
[14079.897229] br_dev_queue_push_xmit (??:?)
[14079.901788] maybe_deliver (br_forward.c:?)
[14079.905564] br_flood (??:?)
[14079.908903] br_handle_frame_finish (??:?)
[14079.913547] br_handle_frame (br_input.c:?)
[14079.917585] ? ipv6_find_hdr (??:?)
[14079.921622] ? br_handle_local_finish (??:?)
[14079.926267] __netif_receive_skb_core.constprop.0 (dev.c:?)
[14079.932125] ? br_handle_frame_finish (br_input.c:?)
[14079.936946] ? ___slab_alloc (slub.c:?)
[14079.940986] __netif_receive_skb_list_core (dev.c:?)
[14079.946240] netif_receive_skb_list_internal (??:?)
[14079.951660] ? napi_gro_flush (??:?)
[14079.955604] napi_complete_done (??:?)
[14079.959808] ixgbe_poll (??:?)
[14079.963506] __napi_poll (dev.c:?)
[14079.967106] napi_threaded_poll (dev.c:?)
[14079.971405] ? __napi_poll (dev.c:?)
[14079.975260] kthread (kthread.c:?)
[14079.978510] ? kthread_complete_and_exit (kthread.c:?)
[14079.983415] ret_from_fork (??:?)
[14079.987102]  </TASK>
[14079.989395] Modules linked in: chaoskey
[14079.993347] CR2: 00000000000000c0
[14079.996773] ---[ end trace 0000000000000000 ]---
[14080.018013] pstore: backend (erst) writing error (-28)
[14080.023274] RIP: 0010:__udp_gso_segment (??:?)
[14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff

Code starting with the faulting instruction
===========================================
   0: c3                    ret
   1: 08 66 89              or     %ah,-0x77(%rsi)
   4: 5c                    pop    %rsp
   5: 02 04 45 84 e4 0f 85 add    -0x7af01b7c(,%rax,2),%al
   c: 27                    (bad)
   d: fd                    std
   e: ff                    (bad)
   f: ff                    .byte 0xff
49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
48 8d
[14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
[14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
[14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
[14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
[14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
[14080.088746] FS:  0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
knlGS:0000000000000000
[14080.096964] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
[14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
[14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

The binaries aren't stripped so i don't, currently, know why it's like this...

but i also get:
gdb vmlinux
GNU gdb (Gentoo 13.2 vanilla) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from vmlinux...
(No debugging symbols found in vmlinux)
Traceback (most recent call last):
  File "/usr/src/linux/vmlinux-gdb.py", line 25, in <module>
    import linux.constants
  File "/usr/src/linux/scripts/gdb/linux/constants.py", line 10, in <module>
    LX_hrtimer_resolution = gdb.parse_and_eval("hrtimer_resolution")
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gdb.error: 'hrtimer_resolution' has unknown type; cast it to its declared type
---

> Thanks!
>
> Paolo
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ