[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANNWa06MVDaEGKVyNW=LGUsFg+OM-RRW-vfBBkR5Vpb+pk4pxw@mail.gmail.com>
Date: Sat, 29 Nov 2025 21:59:29 +0530
From: SHAURYA RANE <ssrane_b23@...vjti.ac.in>
To: Felix Maurer <fmaurer@...hat.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, jkarrenpalo@...il.com,
arvid.brodin@...en.se, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
skhan@...uxfoundation.org, linux-kernel-mentees@...ts.linux.dev,
david.hunter.linux@...il.com, khalid@...nel.org,
syzbot+2fa344348a579b779e05@...kaller.appspotmail.com, stable@...r.kernel.org
Subject: Re: [PATCH net v2] net/hsr: fix NULL pointer dereference in skb_clone
with hw tag insertion
On Fri, Nov 28, 2025 at 1:51 PM Felix Maurer <fmaurer@...hat.com> wrote:
>
> On Thu, Nov 27, 2025 at 10:02:19PM +0530, Shaurya Rane wrote:
> > When NETIF_F_HW_HSR_TAG_INS is enabled and frame->skb_std is NULL,
> > hsr_create_tagged_frame() and prp_create_tagged_frame() call skb_clone()
> > with a NULL pointer.
>
> Have you acually tested this or do you have any other indication that
> this can happen? After all, you are suggesting that this kernel crash in
> a syzbot VM is caused by a (very uncommon) feature of hardware NICs.
>
Agreed. The syzbot reproducer uses PRP (protocol=1), and the crash
path goes through prp_get_untagged_frame() where __pskb_copy() can
fail and return NULL. I tested the patch with syzbot and the bug no
longer reproduces, confirming this is the actual root cause.
> > Similarly, prp_get_untagged_frame() doesn't check
> > if __pskb_copy() fails before calling skb_clone().
>
> I suspect that this is really the only condition that can trigger the
> crash in question. This would also match that the syzbot reproducer hits
> this with a PRP interface (IFLA_HSR_PROTOCOL=0x1).
That makes sense. I've dropped the NETIF_F_HW_HSR_TAG_INS checks and
sent v3 with only the prp_get_untagged_frame() fix.That makes sense.
I've dropped the NETIF_F_HW_HSR_TAG_INS checks and sent v3 with only
the prp_get_untagged_frame() fix and tested it with the syzbot
reproducer .
>
> > This causes a kernel crash reported by Syzbot:
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f]
> > CPU: 0 UID: 0 PID: 5625 Comm: syz.1.18 Not tainted syzkaller #0 PREEMPT(full)
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > RIP: 0010:skb_clone+0xd7/0x3a0 net/core/skbuff.c:2041
> > Code: 03 42 80 3c 20 00 74 08 4c 89 f7 e8 23 29 05 f9 49 83 3e 00 0f 85 a0 01 00 00 e8 94 dd 9d f8 48 8d 6b 7e 49 89 ee 49 c1 ee 03 <43> 0f b6 04 26 84 c0 0f 85 d1 01 00 00 44 0f b6 7d 00 41 83 e7 0c
> > RSP: 0018:ffffc9000d00f200 EFLAGS: 00010207
> > RAX: ffffffff892235a1 RBX: 0000000000000000 RCX: ffff88803372a480
> > RDX: 0000000000000000 RSI: 0000000000000820 RDI: 0000000000000000
> > RBP: 000000000000007e R08: ffffffff8f7d0f77 R09: 1ffffffff1efa1ee
> > R10: dffffc0000000000 R11: fffffbfff1efa1ef R12: dffffc0000000000
> > R13: 0000000000000820 R14: 000000000000000f R15: ffff88805144cc00
> > FS: 0000555557f6d500(0000) GS:ffff88808d72f000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000555581d35808 CR3: 000000005040e000 CR4: 0000000000352ef0
> > Call Trace:
> > <TASK>
> > hsr_forward_do net/hsr/hsr_forward.c:-1 [inline]
> > hsr_forward_skb+0x1013/0x2860 net/hsr/hsr_forward.c:741
> > hsr_handle_frame+0x6ce/0xa70 net/hsr/hsr_slave.c:84
> > __netif_receive_skb_core+0x10b9/0x4380 net/core/dev.c:5966
> > __netif_receive_skb_one_core net/core/dev.c:6077 [inline]
> > __netif_receive_skb+0x72/0x380 net/core/dev.c:6192
> > netif_receive_skb_internal net/core/dev.c:6278 [inline]
> > netif_receive_skb+0x1cb/0x790 net/core/dev.c:6337
> > tun_rx_batched+0x1b9/0x730 drivers/net/tun.c:1485
> > tun_get_user+0x2b65/0x3e90 drivers/net/tun.c:1953
> > tun_chr_write_iter+0x113/0x200 drivers/net/tun.c:1999
> > new_sync_write fs/read_write.c:593 [inline]
> > vfs_write+0x5c9/0xb30 fs/read_write.c:686
> > ksys_write+0x145/0x250 fs/read_write.c:738
> > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> > do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > RIP: 0033:0x7f0449f8e1ff
> > Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 f9 92 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 4c 93 02 00 48
> > RSP: 002b:00007ffd7ad94c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
> > RAX: ffffffffffffffda RBX: 00007f044a1e5fa0 RCX: 00007f0449f8e1ff
> > RDX: 000000000000003e RSI: 0000200000000500 RDI: 00000000000000c8
> > RBP: 00007ffd7ad94d20 R08: 0000000000000000 R09: 0000000000000000
> > R10: 000000000000003e R11: 0000000000000293 R12: 0000000000000001
> > R13: 00007f044a1e5fa0 R14: 00007f044a1e5fa0 R15: 0000000000000003
> > </TASK>
> >
> > Fix this by adding NULL checks for frame->skb_std before calling
> > skb_clone() in the affected functions.
> >
> > Reported-by: syzbot+2fa344348a579b779e05@...kaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=2fa344348a579b779e05
> > Fixes: f266a683a480 ("net/hsr: Better frame dispatch")
> > Cc: stable@...r.kernel.org
> > Signed-off-by: Shaurya Rane <ssrane_b23@...vjti.ac.in>
> > ---
> > net/hsr/hsr_forward.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c
> > index 339f0d220212..8a8559f0880f 100644
> > --- a/net/hsr/hsr_forward.c
> > +++ b/net/hsr/hsr_forward.c
> > @@ -205,6 +205,8 @@ struct sk_buff *prp_get_untagged_frame(struct hsr_frame_info *frame,
> > __pskb_copy(frame->skb_prp,
> > skb_headroom(frame->skb_prp),
> > GFP_ATOMIC);
> > + if (!frame->skb_std)
> > + return NULL;
> > } else {
> > /* Unexpected */
> > WARN_ONCE(1, "%s:%d: Unexpected frame received (port_src %s)\n",
>
> This check looks good to me.
>
> > @@ -341,6 +343,8 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame,
> > hsr_set_path_id(frame, hsr_ethhdr, port);
> > return skb_clone(frame->skb_hsr, GFP_ATOMIC);
> > } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) {
> > + if (!frame->skb_std)
> > + return NULL;
> > return skb_clone(frame->skb_std, GFP_ATOMIC);
> > }
> >
> > @@ -385,6 +389,8 @@ struct sk_buff *prp_create_tagged_frame(struct hsr_frame_info *frame,
> > }
> > return skb_clone(frame->skb_prp, GFP_ATOMIC);
> > } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) {
> > + if (!frame->skb_std)
> > + return NULL;
> > return skb_clone(frame->skb_std, GFP_ATOMIC);
> > }
>
> If any of these two conditions happen we have a different, serious
> problem that needs to be fixed elsewhere.
>
> In hsr_create_tagged_frame(), we first check if we have an skb_hsr (an
> skb containing an already tagged message). If we don't, it has to be an
> skb_std (an skb without any tag). If !skb_std, we are either 1) handing
> around a frame without any skb; or 2) handing a PRP frame to an HSR
> function. In both cases, this would need to be fixed where the problem
> is introduced.
>
> Thanks,
> Felix
>
Thanks, Shaurya
Powered by blists - more mailing lists