[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d6077d36-93ed-4a6d-9eed-42b1b22cdffb@hartkopp.net>
Date: Sun, 30 Nov 2025 20:09:48 +0100
From: Oliver Hartkopp <socketcan@...tkopp.net>
To: Prithvi Tambewagh <activprithvi@...il.com>,
Marc Kleine-Budde <mkl@...gutronix.de>
Cc: linux-can@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com, netdev@...r.kernel.org
Subject: Re: Question about to KMSAN: uninit-value in can_receive
Hi Prithvi,
On 30.11.25 18:29, Prithvi Tambewagh wrote:
> On Sun, Nov 30, 2025 at 01:44:32PM +0100, Oliver Hartkopp wrote:
>>> shall I send this patch upstream and mention your name in
>> Suggested-by tag?
>>
>> No. Neither of that - as it will not fix the root cause.
>>
>> IMO we need to check who is using the headroom in CAN skbs and for
>> what reason first. And when we are not able to safely control the
>> headroom for our struct can_skb_priv content we might need to find
>> another way to store that content.
>> E.g. by creating this space behind skb->data or add new attributes to
>> struct sk_buff.
>
> I will work in this direction. Just to confirm, what you mean is
> that first it should be checked where the headroom is used while also
> checking whether the data from region covered by struct can_skb_priv is
> intact, and if not then we need to ensure that it is intact by other
> measures, right?
I have added skb_dump(KERN_WARNING, skb, true) in my local dummy_can.c
an sent some CAN frames with cansend.
CAN CC:
[ 3351.708018] skb len=16 headroom=16 headlen=16 tailroom=288
mac=(16,0) mac_len=0 net=(16,0) trans=16
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 start=0 offset=0 ip_summed=1 complete_sw=0
valid=0 level=0)
hash(0x0 sw=0 l4=0) proto=0x000c pkttype=5 iif=0
priority=0x0 mark=0x0 alloc_cpu=5 vlan_all=0x0
encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
[ 3351.708151] dev name=can0 feat=0x0000000000004008
[ 3351.708159] sk family=29 type=3 proto=0
[ 3351.708166] skb headroom: 00000000: 07 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
[ 3351.708173] skb linear: 00000000: 23 01 00 00 04 00 00 00 11 22 33
44 00 00 00 00
(..)
CAN FD:
[ 3557.069471] skb len=72 headroom=16 headlen=72 tailroom=232
mac=(16,0) mac_len=0 net=(16,0) trans=16
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 start=0 offset=0 ip_summed=1 complete_sw=0
valid=0 level=0)
hash(0x0 sw=0 l4=0) proto=0x000d pkttype=5 iif=0
priority=0x0 mark=0x0 alloc_cpu=6 vlan_all=0x0
encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
[ 3557.069499] dev name=can0 feat=0x0000000000004008
[ 3557.069507] sk family=29 type=3 proto=0
[ 3557.069513] skb headroom: 00000000: 07 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
[ 3557.069520] skb linear: 00000000: 33 03 00 00 10 05 00 00 00 11 22
33 44 55 66 77
[ 3557.069526] skb linear: 00000010: 88 aa bb cc dd ee ff 00 00 00 00
00 00 00 00 00
(..)
CAN XL:
[ 5477.498205] skb len=908 headroom=16 headlen=908 tailroom=804
mac=(16,0) mac_len=0 net=(16,0) trans=16
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 start=0 offset=0 ip_summed=1 complete_sw=0
valid=0 level=0)
hash(0x0 sw=0 l4=0) proto=0x000e pkttype=5 iif=0
priority=0x0 mark=0x0 alloc_cpu=6 vlan_all=0x0
encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
[ 5477.498236] dev name=can0 feat=0x0000000000004008
[ 5477.498244] sk family=29 type=3 proto=0
[ 5477.498251] skb headroom: 00000000: 07 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
[ 5477.498258] skb linear: 00000000: b0 05 92 00 81 cd 80 03 cd b4 92
58 4c a1 f6 0c
[ 5477.498264] skb linear: 00000010: 1a c9 6d 0a 4c a1 f6 0c 1a c9 6d
0a 4c a1 f6 0c
[ 5477.498269] skb linear: 00000020: 1a c9 6d 0a 4c a1 f6 0c 1a c9 6d
0a 4c a1 f6 0c
[ 5477.498275] skb linear: 00000030: 1a c9 6d 0a 4c a1 f6 0c 1a c9 6d
0a 4c a1 f6 0c
I will also add skb_dump(KERN_WARNING, skb, true) in the CAN receive
path to see what's going on there.
My main problem with the KMSAN message
https://lore.kernel.org/linux-can/68bae75b.050a0220.192772.0190.GAE@google.com/
is that it uses
NAPI, XDP and therefore pskb_expand_head():
kmalloc_reserve+0x23e/0x4a0 net/core/skbuff.c:609
pskb_expand_head+0x226/0x1a60 net/core/skbuff.c:2275
netif_skb_check_for_xdp net/core/dev.c:5081 [inline]
netif_receive_generic_xdp net/core/dev.c:5112 [inline]
do_xdp_generic+0x9e3/0x15a0 net/core/dev.c:5180
__netif_receive_skb_core+0x25c3/0x6f10 net/core/dev.c:5524
__netif_receive_skb_one_core net/core/dev.c:5702 [inline]
__netif_receive_skb+0xca/0xa00 net/core/dev.c:5817
process_backlog+0x4ad/0xa50 net/core/dev.c:6149
__napi_poll+0xe7/0x980 net/core/dev.c:6902
napi_poll net/core/dev.c:6971 [inline]
As you can see in
https://syzkaller.appspot.com/x/log.txt?x=144ece64580000
[pid 5804] socket(AF_CAN, SOCK_DGRAM, CAN_ISOTP) = 5
[pid 5804] ioctl(5, SIOCGIFINDEX, {ifr_name="vxcan0", ifr_ifindex=20}) = 0
they are using the vxcan driver which is mainly derived from vcan.c and
veth.c (~2017). The veth.c driver supports all those GRO, NAPI and XDP
features today which vxcan.c still does NOT support.
Therefore I wonder how the NAPI and XDP code can be used together with
vxcan. And if this is still the case today, as the syzcaller kernel
6.13.0-rc7-syzkaller-00039-gc3812b15000c is already one year old.
Many questions ...
Best regards,
Oliver
Powered by blists - more mailing lists