lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <914ddb6c-79ca-49c2-82b1-33be1986eff5@infotecs.ru>
Date: Tue, 7 Oct 2025 13:34:11 +0000
From: Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru>
To: Magnus Karlsson <magnus.karlsson@...il.com>, Alexander Lobakin
	<aleksander.lobakin@...el.com>
CC: Song Yoong Siang <yoong.siang.song@...el.com>, Maciej Fijalkowski
	<maciej.fijalkowski@...el.com>, Jesper Dangaard Brouer <hawk@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, John Fastabend <john.fastabend@...il.com>, "Alexei
 Starovoitov" <ast@...nel.org>, "stable@...r.kernel.org"
	<stable@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>,
	"Stanislav Fomichev" <sdf@...ichev.me>, Simon Horman <horms@...nel.org>,
	Jakub Kicinski <kuba@...nel.org>, "bpf@...r.kernel.org"
	<bpf@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>, "David S. Miller"
	<davem@...emloft.net>, Magnus Karlsson <magnus.karlsson@...el.com>,
	"lvc-project@...uxtesting.org" <lvc-project@...uxtesting.org>
Subject: Re: [lvc-project] [PATCH net] xsk: Fix overflow in descriptor
 validation@@

On 10/7/25 15:44, Magnus Karlsson wrote:
> On Tue, 7 Oct 2025 at 14:11, Alexander Lobakin
> <aleksander.lobakin@...el.com> wrote:
>>
>> From: Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru>
>> Date: Tue, 7 Oct 2025 11:19:19 +0000
>>
>>> On 10/6/25 18:19, Alexander Lobakin wrote:
>>>> From: Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru>
>>>> Date: Mon, 6 Oct 2025 08:53:17 +0000
>>>>
>>>>> The desc->len value can be set up to U32_MAX. If umem tx_metadata_len
>>>>
>>>> In theory. Never in practice.
>>>>
>>>
>>> Hi Alexander,
>>> Thank you for the review.
>>>
>>> It seems to me that this problem should be considered not from the point of view of practical use,
>>> but from the point of view of security. An attacker can set any length of the packet in the descriptor
>>> from the user space and descriptor validation will pass.
>>>
>>>
>>>>> option is also set, then the value of the expression
>>>>> 'desc->len + pool->tx_metadata_len' can overflow and validation
>>>>> of the incorrect descriptor will be successfully passed.
>>>>> This can lead to a subsequent chain of arithmetic overflows
>>>>> in the xsk_build_skb() function and incorrect sk_buff allocation.
>>>>>
>>>>> Found by InfoTeCS on behalf of Linux Verification Center
>>>>> (linuxtesting.org) with SVACE.
>>>>
>>>> I think the general rule for sending fixes is that a fix must fix a real
>>>> bug which can be reproduced in real life scenarios.
>>>
>>> I agree with that, so I make a test program (PoC). Something like that:
>>>
>>>       struct xdp_umem_reg umem_reg;
>>>       umem_reg.addr = (__u64)(void *)umem;
>>>       ...
>>>       umem_reg.chunk_size = 4096;
>>>       umem_reg.tx_metadata_len = 16;
>>>       umem_reg.flags = XDP_UMEM_TX_METADATA_LEN;
>>>       setsockopt(sfd, SOL_XDP, XDP_UMEM_REG, &umem_reg, sizeof(umem_reg));
>>>       ...
>>>
>>>       xsk_ring_prod__reserve(tq, batch_size, &idx);
>>>
>>>       for (i = 0; i < nr_packets; ++i) {
>>>               struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(tq, idx + i);
>>>               tx_desc->addr = packets[i].addr;
>>>               tx_desc->addr += umem->tx_metadata_len;
>>>               tx_desc->options = XDP_TX_METADATA;
>>>               tx_desc->len = UINT32_MAX;
>>>       }
>>>
>>>       xsk_ring_prod__submit(tq, nr_packets);
>>>       ...
>>>       sendto(sfd, NULL, 0, MSG_DONTWAIT, NULL, 0);
>>>
>>> Since the check of an invalid descriptor has passed, kernel try to allocate
>>> a skb with size of 'hr + len + tr' in the sock_alloc_send_pskb() function
>>> and this is where the next overflow occurs.
>>> skb allocates with a size of 63. Next the skb_put() is called, which adds U32_MAX to skb->tail and skb->end.
>>> Next the skb_store_bits() tries to copy -1 bytes, but fails.
>>>
>>>  __xsk_generic_xmit
>>>       xsk_build_skb
>>>               len = desc->len; // from descriptor
>>>               sock_alloc_send_skb(..., hr + len + tr, ...) // the next overflow
>>>                       sock_alloc_send_pskb
>>>                               alloc_skb_with_frags
>>>               skb_put(skb, len)  // len casts to int
>>>               skb_store_bits(skb, 0, buffer, len)
>>
>> Oh, so you actually have a repro for this. This is good. I suggest you
>> resubmitting the patch and include this repro in the commit message, so
>> that it will be clear that it's actually possible to trigger the problem
>> in the kernel using a malicious/broken userspace application.
>>

I'll add the repro from this e-mail in the next patch version,
the full source is too long.

>> (also pls remove those double `@@` from the subject next time)
>>

ok

>> I'd also like to hear from Maciej and/or others what they think about
>> this problem (that the userspace can set packet len to U32_MAX). Should
>> we just go with this proposed u64 propagation or maybe we need to limit
>> the maximum length which could be sent from the userspace?
> 
> I prefer that we do not set a limit on it and go with the proposed
> solution since I do not know what a future proof size limit would be.
> Somebody could come up with a new virtual device that can send really
> large packets, who knows.
> 

The limit is already checked in the xp_aligned_validate_desc() function:

static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool,
					    struct xdp_desc *desc)
{

...
	if (offset + len > pool->chunk_size)
		return false;
...
}

and pool->chunk_size can't be more than PAGE_SIZE:

static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
{
	u32 chunk_size = mr->chunk_size, headroom = mr->headroom;
	...

	if (chunk_size < XDP_UMEM_MIN_CHUNK_SIZE || chunk_size > PAGE_SIZE) {
		/* Strictly speaking we could support this, if:
		 * - huge pages, or*
		 * - using an IOMMU, or
		 * - making sure the memory area is consecutive
		 * but for now, we simply say "computer says no".
		 */
	
		return -EINVAL;
	}
	...
}

The problem is exactly the overflow.

>> In any case, you raised a good topic.
>>
>>>
>>>> Static Analysis Tools have no idea that nobody sends 4 Gb sized network
>>>> packets.
>>>>
>>>
>>> That's right. Static analyzer is only a tool, but in this case, the overflow
>>> highlighted by the static analyzer can be used for malicious purposes.
>>
>> +1
>>
>> Also I really do hope Infotecs stayed independent from the govs and
>> doesn't take part in any dual-purpose/gov-related projects.
>>
>> Thanks,
>> Olek
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ