lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ8uoz3CeFdAuOd6mjnMBDW45KcY-CQoOUxZt7PB_xAONVYD4w@mail.gmail.com>
Date: Tue, 7 Oct 2025 15:52:52 +0200
From: Magnus Karlsson <magnus.karlsson@...il.com>
To: Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru>
Cc: Alexander Lobakin <aleksander.lobakin@...el.com>, 
	Song Yoong Siang <yoong.siang.song@...el.com>, 
	Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Jesper Dangaard Brouer <hawk@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	John Fastabend <john.fastabend@...il.com>, Alexei Starovoitov <ast@...nel.org>, 
	"stable@...r.kernel.org" <stable@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>, 
	Stanislav Fomichev <sdf@...ichev.me>, Simon Horman <horms@...nel.org>, Jakub Kicinski <kuba@...nel.org>, 
	"bpf@...r.kernel.org" <bpf@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>, 
	"David S. Miller" <davem@...emloft.net>, Magnus Karlsson <magnus.karlsson@...el.com>, 
	"lvc-project@...uxtesting.org" <lvc-project@...uxtesting.org>
Subject: Re: [lvc-project] [PATCH net] xsk: Fix overflow in descriptor validation@@

On Tue, 7 Oct 2025 at 15:34, Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru> wrote:
>
> On 10/7/25 15:44, Magnus Karlsson wrote:
> > On Tue, 7 Oct 2025 at 14:11, Alexander Lobakin
> > <aleksander.lobakin@...el.com> wrote:
> >>
> >> From: Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru>
> >> Date: Tue, 7 Oct 2025 11:19:19 +0000
> >>
> >>> On 10/6/25 18:19, Alexander Lobakin wrote:
> >>>> From: Ilia Gavrilov <Ilia.Gavrilov@...otecs.ru>
> >>>> Date: Mon, 6 Oct 2025 08:53:17 +0000
> >>>>
> >>>>> The desc->len value can be set up to U32_MAX. If umem tx_metadata_len
> >>>>
> >>>> In theory. Never in practice.
> >>>>
> >>>
> >>> Hi Alexander,
> >>> Thank you for the review.
> >>>
> >>> It seems to me that this problem should be considered not from the point of view of practical use,
> >>> but from the point of view of security. An attacker can set any length of the packet in the descriptor
> >>> from the user space and descriptor validation will pass.
> >>>
> >>>
> >>>>> option is also set, then the value of the expression
> >>>>> 'desc->len + pool->tx_metadata_len' can overflow and validation
> >>>>> of the incorrect descriptor will be successfully passed.
> >>>>> This can lead to a subsequent chain of arithmetic overflows
> >>>>> in the xsk_build_skb() function and incorrect sk_buff allocation.
> >>>>>
> >>>>> Found by InfoTeCS on behalf of Linux Verification Center
> >>>>> (linuxtesting.org) with SVACE.
> >>>>
> >>>> I think the general rule for sending fixes is that a fix must fix a real
> >>>> bug which can be reproduced in real life scenarios.
> >>>
> >>> I agree with that, so I make a test program (PoC). Something like that:
> >>>
> >>>       struct xdp_umem_reg umem_reg;
> >>>       umem_reg.addr = (__u64)(void *)umem;
> >>>       ...
> >>>       umem_reg.chunk_size = 4096;
> >>>       umem_reg.tx_metadata_len = 16;
> >>>       umem_reg.flags = XDP_UMEM_TX_METADATA_LEN;
> >>>       setsockopt(sfd, SOL_XDP, XDP_UMEM_REG, &umem_reg, sizeof(umem_reg));
> >>>       ...
> >>>
> >>>       xsk_ring_prod__reserve(tq, batch_size, &idx);
> >>>
> >>>       for (i = 0; i < nr_packets; ++i) {
> >>>               struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(tq, idx + i);
> >>>               tx_desc->addr = packets[i].addr;
> >>>               tx_desc->addr += umem->tx_metadata_len;
> >>>               tx_desc->options = XDP_TX_METADATA;
> >>>               tx_desc->len = UINT32_MAX;
> >>>       }
> >>>
> >>>       xsk_ring_prod__submit(tq, nr_packets);
> >>>       ...
> >>>       sendto(sfd, NULL, 0, MSG_DONTWAIT, NULL, 0);
> >>>
> >>> Since the check of an invalid descriptor has passed, kernel try to allocate
> >>> a skb with size of 'hr + len + tr' in the sock_alloc_send_pskb() function
> >>> and this is where the next overflow occurs.
> >>> skb allocates with a size of 63. Next the skb_put() is called, which adds U32_MAX to skb->tail and skb->end.
> >>> Next the skb_store_bits() tries to copy -1 bytes, but fails.
> >>>
> >>>  __xsk_generic_xmit
> >>>       xsk_build_skb
> >>>               len = desc->len; // from descriptor
> >>>               sock_alloc_send_skb(..., hr + len + tr, ...) // the next overflow
> >>>                       sock_alloc_send_pskb
> >>>                               alloc_skb_with_frags
> >>>               skb_put(skb, len)  // len casts to int
> >>>               skb_store_bits(skb, 0, buffer, len)
> >>
> >> Oh, so you actually have a repro for this. This is good. I suggest you
> >> resubmitting the patch and include this repro in the commit message, so
> >> that it will be clear that it's actually possible to trigger the problem
> >> in the kernel using a malicious/broken userspace application.
> >>
>
> I'll add the repro from this e-mail in the next patch version,
> the full source is too long.
>
> >> (also pls remove those double `@@` from the subject next time)
> >>
>
> ok
>
> >> I'd also like to hear from Maciej and/or others what they think about
> >> this problem (that the userspace can set packet len to U32_MAX). Should
> >> we just go with this proposed u64 propagation or maybe we need to limit
> >> the maximum length which could be sent from the userspace?
> >
> > I prefer that we do not set a limit on it and go with the proposed
> > solution since I do not know what a future proof size limit would be.
> > Somebody could come up with a new virtual device that can send really
> > large packets, who knows.
> >
>
> The limit is already checked in the xp_aligned_validate_desc() function:

What I meant was, let us not introduce a new limit. I like your
proposed solution.

> static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool,
>                                             struct xdp_desc *desc)
> {
>
> ...
>         if (offset + len > pool->chunk_size)
>                 return false;
> ...
> }
>
> and pool->chunk_size can't be more than PAGE_SIZE:
>
> static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
> {
>         u32 chunk_size = mr->chunk_size, headroom = mr->headroom;
>         ...
>
>         if (chunk_size < XDP_UMEM_MIN_CHUNK_SIZE || chunk_size > PAGE_SIZE) {
>                 /* Strictly speaking we could support this, if:
>                  * - huge pages, or*
>                  * - using an IOMMU, or
>                  * - making sure the memory area is consecutive
>                  * but for now, we simply say "computer says no".
>                  */
>
>                 return -EINVAL;
>         }
>         ...
> }
>
> The problem is exactly the overflow.
>
> >> In any case, you raised a good topic.
> >>
> >>>
> >>>> Static Analysis Tools have no idea that nobody sends 4 Gb sized network
> >>>> packets.
> >>>>
> >>>
> >>> That's right. Static analyzer is only a tool, but in this case, the overflow
> >>> highlighted by the static analyzer can be used for malicious purposes.
> >>
> >> +1
> >>
> >> Also I really do hope Infotecs stayed independent from the govs and
> >> doesn't take part in any dual-purpose/gov-related projects.
> >>
> >> Thanks,
> >> Olek
> >>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ