lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <218c929c-b2a2-4965-861b-5733a265acfa@redhat.com>
Date: Wed, 31 Dec 2025 05:01:11 +0200
From: mohammad heib <mheib@...hat.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, horms@...nel.org, kernelxing@...cent.com,
 kuniyu@...gle.com, steffen.klassert@...unet.com, atenart@...nel.org
Subject: Re: [PATCH net] net: skbuff: fix truesize and head state corruption
 in skb_segment_list

Hi Willem,

Thank you for the review and for tracing the history of the accounting 
logic.


On 12/30/25 6:31 PM, Willem de Bruijn wrote:
> mheib@ wrote:
>> From: Mohammad Heib <mheib@...hat.com>
>>
>> When skb_segment_list is called during packet forwarding through
>> a bridge or VXLAN, it assumes that every fragment in a frag_list
>> carries its own socket ownership and head state. While this is true for
>> GSO packets created by the transmit path (via __ip_append_data), it is
>> not true for packets built by the GRO receive path.
> 
> We have to separate packets that use frag_list, a broader category,
> from those that use fraglist gso chaining. This code path is only
> exercised by the latter.
That makes perfect sense. I will remove the redundant is_flist check 
since this path is only reached for fraglist GSO, and I'll update the 
Fixes tag to ed4cccef64c1 in v2.
> 
>> In the GRO path, fragments are "orphans" (skb->sk == NULL) and were
>> never charged to a socket. However, the current logic in
>> skb_segment_list unconditionally adds every fragment's truesize to
>> delta_truesize and subsequently subtracts this from the parent SKB.
> 
> This was not present in the original fraglist chaining patch cited in
> the Fixes tag. It was added in commit ed4cccef64c1 ("gro: fix
> ownership transfer"). Which was a follow-on to commit 5e10da5385d2
> ("skbuff: allow 'slow_gro' for skb carring sock reference") removing
> the skb->destructor reference.
>   
>> This results a memory accounting leak, Since GRO fragments were never
>> charged to the socket in the first place, the "refund" results in the
>> parent SKB returning less memory than originally charged when it is
>> finally freed. This leads to a permanent leak in sk_wmem_alloc, which
>> prevents the socket from being destroyed, resulting in a persistent memory
>> leak of the socket object and its related metadata.
>>
>> The leak can be observed via KMEMLEAK when tearing down the networking
>> environment:
>>
>> unreferenced object 0xffff8881e6eb9100 (size 2048):
>>    comm "ping", pid 6720, jiffies 4295492526
>>    backtrace:
>>      kmem_cache_alloc_noprof+0x5c6/0x800
>>      sk_prot_alloc+0x5b/0x220
>>      sk_alloc+0x35/0xa00
>>      inet6_create.part.0+0x303/0x10d0
>>      __sock_create+0x248/0x640
>>      __sys_socket+0x11b/0x1d0
>>
>> This patch modifies skb_segment_list to only perform head state release
>> and truesize subtraction if the fragment explicitly owns a socket
>> reference. For GRO-forwarded packets where fragments are not owners,
>> the parent maintains the full truesize and acts as the single anchor for
>> the memory refund upon destruction.
> 
> Thanks for the report and fix. It can probably be simplified a bit
> based on knowledge that only fraglist chaining skbs reach this path.
> And the Fixes tag should reflect the patch that changed this
> accounting in the GRO patch. Matching that in the GSO path makes
> sense.
> 
>> Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
>> Signed-off-by: Mohammad Heib <mheib@...hat.com>
>> ---
>>   net/core/skbuff.c | 18 ++++++++++++++++--
>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index a00808f7be6a..aee9be42409b 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -4641,6 +4641,7 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>>   	struct sk_buff *tail = NULL;
>>   	struct sk_buff *nskb, *tmp;
>>   	int len_diff, err;
>> +	bool is_flist = !!(skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST);
> 
> This is guaranteed when entering skb_segment_list.
> 
>>   
>>   	skb_push(skb, -skb_network_offset(skb) + offset);
>>   
>> @@ -4656,7 +4657,15 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>>   		list_skb = list_skb->next;
>>   
>>   		err = 0;
>> -		delta_truesize += nskb->truesize;
>> +
>> +		/* Only track truesize delta if the fragment is being orphaned.
>> +		 * In the GRO path, fragments don't have a socket owner (sk=NULL),
>> +		 * so the parent must maintain the total truesize to prevent
>> +		 * memory accounting leaks.
>> +		 */
>> +		if (!is_flist || nskb->sk)
>> +			delta_truesize += nskb->truesize;
>> +
I considered combining the two if (nskb->sk) blocks into one to simplify 
the code further, but I decided against it to avoid reordering the 
skb_clone and skb_push logic, which felt too risky for this path.
>>   		if (skb_shared(nskb)) {
>>   			tmp = skb_clone(nskb, GFP_ATOMIC);
>>   			if (tmp) {
>> @@ -4684,7 +4693,12 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>>   
>>   		skb_push(nskb, -skb_network_offset(nskb) + offset);
>>   
>> -		skb_release_head_state(nskb);
>> +		/* For GRO-forwarded packets, fragments have no head state
>> +		 * (no sk/destructor) to release. Skip this.
>> +		 */
>> +		if (!is_flist || nskb->sk)
>> +			skb_release_head_state(nskb);
>> +
>>   		len_diff = skb_network_header_len(nskb) - skb_network_header_len(skb);
>>   		__copy_skb_header(nskb, skb);
>>   
>> -- 
>> 2.52.0
>>
> 
> 
Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ